K-Steering How It Works
This page covers installation, a quickstart walkthrough, and guidance for running larger models.
Installation
Clone the Repository
git clone https://github.com/withmartian/k-steering.git
Prerequisites
- Python 3.12 or higher
- uv: Fast Python package installer and resolver
To install uv, follow the
Astral installation guide.
Install Dependencies
For now, we recommend running K-Steering locally from the repository root:
uv sync
This creates the environment and installs the required dependencies.
Quick Start
Try it in Google Colab
You can explore K-Steering without any local setup using the Colab notebook.
Includes installation, training, and inference examples.
API Usage
See the
examples/ directory for complete
scripts for training different steering models.
K-Steering (Non-Linear Steering)
This example shows how to use K-Steering to guide a language model's behavior by training lightweight steering classifiers and applying them during inference.
1. Load Required Modules
from k_steering.steering.config import SteeringConfig
from k_steering.steering.k_steer import KSteering
2. Select a Base Model
# Hugging Face model to be steered
MODEL_NAME = "unsloth/Llama-3.2-1B-Instruct"
3. Configure Steering
Define which layers are used to train and apply steering.
steering_config = SteeringConfig(
train_layer=1, # Layer used to train the steering classifier
steer_layers=[1, 3], # Layers where steering is applied
)
4. Task and Generation Settings
TASK_NAME = "debates" # e.g. "debates" or "tones"
MAX_NEW_TOKENS = 100 # Maximum number of tokens to generate
MAX_SAMPLES = 10 # Maximum number of samples for training
GENERATION_KWARGS = {
"max_new_tokens": MAX_NEW_TOKENS,
"temperature": 1.0,
"top_p": 0.9,
}
5. Initialize K-Steering
Wrap the base model with K-Steering.
steer_model = KSteering(
model_name=MODEL_NAME,
steering_config=steering_config,
)
6. Train Steering Classifiers
Train steering classifiers on task-specific data. Remove max_samples to use the full dataset.
steer_model.fit(
task=TASK_NAME,
max_samples=MAX_SAMPLES,
)
7. Generate Steered Outputs
prompts = [
"Are political ideologies evolving in response to global challenges?"
]
output = steer_model.get_steered_output(
prompts,
target_labels=["Empirical Grounding"], # Behaviors to encourage
avoid_labels=["Straw Man Reframing"], # Behaviors to suppress
generation_kwargs=GENERATION_KWARGS,
)
print(output)
Large Model Setup
The table below provides approximate GPU memory requirements for transformer models at different parameter scales. Use it to estimate what can run on free Colab versus what needs larger hardware.
| Model Size | Params | FP16 VRAM (Inference) | 4-bit VRAM (Inference) | Recommended GPU | Colab Free Feasible? |
|---|---|---|---|---|---|
| Tiny | 100M-300M | ~0.5-1 GB | ~0.3-0.5 GB | Any GPU | Yes |
| Small | 500M-1B | ~2-3 GB | ~1-1.5 GB | T4 / L4 | Yes |
| Medium | 2B-3B | ~5-7 GB | ~2-3 GB | T4 (tight) / L4 | No |
| Upper-Mid | 7B | ~14-16 GB | ~4-6 GB | L4 / A100 | No |
| Large | 13B | ~26-28 GB | ~8-10 GB | A100 40GB | No |
| Very Large | 30B | ~60+ GB | ~18-22 GB | Multi-GPU | No |
| Frontier | 70B | ~140+ GB | ~35-40 GB | Multi A100/H100 | No |