Why We Built a Custom SSH Proxy
Most SSH proxies are about logging. Record what happened, audit it later, maybe raise an alert. That is useful, but it does not prevent anything. By the time your SIEM fires, rm -rf / has already finished.
We needed something different: a proxy that can block a command before it reaches the target server. The operator types DROP TABLE users, the proxy intercepts it, sends it to a reviewer over WebSocket, and holds the SSH session in limbo until a human says "yes" or "no." If denied, the command never executes. The target server never even sees it.
No off-the-shelf SSH proxy does this. OpenSSH's ForceCommand cannot selectively block. Session recording tools like asciinema are post-hoc. Bastion hosts authenticate, but they do not approve. So we built expacti-sshd from scratch in Rust, using the russh crate.
Architecture: Server + Client in One Process
The core insight is that an SSH proxy is both a server (accepting connections from the operator) and a client (connecting to the target host). With russh, we run both roles in the same Tokio runtime:
// Simplified connection flow
//
// SSH Client → expacti-sshd (Server Handler) → Target SSH Server
// ↓
// CommandBuffer (PTY parsing)
// ↓
// Approval Service (WebSocket)
When a client connects, the proxy's russh::server::Handler implementation creates a new SshProxy struct. That struct holds:
- A CommandBuffer — the PTY parser that reconstructs typed commands from raw bytes
- An
Arc<AtomicBool>flag for the approval state - A channel sender to forward bytes to the target via an mpsc channel
- A reference to the WebSocket approval client
The target-side connection runs in a separate spawned task, bridged to the client side through Tokio channels. This separation is critical: the server handler is synchronous (called by russh on each SSH message), but the target forwarding is fully async.
PTY-Level Command Interception
This is where things get messy. When a user types in an interactive SSH session, the proxy does not receive neat, line-delimited commands. It receives raw bytes — one keystroke at a time, mixed with ANSI escape sequences, control characters, and multi-byte UTF-8.
Our CommandBuffer is a state machine that reconstructs the command the user intended to type. Here is a simplified version of the core logic:
pub struct CommandBuffer {
buf: Vec<char>,
utf8_buf: Vec<u8>,
esc_state: u8, // 0=normal, 1=saw ESC, 2=inside CSI
}
impl CommandBuffer {
/// Feed a byte. Returns Some(command) on Enter.
pub fn push(&mut self, b: u8) -> Option<String> {
match b {
// Enter → emit the buffered command
0x0d => {
let cmd: String = self.buf.iter().collect();
self.buf.clear();
return Some(cmd);
}
// Backspace / DEL
0x7f | 0x08 => { self.buf.pop(); }
// Ctrl+W → delete word
0x17 => {
while let Some(&c) = self.buf.last() {
if c == ' ' { break; }
self.buf.pop();
}
while self.buf.last() == Some(&' ') {
self.buf.pop();
}
}
// Ctrl+U → clear line
0x15 => { self.buf.clear(); }
// ESC → start escape sequence
0x1b => { self.esc_state = 1; }
// Printable ASCII
0x20..=0x7e if self.esc_state == 0 => {
self.buf.push(b as char);
}
_ => { /* handle escape sequences, UTF-8 */ }
}
None
}
}
The ANSI Escape Problem
Arrow keys, Home/End, F-keys, and terminal resize events all generate multi-byte ANSI escape sequences. If you naively push these into the command buffer, you get garbage like ls[A[A[B instead of ls.
Our parser uses a three-state machine: Normal (collecting printable chars), Saw ESC (just received 0x1b), and Inside CSI (processing a Control Sequence Introducer). In the CSI state, we consume parameter bytes until we hit a final byte (0x40–0x7E), then discard the entire sequence. This correctly handles everything from simple arrow keys (ESC [ A) to complex bracketed paste sequences.
UTF-8 Multi-Byte Sequences
SSH sends raw bytes. A single emoji or CJK character arrives as 3–4 bytes across potentially multiple data() callbacks. The buffer accumulates high bytes in a separate utf8_buf, checks the expected length from the leading byte, and only pushes to the command buffer once the full character is assembled:
// Multi-byte UTF-8 assembly
b if b >= 0x80 => {
self.utf8_buf.push(b);
let expected = match self.utf8_buf[0] {
0xc0..=0xdf => 2,
0xe0..=0xef => 3,
0xf0..=0xf7 => 4,
_ => { self.utf8_buf.clear(); return None; }
};
if self.utf8_buf.len() == expected {
if let Ok(s) = std::str::from_utf8(&self.utf8_buf) {
for c in s.chars() { self.buf.push(c); }
}
self.utf8_buf.clear();
}
}
The Atomic Approval Flow
When the user presses Enter and the CommandBuffer emits a command, we need to pause the session and wait for a human decision. The challenge: the data() handler is called by russh for every incoming SSH packet. We cannot block it (that would freeze the entire SSH connection). We cannot await inside it (the handler is not async in the way we need). So we use a shared atomic flag.
pub struct SshProxy {
awaiting_approval: Arc<AtomicBool>,
cmd_buf: CommandBuffer,
// ...
}
The flow works like this:
- User types a command;
CommandBufferreturns it on Enter - Handler sets
awaiting_approval.store(true, SeqCst) - Handler spawns an async task that sends the command to the approval service over WebSocket
- While the flag is true, all subsequent keystrokes are silently dropped — the user cannot type anything new
- The spawned task receives the decision and sets the flag back to false
On approval: the task sends 0x0d (Enter) to the target, causing it to execute the command that was already echoed to its PTY. On denial: the task sends 0x15 (Ctrl+U) to clear the target's line buffer, and sends a denial message back to the client.
// Inside the data() handler
if let Some(command) = self.cmd_buf.push(byte) {
let command = command.trim().to_string();
if command.is_empty() {
self.send_to_target(vec![0x0d]).await;
continue;
}
self.awaiting_approval.store(true, Ordering::SeqCst);
let flag = self.awaiting_approval.clone();
let tx = self.target_tx.clone();
tokio::spawn(async move {
match request_approval(&command).await {
Decision::Approved => {
let _ = tx.send(TargetCmd::Data(vec![0x0d])).await;
}
Decision::Denied(reason) => {
let _ = tx.send(TargetCmd::Data(vec![0x15])).await;
// Send denial message to client...
}
}
flag.store(false, Ordering::SeqCst);
});
}
Why AtomicBool instead of a Mutex? The flag is only ever read or written as a single boolean. An atomic is lock-free, has no poisoning, and is trivially Send + Sync. A mutex would work but adds unnecessary overhead for a single-bit state.
Bidirectional Bridging with tokio::select!
The proxy needs to shuttle data in both directions: client-to-target and target-to-client. This is a classic multiplexing problem, and tokio::select! handles it elegantly.
When the client requests a PTY and shell, we open a channel to the target and spawn a forwarding task:
tokio::spawn(async move {
loop {
tokio::select! {
// Target → Client: forward output
msg = target_channel.wait() => {
match msg {
Some(ChannelMsg::Data { ref data }) => {
let _ = client_handle
.data(client_channel_id,
CryptoVec::from_slice(data))
.await;
}
Some(ChannelMsg::Eof) | None => {
let _ = client_handle
.close(client_channel_id).await;
break;
}
_ => {}
}
}
// Client → Target: receive forwarded commands
cmd = cmd_rx.recv() => {
match cmd {
Some(TargetCmd::Data(bytes)) => {
let _ = target_channel
.data(&bytes[..]).await;
}
None => {
let _ = target_channel.close().await;
break;
}
_ => {}
}
}
}
}
});
The key design choice: the client never writes directly to the target channel. Instead, the server handler sends TargetCmd messages through an mpsc channel. This decoupling lets the approval flow inject or suppress bytes without racing with the forwarding task.
The auth_none Trick
SSH authentication happens before any channels are opened. A normal SSH proxy would need to handle authentication — checking passwords, validating public keys, managing certificates. That is a lot of complexity and a large attack surface.
Our approach: accept everything at the proxy level.
impl Handler for SshProxy {
async fn auth_none(
&mut self, _user: &str
) -> Result<Auth, Self::Error> {
Ok(Auth::Accept)
}
async fn auth_password(
&mut self, _user: &str, _password: &str
) -> Result<Auth, Self::Error> {
Ok(Auth::Accept)
}
async fn auth_publickey(
&mut self, _user: &str, _key: &_
) -> Result<Auth, Self::Error> {
Ok(Auth::Accept)
}
}
The proxy blindly accepts any credentials. Real authentication is deferred to the target server. When the proxy opens a client connection to the target, the target performs its own auth. If the target rejects the credentials, the session simply fails — no shell is established.
Security note: This design means the proxy itself does not enforce authentication. It must sit behind a network boundary (VPN, firewall, or mTLS) so that only authorized operators can reach it. The proxy's job is authorization (approving commands), not authentication (verifying identity).
This separation of concerns keeps the proxy simple. We do not need to replicate OpenSSH's authentication stack, manage authorized_keys files, or handle PAM. The target server — which already has that infrastructure — handles it.
Test Strategy
Testing an SSH proxy is hard. You need a real SSH server, a real SSH client, and a way to simulate the approval flow. Our solution: run everything in-process.
Mock SSH Target
We spin up a minimal russh::server that accepts all auth and echoes exec commands with a :ok suffix:
fn spawn_mock_target() -> u16 {
let port = free_port();
tokio::spawn(async move {
// Accept all auth, echo "{command}:ok\r\n"
// on exec requests
run_mock_ssh_server(port).await;
});
port
}
In-Process Approval Mock
Instead of running a real WebSocket server, we inject an mpsc::Sender<ApprovalRequest> directly into the proxy. The mock auto-approves everything unless the command contains "DENY":
fn spawn_approval_mock() -> mpsc::Sender<ApprovalRequest> {
let (tx, mut rx) = mpsc::channel(64);
tokio::spawn(async move {
while let Some(req) = rx.recv().await {
let decision = if req.command.contains("DENY") {
Decision::Denied("blocked".into())
} else {
Decision::Approved
};
let _ = req.respond.send(decision);
}
});
tx
}
What We Test
We have unit tests for the PTY parser (backspace handling, Ctrl+W, ANSI stripping, UTF-8 assembly) and the WebSocket client protocol. On top of that, 6 end-to-end integration tests exercise the full proxy stack:
- Approved exec — command reaches target, output contains
:ok - Denied exec — denial message returned, target never sees the command
- Sequential commands — multiple execs in one session, each independently approved
- Mixed decisions — approve one, deny the next, approve again
- Auth passthrough — proxy accepts any credentials
- Connection teardown — clean shutdown when client or target disconnects
The full test suite runs in under 3 seconds. No external processes, no Docker containers, no network calls.
Lessons Learned
PTY is Messier Than Expected
We started with a naive "split on newline" approach. It lasted about 10 minutes. Real terminal input includes:
- Backspace that erases the last character (but different terminals send 0x7f or 0x08)
- Arrow keys for history navigation (ESC [ A/B) that should not appear in the command
- Tab completion that changes the buffer contents server-side (we forward tabs and let the target handle completion)
- Multi-line commands with backslash continuation
- Paste events that arrive as a burst of bytes, sometimes with bracketed paste markers
The PTY parser went through four rewrites before stabilizing. The current version handles all of the above correctly, but we still discover edge cases occasionally (screen, tmux, and mosh all have their own PTY quirks).
russh API Quirks
The russh crate is excellent — it is the only pure-Rust SSH implementation that supports both server and client. But it has some sharp edges:
- Handler trait methods are async but called sequentially — you cannot spawn long-running work inside them without blocking subsequent SSH messages. We solved this by spawning tasks and communicating via channels.
CryptoVecis notClone— you needCryptoVec::from_slice()to copy data between channels, which adds an allocation per forwarded packet.- Channel IDs are opaque — you need to carefully track which server-side channel maps to which client-side channel, especially when multiplexing multiple channels (PTY + forwarded ports).
Testing Async SSH is Hard
The biggest challenge was not the SSH protocol itself but timing. Tests that connect, send a command, and check output need to account for:
- TCP connection establishment latency
- SSH handshake (key exchange, auth, channel open)
- Approval round-trip through the mock
- Output propagation back through the proxy
We solved this with generous timeouts, retry loops on connection attempts, and careful channel synchronization. The in-process approval mock (versus a WebSocket server) eliminates one entire network hop, which makes tests both faster and more deterministic.
What's Next
We are working on recording and replay — capturing the full PTY stream so reviewers can see not just what command was typed but the entire terminal context. We are also exploring AI-assisted risk scoring for SSH commands, similar to what we already do for API-submitted commands in the main expacti backend.
If you are building tools that sit in the data path of infrastructure access, Rust + Tokio + russh is a compelling stack. The type system catches entire classes of concurrency bugs at compile time, and the performance is such that the proxy adds sub-millisecond latency to the SSH session (the approval round-trip dominates, as it should).
Try Expacti
Human-in-the-loop command approval for AI agents and infrastructure access. Free tier available.
Live Demo