🪷 tsh

The shell for long-running agents. Smart output limiting. Loop memory recovery. 99% context savings on repeat reads.

tsh$ cat large_module.py # loop 1: 1,142 lines → 180 (85% saved) tsh$ cat large_module.py # loop 2: structure only → 50 (99% saved) tsh$ cat large_module.py # loop 3: structure only → 50 (99% saved)
Get started Agent reference GitHub
claude + tsh
Total context ~280 tokens Saved 99%
claude + bash
Total context ~28,500 tokens Saved 0%

🫧 Three modes, one binary

SHELL

Interactive

Run tsh bare. Full POSIX shell with all bash builtins, powered by brush-core. Every command's stdout is safety-filtered.

-c

Command

Run tsh -c 'cmd'. Executes through the shell engine, filters output, exits with the command's status.

PIPE

Script

Pipe a script via stdin. tsh handles encoding (UTF-16LE on Windows), runs it through brush-core, filters output.

🪷 Loop memory recovery

The biggest context waste isn't the first read — it's the second, third, and fourth. After compaction, the agent re-reads the same file and dumps the whole thing again. tsh tracks this and stops it.

Read # What the agent sees Savings
1st read Head 100 lines + structural skeleton + tail 20 + footer ~70–90%
2nd read (same file) Structure only — def/class/ERROR/WARN lines + footer ~99%
N-th read Same as 2nd — structural only + count indicator ~99%

Powered by SessionTracker — an ExecutionObserver on brush-core that watches every file-reading command (cat, head, tail, less, bat) and counts reads per file path across the session. No configuration needed. Fully automatic.

Real benchmark: two reads of server_log.txt (2,000 lines)

Metric Without tsh With tsh Savings
Tokens sent to agent 53,623 ~3,369 94%
Lines sent to agent ~4,000 158 96%
Found both needles (ERROR + FATAL) Yes Yes No loss

🪻 Why an agent should use tsh

An LLM agent running shell commands needs a shell. Most reach for bash. But bash dumps everything into the agent's context window — 48,000 lines of logs, raw binary, credentials. tsh manages what actually reaches the agent's memory.

🪷 It's a real shell

Pipes, redirects, heredocs, command substitution, process substitution, loops, traps, job control. Powered by brush-core — a Rust implementation of bash.

🫧 Smart output limiting

Large outputs are automatically reduced to their structural skeleton — function signatures, class defs, imports, errors. First 100 lines + structural extraction + last 20. No LLM call, just fast pattern matching.

🪻 Internal pipes untouched

When you run grep foo | sort | uniq, those internal pipes run at OS speed. Only the final output to the agent gets routed through the safety layer. Zero overhead on intermediate work.

🪷 Pluggable & configurable

Tune via env vars: TSH_HEAD_LINES, TSH_TAIL_LINES, TSH_MAX_LINES. Or set TSH_NO_LIMIT=1 for full pass-through. The Python filter is pluggable — swap in your own logic.

🫧 Windows native

Auto-detects PowerShell's UTF-16LE encoding — with BOM, without BOM, or plain UTF-8. Agents on Windows need zero special configuration.

🪻 Multilingual structural awareness

Knows the structural patterns of Python, Rust, JavaScript, TypeScript, Java, C/C++, Go, and log formats. Extracts what matters — def, class, pub fn, ERROR, Traceback — elides the rest.

🫧 Try tsh in your browser

A live shell. Connect a local folder to browse real files.

tsh — token shell
Token Shell (tsh) 0.1.0
Type commands below. Use "mount" to browse your connected folder.
 
tsh$
browser shell

🪷 Quickstart

macOS / Linux / WSL

curl -fsSL https://raw.githubusercontent.com/maceip/tsh/main/install.sh | sh

Windows (PowerShell)

irm https://raw.githubusercontent.com/maceip/tsh/main/install.ps1 | iex

Then open a new terminal and run:

tsh

Usage

# Interactive shell (safety-filtered)
tsh

# Run a command
tsh -c 'cat /var/log/syslog | tail -50'

# Disable safety for debugging
tsh --no-safety -c 'echo "raw output"'

# Pipe a script
echo 'whoami && df -h' | tsh

Build from source

git clone https://github.com/maceip/tsh && cd tsh
cargo build --release
./target/release/tsh

Docker

docker compose up -d
docker compose run --rm tsh

🫧 Agent reference

Structured for LLM agents and automation tools. Exact types, defaults, and behavior for every flag, mode, and output.

Mode detection

tsh selects its mode automatically. No mode flag exists.

ConditionModeBehavior
-c "cmd" provided Command Runs the string through brush-core. Stdout is safety-filtered. Exits with the command's status code.
stdin is a pipe (not a terminal) Script Reads all stdin bytes as a script (handles UTF-16LE on Windows). Runs through brush-core. Output filtered.
stdin is a terminal Shell REPL with tsh$ prompt. Full POSIX interactive shell (brush-core with bash-mode builtins). All stdout safety-filtered.

Priority: -c is checked first. Then pipe detection. Shell is the fallback.

CLI flags

FlagTypeDefaultDescription
-c / --command String None Execute this command string and exit. Like bash -c.
--no-safety Boolean false Disable the safety filter. Output passes through unfiltered. For debugging.

Agent invocation patterns:

# Run commands with safety filtering
tsh -c 'ls -la /tmp && df -h'
tsh -c 'cat .env | head -10'
tsh -c 'grep -rn "password" config/'

# Pipe a script
echo 'whoami && env' | tsh

# Debug mode (no filtering)
tsh --no-safety -c 'echo "raw output"'

# Interactive
tsh

Output routing

How tsh handles command output:

Shell capabilities

Full POSIX shell via brush-core:

Error handling

Errors go to stderr (unfiltered). Exit code 1 on failure.

Conditionstderr messageResolution
Empty stdin in pipe mode No input via stdin. Ensure data is piped.
Invalid encoding (Windows) Not valid UTF-8 or UTF-16LE. PowerShell: $OutputEncoding = [System.Text.Encoding]::UTF8
safety filter not found [tsh] safety filter not found. Pass-through mode. Ensure python/safety_filter.py exists and Python 3 is on PATH.

Pipeline patterns

# Safe command execution — output is filtered
tsh -c 'find . -name "*.py" -exec grep -l "import os" {} +'
tsh -c 'cat /var/log/auth.log | grep "Failed password" | tail -20'

# Chain commands — internal pipes at OS speed, only final output filtered
tsh -c 'ps aux | grep python | awk "{print \$2, \$11}" | sort'

# Script via stdin
echo 'for f in *.log; do wc -l "$f"; done | sort -rn' | tsh

# Debug — bypass safety
tsh --no-safety -c 'env | sort'

🪷 Performance

🫧 Routing overhead

Pipe routing adds single-digit milliseconds. Internal pipes between commands are pure OS speed — untouched by the router.

🪻 Async I/O

Tokio async runtime. Non-blocking pipe reads. safety subprocess spawns once and stays warm for the entire session.

🪷 Binary bypass

First chunk scanned for null bytes. Binary output skips safety entirely — straight to terminal. No serialization overhead.

🫧 Startup

Native Rust binary. Sub-millisecond to shell ready. Python safety process spawns in parallel with first command.

🫧 LangExtract — companion extraction engine

tsh ships with langextract-host, a zero-copy chunked extraction library for routing documents through local LLMs. Run it independently via cargo run -p langextract-host.

📄 Document
chunk_text()
24KB slices · 1KB overlap · zero-copy · whitespace-safe
JSON lines
python/shim.py
redact inputs ▸ call LLM ▸ tag outputs ▸ JSON back
Ollama / vLLM
aggregated
AnnotatedDocument[]
extraction_class · char_interval · attributes · alignment

Chunking

Environment variables (LangExtract)

VariableDefaultDescription
OPENAI_API_BASEhttp://localhost:11434/v1LLM endpoint.
OPENAI_API_KEYlocal-poc-keyAPI key.
LLM_MODEL_IDllama3Model identifier.
TSH_MODEL_DIRplatform cache dirModel download location.

🪻 Architecture

crates/
  tsh/                      # Shell binary — pipe routing, safety integration, brush-core
  langextract-host/         # Extraction engine — zero-copy chunker, async streaming
  tsh-model-manager/        # Model lifecycle — download, cache, SHA-256 verify
python/
  safety_filter.py             # safety output filter — redaction, PII masking (pluggable)
  shim.py                   # LangExtract shim — input mutations, LLM routing, output tagging
xtask/                      # CI — 42+ bash tests, 50+ Windows tests, platform-aware

Shell engine: brush-core — a Rust implementation of bash. tsh injects custom file descriptors so stdout/stderr flow through OS pipes into the async routing layer.

safety boundary: The Python safety filter is a long-running subprocess. tsh writes text to its stdin and reads sanitized output from its stdout. Binary data bypasses entirely. The filter is kill-on-drop.

References

The smart output limiter is informed by the following research on context compression for LLM agents.

  1. [1] Jha, Erdogan, Kim, Keutzer, Gholami. "Characterizing Prompt Compression Methods for Long Context Inference." ICML 2024. Extractive compression achieves up to 10x compression with minimal accuracy loss.
  2. [2] Lindenbauer & Slinko. "Simple Observation Masking Is as Efficient as LLM Summarization for Agent Context Management." NeurIPS DL4Code Workshop, Dec 2025. Halves cost vs. LLM summarization. Code
  3. [3] Tree-sitter code skeletonization (Repomix / Aider, 2024–2025). Parse code, return signatures + imports, strip bodies. ~70% token reduction. Code
  4. [4] Zhang, Zhao et al. "cAST: AST-Based Code Chunking." EMNLP 2025 Findings. Recursively breaks large AST nodes into semantically coherent chunks. +4.3 Recall@5. Code
  5. [5] Chirkova et al. "Provence: Context Pruning for RAG." ICLR 2025. Question-aware sentence pruning, plug-and-play for any LLM. Code
  6. [6] Jiang et al. "LLMLingua-2." ACL 2024. BERT-level encoder for token classification via data distillation. 3x–6x faster, up to 20x compression. Code
  7. [7] Li, Liu, Su, Collier. "Prompt Compression for LLMs: A Survey." NAACL 2025 (Oral). Comprehensive taxonomy of hard vs. soft prompt compression. Code
  8. [8] Kang et al. "ACON: Agent Context Optimization." arXiv, Oct 2025. Gradient-free compression guideline optimization. 26–54% memory reduction, 95%+ accuracy.