Work
Trust Bench
in progressOpen-source profiler for LLM trustworthiness. Extracts internal signals, evaluates across safety dimensions, diagnoses failure modes, and measures trust boundaries specific to deployment context.
Cross-lingual feature found in 6 languages sae-explorer
Found a single SAE feature in Gemma 2 2B that detects conjunctions across 6 languages with zero false positives.
DeltaNet from scratch on Apple Silicon hybrid-attention-150m
Trained Qwen3.5 hybrid architecture at 8M and 150M params. Found and fixed a numerical bug in the triangular solve.
ECE 0.107, overconfident by 3% calibration-probe
Measures whether LLMs know when they are wrong. 102 factual questions, forced confidence, calibration curves.
6 experiments, 5 repos building-intuition
Superposition, activation projections, loss landscapes, scaling laws, and attention variants. Each experiment answered a question I needed for Trust Bench.
10K+ daily queries Healthcare RAG Pipeline
Production RAG system for clinicians. Hallucination-aware retrieval with domain-specific re-ranking.
25% lower inference cost LLM Fine-tuning System
Fine-tuning Qwen2.5-3B with GRPO + LoRA. Studied how reward-guided optimization changes model behavior.
5 entity types, cross-format search Medical Knowledge Graph
Knowledge graph connecting diseases, drugs, treatments, and clinical trials. Powers retrieval grounding.
Early warning signals Drug Safety Sentiment Analysis
Transformer classifiers detecting sentiment shifts in medical expert opinions for pharmacovigilance.
90%+ accuracy, 50% less downtime Predictive Maintenance
Digital twin platform for industrial machinery. Anomaly detection and time-to-failure prediction from IoT streams.
Real-time token analytics crux
Terminal dashboard for Claude Code usage. Tracks context growth, cache efficiency, cost breakdowns, and session health.
brew install amaljithkuttamath/tap/cruxcargo install crux-cli
brew install amaljithkuttamath/tap/cruxcargo install crux-cli Compare 4 encodings tokenizer-arena
CLI that runs the same text through multiple LLM tokenizers and shows differences in token count and boundaries.
cargo install tokenizer-arena
cargo install tokenizer-arena Parse models byte by byte gguf-inspect
CLI that reads GGUF model files and prints architecture, quantization, tensor shapes, and memory estimates.
Claude Code plugin skill-doctor
Audits your Claude Code skills and diagnoses issues. Builds upgrades in staging with rollback.
microGPT Playground
try it live →Train a transformer in your browser. Watch attention patterns, embeddings, and loss evolve in real time.