Work
experience, research, projects, and open source
Building AI agents, internal tooling, and automation pipelines across teams. Designing reusable AI infrastructure, prototyping LLM-powered workflows, and converting operational bottlenecks into production-grade AI systems.
Architected agent-based AI system for real-time client interaction analysis using Mistral 7B and OpenAI/Gemini APIs. Built end-to-end business intelligence platform with real-time KPI tracking and LLM-powered conversational analytics.
Productionized voice and audio ML pipelines achieving sub-200ms end-to-end latency. Owned core voice interaction stack including control logic, prompt routing, and eval-driven iteration loops.
Investigated whether implicit neural representations (SIRENs) can replace KV cache memory reads with compute during LLM inference. Ran 280 SIREN fits across 7 architectures on Llama 3.1-8B. Found keys have learnable positional structure from RoPE (0.91 CosSim) but values don't (0.67). SVD dominates at every compression ratio with zero training. The hypothesis was creative but wrong; the contribution is the structural characterization of the K/V asymmetry.
Designing hierarchical multi-agent framework modeling virtual research institute with independent labs exploring competing hypotheses.
Designing benchmark to detect "Silent Failure" in AI systems producing semantically incorrect but syntactically valid conjectures.
New Horizon Labs (VRI)
A virtual research institute that generates autonomous laboratories on demand, staffs them with specialised AI co-scientists, and coordinates cross-lab adversarial validation — manufacturing scientific contradictions at scale.
Cottus Runtime
C++/CUDA LLM inference engine for Llama with PagedAttention (40% memory reduction) and cuBLAS kernels (2.3x speedup). Custom CUDA kernels for Multi-Head Attention, RoPE embeddings, and GEMM operations.
StoneDB-Engine
A modular, ACID-compliant embedded database written in C++ with two-phase locking, WAL, deadlock detection, and LRU page cache.
Helix
A semantic vector engine for scalable similarity search built in C++20.
Added compaction via responses.compact API, fixed PCM audio duration calculation and VAD truncation, improved strict schema compatibility, added extensible data payloads to lifecycle hooks.
Formalizing Erdős Problems in Lean 4: Ramsey size linearity (Q3, K33, H5), doubly exponential lower bounds for hypergraph Ramsey numbers, asymptotic growth, logarithmic density of size-dependent congruences, and the Lebesgue-Nagell equation conjecture.
Fixed input mutation in optimize(), added missing get_job_events() to RuntimeBackend base class, fixed LocalProcessBackend job status never returning Complete, handled falsy values in get_args_from_peft_config. Added unit tests for Optimizer KubernetesBackend.
Fixed error formatting in trial controller package. Called on by maintainers to design multi-objective optimization support for the new OptimizationJob CRD — covering API changes, Pareto front status tracking, suggestion service integration, and webhook validation.