About - Ruskaruma

I work at the intersection of systems programming, machine learning infrastructure, and mathematics. I like building things close to the metal where performance is real and abstractions eventually give way to first principles. Most of my work lives in LLM inference engines, CUDA kernels, databases, and ML pipelines.

I come from an electronics background, which means I never really learned to treat hardware as a black box. I tend to think in signals, memory, bandwidth, and limits. That way of thinking shapes how I approach systems and optimization from the ground up.

I am deeply curious about how models actually work under the hood. From the math behind transformer architectures to the silicon they run on. A lot of my time goes into writing C++ and CUDA, thinking about memory layouts, cache behavior, and figuring out how to make inference faster than it has any right to be.

I am currently building Cottus, which is where I keep and grow most of my work. It is my lab, my notebook, and my long term project. Ideas start messy there and slowly turn into systems that hold up under pressure.

When I am not writing code, I am usually reading papers, formalizing math in Lean 4, or contributing to open source AI infrastructure. I optimize for depth of understanding, learning velocity, and work that compounds over time. I deliberately look for problems that are hard, unclear, and slightly uncomfortable, because that is where the most interesting systems seem to emerge.

Long term, I care about building intelligent systems that live in the physical world. If we are going to put models into bodies one day, I would prefer they turn out a little more capable and a little less awkward than certain golden protocol droids. Understanding the math and the metal feels like a good place to start.