A non-profit research initiative advancing the frontiers of artificial intelligence. We focus on omni-modal AI systems, efficient architectures, and synthetic data at scale.
A non-profit research initiative advancing omni-modal AI.
Developing large language models that understand and generate across text, images, audio, and video with near-zero latency.
Building systems that process multiple concurrent streams of audio, video, and data inputs simultaneously without turn-taking constraints.
Creating large-scale synthetic datasets grounded in factual knowledge across languages, documents, and long-context scenarios.
Scaling efficient attention mechanisms to 1M+ tokens for all-day task memory and in-context learning.
A gauge-theoretic framework that reframes LLM inference from a process of physical data movement to one of dynamic coordinate transformation, leveraging the group properties of RoPE to rotate queries over a static KV cache manifold.
Retrieval-Based Multi-Turn Chat SFT Synthetic Data, a new 100k entry, multi-turn synthetic dialogue dataset for SFT, building on our work with CausalLM/Refined-Anime-Text.