About Us

OmniML is an artificial intelligence (AI) company that aims to amplify and enable powerful machine learning capabilities to edge devices. OmniML enables greater speed, accuracy, and efficiency in AI by creating deep learning models that bridge the gap between AI applications and the diverse range of devices found on the edge. OmniML is backed by established VCs and world-leading researchers and industry experts. OmniML makes major ML tasks 10x faster on different edge devices with 1/10th of the engineering effort. OmniML’s technology has already demonstrated significant gains in model performance and cost reduction for many enterprise customers in multiple vertical markets. OmniML was founded in 2021 and is headquartered in San Jose, CA.

Founded by Dr. Song Han, MIT EECS Professor and serial entrepreneur, Dr. Di Wu, former Facebook engineer, and Dr. Huizi Mao, co-inventor of the “deep compression” technology coming out of Stanford, OmniML solves a fundamental mismatch between AI applications and edge hardware to make AI more accessible for everyone, not just data scientists and developers. The core product offering is a model design platform that automates model co-design, training, and deployments targeting GPUs, AI SoCs, and even tiny MCUs.

AI is already improving our lives in all imaginable areas, many of which require AI to run on edge devices for latency, cost, privacy, etc. However, in the AI industry nowadays, there still isn’t a good solution to design efficient models targeting AI capability on the increasingly diverse edge hardware. As a result, it takes repeated manual design and training iterations for model deployment, which in turn demands an extraordinary level of resource and engineering time for AI to reach production.

The team at OmniML is among a small cadre of AI/ML experts who know how to miniaturize deep learning models without sacrificing accuracy. As the publications from our research at MIT and at prestigious machine learning conferences, winning multiple awards along the way, our methods outperform peers in the market by a significant margin. OmniML is helping multiple customers achieve massive savings in computation and energy costs, our technology will enable powerful AI models deployed on all possible edge devices.

Our approach

A complete ecosystem of automated ML model-system co-design

For each ML application, we provide a collection of tailored-designed model architectures and runtime libraries for efficient execution on edge devices. Given the target hardware devices, our platform explores the vast space of different combinations and then performs search to obtain the optimal design for each device without repeating training. With our co-design approach, the deployed model will have orders of magnitude improvement on speed and memory footprint reduction without sacrificing much on accuracy.

Founding Team

The world’s best in efficient deep learning for efficient AI

Song han

Co-Founder & Chief Scientist

  • Assistant professor at MIT, PhD from Stanford
  • Co-founder of DeePhi Tech (acquired by Xilinx)
  • “35 Innovators Under 35” by MIT Technology Review
  • Inventor of “Deep Compression”
  • 29k Google Scholar citations

Di Wu

Co-Founder & CEO

  • Tech lead at Facebook AI, PyTorch accelerator enablement
  • Product and engineering leader at Falcon Computing Solution (acquired by Xilinx)
  • PhD from UCLA, years of experience in customized hardware systems at Intel Lab, MSRA

Huizi Mao

Co-Founder & CTO

  • PhD from Stanford. Co-Inventor of “Deep Compression”
  • Early member of DeePhi and Megvii
  • Worked at Google Research, Facebook AML and NVIDIA
  • NVIDIA Fellowship Recipients 

Awards/The Best Hardware Aware ML Platform