Designed for the Inference Era

Introducing Lumai Iris Nova, the first-generation of the Lumai Iris Inference Server platform. Demonstrating the fastest optical compute available for datacenter evaluation today, Iris Nova far exceeds previous performance benchmarks to set the new standard for what is possible in commercial optical compute.

Optical accelerator for heterogeneous AI computing

Designed for compute-bound, power constrained workloads

Fully compatible with existing data center infrastructure

Meet Lumai Iris Nova

The Only Optical Compute Capable of Running Billion-Parameter LLMs

Lumai Iris Nova runs billion-parameter LLMs in real time - supporting Llama 8B and 70B models today, bringing unprecedented inference performance on optical hardware today.

Lumai Iris Nova is ready for inference evaluation in data center environments.

Meet Lumai Iris Aura & Iris Tetra

The next-generation Lumai Iris Inference Servers, Iris Aura and Iris Tetra deliver unmatched throughput, efficiency and scalability in power-constrained environments. The sustained processing and utilization maximizes tokens per Watt at a fraction of the cost of digital chips.

Extreme Efficiency & Lower TCO

High-throughput token engine delivers up to 100 TOPS per Watt, enabling extremely high performance clusters, breaking through the power limitations of conventional digital accelerators. Optimized for compute-bound inference workloads, the accelerator operates independently or as a pre-fill processor, alongside decode-optimized accelerators, maximising performance and efficiency across the inference pipeline.

Optical Matrix Multiplication at Scale

A compact optical engine performing matrix multiplication up to 2048×2048, supporting INT4/INT8-equivalent precision. The architecture scales efficiently by utilizing three-dimensional optical parallelism, enabling massive AI workloads that exceed conventional digital accelerator limits.

AI Workload Execution

Lumai Iris Nova supports billion-parameter-scale LLM inference workloads, including Llama3. The system is available for inference performance evaluation in datacenter environments, demonstrating throughput scalability as workloads and model sizes increase.

Seamless Data Center Integration

Engineered for easy deployment in standard data center environments using conventional rack infrastructure and air-cooled systems. Designed to integrate directly into high-performance clusters using standard interfaces for flexible scale-up and scale-out architectures. A digital processor ensures seamless operation within existing software and digital compute workflows.

Delivering Breakthrough Performance

Lumai Iris is designed for token capability within a 10kW power budget. It can process tokens much faster and
cheaper, thanks to the high energy efficiency, high performance, and high hardware utilization efficiency.

1 ExaOPS

In a 10kW power constrained deployment

100 TOPS/W

2048x2048 Optical Tensor

Designed for the Inference Era

Meet Lumai Iris Nova

The Only Optical Compute Capable of Running Billion-Parameter LLMs

Extreme Efficiency & Lower TCO

Optical Matrix Multiplication at Scale

AI Workload Execution

Seamless Data Center Integration

Delivering Breakthrough Performance

Request Your Evaluation

WHAT WE DO

About LUMAI

Follow us