GB10 - projects

MM RAG - Multimodal Retrieval-Augmented Generation

MM RAG (Multimodal RAG) is an advanced document intelligence system that combines vector search, OCR, and language models to extract and synthesise information from complex PDF documents. The system is specially optimised to run efficiently on NVIDIA's GB10 platform.

gray concrete wall inside buildinggray concrete wall inside building
Huge model

Specific programme for GB10 to load models larger than the available memory.

gray concrete wall inside building
gray concrete wall inside building
Lib UnifiedFlow
CPU-GPU Zero-Copy Library for NVIDIA Grace-Blackwell

C++/CUDA library optimized to leverage the unified CPU-GPU architecture of the NVIDIA GB10 (Grace-Blackwell). It enables high-performance data processing pipelines with zero-copy between the ARM CPU and Blackwell GPU.

gray concrete wall inside buildinggray concrete wall inside building
HugeModel Inference Engine

Run Large Language Models Beyond GPU Memory Limits

A high-performance C++/CUDA inference engine designed for Mixture-of-Experts models on NVIDIA GB10 with unified memory architecture.

gray concrete wall inside buildinggray concrete wall inside building