GSI Technology Announces Preliminary Benchmarks Data For Gemini-II Compute-in-Memory Processor Demonstrating 3-Second Time-To-First-Token Performance For Multimodal Large Language Models

GSI Technology, Inc. +3.23%

GSI Technology, Inc.

GSIT

5.44

+3.23%

GSI Technology, Inc. (NASDAQ:GSIT), the inventor of the Associative Processing Unit (APU), a paradigm shift in artificial intelligence (AI) and high-performance compute processing, providing true compute-in-memory technology, today announced preliminary benchmark results for the Gemini-II Compute-in-Memory processor. These results demonstrated 3-second time-to-first-token ("TTFT") performance for multimodal large language models operating at the edge with video and text inputs.

Using the Gemma-3 12B vision-language model on GSI's production Gemini-II processor, GSI achieved the 3-second TTFT while consuming approximately 30 watts at the AI sub-system, including the chip. To GSI's knowledge, this 3-second TTFT at approximately 30 watts at the AI sub-system is the lowest publicly reported result for a multimodal 12B model running on an embedded edge processor.

Independent third-party testing of the same workload on competitive embedded platforms reported TTFT measurements of roughly 12 seconds on Qualcomm Snapdragon X Elite with 30W power, and 3 seconds on NVIDIA Jetson Thor with over 100W power. With performance on par with or superior to competitive platforms at lower power usage levels, GSI concludes that Gemini-II offers a favorable responsiveness and power-efficiency profile for power- and thermally-constrained edge environments.