APU maker achieves 100x speedup over Xeon for big data search

In the ever-evolving world of data science and computing, the search for efficiency and speed is relentless. One of the most critical challenges in this domain is the ability to conduct similarity searches, particularly in vast datasets. GSI Technology has emerged with a groundbreaking solution that promises to revolutionize this aspect of data processing.
Similarity search is not merely a technical term; it represents a fundamental capability in the analysis of unstructured data. This process allows researchers to sift through enormous amounts of information that conventional query engines cannot effectively manage. Whether it's identifying faces in images, analyzing DNA sequences, or even discovering new drug candidates, similarity search plays a pivotal role in various fields. One standout example of this technology in action is Facebook’s FAISS library, which has set a benchmark for efficient similarity searches.
Understanding the Limitations of Current Technologies
Traditionally, the heavy lifting in similarity searches has been performed by Xeon CPUs and GPUs. However, both these technologies have inherent limitations when it comes to processing large datasets efficiently. The primary challenge lies in the memory-to-CPU-core bottleneck, which becomes pronounced as the size of the dataset increases.
A Xeon CPU, for instance, processes one record at a time per core. When executing a similarity search, it loads portions of a dataset into memory and sequentially compares each entry against the search term. This method can be painfully slow, especially when dealing with substantial datasets, such as an image recognition database containing a billion records. The power requirements also spike as the system struggles to keep up with demand.
Even with the advantages of Nvidia GPUs, which can deploy multiple cores, the time taken to conduct searches in vast databases remains significant. As a result, the quest for faster, more efficient processing has led to innovation in the field, culminating in GSI Technology's latest advancements.
Introducing GSI's Gemini APU
GSI Technology has developed the Gemini 'associative processing unit' (APU), a specialized solution designed to tackle the challenges of similarity searches head-on. The company claims that its Gemini APU can execute these searches on specific big data workloads up to 100 times faster than traditional Xeon processors, all while reducing power consumption by an impressive 70 percent.
What sets the Gemini APU apart is its unique design. By positioning compute units directly within a memory array, it enables parallel data processing. This innovation eliminates the need to transfer data from memory to the Xeon CPU core, thus avoiding the lengthy traversing of various cache levels (L1, L2, and L3) that typically slows down the process.
Real-World Performance Metrics
GSI's Gemini APU has already showcased impressive performance metrics in real-world applications. For instance, a configuration with four onboard Gemini APUs successfully found matches for a scanned face within a billion-record database in just 1.25 milliseconds. In contrast, the same search executed on a Xeon server could take up to 125 milliseconds.
Moreover, a 1U server equipped with 16 Gemini chips achieved a remarkable throughput of 5.4 million hashes per second while running the 256-bit SHA1 algorithm. This performance surpasses that of a 4U server housing eight Nvidia V100 cards and consumes only half the electrical power.
Technical Specifications of the Gemini APU
The architecture of the Gemini APU is a testament to cutting-edge design and engineering. It integrates Static Random Access Memory (SRAM) with two million bit-processors specifically dedicated to in-memory computing functions. Compared to DRAM, SRAM is not only faster but also more expensive, making it a premium choice for high-performance applications.
Within the Gemini chip, GSI intersperses 1-bit processing units alongside the read-modify-write lines of SRAM, allowing all processors to function in parallel. This design enables a seamless flow of data from memory cells to the chip, where each search term can be loaded onto individual processors for comparison.
Component | Gemini APU | Xeon 8280 | Nvidia A100 GPU |
---|---|---|---|
Processing Units | 2 million 1-bit | 28 x 2 x 512 bits | 104 x 4,096 bits |
Clock Speed | 400 MHz | 2.7 GHz | 1.4 GHz |
Memory Bandwidth | 26 TB/sec | 1 TB/sec | 7 TB/sec |
The Concept of Hamming Distance in Similarity Searches
At the heart of similarity searches lies the concept of Hamming distance. This metric quantifies the difference between two binary strings, which represent search terms. By measuring the number of differing positions between the two strings, the Hamming distance provides a numerical representation of similarity.
For example, consider two binary strings: 11011001 and 10011101. When these are added together, the result is 01000100, which has a Hamming distance of 2—indicating two positions where the strings differ. In practical applications, smaller Hamming distances suggest greater similarity, making this a crucial metric in various fields such as facial recognition, genomics, and drug discovery.
The efficiency of the Gemini APU in calculating Hamming distances far outstrips that of traditional Xeon processors, highlighting its potential to transform how we conduct similarity searches across multiple disciplines.
Market Position and Future Prospects
While specific pricing details for the Gemini APU remain undisclosed, the cost of GSI’s Leda-branded PCIe card, which features four Gemini chips, is approximately $15,000. This investment positions GSI Technology competitively within the high-performance computing market, especially as organizations increasingly recognize the need for optimized data processing solutions.
As the demand for efficient similarity searches grows, the Gemini APU stands out as a promising innovation that can significantly enhance data analytics capabilities across various industries. With its unmatched speed and efficiency, GSI Technology is carving a niche in an arena dominated by traditional CPU and GPU technologies.
For those wanting to delve deeper into the capabilities of GSI's innovative technology, check out this insightful video:
Leave a Reply