Nvidia and DataStax just made generative AI smarter and leaner — here’s how


Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More


Nvidia and DataStax launched new technology today that dramatically reduces storage requirements for companies deploying generative AI systems, while enabling faster and more accurate information retrieval across multiple languages.

The new Nvidia NeMo Retriever microservices, integrated with DataStax’s AI platform, cuts data storage volume by 35 times compared to traditional approaches — a crucial capability, as enterprise data is projected to reach more than 20 zettabytes by 2027.

“Today’s enterprise unstructured data is at 11 zettabytes, roughly equal to 800,000 copies of the Library of Congress, and 83% of that is unstructured with 50% being audio and video,” said Kari Briski, VP of product management for AI at Nvidia, in an interview with VentureBeat. “Significantly reducing these storage costs while enabling companies to effectively embed and retrieve information becomes a game changer.”

Nvidia’s NeMo Retriever technology delivers a 35x improvement in data storage efficiency, as illustrated in a comparison of raw text storage, baseline vector embeddings, and reduced embedding dimensions. This breakthrough underpins the scalability of generative AI across enterprise applications. (Credit: Nvidia)

The technology is already proving transformative for Wikimedia Foundation, which used the integrated solution to reduce processing time for 10 million Wikipedia entries from 30 days to under three days. The system handles real-time updates across hundreds of thousands of entries being edited daily by 24,000 global volunteers.

“You can’t just rely on large language models for content — you need context from your existing enterprise data,” explained Chet Kapoor, CEO of DataStax. “This is where our hybrid search capability comes in, combining both semantic search and traditional text search, then using Nvidia’s re-ranker technology to deliver the most relevant results in real time at global scale.”

Enterprise data security meets AI accessibility

The partnership addresses a critical challenge facing enterprises: how to make their vast stores of private data accessible to AI systems without exposing sensitive information to external language models.

“Take FedEx — 60% of their data sits in our products, including all package delivery information for the past 20 years with personal details. That’s not going to Gemini or OpenAI anytime soon, or ever,” Kapoor explained.

The technology is finding early adoption across industries, with financial services firms leading the charge despite regulatory constraints. “I’ve been blown away by how far ahead financial services firms are now,” said Kapoor, citing Commonwealth Bank of Australia and Capital One as examples.

The next frontier for AI: Multimodal document processing

Looking ahead, Nvidia plans to expand the technology’s capabilities to handle more complex document formats. “We’re seeing great results with multimodal PDF processing — understanding tables, graphs, charts and images and how they relate across pages,” Briski revealed. “It’s a really hard problem that we’re excited to tackle.”

For enterprises drowning in unstructured data while trying to deploy AI responsibly, the new offering provides a path to make their information assets AI-ready without compromising security or breaking the bank on storage costs. The solution is available immediately through the Nvidia API catalog with a 90-day free trial license.

The announcement underscores the growing focus on enterprise AI infrastructure as companies move beyond experimentation to large-scale deployment, with data management and cost efficiency becoming critical success factors.



Source link

About The Author

Scroll to Top