Graph database maker Neo4j Inc. today launched Infinigraph, calling it a significant advancement in distributed graph technology.
The company said the architecture allows users to run both operational and analytical workloads on a single graph database platform at over 100 terabytes in scale without fragmenting the graph, duplicating infrastructure or compromising performance.
The product of more than two years of development, Infinigraph addresses the problem of harmonizing transactional systems and analytical workloads. Previous versions of Neo4j required running on a single physical computer, meaning that organizations had to use extract/transfer/load pipelines, synchronization, or multiple databases to handle high volumes.
Infinigraph addresses this limitation by using sharding — a database technique that splits large datasets into smaller, more manageable pieces — to support billions of relationships and thousands of concurrent queries across multiple processors while maintaining the atomicity, consistency, isolation and durability, known as ACID, that’s needed in transactional scenarios.
The company can shard graph data across multiple machines while preserving its logical consistency, allowing for automatic distribution and scaling of data without the need for application rewrites or manual intervention.
Neo4j said the new architecture lets customers embed tens of millions of documents as vectors directly into the graph. This enables use cases such as fraud detection, product knowledge graphs, long-term compliance monitoring and semantic search to be conducted on much larger and richer data volumes.
‘Billions of vectors’
“We’re now able to support billions of vectors in the Neo4j database,” said Sudhir Hasbe, president of technology at Neo4j. “This is particularly useful in life sciences, where companies are processing tens of millions of scientific documents for drug discovery. In the past, these documents would be orphaned. Now, they can be embedded directly into the graph.”
Neo4j laid the groundwork for the new architecture with the introduction of Fabric four years ago. That enabled federated graph queries across machines, but customers had to manage sharding themselves. Infinigraph automates this process while retaining full ACID compliance, a feature Hasbe said is critical for transactional reliability.
“Graph sharding is a difficult problem due to traversal queries,” which are a type of database query used to navigate relationships between connected data points in graph databases, Hasbe said. “We solved it by maintaining a global index in one environment for fast path queries, while distributing the actual data across machines for horizontal scalability. Even distributed transactions remain consistent and reliable.”
Neo4j quoted early-access customers Intuit Inc. and Dun & Bradstreet Corp. expressing their enthusiasm for the new features. “Running real-time queries while also analyzing broader patterns is critical,” said Moheesh Raj, D&B’s director of engineering. “That requires a graph to scale both.”
“Some of the biggest banks are now able to run fraud detection systems using Infinigraph, working with hundreds of terabytes of interconnected transactional data,” Hasbe said.
AI angle
Neo4j also trumpeted the value of graphs as vector databases used in generative artificial intelligence. AI training requires both structured and unstructured data. The company first added vector support in 2023, allowing documents to be stored as vector embeddings. Infinigraph enables storage at a much larger scale.
“Gen AI has made unstructured data more valuable than ever,” Hasbe said. “We’ve seen customers go from using [Elasticsearch BV’s] Elastic Store for vectors to managing everything within Neo4j. That’s a huge simplification of their stack.”
Infinigraph is available on an early access basis now in Neo4j’s self-managed Enterprise Edition, with broader availability set for October. The company said the feature will soon be available within its AuraDB cloud-native graph platform.
Pricing for Infinigraph will follow a decoupled model, separating compute and storage to provide greater flexibility. “We’re aligning our pricing model with how modern distributed systems operate,” said Hasbe. “It allows customers to scale their workloads without unexpected costs.” He said customers with smaller workloads will probably see costs decline from what they are currently paying.
John Furrier, co-host of SiliconANGLE’s theCUBE, spoke exclusively this week to Sudhir Hasbe, Neo4j’s president and chief product officer:
Image: Neo4j
Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.
15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.
About SiliconANGLE Media
Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.