• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Liutauras Vilda
  • Tim Cooke
  • Jeanne Boyarsky
  • Bear Bibeault
Sheriffs:
  • Knute Snortum
  • paul wheaton
  • Devaka Cooray
Saloon Keepers:
  • Tim Moores
  • Stephan van Hulst
  • Ron McLeod
  • Piet Souris
  • Ganesh Patekar
Bartenders:
  • Tim Holloway
  • Carey Brown
  • salvin francis

Scaling out with Neo4J?

 
Bartender
Posts: 2407
36
Scala Python Oracle Postgres Database Linux
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I've seen a fair amount of evidence online (videos, blogs etc) describing how Neo4j performs very well when scaling up data volumes on a single machine, partly because relationships (edges) between data items can be followed directly instead of indirectly via FKs as in relational systems, and Neo4j's ACID transaction support is a real bonus in the NoSQL world. But to what extent is it amenable to scaling out across multiple servers in the way that Hadoop or MongoDB allow? How far is it possible to achieve high performance across multiple servers, given the fact that your data will inevitably have dependencies that cross machine boundaries, affecting how your transactions are managed?

I can see potential for Neo4j in some applications in my workplace, where we currently have a lot of relationships between tables on a strictly 3NF (unfortunately!) relational database that require lots of joins, because those relationships are expensive to traverse and Neo4j would make that process much more efficient. But I'm not clear on how far it would be possible to scale that solution out if the need arose, especially as the general trend is away from scaling up on big expensive kit and towards scaling out on cheaper servers.
 
Author
Posts: 10
5
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Chris,

This is a great question.

Scaling graphs horizontally in terms of storage (data sharding) is a hard problem. Traversing graph with the cost of network hop for a relationship can quickly become too expensive.
So in those terms, Neo4j will not compete with likes of Hadoop in the near future.

What Neo4j does provide is a very intelligent caching engine (both JVM heap caching and off-memory caching), which should speed things significantly if entire data set can fit into memory.
For cases where data is larger then the amount of memory available, you can run Neo4j in a clustered master-slave setup. Each node (master and slave) will have the entire copy of the data, but by smart routing within your application you keep different data sets cached on different nodes, making traversals scalable on large data sets.
In addition, code Neo4j data structures are very small - 9 bytes for a node and 33 bytes for a relationships - which means that you can fit billions of nodes and relationships in a standard server RAM memory for example. It is properties that add size to the data - and if you are after core traversal performance, you can store only minimal amount of information in Neo4j and use other storage system for persist other data - this is what polyglot persistence i all about.

Aleksa
 
chris webster
Bartender
Posts: 2407
36
Scala Python Oracle Postgres Database Linux
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks again, Aleksa. It sounds like server-based Neo4j would probably be fine for some of our applications, and I like the idea of polyglot persistence using the right DB for the right kind of data. Just need to persuade my bosses...
 
Watchya got in that poodle gun? Anything for me? Or this tiny ad?
Enterprise-grade Excel API for Java
https://products.aspose.com/cells/java
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!