Win a copy of Svelte and Sapper in Action this week in the JavaScript forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Ron McLeod
  • Paul Clapham
  • Bear Bibeault
  • Junilu Lacar
Sheriffs:
  • Jeanne Boyarsky
  • Tim Cooke
  • Henry Wong
Saloon Keepers:
  • Tim Moores
  • Stephan van Hulst
  • Tim Holloway
  • salvin francis
  • Frits Walraven
Bartenders:
  • Scott Selikoff
  • Piet Souris
  • Carey Brown

What kind of data fits Neo4j best (or worst)?

 
Bartender
Posts: 2407
36
Scala Python Oracle Postgres Database Linux
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
What kind of data fits best/worst into Neo4j?

For example, we have an application currently running on an RDBMS with lots of tables that are mostly about relationships - a user works for an organisation, a message relates to a particular topic, a topic is part of a given survey, an organisation is responding to a particular survey etc. Right now we have to do a lot of joins to present this data to the end users as they prefer, and these joins will become expensive as we have to handle larger volumes of data and traverse foreign key lookups all the time, so Neo4j might be a good option here.

On the other hand, we have other applications, also running on an RDBMS, where most of our data would naturally suit wide rectangular formats, and the main challenge is to deal with high volumes of sparse data e.g. tables with hundreds of columns where individual columns may not be populated for every record. If we were looking for a NoSQL option here, I'd probably choose MongoDB rather than Neo4j.

So what are the sweet spots for Neo4j, and where would you probably choose a different database?
 
Author
Posts: 10
5
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Chris,

Excellent question, and the one that is most important when deciding which storage system to use for a specific use case.

In my opinion, Neo4j shines when it comes to real-time queries of the localized data. When I say localized, I mean data closely connected to one or few nodes.
For example, finding friends of friends in the social network, or finding which products a person buys together.
Anything that requires complex queries of interconnected, joined data is a good use case for Neo4j.
As you correctly put it, if you have SQL tables that represent relationships between entities, it will most certainly be more efficient and performant to store this in Neo4j instead.
Examples are plenty: social networks, access control lists, master data management... We cover some of these use cases in the book, so take a look there as well.

Neo4j also supports ACID transactions, unlikely most of other NoSQL solutions, which makes in unique proposition for uses cases that require transaction support.

Neo4j does support storing tabular and semi-structured data (each node is Neo4j is just a bag of properties, not dissimilar to Mongo's document).
But Neo4j does not support sharding (which Mongo has built-in for example), so scaling out with large amount of tabular data will most likely be less efficient in Neo4j.

Storing large blobs of data (like PDFs or images) is not Neo4j's strength as well - I'd probably choose other db for that.

When talking about Big Data, any query use case that is likely to scan entire graph will probably not have full benefits of Neo4j Graph engine (this means that the data is not localized) - for such cases, depending on the size of the data, HDFS/Hadoop based solution bay be better option.

Aleksa
 
chris webster
Bartender
Posts: 2407
36
Scala Python Oracle Postgres Database Linux
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks, Aleksa.

It sounds like I can probably make a case for learning two NoSQL databases then!
 
There is no greater crime than stealing somebody's best friend. I miss you tiny ad:
the value of filler advertising in 2020
https://coderanch.com/t/730886/filler-advertising
    Bookmark Topic Watch Topic
  • New Topic