For example, we have an application currently running on an RDBMS with lots of tables that are mostly about relationships - a user works for an organisation, a message relates to a particular topic, a topic is part of a given survey, an organisation is responding to a particular survey etc. Right now we have to do a lot of joins to present this data to the end users as they prefer, and these joins will become expensive as we have to handle larger volumes of data and traverse foreign key lookups all the time, so Neo4j might be a good option here.
On the other hand, we have other applications, also running on an RDBMS, where most of our data would naturally suit wide rectangular formats, and the main challenge is to deal with high volumes of sparse data e.g. tables with hundreds of columns where individual columns may not be populated for every record. If we were looking for a NoSQL option here, I'd probably choose MongoDB rather than Neo4j.
So what are the sweet spots for Neo4j, and where would you probably choose a different database?
Excellent question, and the one that is most important when deciding which storage system to use for a specific use case.
In my opinion, Neo4j shines when it comes to real-time queries of the localized data. When I say localized, I mean data closely connected to one or few nodes.
For example, finding friends of friends in the social network, or finding which products a person buys together.
Anything that requires complex queries of interconnected, joined data is a good use case for Neo4j.
As you correctly put it, if you have SQL tables that represent relationships between entities, it will most certainly be more efficient and performant to store this in Neo4j instead.
Examples are plenty: social networks, access control lists, master data management... We cover some of these use cases in the book, so take a look there as well.
Neo4j also supports ACID transactions, unlikely most of other NoSQL solutions, which makes in unique proposition for uses cases that require transaction support.
Neo4j does support storing tabular and semi-structured data (each node is Neo4j is just a bag of properties, not dissimilar to Mongo's document).
But Neo4j does not support sharding (which Mongo has built-in for example), so scaling out with large amount of tabular data will most likely be less efficient in Neo4j.
Storing large blobs of data (like PDFs or images) is not Neo4j's strength as well - I'd probably choose other db for that.
When talking about Big Data, any query use case that is likely to scan entire graph will probably not have full benefits of Neo4j Graph engine (this means that the data is not localized) - for such cases, depending on the size of the data, HDFS/Hadoop based solution bay be better option.