This is one important and big question - and the one we have tried to answer in Neo4j in Action book.
However, in few pointers, the key things are:
- Neo4j allows for fast and efficient querying of highly connected data - it can traverse millions hops per second (hop is a jump from one node to another via relationship that connects them)
- In Neo4j, all data must live on a single machine/disk. This means that if you have cluster of Neo4j nodes, each will have full copy of the data. Neo4j core data structures (nodes and relationships) are very small though - (9 bytes for a node and 33 bytes for a relationship) - making it easy to fit hundreds of millions of nodes/relationships on a single node
- As for availability, Neo4j has a HA setup, with consist of master-slave Neo4j server cluster - this is commercial feature though and only available with licensed product.
Hope this helps, for more details, please read the book!
Joel Salatin has signs on his property that say "Trespassers will be Impressed!" Impressive tiny ad: