This is one important and big question - and the one we have tried to answer in Neo4j in Action book.
However, in few pointers, the key things are:
- Neo4j allows for fast and efficient querying of highly connected data - it can traverse millions hops per second (hop is a jump from one node to another via relationship that connects them)
- In Neo4j, all data must live on a single machine/disk. This means that if you have cluster of Neo4j nodes, each will have full copy of the data. Neo4j core data structures (nodes and relationships) are very small though - (9 bytes for a node and 33 bytes for a relationship) - making it easy to fit hundreds of millions of nodes/relationships on a single node
- As for availability, Neo4j has a HA setup, with consist of master-slave Neo4j server cluster - this is commercial feature though and only available with licensed product.
Hope this helps, for more details, please read the book!