To be honest, I suppose that all depends on the technology. Systems like Ceph, HDFS, Cassandra, and Elasticsearch, where fault tolerance is built into the service, should work just fine running on Mesos (and Marathon). Just be sure to run each instance of the overall cluster on a dedicated machine, so that you don't inadvertently create single points of failure. And of course, take backups (and
test them!)
For services that require some sort of underlying storage, you have a couple of options: if you're in an enterprise environment where you can get a NFS export off of a SAN (or some other storage cluster), your best bet might be to ensure that this mount exists on each machine in the cluster and use that as a sort of shared storage, mounting a directory from the NFS export into the container. As a matter of fact, this is what we (Mesosphere) currently suggest for our customers running
Jenkins on DC/OS.
If that's not an option, or you're running in AWS, you might take a look at their latest offering EFS, which essentially does the same thing. The team over at EMC has also produced the open source tool REX-Ray (
http://rexray.readthedocs.io/en/stable/) which is *fantastic* for provisioning a storage volume and attaching it to a running instance. If the instance is restarted, it simply reattaches to the volume before it comes up in a fully automated way; less work on your part!