• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Docker in Action: Performance and Monitoring

 
Ranch Hand
Posts: 112
3
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Ian and Aiden,

Does your book cover performance tuning or considerations of Docker Containers? I see production concerns in chapters 11 and 12 and am pleased to see a chapter on security. I have used the "docker stats" command to watch performance at the CLI and Shipyard. I have also used memory config props in the docker-compose.yml files. Do you go in depth on production performance tuning?
 
Author
Posts: 6
5
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Kent O. Johnson wrote:Ian and Aiden,

Does your book cover performance tuning or considerations of Docker Containers? I see production concerns in chapters 11 and 12 and am pleased to see a chapter on security. I have used the "docker stats" command to watch performance at the CLI and Shipyard. I have also used memory config props in the docker-compose.yml files. Do you go in depth on production performance tuning?



Hi Kent

Technique 91 (chapter 11) contains information on cAdvisor, a tool we felt would get people started with container monitoring. A few other techniques in chapter 11 deal with resource control if you want to keep your containers under control, and technique 97 (chapter 12) talks about opting out of Docker containerisation if you want to harness the full power of your machine (the most common examples here being volumes, which are covered a fair amount throughout the book as well as here, and --net=host, to bypass whatever method Docker is currently using to proxy network traffic).

Drilling down into the performance of Docker itself is tricky, because it's changing so rapidly. For example, I believe I recall a regression in some version after 1.3 where having more than 100 containers would bring Docker performance to a halt - obviously they fixed this very quickly! Note that the performance of the Docker daemon itself (usually!) doesn't affect individual containers so needing to tune the daemon is rare. For individual applications, it's likely best to use whatever tools are most appropriate for your application by injecting them in your container with nsenter.

One thing that is missing from the book is a discussion on the different storage drivers and how they can impact the I/O performance of your application - a skeleton outline of this technique is actually sitting on my computer! However, it would end up being a look at the detail of some Docker internals which is perhaps not of huge interest to most readers - it's somewhat uncommon to do a lot of I/O in a container (a database being one of the few obvious example) and the pain that can be caused by the storage drivers is why I tend to recommend using volumes for your database 'dbspace' (or 'tablespace') to get reliable performance.

So I suppose the short answer to your question is "no", mainly because the precise details of tuning depends on what you're doing in the container! Let me know if you have any specific questions and I'll see if I can answer directly or if there's anything relevant in the book.

Aidan
 
Kent Bull
Ranch Hand
Posts: 112
3
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Regarding database I/O performance, I had an issue where using a PostgreSQL database container with volumes from a data-only container from within a VM gave very unpredictable and erratic I/O behavior, which I am thinking is because of multiple layers of swapping between the physical machine and the virtual machine. I was using volumes so I am thinking the problem was not a container problem at all, just the configuration of the I/O driver for VMWare, the hypervisor I used.

Switching everything to physical machines outside of containers made things much faster and more consistent. I considered trying using containers on the physical machine though I hadn't got to that yet. I am glad you mentioned the different storage drivers, that they can affect performance, though it sounds like they are only relevant for in-container I/O. Is that right? If I wanted to run a PostgreSQL database container or any database container without mounting volumes then how would I go about achieving similar performance to using the DBMS on bare metal?

Using databases inside of containers seems to unnecessarily complicate backups since there is no SSHing into a container, though I could see an automated backup running to a directory mounted as a volume to solve that problem. Would you solve the backup use case like that?

Regarding network I/O, do you find that the virtual network device layer used by the Docker Engine adds significant enough overhead to warrant bypassing it all together with the "--net=host" option? How much of a performance gain have you seen that give? This is the first time I have seen that option and would like to learn more about it and where it would make sense.

Kent
 
Saloon Keeper
Posts: 27764
196
Android Eclipse IDE Tomcat Server Redhat Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Well, "I am thinking" isn't really what you want. It's "I have measured".

One thing a long and evil career in IT has taught me is that the bottlenecks are almost never where you "know" they're going to be.

You can "ssh" into Docker containers, even if the container doesn't have sshd running and a port mapped. Use a command like this: "docker exec -it mycontainer /bin/sh".

My primary databases are standalone because A) they predate Docker itself, B) They are used more than just in Docker container links and C) having a few central databases makes it a lot easier to backup and maintain them. However, certain specialized systems do use linked container databases and I haven't had any complaints with them.

The main thing I do like to do, however, is keep my data files (and that included database data directories) in mounted volumes rather than within the images themselves. It makes it easier to bounce data around the SAN when moving containers, makes it easier to back it up, and I don't have to worry about losing the data changes if the container crashes and someone brings things back up from the original image.
 
Kent Bull
Ranch Hand
Posts: 112
3
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Yes, Tim, you are right, I do want more to measure it than to go on a suspicion or a hunch. I don't yet know how to detailed performance measurement of computer software, specifically I/O, properly and I would like to learn how to do so. Though I wanted to dig in deep and learn how to measure things my ops team members wanted to just try things first. I am not satisfied that we have found the root of the problem or even know exactly what the problem was. It may have been our container configuration, or lack thereof, or something in VMWare. We really don't know for sure, which I am sorry to say. I personally would like more certainty as to the source of the problem.

If I knew how to check and measure performance then I could ask better questions of my ops team members instead of having a discussion about an ambiguous performance villain of "it goes slow" or "it is inconsistent." I will be doing some google searching and going through some training courses on Pluralsight like Nigel Poulton's RHEL Storage Fundamentals course and his CompTia Storage+ course set. Do you have any recommendations on resources I could review on how to find bottlenecks in IT systems? The school of hard knocks is definitely a place I can learn, though I'd like to be prepared so I can be proactive about designing software and infrastructure architectures for performance.

Kent
 
Tim Holloway
Saloon Keeper
Posts: 27764
196
Android Eclipse IDE Tomcat Server Redhat Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
In that case, I invite you to visit our Performance forum here at the Ranch! You'll find lots of help there.

Of course, the first thing that they'll tell you is not to design for performance. That's because it's easier to optimize a clean app that to re-optimize one with odd optimizations already in it.

There are plenty of performance tools available. On Linux/Unix, there are utilities like top and iostat and various resources under the /proc filesystem. On Windows, performance tools are mostly third-party items, but they can be a big help. Even if half the time they say that the resource hog is "rundll32" or one of its more recent relatives.
 
Aidan Hobson Sayers
Author
Posts: 6
5
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Kent O. Johnson wrote:Regarding database I/O performance, I had an issue where using a PostgreSQL database container with volumes from a data-only container from within a VM gave very unpredictable and erratic I/O behavior, which I am thinking is because of multiple layers of swapping between the physical machine and the virtual machine. I was using volumes so I am thinking the problem was not a container problem at all, just the configuration of the I/O driver for VMWare, the hypervisor I used.

Switching everything to physical machines outside of containers made things much faster and more consistent. I considered trying using containers on the physical machine though I hadn't got to that yet. I am glad you mentioned the different storage drivers, that they can affect performance, though it sounds like they are only relevant for in-container I/O. Is that right? If I wanted to run a PostgreSQL database container or any database container without mounting volumes then how would I go about achieving similar performance to using the DBMS on bare metal?

Using databases inside of containers seems to unnecessarily complicate backups since there is no SSHing into a container, though I could see an automated backup running to a directory mounted as a volume to solve that problem. Would you solve the backup use case like that?

Regarding network I/O, do you find that the virtual network device layer used by the Docker Engine adds significant enough overhead to warrant bypassing it all together with the "--net=host" option? How much of a performance gain have you seen that give? This is the first time I have seen that option and would like to learn more about it and where it would make sense.

Kent



Hi Kent

As far as I've seen, databases (large binary files frequently changed) are about the worst thing you can have inside a container. Aside from the storage driver thing docker layering will also do very poorly (e.g. if you run your schema in a new layer on a large dbspace, the whole dbspace will be copied). I generally avoid doing I/O heavy things inside a container if possible.

Storage drivers are a fundamental part of Docker, but generally you can get by without knowing anything about them. However, knowing how they work gives you some insight into edge cases (like databases). I'll briefly note the how each one works below:
  • aufs (https://docs.docker.com/engine/userguide/storagedriver/aufs-driver/) - each layer of an image consists of a set of files, and looking at the whole filesystem of a running container works by looking at the files of all layers for the container image (where files in layers higher up 'hide' files from layers lower down). When a container wants to write to a file, the file is copied from the highest layer with that file and used as the container copy.
  • overlayfs (https://docs.docker.com/engine/userguide/storagedriver/overlayfs-driver/) - basically the same as aufs
  • devicemapper (https://docs.docker.com/engine/userguide/storagedriver/device-mapper-driver/) - layers are a bunch of pointers to the actual block contents of files, and file contents are stored in a big pool shared between layers. When a container wants to write to a file, it just needs to copy the appropriate block of the file and alter one of the block pointers.
  • btrfs (https://docs.docker.com/engine/userguide/storagedriver/btrfs-driver/) - approximately the same idea as devicemapper
  • zfs (https://docs.docker.com/engine/userguide/storagedriver/zfs-driver/) - again, approximately the same idea as devicemapper
  • vfs - a 'fake' layer driver that copies the entire contents of the parent layer on creation


  • We can make some observations just on the designs outlined above:
  • aufs/overlayfs 'copy-file-on-write' means the first time you try and write to a large file (e.g. a database) it will copy the whole file, which will be slow, but once done should give fast access because you have your own copy
  • devicemapper/btrfs/zfs 'copy-block-on-write' means there isn't a huge penalty the first time you write to a file so you can get started quicker and the disk space reuse is better, but it may not be as fast as aufs or native access
  • vfs is astoundingly slow to start up, shares no disk space between layers...but should be as fast as raw filesystem access once created

  • My experience is mainly with aufs and devicemapper, so you should definitely (at minimum) benchmark the others before accepting my claims. It doesn't hurt to benchmark them all! You can also look at the table at the bottom of https://docs.docker.com/engine/userguide/storagedriver/selectadriver/ for what Docker Inc suggests are the important points about each driver.

    Unfortunately, there's more to consider than just performance. For example,
  • AUFS is considered a highly stable driver by Docker Inc, but for a particular unusual I/O heavy application I was using 6 months ago, I would weekly see kernel panics bringing down my system!
  • Devicemapper is currently the only commercially supported driver on centos/red hat
  • BTRFS/overlayfs/devicemapper (partition)/zfs can require special setup which may be nontrivial


  • Honestly, I look at heavy I/O in containers with great scepticism, just because I've lived with a lot of pain with unstable drivers. Of course, things are always improving! But volumes (and possibly volume drivers) feel like the most reliable approach to me for now.
     
    Kent Bull
    Ranch Hand
    Posts: 112
    3
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator

    Aidan Hobson Sayers wrote:
    I'll briefly note the how each one works below:



    Thanks for the in-depth answer, Aidan. I appreciate perspective on each of the storage drivers.
    reply
      Bookmark Topic Watch Topic
    • New Topic