Zombie containers -> Skip This Build?

Yet Another Update: A cure is found… https://freddysblog.com/2019/09/05/a-cure-for-zombie-containers/

Another Update: Docker engine 19.03.2 (and docker Desktop 2.1.0.2) should be released first week of September with a cure for Zombie containers.

Update: A way to bring your Zombie Container back to live is in the comments by Mick Carr (THANKS MICK) – just tried this and it works on my Computer with 2.1.0.1 as well. Basically modify the containers config.v2.json, change running to false, restart docker and now your container is dead (not living dead). Use docker start to start the container and it comes back to life…

Over the last few days I have experienced a strange behavior with the latest version of Docker Desktop Community edition (2.1.0.0 (36874) released July 31, 2019) on my Windows 10 1903 machine. Thinking this was a problem with my machine, I decided to postpone the investigation, while working on other issues. Yesterday I had two partners contact me with the same behavior, it was going to be a long night…

The problem

It is very simple to reproduce the problem. If you restart Windows while your NAV or Business Central containers are running, they will (after the restart) still report as running (as they should due to restart setting), but the containers will be like zombies.

  • docker ps will say they are running
  • docker exec -it container cmd will say “no such container”
  • docker stop container will freeze and not be able to stop the container
  • docker rm container will freeze and if you break this and retry it will say that it is being removed.
  • Trying to ping the container or in other way contact the container fails

A living dead container – a zombie.

I can reproduce this using New-NavContainer or Docker Run – no difference and I am in contact with the team in Microsoft, who are working with Docker on this issue.

I don’t think this is related to NAV or Business Central containers. It also doesn’t matter whether it is running Hyper-V or Process isolation nor does it make any difference if I have updatehosts, volume shares or any other things enabled.

The only thing that worked for me, was to uninstall Docker and reinstall this version: Docker Community Edition 2.0.0.3 2019-02-15, which you can find and download here: https://docs.docker.com/docker-for-windows/release-notes/

I will continue working with the team in Microsoft and with Docker to find a solution and see whether we need to change anything in our containers, but for now, my recommendation is to Skip This Build.

Note that when you uninstall, it will remove all containers and images. Since I haven’t found any way to bring the Zombie containers back to life, I guess this is the only thing to do.

When you start this version (or if you haven’t upgraded yet) you will be met with this dialog:

skipthisversion

On this dialog, I pressed Skip This Build.

One more “workaround”

Only other workaround I found is to stop the containers before restarting Windows and start them afterwards manually. This of course led me to believe that I could set the restart option to no, but that also doesn’t work. The new Docker version will still report it as running after Windows restart.

What’s next

I will update this blog post as soon as I know more – for now, I just wanted to make sure that our partners aren’t wasting valuable time troubleshooting a problem, which there might be no solution to.

If you read this and have information on this subject, which you want to share, please create an issue here: https://github.com/microsoft/navcontainerhelper/issues or email me with your information.

Thanks

 

Freddy Kristiansen
Technical Evangelist

 

15 thoughts on “Zombie containers -> Skip This Build?

      • Hi Freddy,

        I had something similar to this a while ago.

        If you stop the Docker daemon and browse to your Docker folder where your containers are stored. For me this was C:\ProgramData\Docker\containers. You will need the ID of your container, which you can get from the message that was displayed or by doing ‘docker inspect ’. If you open the Config.v2.json (app needs admin rights to edit) you’ll see that there are a few properties around the containers state…

        “State”: {

        “Running”: true,

        “Paused”: false,

        “Restarting”: false,

        “OOMKilled”: false,

        “RemovalInProgress”: false,

        “Dead”: false

        If you change the ‘Running’ state from true to false, start the Docker daemon then with some luck you should be able to get your container back. At least this worked for me. There are a few scenarios where I’ve had to do this for my containers, such as when the network fails etc. Hopefully this will help! I did have a container stuck with “RemovalInProgress: true” which I also got around by just resetting this value to be able to remove it.

        Like

  1. What I do is to make sure the Docker Desktop is not automatically started when starting windows. Most of the time the docker images are started fine after this. Sometimes with the latest Docker Build 2.1.0.0 (36873) I manually have to start the D365 BC container.

    the container is installed using : New-NavContainer -accept_eula -auth NavUserPassword -updateHosts -assignPremiumPlan -containerName mybcsandbox -imageName mcr.microsoft.com/businesscentral/sandbox:nl -additionalParameters @(‘–cpu-count 4’)

    Only after a bigger Windows Update I most like have to install the images again by resetting to factory default in Docker and run the Powershell script again.

    Hope it helps:-)

    Like

  2. I can confirm, that after a Windows Update (Cumulative Updatefor Windows 10 Version Next (10.18362.10013) (KB4508451), The D365 BC container stopped working.
    After using the tip from Mick Carr, (Running was indeed set to true) and a restart of the Docker Daemon, and manually starting the Container using Kitematic, everything came back alive.

    Like

    • My workaround has been to cleanly shut down the docker container in powershell before I shut down the host. Then I have to start the container after a clean boot, but at least it doesn’t get itself into a zombie state.

      Liked by 1 person

  3. I am so glad for Freddy’s blog on containers 🙂 Glad to know this problem isn’t just me!

    Not only is this “zombie container” issue a problem – even installing Docker is becoming more and more problematic.

    Like

  4. Thanks so much for the fix (Mike Carr’s fix worked for me)! Docker has almost become such an essential tool for the NAV/BC developer that it is a bummer when it is not working. Plus I loved the title of this thread as I am a huge fan of zombie movies!

    Like

Leave a comment