This episode discusses proper container start-up and shut-down. If you have more than one container to spin up, especially in your local environment, then you’re probably using Docker Compose. It’s a complicated problem because there can be multiple dependencies (and each dependency can have dependencies). Chris Hickman and Jon Christensen of Kelsus and Rich Staats from Secret Stache discuss this issue in the continuation of their micro series of episodes focused on Docker tips and tricks.
Some of the highlights of the show include:
- Database needs to be up and running before microservice; manage them and make sure you’re starting them up dependably
- In Compose file, have settings that give hints on how they should be started, but may not necessarily be ready to service calls – need synchronization
- Code it in application code or startup script for polling readiness of dependencies
- Orchestrator can help with sequences and handle some tasks; retries until it succeeds
- Container should be aware of dependencies and wait for them or identify what’s missing
- In a production environment, it’s maybe not a good idea to have application code that waits inside a container for a dependency to be available
- When Docker wants to stop a container, it sends a signal to the container and waits for it to respond; allows you to clean up before quickly exiting – does not hang
Links and Resources:
Rich: In Episode 25 of Mobycast, we continue our micro-series on Docker Tips and Tricks. In particular, we discuss how to start up containers dependably and shut them down gracefully. Welcome to Mobycast, a weekly conversation about containerization, Docker and modern software deployment. Let’s jump right in.
Jon: Welcome, Chris and Rich. It’s another episode of Mobycast.
Rich: Hey, guys.
Jon: What have you been up to this week, Rich, besides producing another episode of Mobycast?
Rich: Yeah, fighting with Facebook ads and trying to figure out how to get some intelligence from what we’re seeing on a project that you and I worked on together. We’ve managed to figure out–if you’ve come to this podcast through a Facebook ad, those ads have been working very well for us and we have those under control and they’re pretty predictable but for another project where we’re working on Facebook ads that lead to app installs, it’s inconsistent, the data conflicts with each other and we’ve been really struggling trying to wrap our heads around it, so a lot of time with that and a bunch of different WordPress projects that have been going through.
With all that said, we reached 10,000 listens, 10,000 downloads with Episode 24 so I thought I would say thank you to those who are actually listening to us. Although we’ve had 10,000 listens, we’ve had 0 reviews. With that milestone behind us, our next milestone will be the first review. If you’ve liked what you’ve been hearing, […] later and give us a five-star review. If you don’t think we’re doing a good job, find Jon on Twitter and reach out to him there.
Jon: Yeah, please do. That’s so funny that you said that, Rich, because I was thinking I was going to say the same thing. We’d love to hear from you if you’re listening. Chris, what have you been up to?
Chris: I am trying to hide from all the smoke that we have here in the Seattle area. The wildfires have been pretty bad in Canada and the other areas around us, and it’s just been sitting over Seattle so we kind of look like we’re on Tatooine with the Red Sun and the Red Moon at night. So it looks to be of some relief in sight, and I will be able to get back on my bike so I’m looking forward to that.
Jon: That would be nice. Eagle, Colorado has been that way all summer, and I recently saw–I think it was maybe on NPR–that large parts of the west have had worse air quality than Beijing throughout the summer, which is just kind of–I’ve read about bad air quality in Beijing and I’ve talked to people that are like, “Oh, I wouldn’t to travel to Beijing with my kids because the air quality is so bad there,” and it just kind of hits you in the face when you’re raising kids and you find out, “Oh, actually, the air quality right here is as bad as it gets.” It’s pretty hard to deal with this climate change, these fires and this awful air.
Jon: It’s really, really depressing.
Chris: I’m scraping ash off the cars.
Jon: Yeah, exactly. I mean, literally weeks, and weeks, and weeks of horrible air. Hopefully, it gets better. We’ve actually had a ton of rain here in the past couple of days so I think we’re on that swing too.
Jon: More fun things to think about: Docker and AWS. Let’s continue where we left off last week. Last week, we were doing a recap of Tips and Tricks talk. Who is that by?
Chris: By Adrian Mouat who’s a chief scientist at Container Solutions, and he had a breakout session called Tips and Tricks of the Docker Captains. Adrian’s in the Docker Captain program and so kind of brainstorming with the rest of the folks in that program that he’s familiar with to come up with what are the common gotcha’s that folks have that they see, too, and present those in a session.
Jon: Cool, and we talked also last week about how this session is really just Tips and Tricks and there wasn’t an overarching theme other than, “Here’s great things for you to know about,” and so each tip and trick is not necessarily related to the next. We can jump right in. The last one we talked about was, “Beware of Latest.” We don’t need to talk about it again. It’s on the last episode so we can jump right into the next one. What’s the next one?
Chris: The next tip is, Start up dependably. This is definitely a common situation when you have more than one container that needs to spin up, especially in your local environment. You’re probably using Docker Compose, you have multiple services defining your Docker Compose and there’s very real dependencies there. The typical example would be you have a microservice-implemented API that uses a database for a backing store. Obviously, the database needs to be up and running before the microservice is so you need to manage all that stuff and make sure that you are starting up dependently, so what are the gotcha’s and how do you do that.
Chris: Again, this is a pretty typical problem that folks have, especially right out of the gate, and there are ways that you can–in your compose file, you can have settings that kind of give hints for the Docker Compose tool itself of how I should start things up, but that doesn’t mean that it’s actually right. Just because it started doesn’t mean that it’s actually ready to service calls. There’s this whole synchronization thing so you have to be aware of that. One of the best ways of dealing with this is to actually just code this into your dependencies themselves, whether it be–you can either do that in your application code or you can do it–if you don’t do it there, then maybe you do it in your startup script.
Again, in this example, if you have a microservice with a back-end database, you may go ahead and start up this database service, start up your microservice as part of your microservice startup code. It can go query for the database connection and check to see if it’s ready. If it’s not ready, then it just sleeps and waits a little bit then it goes and checks again. It’s pulling and it’s waiting for that service to come up and be alive. Once it is alive, then it can proceed. Again, you can do that kind of pulling for readiness either in your startup script or in the application code itself depending on what makes most sense for you. Doing that is going to give a much more dependable operation of your code. It becomes even more important the more dependencies that you have.
Jon: This is causing me to have some thoughts–and I realize that you said this might be mostly in your local environment–because the thoughts that this is causing me to have is that can’t we use our orchestrator do some of this for us and wouldn’t it be better to not have application codes sleeping if it’s not really working? Because what I’m thinking is you have a container and it depends on a database. It seems like, in a perfect world, the container would start up and be like, “Oh, my database isn’t there. I’m going away. I’m done. Don’t worry about me. I should be alive,” and that the orchestrator should be the one that’s retrying and retrying until it can succeed as opposed to the application code sitting there with a running container that’s doing nothing and not useful, waiting and waiting for a database to come alive. What do you think?
Chris: It’s definitely pretty complicated because what is your orchestrator? In your local environment, Docker composes your orchestrator. In this particular case, it could be. You could very well be running Kubernetes or something like that locally as well and you can use that as your orchestrator, and there’s actually other options out there as well. For the most part, if you’re a developer writing code, you’re probably going to be using Docker Compose and so that has some basic functionality.
Again, the whole dependencies thing–how do you actually describe those dependencies? Then, you get into the concept of shallow checks versus deep checks on health. Again, it’s one thing to say something has started up and you may have a shallow health check for that, but is it really ready? Maybe it has dependencies, that it was to wait for them to be ready. Then, you have this–you do a deep check, and how do you describe all this stuff to an orchestrator for it to do? That’s just a lot of complexity, and no one knows that better than you. It does kind of get pushed off from the orchestrator back onto you. The orchestrator can help out a little bit with, “Hey, this is the sequence I want things to do. I want to go,” but, as far really guaranteeing these things are up, they’re ready, they’re responding to requests correctly and just making sure that the system’s up and running correctly. That’s more your job.
Jon: I really think this is worth talking about a bit more. One of the things about Docker is that it’s supposed to let you make these containers that are–you run them anywhere. You run them in production, you run them in staging, you run them on your local machine and, hopefully, they’re not really different in those different environments. The containers that that’s happy to be run in different contexts. Having said that, it feels like, on one hand, yes, the container itself and its application code does feel like it should be aware of its dependencies.
It should be able to say, “Hey, my dependencies aren’t here. That’s wrong. Something’s not okay,” and maybe the container is like,” Okay, I’m going to wait for my independency so I’m going to give some time to come up, or maybe the other thing the container could do is be like, “My dependencies aren’t where they’re supposed to be so I’m going to tell you, as part of my graceful shutdown what I’m missing and why I’m not happy.” At least you get some feedback.
The reason I feel like the orchestrator should be part of this is orchestrators run towards this optimal–they’re trying to make things optimal so they’re like, “We want at least five containers running but not more than 10. That’s how you do things like fully no downtime upgrades. I knew the word “up” was in there but it was part of upgrade.
No downtime upgrades require orchestrators to be smart about shutting down containers, spinning them up and waiting until they’re happy. Then, if there’s ever a container that fails, orchestrators are like, “Yup, let me just take care of that,” and put anyone in. Going back to what we were talking about, I agree that Orchestrator shouldn’t enforce a dependency order like the orchestrator’s not really going to be in charge of making sure that this one starts and then this one starts, but it may be okay to rely on Orchestrator to keep trying to and something if it’s not happy.
If a container is like, “I’m not happy,” my dependency is not there, then it seems to me to be like there may be a lack in Docker Compose of the fact that Docker Compose is not really a full orchestrator is kind of problematic because there are other orchestrators like ECS or Kubernetes or are good enough to say, “I’m going to retry several times until I get a happy container. A happy container is one that says, “Yup, I got my independencies and I’m running,” as opposed to, “I don’t have my dependencies and I’m running so I might seem happy to the orchestrator but I’m not actually.” That was a long description but I’m trying to make up something.
Chris: Again, it’s a super-complicated problem just because of the fact that their dependencies–there could be multiple ones and you could have dependencies of dependencies. You can think of it as a graph and you can have loops. Even something like ECS or even Kubernetes–they’re not looking at the graph; they’re looking at like, “Okay, here’s a container. I’m spinning up Kubernetes as the idea of health checks so you can actually have Kubernetes health checks with your services and what not, and it can ping and that stuff and it can say, “Oh, this container’s not running,” and just go ahead and restart it. Again, if it’s not started because this other thing’s not ready, maybe that other thing might be waiting on something else. It could just be thrashing.
Jon: Right, if that other thing is outside of your control plan so it’s not going to get started. Nevertheless, I think the thing that really caused me to pause, the thing that made me like I don’t know if I agree with this is just the idea of having a container that’s sitting there, waiting, and not telling anybody what it’s waiting for.
It’s just like waiting for independency and pulling and not being useful to anybody for anything. That was what made me say, “Wait a minute. Wait a minute. That doesn’t sound right. Why would you ever want to have that?” I get in a Docker Compose environment, and maybe that is or it’s not going to be waiting that long because you know that the Docker-composed trick is going to start the other container and just kind take a little bit to start. If you’re not always running in Docker Compose, which you’re not, then that behavior is sometimes useful and sometimes really, really bad.
Chris: This is definitely something that’s much more applicable to the local environment and not in your cloud or your production environment. In this particular case, again, microservice, backend database. You’re spinning up the database locally in your machine on demand. In a production environment, it’s probably an RDS instance. It’s always there so there is no–that dependency is just there; you’re not going to wait for it. You’ll do that, too. You’ll put in flags to say, “Only do this when I’m running locally. Don’t run it in production. There’s no reason to. This goes with just about all the other dependencies as well.
It’s like, when you’re running in the cloud and in your production environments, you could have 20, 50 or 100 services that are all clients with each other and whatnot, but they’re all actively running. It’s not like you’re starting all of them every time you deploy something versus, “That’s what’s happening on your local machine. On your local machine, you have to start up like I said. This discussion really is much more relevant to your local environment and then, also, in your CI environments. If you’re trying to run tests and that might require spanning up some dependencies like this really comes into play as well.
Jon: Cool. That makes sense. Can I get at least a nod of acknowledgement that maybe is in a production environment and it’s maybe not a great idea to have application code that just sort of waits forever inside a container for a dependency to be available?
Chris: You will not just a get a nod. That is a really good distinction to make. Again, the example of kind of pulling and whatnot, being in your startup application code. Definitely make sure that’s for your local environment only. Guard against that. Don’t do this in production because if you’re doing that in production, then you need to be doing some other things first, like you’re probably doing some things wrong.
Jon: Great. Okay, I think we have just a couple of more minutes that we can sweet in this Shut Down Gracefully Tip and Trick and then we’ll finish of the rest of the tips and tricks next week.
Chris: This is another one of these grab bag of tips, and that is–how do you shut down. How do you handle shutdowns in your container? This is one of those things that I think a lot of people don’t give too much thought to, and it’s probably–it’s definitely top of mind but it is something to give some thought to, to understand what’s going on there and what you may be missing out on.
Just for reference, when it’s time for a–when Docker wants to stop a container, it’s going to send that container a turn signal and then it’s going to wait 10 seconds for that container to respond to that turn to give it a chance to clean up and stop gracefully. If it doesn’t exit within those 10 seconds, then it’s going to hard kill it by sending a signal. What this means to you and your application is that if you properly handle that signal turn, it’s going to give you the opportunity to clean up and to make things tidy before you exit.
It might mean things like closing network connections, or sockets, or handles. Maybe it’s finish writing stuff to a log, flushing memory to a log. Maybe you’re dealing with caching for performance reasons. Maybe you need to write some stuff to a database and whatever else–what other kind of housekeeping your app may or may not have to do. Then, the other big benefit, of course, is just faster shutdown. If you actually respond to it, then your container will shut down very, very quickly as opposed to just hanging, basically, for 10 seconds.
If you’ve used Docker in various images and even on your local machine, I’ve run into this many times of using various images. When you go ahead and try to stop something, it hangs for a while before it finally quits. That’s what’s going on, is that that particular container is not responding to the turn signal.
Jon: That is super interesting. I did not know that and I’ve definitely seen that behavior. It’s like 99.99% of Docker containers are not using this tip or trick.
Jon: That is super interesting. Gosh, when something does go wrong or when you get shut down unexpectedly, wouldn’t it be nice if you had some log memory to get that written out before things go away.
Chris: Yeah, and we’ve seen this in some of the apps that we’ve done, too, where, some of the logging, the way that the logging code was working, it was not getting a chance to flush when it continued to shut down. When it had maybe an abnormal exit or it was being shut down because it failed a health check, it was not finishing–we were not seeing this in the logs, and this was one of the reasons why.
Jon: Thank you very much. Those are two good ones. Thank you, really, to Adrian for coming up with those and letting us talk about them. We’ll talk to you again next week with a few more.
Rich: All right. Thanks, guys. Take care.
Jon: Thanks, Rich. Thanks, Chris.
Rich: Well, dear listener, you made it to the end. We appreciate your time and invite you to continue the conversation with us online. This episode, along with show notes and other valuable resources, is available at Mobycast. If you have any questions or additional insights, we encourage you to leave us a comment there. Thank you and we’ll see you again next week.