Musings on cloud initialization ________________________________________ I spent a while reading/learning about how to set up cloud instances. This document summarizes some of those things. The Manual Way ________________________________________ This is the way I've been doing things up until now. If I need the machine set up a certain way, then I'll just start an instance, SSH in, and run the com- mands I need. Then, if I need to start up many instances like that, I'll make a [snapshot] and convert that to an [AMI] . This way definitely works, but it means that if I made a mistake on the 2nd com- mand, I have to manually re-run every- thing up until that point. It also means that explaining what I did is next to impossible unless I take proper notes. The nice thing is that it is easy to de- bug and test things when you don't know how they're going to work. At every step along the way, I can manually test how it's working. I also find that it's hard to (de)com- pose. The latest example is that I set up an instance with two main features: logs get forwarded to a single instance, and it runs an application on start up. If I want to decompose this into each separate part, and maybe run a different application with logging, or run two ap- plications on start up but not forward logs, then I'm at a loss. Pros: Easy to understand. Easy to test. Cons: Hard to explain. Hard to repro- duce. Hard to (de)compose. Aside: Running applications on startup ________________________________________ I like to use [systemd] to do this be- cause I understand it to a certain lev- el. I also know that it will get things right with regards to starting up my ap- plication again if it crashes. I want to be able to use systemd user services to run things because it means that I don't have to manage everything as root. I ran into problems where I couldn't get it to start actually run- ning my user services on startup, and I'm not sure why. There might have been something wrong with the [linger] set- ting or something to that effect. In the end, I just used regular ser- vices, which meant that I threw a [.ser- vice] file in [/etc/systemd/system/] . A simple one looks like this. --------8<-------------------------------------------------------------- [Unit] AssertPathExists=/usr/local/bin/myapplication [Service] ExecStart=/usr/local/bin/myapplication --my-flags Restart=on-failure RestartSec=10s [Install] WantedBy=multi-user.target -------->8-------------------------------------------------------------- The [Type=simple] setting is implied by the [ExecStart=] one. I also find that this is the right level of restart log- ic. I had tinkered in the past with some restart settings like [StartLimit- IntervalSec=] and [StartLimitBurst=] be- cause they are referenced nearby in the man page, but they are almost certainly the wrong choice for the types of appli- cations I write. After writing this file to [/etc/sys- temd/system/myapplication.service] , it is a simple matter of enabling and starting the service to get it to auto- run every time the instance starts. --------8<-------------------------------------------------------------- $ sudo systemctl enable myapplication $ sudo systemctl start myapplication -------->8-------------------------------------------------------------- Automating the manual way ________________________________________ You can automate the manual way pretty trivially. Either via [scp && ssh && bash] or using a library like [paramiko] . Furthermore, you could use [Puppet] or [Chef] or whatever the latest tool like that is and let it handle setting up the instance. The main problem with this is that it all happens after-boot, when the entire stack is already up, and requires SSH access. Other than that, there is still the problem of having to write/run ev- erything as one cohesive unit, which makes things harder to debug. There is a nice way to facilitate testing by cre- ating one instance along with copying lines over to a shell script as you run them, which does work, but it makes it hard to relay what testing was done be- tween steps. Pros: Automatic. Simple to understand. Cons: Tests aren't encoded. Runs after boot, not on boot. Automating the Docker way ________________________________________ If [Docker] is running on the server, then it's pretty easy to push the Docker image to the cloud and then run it from the server. You can even set up some automation there to automatically pull the latest image, but it's not the most straightforward. Automating the cloud way ________________________________________ This kind of problem has already been solved by cloud people who need to in- stall different software on different instances. It's also been fairly stan- dardized because of the availability of many good cloud services. The [cloud-init] approach is a YAML file that encodes many different devops func- tions and runs through them. These functions include: write this content to this file, run this command in this di- rectory, create this user with this name, and more. I see one large flaw in this approach for general cloud instance configura- tion: it is non-trivial to have things run in a particular order or interleaved in a particular way. Consider: to create a systemd service, we need to first [pip install] the ap- plication ("run a command"), then we need to create the systemd service file ("create a file"), and finally enable the service ("run a command"). As far as I can tell, cloud-init doesn't work like this. It would want to run both commands one after another and create the file either before or after both commands. In practice, this doesn't have to be a problem because one could create the files using a shell script ("run a com- mand"), but it does feel at odds with everything else. There is a very good story about extend- ing cloud-config, however. By creating custom "part handlers," one can define exactly what they want to happen when a certain key is found in the YAML file. This means that one could create some- thing that would allow the interleaved scenario from above to be efficiently defined. I also think there would be a lot of value in creating a Docker-to-Cloud-Con- fig converter. There is a similar Dock- er-to-AMI converter, but it functions like "Automating the manual way" from above. Note: there is some nuance to the dis- tinction between cloud-init and cloud- config. The cloud-init tool accepts many different types of input: a shell script, a custom part-handler, a URL to download and run, and a cloud-config file, etc. It also allows you to com- bine multiple inputs together in one "package" which it will run through se- quentially. The cloud-config file sup- ports things like: running scripts, cre- ating users, writing to files, in- stalling packages, etc. Although cloud- config shares many functions with cloud- init, they serve different use cases. In the ideal case, I would be able to send multiple cloud-config files to cloud-init and have each file set up a different tool, but I believe cloud-init with merge these files together, thus breaking in the way I described above because it would interleave things to- gether. I am unsure of this though, and it should be tested, because that would be an easy solution to my problem. Pros: Supported by every cloud. Can be run earlier in the system setup. Cons: File format is a little hard to understand (compared to a shell script). Might be composable. Not applicable to non-cloud-machines. Conclusion ________________________________________ I haven't found the best way to set up a cloud instance. Currently, I'm thinking that cloud-init is the best way and us- ing cloud-config, but that's only if the composition story makes sense. Other- wise, cloud-init with regular shell scripts and a framework that facilitates easier testing in between steps is what I'll go with in the future. Link Dump ________________________________________ [0]: https://cloudinit.readthedocs.io/en/latest/topics/format.html Details how the cloud-init format works and how to combine different steps into one file. [1]: https://cloudinit.readthedocs.io/en/latest/topics/examples.html A nice set of examples of the cloud-con- fig format. [2]: https://serverfault.com/a/413408 I ran into a problem where I needed to set custom environment variables for different instances. This answer shows one way to do this, and it's the way I ended up using. You can create a [/etc/systemd/sys- tem/myapplication.service.d] directory and put [.conf] files inside. Then sys- temd will read and merge all of these files together, thus ensuring that your environment variables will be loaded. Note: You need to still put the right section titles. Don't forget to put the [Service] header before the [Environ- ment=] settings, or it won't work. [3]: https://serverfault.com/a/410438 Inside of the instance, I wanted to know what type it is (i.e. [t2.micro] or [t2.medium] . AWS exposes a web server that has this information at [169.254.169.254] . Here is a bash snippet to store some of these values as environment variables. --------8<-------------------------------------------------------------- keys=( $(curl http://169.254.169.254/latest/meta-data/) ) for key in "${keys[@]}"; do case "$key" in (*/*) # Skip any multi-level keys continue;; esac value=$(curl http://169.254.169.254/latest/meta-data/$key) # Before: $key looks like "ami-id" # After: $key looks like "AMI_ID" key=${key^^} key=${key//-/_} # Safely use eval without any worry about escaping anything eval "AWS_$key=\$value" done -------->8-------------------------------------------------------------- Afterwards, you can use variables like [$AWS_AMI_ID] or [$AWS_INSTANCE_TYPE] . [4]: https://forums.aws.amazon.com/thread.jspa?threadID=250683 Apparently AWS instances take a while to disappear after terminating them, so it's fine if they don't disappear from the console immediately. [5]: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/user-data.html AWS has user-data which is sent to cloud-init, so it should be in cloud- init formats (i.e. a shell script, a cloud-config, or many things concatenat- ed).