wrong-way-to-switch-server-os.md (9491B)
1 +++ 2 title = "The Wrong Way to Switch Operating Systems on Your Server" 3 date = 2021-06-17 4 updated = 2022-06-15 5 [extra] 6 type = "post" 7 +++ 8 9 After [moving my server to Hetzner][mv], I built up a large collection 10 of self-hosted services I use on a daily basis: from fun things like 11 an [RSS reader] and an [IRC bouncer], to critical services like my 12 [email]. I ran them all with `docker-compose` from a [Debian] VPS. For 13 the last couple months, however, I've been meaning to move away from 14 Debian and towards something more minimal and clean. Over this last 15 weekend, I decided to move to [Alpine Linux]. 16 17 <!-- more --> 18 19 ## The Plan 20 21 The transition was supposed to be quick and dirty: 22 23 1. Shut down all the services running on my VPS 24 2. Make a backup of relevant files with [Tarsnap] 25 3. Mount Alpine Virtual ISO image and setup the OS 26 4. Restore files from Tarsnap backup 27 5. Bring everything back up 28 29 In a previous move between two servers, I simply `rsync`ed the 30 relevant files over to the new VPS. Here, where I'm just switching 31 operating systems on a single server, I figured I could make a backup 32 with Tarsnap, and be done within the day. 33 34 However, backups are much more complex than simply transferring files 35 from one server to another. My haphazard strategy resulted in three 36 days of stress and frustration as I clambered to restore a 37 self-hosting empire that I myself had reduced to ash. 38 39 ## Day One 40 41 I began my work on the transition full of optimism, if a bit stressed. 42 I had read through the Tarsnap online documentation a number of times, 43 and was ready to make my first attempt. I loaded my Tarsnap account up 44 with USD$10 and ran: 45 46 ```sh 47 $ sudo tarsnap -c -f backup-name docker-compose.yml ... 48 ``` 49 50 My terminal sat empty for hours. There were no changes – the process 51 was running, but there was no feedback. I was nervous. 52 53 > What if it failed silently? 54 > 55 > How can I check? 56 > 57 > What should I do? 58 59 I pressed `<Ctrl-C>`. 60 61 To my horror, stats printed to the screen: the backup had been 90% 62 complete, and I had stopped it. Convinced I had ruined the backup 63 completely, I deleted the partial backup from Tarsnap and started 64 again from scratch. 65 66 This was my first, but not last, moment close to tears. I went to 67 sleep and let the backup run overnight. 68 69 ## Day Two 70 71 Day Two began well: I woke and the backup was finished! I wiped the 72 VPS, installed Alpine, and brought it up to spec. I created a regular 73 user, configured SSH, and decided to use `doas` instead of `sudo` for 74 a change. Alpine, so far, feels great to use. None of the cruft that 75 bothered me when using Debian. 76 77 ### Virgin Tarnsap 78 79 With Alpine set up, I started to restore the backup: 80 81 ```sh 82 $ doas tarsnap -x -f backup-name 83 ``` 84 85 Once again, after running all day it had not finished. 86 87 I opened up a new `tmux` window and poked around the filesystem. All 88 my files seemed like they were already there... 89 90 > What if it failed silently? 91 > 92 > How can I check? 93 > 94 > What should I do? 95 96 I pressed `<Ctrl-C>`, cutting off the download, and tried to bring 97 everything back online: 98 99 ```sh 100 $ doas docker-compose up -d 101 ``` 102 103 It errored out. All my environment variables were undefined. Then it 104 hit me: I forgot to back up the `.env` file. My eyes welled up. 105 106 Still, I was determined. I worked to reconstruct the `.env` file from 107 secrets I had stored in [Bitwarden] (my offline copy, because my vault 108 is self-hosted and was thus down). 109 110 I ran it again: 111 112 ```sh 113 $ doas docker-compose up -d 114 ``` 115 116 One of my services was missing a Dockerfile to build. I shouldn't have 117 pressed `<Ctrl-C>`! I was a total moron. 118 119 I put on a [sad song]. I was close to tears once again. 120 121 I gathered what was left of my resolve and trudged onwards. I searched 122 `tarsnap`'s manpages looking for something to speed up my download. 123 124 I found a number of flags that could have helped me *make* a backup 125 better the next time around, but nothing that would help me restore 126 the backup any faster. With nothing in the manpages, I went to look at 127 the [helper scripts]. 128 129 ### Chad Redsnapper 130 131 That's when I found it: [redsnapper]. A Ruby script that runs multiple 132 tarsnap clients at once to extract archives **fast**. Fucking 133 precisely. I wiped out the incomplete files I had restored, downloaded 134 Ruby and started restoring from the backup once again: 135 136 ```sh 137 $ doas redsnapper backup-name 138 ``` 139 140 I changed [the song], and watched the files fly by on my screen. I 141 went to sleep, confident I would wake to good news. 142 143 ## Day Three 144 145 The download had failed trying to download a large `.mkv` file. 146 147 ### Manual Exclusion 148 149 I restarted `redsnapper`, explicitly excluding the `.mkv` it had 150 failed to download, and let it run until it came on another movie and 151 crashed again (an hour or so later). I excluded the second movie file 152 and sent it to run again. 153 154 This was a long, boring process. It sucked. 155 156 ### An Afternoon Breakthrough 157 158 Then I realized something. `redsnapper` kept crashing when it hit 159 movies I had stored in [Jellyfin]. 160 161 > I don't need Jellyfin at all. I've never watched a movie more than 162 > once. 163 > 164 > The movies take up massive storage on disk, and keep causing tarsnap 165 > to crash. They don't compress well either, so they take up a fuckton 166 > of space in the archives. 167 > 168 > I can always download the movies again if I want to give them 169 > another go. 170 > 171 > Why the fuck am I forcing myself to deal with this shit? 172 173 I stopped the download in the middle - the day's third, after two 174 earlier attempts that ended after encountering movie files – and 175 changed the command slightly before rerunning. After a number of 176 errors I couldn't explain, I realized my account was negative and 177 topped it up with another USD$25 before running: 178 179 ```sh 180 $ doas redsnapper backup-name -- --exclude='*/jellyfin/*' 181 ``` 182 183 I returned to my computer a couple hours later. `redsnapper` had 184 stopped, with a whole lot of files extracted and a couple errors at 185 the bottom about symlinks. 186 187 I figured, this time, it had probably done everything properly but 188 couldn't create the symlinks (probably a flag missing somewhere). I 189 manually went through my files creating the symlinks, and then brought 190 everything up with docker-compose. 191 192 I checked the containers. All up. 193 194 I checked the logs – no immediate errors visible. 195 196 I opened [figbert.com] on my laptop. It appeared. Service was 197 restored. Hallelujah. 198 199 ## Mistakes 200 201 I made a lot of them. Here are a few: 202 203 1. After shutting down my containers, I backed up my entire setup. 204 This included a number of ["live" databases][live], `.git` folders, 205 and other data that I either did not need or could reconstruct 206 once the move had been completed. 207 2. I didn't back up the `.env` file I use to store secrets for use in 208 `docker-compose.yml`. I was luckily able to reconstruct it from 209 individual secrets I stored in my password manager. 210 3. A thorough read of the manpages before I started (rather than just 211 the online guides) would have revealed several helpful flags: 212 `-v` to see what files `tarsnap` is operating on, 213 `--aggressive-networking` to take advantage of the datacenter 214 internet speeds, and `--recover` to resume interrupted backups, to 215 name a few. 216 5. We already talked about Jellyfin. Even with very little content in 217 Jellyfin, the collection took up huge amounts of space on disk and 218 in the backup (especially because video files don't compress 219 well), and sat entirely unused. It is now gone. Good riddance. 220 221 ## Future 222 223 What did I learn? Well, I'm still devising a plan to prevent things 224 like this from happening in the future. Here's the plan currently: 225 226 ### Backups 227 228 Back up everything every day. I'll build a buffer of three "rolling" 229 backups, where backups collect up to a max of three and then, as new 230 backups are created, the older backups are removed. 231 232 The backup script will shut down the services, dump the databases 233 (i.e. convert as much content to plain-text, easily-compressible 234 formats as possible) and make a time-stamped backup (currently only 235 with Tarsnap, but perhaps in the future with a number of other 236 services). 237 238 ### Restoring 239 240 Simply having high-quality backups to restore will already be a huge 241 leap forward. I'm also *definitely* going to continue using 242 [redsnapper]: the speed gains it gives on large backups are crucial. 243 244 ### Manpages 245 246 I really should read all the documentation before I try something new. 247 248 ## Bye Bye 249 250 I'll write further about my self-hosting setup as it evolves, and 251 publish the backup script once its finished. I'll also maintain a 252 dedicated page on my site describing my self-hosting setup as it 253 changes. 254 255 Also, I'm sure there are people more knowledgeable about Tarsnap than 256 I. That's basically the point of this article. If you are one of these 257 people, please don't hesitate to [email me] if you've got corrections, 258 advice, or just want to flex that you know how to do backups better 259 than I do. 260 261 [mv]: @/posts/moving-to-hetzner-from-digitalocean/index.md 262 [RSS reader]: https://miniflux.app 263 [IRC bouncer]: https://thelounge.chat 264 [email]: https://maddy.email 265 [Debian]: https://www.debian.org 266 [Alpine Linux]: https://alpinelinux.org 267 [Tarsnap]: https://www.tarsnap.com 268 [Bitwarden]: https://bitwarden.com 269 [sad song]: https://www.youtube.com/watch?v=I-sH53vXP2A 270 [helper scripts]: https://www.tarsnap.com/helper-scripts.html 271 [redsnapper]: https://github.com/directededge/redsnapper 272 [Jellyfin]: https://jellyfin.org 273 [figbert.com]: / 274 [the song]: https://www.youtube.com/watch?v=gPOEBkcZHM4 275 [live]: https://www.tarsnap.com/tips.html#back-up-live 276 [email me]: mailto:figbert@figbert.com