wrong-way-to-switch-server-os.md (9469B)
1 +++ 2 title = "The Wrong Way to Switch Operating Systems on Your Server" 3 date = 2021-06-17 4 updated = 2022-06-15 5 +++ 6 7 After [moving my server to Hetzner][mv], I built up a large collection 8 of self-hosted services I use on a daily basis: from fun things like 9 an [RSS reader] and an [IRC bouncer], to critical services like my 10 [email]. I ran them all with `docker-compose` from a [Debian] VPS. For 11 the last couple months, however, I've been meaning to move away from 12 Debian and towards something more minimal and clean. Over this last 13 weekend, I decided to move to [Alpine Linux]. 14 15 <!-- more --> 16 17 ## The Plan 18 19 The transition was supposed to be quick and dirty: 20 21 1. Shut down all the services running on my VPS 22 2. Make a backup of relevant files with [Tarsnap] 23 3. Mount Alpine Virtual ISO image and setup the OS 24 4. Restore files from Tarsnap backup 25 5. Bring everything back up 26 27 In a previous move between two servers, I simply `rsync`ed the 28 relevant files over to the new VPS. Here, where I'm just switching 29 operating systems on a single server, I figured I could make a backup 30 with Tarsnap, and be done within the day. 31 32 However, backups are much more complex than simply transferring files 33 from one server to another. My haphazard strategy resulted in three 34 days of stress and frustration as I clambered to restore a 35 self-hosting empire that I myself had reduced to ash. 36 37 ## Day One 38 39 I began my work on the transition full of optimism, if a bit stressed. 40 I had read through the Tarsnap online documentation a number of times, 41 and was ready to make my first attempt. I loaded my Tarsnap account up 42 with USD$10 and ran: 43 44 ```sh 45 $ sudo tarsnap -c -f backup-name docker-compose.yml ... 46 ``` 47 48 My terminal sat empty for hours. There were no changes – the process 49 was running, but there was no feedback. I was nervous. 50 51 > What if it failed silently? 52 > 53 > How can I check? 54 > 55 > What should I do? 56 57 I pressed `<Ctrl-C>`. 58 59 To my horror, stats printed to the screen: the backup had been 90% 60 complete, and I had stopped it. Convinced I had ruined the backup 61 completely, I deleted the partial backup from Tarsnap and started 62 again from scratch. 63 64 This was my first, but not last, moment close to tears. I went to 65 sleep and let the backup run overnight. 66 67 ## Day Two 68 69 Day Two began well: I woke and the backup was finished! I wiped the 70 VPS, installed Alpine, and brought it up to spec. I created a regular 71 user, configured SSH, and decided to use `doas` instead of `sudo` for 72 a change. Alpine, so far, feels great to use. None of the cruft that 73 bothered me when using Debian. 74 75 ### Virgin Tarnsap 76 77 With Alpine set up, I started to restore the backup: 78 79 ```sh 80 $ doas tarsnap -x -f backup-name 81 ``` 82 83 Once again, after running all day it had not finished. 84 85 I opened up a new `tmux` window and poked around the filesystem. All 86 my files seemed like they were already there... 87 88 > What if it failed silently? 89 > 90 > How can I check? 91 > 92 > What should I do? 93 94 I pressed `<Ctrl-C>`, cutting off the download, and tried to bring 95 everything back online: 96 97 ```sh 98 $ doas docker-compose up -d 99 ``` 100 101 It errored out. All my environment variables were undefined. Then it 102 hit me: I forgot to back up the `.env` file. My eyes welled up. 103 104 Still, I was determined. I worked to reconstruct the `.env` file from 105 secrets I had stored in [Bitwarden] (my offline copy, because my vault 106 is self-hosted and was thus down). 107 108 I ran it again: 109 110 ```sh 111 $ doas docker-compose up -d 112 ``` 113 114 One of my services was missing a Dockerfile to build. I shouldn't have 115 pressed `<Ctrl-C>`! I was a total moron. 116 117 I put on a [sad song]. I was close to tears once again. 118 119 I gathered what was left of my resolve and trudged onwards. I searched 120 `tarsnap`'s manpages looking for something to speed up my download. 121 122 I found a number of flags that could have helped me *make* a backup 123 better the next time around, but nothing that would help me restore 124 the backup any faster. With nothing in the manpages, I went to look at 125 the [helper scripts]. 126 127 ### Chad Redsnapper 128 129 That's when I found it: [redsnapper]. A Ruby script that runs multiple 130 tarsnap clients at once to extract archives **fast**. Fucking 131 precisely. I wiped out the incomplete files I had restored, downloaded 132 Ruby and started restoring from the backup once again: 133 134 ```sh 135 $ doas redsnapper backup-name 136 ``` 137 138 I changed [the song], and watched the files fly by on my screen. I 139 went to sleep, confident I would wake to good news. 140 141 ## Day Three 142 143 The download had failed trying to download a large `.mkv` file. 144 145 ### Manual Exclusion 146 147 I restarted `redsnapper`, explicitly excluding the `.mkv` it had 148 failed to download, and let it run until it came on another movie and 149 crashed again (an hour or so later). I excluded the second movie file 150 and sent it to run again. 151 152 This was a long, boring process. It sucked. 153 154 ### An Afternoon Breakthrough 155 156 Then I realized something. `redsnapper` kept crashing when it hit 157 movies I had stored in [Jellyfin]. 158 159 > I don't need Jellyfin at all. I've never watched a movie more than 160 > once. 161 > 162 > The movies take up massive storage on disk, and keep causing tarsnap 163 > to crash. They don't compress well either, so they take up a fuckton 164 > of space in the archives. 165 > 166 > I can always download the movies again if I want to give them 167 > another go. 168 > 169 > Why the fuck am I forcing myself to deal with this shit? 170 171 I stopped the download in the middle - the day's third, after two 172 earlier attempts that ended after encountering movie files – and 173 changed the command slightly before rerunning. After a number of 174 errors I couldn't explain, I realized my account was negative and 175 topped it up with another USD$25 before running: 176 177 ```sh 178 $ doas redsnapper backup-name -- --exclude='*/jellyfin/*' 179 ``` 180 181 I returned to my computer a couple hours later. `redsnapper` had 182 stopped, with a whole lot of files extracted and a couple errors at 183 the bottom about symlinks. 184 185 I figured, this time, it had probably done everything properly but 186 couldn't create the symlinks (probably a flag missing somewhere). I 187 manually went through my files creating the symlinks, and then brought 188 everything up with docker-compose. 189 190 I checked the containers. All up. 191 192 I checked the logs – no immediate errors visible. 193 194 I opened [figbert.com] on my laptop. It appeared. Service was 195 restored. Hallelujah. 196 197 ## Mistakes 198 199 I made a lot of them. Here are a few: 200 201 1. After shutting down my containers, I backed up my entire setup. 202 This included a number of ["live" databases][live], `.git` folders, 203 and other data that I either did not need or could reconstruct 204 once the move had been completed. 205 2. I didn't back up the `.env` file I use to store secrets for use in 206 `docker-compose.yml`. I was luckily able to reconstruct it from 207 individual secrets I stored in my password manager. 208 3. A thorough read of the manpages before I started (rather than just 209 the online guides) would have revealed several helpful flags: 210 `-v` to see what files `tarsnap` is operating on, 211 `--aggressive-networking` to take advantage of the datacenter 212 internet speeds, and `--recover` to resume interrupted backups, to 213 name a few. 214 5. We already talked about Jellyfin. Even with very little content in 215 Jellyfin, the collection took up huge amounts of space on disk and 216 in the backup (especially because video files don't compress 217 well), and sat entirely unused. It is now gone. Good riddance. 218 219 ## Future 220 221 What did I learn? Well, I'm still devising a plan to prevent things 222 like this from happening in the future. Here's the plan currently: 223 224 ### Backups 225 226 Back up everything every day. I'll build a buffer of three "rolling" 227 backups, where backups collect up to a max of three and then, as new 228 backups are created, the older backups are removed. 229 230 The backup script will shut down the services, dump the databases 231 (i.e. convert as much content to plain-text, easily-compressible 232 formats as possible) and make a time-stamped backup (currently only 233 with Tarsnap, but perhaps in the future with a number of other 234 services). 235 236 ### Restoring 237 238 Simply having high-quality backups to restore will already be a huge 239 leap forward. I'm also *definitely* going to continue using 240 [redsnapper]: the speed gains it gives on large backups are crucial. 241 242 ### Manpages 243 244 I really should read all the documentation before I try something new. 245 246 ## Bye Bye 247 248 I'll write further about my self-hosting setup as it evolves, and 249 publish the backup script once its finished. I'll also maintain a 250 dedicated page on my site describing my self-hosting setup as it 251 changes. 252 253 Also, I'm sure there are people more knowledgeable about Tarsnap than 254 I. That's basically the point of this article. If you are one of these 255 people, please don't hesitate to [email me] if you've got corrections, 256 advice, or just want to flex that you know how to do backups better 257 than I do. 258 259 [mv]: @/posts/moving-to-hetzner-from-digitalocean/index.md 260 [RSS reader]: https://miniflux.app 261 [IRC bouncer]: https://thelounge.chat 262 [email]: https://maddy.email 263 [Debian]: https://www.debian.org 264 [Alpine Linux]: https://alpinelinux.org 265 [Tarsnap]: https://www.tarsnap.com 266 [Bitwarden]: https://bitwarden.com 267 [sad song]: https://www.youtube.com/watch?v=I-sH53vXP2A 268 [helper scripts]: https://www.tarsnap.com/helper-scripts.html 269 [redsnapper]: https://github.com/directededge/redsnapper 270 [Jellyfin]: https://jellyfin.org 271 [figbert.com]: / 272 [the song]: https://www.youtube.com/watch?v=gPOEBkcZHM4 273 [live]: https://www.tarsnap.com/tips.html#back-up-live 274 [email me]: mailto:figbert@figbert.com