figbert.com-website

[ACTIVE] the website and home of figbert on the clearnet
git clone git://git.figbert.com/figbert.com-website.git
Log | Files | Refs | README | LICENSE

wrong-way-to-switch-server-os.md (9491B)


      1 +++
      2 title = "The Wrong Way to Switch Operating Systems on Your Server"
      3 date = 2021-06-17
      4 updated = 2022-06-15
      5 [extra]
      6 type = "post"
      7 +++
      8 
      9 After [moving my server to Hetzner][mv], I built up a large collection
     10 of self-hosted services I use on a daily basis: from fun things like
     11 an [RSS reader] and an [IRC bouncer], to critical services like my
     12 [email]. I ran them all with `docker-compose` from a [Debian] VPS. For
     13 the last couple months, however, I've been meaning to move away from
     14 Debian and towards something more minimal and clean. Over this last
     15 weekend, I decided to move to [Alpine Linux].
     16 
     17 <!-- more -->
     18 
     19 ## The Plan
     20 
     21 The transition was supposed to be quick and dirty:
     22 
     23 1. Shut down all the services running on my VPS
     24 2. Make a backup of relevant files with [Tarsnap]
     25 3. Mount Alpine Virtual ISO image and setup the OS
     26 4. Restore files from Tarsnap backup
     27 5. Bring everything back up
     28 
     29 In a previous move between two servers, I simply `rsync`ed the
     30 relevant files over to the new VPS. Here, where I'm just switching
     31 operating systems on a single server, I figured I could make a backup
     32 with Tarsnap, and be done within the day.
     33 
     34 However, backups are much more complex than simply transferring files
     35 from one server to another. My haphazard strategy resulted in three
     36 days of stress and frustration as I clambered to restore a
     37 self-hosting empire that I myself had reduced to ash.
     38 
     39 ## Day One
     40 
     41 I began my work on the transition full of optimism, if a bit stressed.
     42 I had read through the Tarsnap online documentation a number of times,
     43 and was ready to make my first attempt. I loaded my Tarsnap account up
     44 with USD$10 and ran:
     45 
     46 ```sh
     47 $ sudo tarsnap -c -f backup-name docker-compose.yml ...
     48 ```
     49 
     50 My terminal sat empty for hours. There were no changes – the process
     51 was running, but there was no feedback. I was nervous.
     52 
     53 > What if it failed silently?
     54 >
     55 > How can I check?
     56 >
     57 > What should I do?
     58 
     59 I pressed `<Ctrl-C>`.
     60 
     61 To my horror, stats printed to the screen: the backup had been 90%
     62 complete, and I had stopped it. Convinced I had ruined the backup
     63 completely, I deleted the partial backup from Tarsnap and started
     64 again from scratch.
     65 
     66 This was my first, but not last, moment close to tears. I went to
     67 sleep and let the backup run overnight.
     68 
     69 ## Day Two
     70 
     71 Day Two began well: I woke and the backup was finished! I wiped the
     72 VPS, installed Alpine, and brought it up to spec. I created a regular
     73 user, configured SSH, and decided to use `doas` instead of `sudo` for
     74 a change. Alpine, so far, feels great to use. None of the cruft that
     75 bothered me when using Debian.
     76 
     77 ### Virgin Tarnsap
     78 
     79 With Alpine set up, I started to restore the backup:
     80 
     81 ```sh
     82 $ doas tarsnap -x -f backup-name
     83 ```
     84 
     85 Once again, after running all day it had not finished.
     86 
     87 I opened up a new `tmux` window and poked around the filesystem. All
     88 my files seemed like they were already there...
     89 
     90 > What if it failed silently?
     91 >
     92 > How can I check?
     93 >
     94 > What should I do?
     95 
     96 I pressed `<Ctrl-C>`, cutting off the download, and tried to bring
     97 everything back online:
     98 
     99 ```sh
    100 $ doas docker-compose up -d
    101 ```
    102 
    103 It errored out. All my environment variables were undefined. Then it
    104 hit me: I forgot to back up the `.env` file. My eyes welled up.
    105 
    106 Still, I was determined. I worked to reconstruct the `.env` file from
    107 secrets I had stored in [Bitwarden] (my offline copy, because my vault
    108 is self-hosted and was thus down).
    109 
    110 I ran it again:
    111 
    112 ```sh
    113 $ doas docker-compose up -d
    114 ```
    115 
    116 One of my services was missing a Dockerfile to build. I shouldn't have
    117 pressed `<Ctrl-C>`! I was a total moron.
    118 
    119 I put on a [sad song]. I was close to tears once again.
    120 
    121 I gathered what was left of my resolve and trudged onwards. I searched
    122 `tarsnap`'s manpages looking for something to speed up my download.
    123 
    124 I found a number of flags that could have helped me *make* a backup
    125 better the next time around, but nothing that would help me restore
    126 the backup any faster. With nothing in the manpages, I went to look at
    127 the [helper scripts].
    128 
    129 ### Chad Redsnapper
    130 
    131 That's when I found it: [redsnapper]. A Ruby script that runs multiple
    132 tarsnap clients at once to extract archives **fast**. Fucking
    133 precisely. I wiped out the incomplete files I had restored, downloaded
    134 Ruby and started restoring from the backup once again:
    135 
    136 ```sh
    137 $ doas redsnapper backup-name
    138 ```
    139 
    140 I changed [the song], and watched the files fly by on my screen. I
    141 went to sleep, confident I would wake to good news.
    142 
    143 ## Day Three
    144 
    145 The download had failed trying to download a large `.mkv` file.
    146 
    147 ### Manual Exclusion
    148 
    149 I restarted `redsnapper`, explicitly excluding the `.mkv` it had
    150 failed to download, and let it run until it came on another movie and
    151 crashed again (an hour or so later). I excluded the second movie file
    152 and sent it to run again.
    153 
    154 This was a long, boring process. It sucked.
    155 
    156 ### An Afternoon Breakthrough
    157 
    158 Then I realized something. `redsnapper` kept crashing when it hit
    159 movies I had stored in [Jellyfin].
    160 
    161 > I don't need Jellyfin at all. I've never watched a movie more than
    162 > once.
    163 >
    164 > The movies take up massive storage on disk, and keep causing tarsnap
    165 > to crash. They don't compress well either, so they take up a fuckton
    166 > of space in the archives.
    167 >
    168 > I can always download the movies again if I want to give them
    169 > another go.
    170 >
    171 > Why the fuck am I forcing myself to deal with this shit?
    172 
    173 I stopped the download in the middle - the day's third, after two
    174 earlier attempts that ended after encountering movie files – and
    175 changed the command slightly before rerunning. After a number of
    176 errors I couldn't explain, I realized my account was negative and
    177 topped it up with another USD$25 before running:
    178 
    179 ```sh
    180 $ doas redsnapper backup-name -- --exclude='*/jellyfin/*'
    181 ```
    182 
    183 I returned to my computer a couple hours later. `redsnapper` had
    184 stopped, with a whole lot of files extracted and a couple errors at
    185 the bottom about symlinks.
    186 
    187 I figured, this time, it had probably done everything properly but
    188 couldn't create the symlinks (probably a flag missing somewhere). I
    189 manually went through my files creating the symlinks, and then brought
    190 everything up with docker-compose.
    191 
    192 I checked the containers. All up.
    193 
    194 I checked the logs – no immediate errors visible.
    195 
    196 I opened [figbert.com] on my laptop. It appeared. Service was
    197 restored. Hallelujah.
    198 
    199 ## Mistakes
    200 
    201 I made a lot of them. Here are a few:
    202 
    203 1. After shutting down my containers, I backed up my entire setup.
    204    This included a number of ["live" databases][live], `.git` folders,
    205    and other data that I either did not need or could reconstruct
    206    once the move had been completed.
    207 2. I didn't back up the `.env` file I use to store secrets for use in
    208    `docker-compose.yml`. I was luckily able to reconstruct it from
    209    individual secrets I stored in my password manager.
    210 3. A thorough read of the manpages before I started (rather than just
    211    the online guides) would have revealed several helpful flags:
    212    `-v` to see what files `tarsnap` is operating on,
    213    `--aggressive-networking` to take advantage of the datacenter
    214    internet speeds, and `--recover` to resume interrupted backups, to
    215    name a few.
    216 5. We already talked about Jellyfin. Even with very little content in
    217    Jellyfin, the collection took up huge amounts of space on disk and
    218    in the backup (especially because video files don't compress
    219    well), and sat entirely unused. It is now gone. Good riddance.
    220 
    221 ## Future
    222 
    223 What did I learn? Well, I'm still devising a plan to prevent things
    224 like this from happening in the future. Here's the plan currently:
    225 
    226 ### Backups
    227 
    228 Back up everything every day. I'll build a buffer of three "rolling"
    229 backups, where backups collect up to a max of three and then, as new
    230 backups are created, the older backups are removed.
    231 
    232 The backup script will shut down the services, dump the databases
    233 (i.e. convert as much content to plain-text, easily-compressible
    234 formats as possible) and make a time-stamped backup (currently only
    235 with Tarsnap, but perhaps in the future with a number of other
    236 services).
    237 
    238 ### Restoring
    239 
    240 Simply having high-quality backups to restore will already be a huge
    241 leap forward. I'm also *definitely* going to continue using
    242 [redsnapper]: the speed gains it gives on large backups are crucial.
    243 
    244 ### Manpages
    245 
    246 I really should read all the documentation before I try something new.
    247 
    248 ## Bye Bye
    249 
    250 I'll write further about my self-hosting setup as it evolves, and
    251 publish the backup script once its finished. I'll also maintain a
    252 dedicated page on my site describing my self-hosting setup as it
    253 changes.
    254 
    255 Also, I'm sure there are people more knowledgeable about Tarsnap than
    256 I. That's basically the point of this article. If you are one of these
    257 people, please don't hesitate to [email me] if you've got corrections,
    258 advice, or just want to flex that you know how to do backups better
    259 than I do.
    260 
    261 [mv]: @/posts/moving-to-hetzner-from-digitalocean/index.md
    262 [RSS reader]: https://miniflux.app
    263 [IRC bouncer]: https://thelounge.chat
    264 [email]: https://maddy.email
    265 [Debian]: https://www.debian.org
    266 [Alpine Linux]: https://alpinelinux.org
    267 [Tarsnap]: https://www.tarsnap.com
    268 [Bitwarden]: https://bitwarden.com
    269 [sad song]: https://www.youtube.com/watch?v=I-sH53vXP2A
    270 [helper scripts]: https://www.tarsnap.com/helper-scripts.html
    271 [redsnapper]: https://github.com/directededge/redsnapper
    272 [Jellyfin]: https://jellyfin.org
    273 [figbert.com]: /
    274 [the song]: https://www.youtube.com/watch?v=gPOEBkcZHM4
    275 [live]: https://www.tarsnap.com/tips.html#back-up-live
    276 [email me]: mailto:figbert@figbert.com