zfs set volsize={Installed RAM รท 2}G zones/dump
dumpadm -d /dev/zvol/dsk/zones/dump
Today I continue my SmartOS saga. Actually, it’s done! I have SmartOS on my main server. It was quite a task. In addition to my dad’s server being replaced with SmartOS in April (which has been running very well), I have yet to get SmartOS on my friend’s server. Previously, I said I would probably install SmartOS on a friend’s box before mine, but events lead me to get ahead on my box first.
Recap: Why The Switch?
I’ve originally been using FreeBSD as my main server OS for 7 years. It has been great, reliable, never gave me an issue, and performant. My first motivation is because this is a home lab server and I like to try and learn new things. SmartOS has some overlap in features that I like:
-
ZFS
-
bhyve
-
Same bootloader as FreeBSD
-
zones (like jails, but not entirely)
The other reason has to do with a couple shortcomings in FreeBSD that SmartOS very uniquely solves:
-
VM images - native OS zones, Linux (LX) branded zones, a variety of Linux VMs, and I can still make my own from scratch
-
Unified tooling for the sysadmin (vmadm, imgadm, zoneadm, fwadm, etc.)
-
JSON manifests to create zones easily from images
-
Available Ansible module for zone management
-
It’s easier to wrap my mind around firewalls for each zone with fwadm and tags
Some of these things I could solve in ways with some third party tools and build some of my own stuff, but as I get busier with life, I sometimes need something that I can do fast or with minimal thought. SmartOS makes it a bit easier to automate some of the stuff that I do with my server.
The Long-winded Process
Since I wasn’t familiar with SmartOS and wanted to learn and kick the tires, I pulled out the only extra computer I had, which was an 8GB RAM Intel i5 4570 Thinkcentre. After learning, I decided it was time and I was comfortable enough to get the migration on. It was quite the journey learning and starting to move my virtual machines over to this desktop initially. I only had a single 4TB NAS drive and the amount of data I have on my server is just over 4TB. So I ended up removing some extra data that was unneeded and planned to move another chunk of data somewhere else while I was preparing this migration. So I started with moving some of my simpler VMs, ones that didn’t take up too much space and didn’t require much memory or CPU time.
I proceeded to one of my larger servers, which was Plex. Since I don’t update my Plex library very often and the amount of space it uses up was okay for me to move it over, it wasn’t that big of an issue. A day later I had noticed that ZFS foudn 8 corrupt files in my Plex library. I eventually found out that the issue was a faulty SATA cable. After replacing it, everything worked fine, I did a memtest to ensure my memory was okay and did a scrub on the zpool. Afterward, I replaced the corrupt files. I was concerned then with the reliability of this drive. It had been running for 4 years 24/7, then I let it sit 3 years. That had my concerned about the identical drive in my existing server which has been running 24/7 for 7 years. To be safe, I decided it was time to spend the cash (which I was trying to avoid) and I ordered 2x14TB drives.
New Drive Configuration
In my intial plan, I had intended to install my zones zpool on one of my SSDs and then have the data in its own data zpool. However, I was advised not to do that since I would not see any gains from using these SSDs. Additionally, I’d be having to do workarounds with SmartOS such as LOFS and such to get the data properly in each zone, which would have added complexity to my server. So at the advice of the maintainers, I decided to use only my new hard drives.
The Process Begins
Once I got the new drives, I couldn’t wait. I decided to start the migration over the upcoming weekend. I installed my new drives in my Thinkcentre, with one hanging out the chassis since it only supported a single 3.5" drive. I started copying data from my jails to my new zones to delegated datasets. However, I started encountering issues. One issue was the constant battle of low memory and trying to move everything over to this new "server" without overloading the desktop. I had been using rsync up until this time, but since my destination zones had too little memory allocated to them, rsync would eventually fail. I found sftp to work, and eventually I reached a point where I had to just turn stuff off and allocate more memory to these zones to get data over efficiently.
I did my best to recreate all my jails as zones. Most of them were done beforehand since they had low CPU/memory requirements. I had some others such as Nextcloud, my samba server, Plex, and my friend’s minecraft server that required some messing around to get things right and get it correct in joyent/lx branded zones. Once it was all done, I was ready to power down both machines and swap the guts.
Failure
At this point, I had my 2 disks in my server and all else hooked up, but encountered an error: I was booting into single user mode constantly. After looking through the logs and beating my head against the wall, I had to revert everything and put stuff back in their messed up state with both computers running their duties. I kept Nextcloud, Immich, Matrix, and Minecraft on my old server for the night, and most of the next day. In the meantime, it was the end of the night.
The following day, I scoured the internet for some sort of answer, and came upon this post. In it the writer notes to size up the dump and swap space on the zones zpool. Even mentioning the exact same error I had gotten. So I followed his instructions, but kept swap the same since I was doing this on my live system, and not single user mode. But I increased my dump volsize:
I have 64GB RAM, where my old system which I built SmartOS on had 8GB. So the dump size was too small, and I set it to 40GB or something. When I put everything back in my server, it all worked!
Finishing the Move
The remaining bits were to put back together Nextcloud, Immich, Matrix, and Minecraft. I had ensured my data was current on Nextcloud, did a database dump/restore and Nextcloud was running well then. Immich, I had already moved the data and samba (namely, SmartOS' in-kernel SMB server) share. So my steps were to just restore the database stuff I backed up and spin Immich up in a new VM. Matrix and Minecraft restores went well too. At this point, I was fully on SmartOS.
Finishing Touches
Once done, I had some improvements to make. For one, I think going from a raidz1 to a mirrored zpool slowed down my write speeds some, which I was thinking might hav been the case. And it was. However, I also noticed that Nextcloud was performing particularly slow. For one, I noticed I didn’t install the php redis module, opcache php modules were missing too, and it wasn’t configured right either. After making those changes, Nextcloud is running where it’s been at. Some things feel a little slower, and I have high certainty it is disk I/O bound.
Monitoring and Backups
I like to receive alerts for problems, so I put in place SMART data monitoring, as well as monitoring to detect if a zpool is not in an ONLINE state or experiencing problems. Additionally, I have a monitor that will alert me if one of my zones goes offline.
The last bit is backups, which, I am using znapzend to backup my delegated ZFS datasets. I have backups of my bhyve virtual machines going to my storage zone. My last plan of action is to get all my configs backed up to a better spot so I can use Ansible or something to neatly rebuild my servers.
Final Thoughts
This project was a lot of fun to do, and so far I have been enjoying SmartOS. My hope is that it will continue to run well as it is now. I still have some fwadm work to do with this server, get into tagging and improving my zone manifests for future deployments of whatever I want to do. I am also hoping that my vision of this being a time saver will be true. As other obligations keep me busy, and family, I want something that I can let lowkey hum along in the back and check up on it when I feel like it. Since SmartOS seems to be more rolling, this is helpful for me. Likewise too since when I update the global zone, it also updates my native zones too.