• 0 Posts
  • 2 Comments
Joined 7 months ago
cake
Cake day: June 29th, 2024

help-circle
  • I don’t think ‘cattle not pets’ is all that corporate, especially w/r/t death of the author. For me, it’s more about making sure that failure modes have (rehearsed) plans of action, and being cognizant of any manual/unreplicable “hand-feeding” that you’re doing. Random and unexpected hardware death should be part of your system’s lifecycle, and not something to spend time worrying about. This is also basically how ZFS was designed from a core level, with its immense distrust for hardware allowing you to connect whatever junky parts you want and letting ZFS catch drives that are lying/dying. In the original example, uptime seems to be an emphasized tenet, but I don’t think it’s the most important part.

    RE replacements on scheduled time, that might be true for RAIDZ1, but IMO a big selling point of RAIDZ2 is that you’re not in a huge rush to get resilvering done. I keep a cold drive around anyway.


  • “Cattle not pets” in this instance means you have a specific plan for the random death of a HDD (which RAIDZ2 basically already handles), and because of that you can work your HDDs until they are completely dead. If your NAS is a “pet” then your strategy is more along the lines of taking extra-good care of your system (e.g. rotating HDDs out when you think they’re getting too old, not putting too much stress on them) and praying that nothing unexpected happens. I’d argue it’s not really “okay” to have pets just because you’re in a homelab, as you don’t really have to put too much effort into changing your setup to be more cynical instead of optimistic, and it can even save you money since you don’t need to worry about keeping things fresh and new.

    “In the old way of doing things, we treat our servers like pets, for example Bob the mail server. If Bob goes down, it’s all hands on deck. The CEO can’t get his email and it’s the end of the world. In the new way, servers are numbered, like cattle in a herd. For example, www001 to www100. When one server goes down, it’s taken out back, shot, and replaced on the line.”

    ~from https://cloudscaling.com/blog/cloud-computing/the-history-of-pets-vs-cattle/