Is bit rot really a threat that I should worry about?

A Mouse · 2 years ago

Is bit rot really a threat that I should worry about?

@dragontamer@lemmy.world · edit-2 2 years ago

Wait, what’s wrong with issuing “ZFS Scan” every 3 to 6 months or so? If it detects bitrot, it immediately fixes it. As long as the bitrot wasn’t too much, most of your data should be fixed. EDIT: I’m a dumb-dumb. The term was “ZFS scrub”, not scan.

If you’re playing with multiple computers, “choosing” one to be a NAS and being extremely careful with its data that its storing makes sense. Regularly scanning all files and attempting repairs (which is just a few clicks with most NAS software) is incredibly easy, and probably could be automated.

A Mouse · 2 years ago

I guess, my primary concern was if I didn’t have the computer with ZFS(in my case btrfs but similar thing). Maybe it is for the best that I keep the raid setup to scrub and make sure important data is safe, and use the smaller single disk mini PC for services and data that isn’t as important.

@dragontamer@lemmy.world · edit-2 2 years ago

If you have a NAS, then just put iSCSI disks on the NAS, and network-share those iSCSI fake-disks to your mini-PCs.

iSCSI is “pretend to be a hard-drive over the network”. iSCSI can exist “after” ZFS or BTRFS, meaning your scrubs / scans will fix any issues. So your mini-PC can have a small C: drive, but then be configured so that iSCSI is mostly over the D: iSCSI / Network drive.

iSCSI is very low-level. Windows literally thinks its dealing with a (slow) hard drive over the network. As such, it works even in complex situations like Steam installations, albeit at slower network-speeds (it gotta talk to the NAS before the data comes in) rather than faster direct connection to hard drive (or SSD) speeds.

Bitrot is a solved problem. It is solved by using bitrot-resilient filesystems with regular scans / scrubs. You build everything on top of solved problems, so that you never have to worry about the problem ever again.

@vividspecter@lemm.ee · 2 years ago

As such, it works even in complex situations like Steam installations

Steam also works with NFS just fine (at least for the libraries, maybe not a full Steam installation), although Samba can be problematic in my experience. But your overall point is reasonable.

A Mouse · 2 years ago

Thanks for that information about iSCSI, I hadn’t looked into it. I will probably just stick with my primary server for the moment, maybe rebuild it into a NAS, and than use mini PCs with it as the storage.

@markstos@lemmy.world · 2 years ago

You don’t define bitrot. If you leave software alone with no updates for long enough, yes, there will be problems.

There will eventually be a security issue with no fix, or a new OS or hardware it doesn’t work on.

Backups can also fail over time if restores are not tested periodically.

This recently happened to me. A server wouldn’t boot anymore, so we restored from backup, but it still wouldn’t boot. The issue was that we’d introduced change that caused a boot failure. To fix that by restoring from a backup, we’d need a backup from before that change. It turns out we had one, but didn’t realize what the issue was.

The other moral is to reboot frequently if only to confirm the system can still boot.

@dragontamer@lemmy.world · edit-2 2 years ago

That’s not what storage engineers mean when they say “bitrot”.

“Bitrot”, in the scope of ZFS and BTFS means the situation where a hard-drive’s “0” gets randomly flipped to “1” (or vice versa) during storage. It is a well known problem and can happen within “months”. Especially as a 20-TB drive these days is a collection of 160 Trillion bits, there’s a high chance that at least some of those bits malfunction over a period of ~double-digit months.

Each problem has a solution. In this case, Bitrot is “solved” by the above procedure because:

Bitrot usually doesn’t happen within single-digit months. So ~6 month regular scrubs nearly guarantees that any bitrot problems you find will be limited in scope, just a few bits at the most.
Filesystems like ZFS or BTFS, are designed to handle many many bits of bitrot safely.
Scrubbing is a process where you read, and if necessary restore, any files where bitrot has been detected.

Of course, if hard drives are of noticeably worse quality than expected (ex: if you do have a large number of failures in a shorter time frame), or if you’re not using the right filesystem, or if you go too long between your checks (ex: taking 25 months to scrub for bitrot instead of just 6 months), then you might lose data. But we can only plan for the “expected” kinds of bitrot. The kinds that happen within 25 months, or 50 months, or so.

If you’ve gotten screwed by a hard drive (or SSD) that bitrots away in like 5 days or something awful (maybe someone dropped the hard drive and the head scratched a ton of the data away), then there’s nothing you can really do about that.

@iHUNTcriminals@lemm.ee · 2 years ago

Does the smart thing in omv take care of this? Anyone know? Obviously I’m a novice haha.