[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: handling of lost-writes in linux filesystems



On Mon, 17 Mar 2008 22:35:01 +0530, Manish Katiyar wrote:
> Thanks for the reply. You are correct to some extent though I don't
> agree with that completely. For the simple reason that today ext{3,4}
> are targetting for being enterprise filesystem and its completely valid
> to have the device such bugs. Imagine by the time you realise that your
> important data was not properly written to disk and has been lost , the
> damage has been already done. So I guess either the underlying raid
> software or the file system should have a way of guaranteeing that the
> data is safe.

I suspect what you are asking is "do common Linux filesystems do 
checksumming?". To the best of my knowledge the answer is no (although 
future filesystems like btrfs ( http://oss.oracle.com/projects/btrfs/ )
will). A lot depends on exactly how this problem manifests itself and 
frequency though. In a RAID setup there is a possibility it would be 
picked up by resilvering (although what you do if you find the problem is 
another issue).

Ultimately this is no different to any other "what if the hardware lies 
to you?" question. What if your RAM bit flips. What if your disk 
controller is buggy? What if the CPU gives you bad results? At some point 
if you need that level of assurance you are going to have calculate a 
checksum and if the checksum is redundant enough you may be able to not 
only detect the error but recover from it too. Doing this all the 
checking is bound to cause a tradeoff though and it's always handy to 
have real backups.

-- 
Sitsofe | http://sucs.org/~sits/


--
To unsubscribe from this list: send an email with
"unsubscribe kernelnewbies" to ecartis@xxxxxxxxxxxx
Please read the FAQ at http://kernelnewbies.org/FAQ