How does a journaling filesystem work?
How does a journaling filesystem work in the major Linux distros, and what is the difference between Ext2 and Ext3?
Ext2 is the standard file system for Linux (I-node based, maintaining metadata, pointing to actual data blocks), while ext3 is a journaled file system. A
journaled file system is much easier to recover then a standard one, because you do not need to use fsck anymore for recovery. Data integrity occurs because
updates to files are written to logs before the actual data blocks are updated. A thread writes data to the f/s and each transaction is flagged. After a
crash, updates are copied back from the journal to the filesystem and any incomplete transactions are gone. In case of a failure, a journaled file system
will ensure that all inconsistencies are restored back to its prior state.
Prior to journal filesystems, one would need to run fsck to resolve file and metadata inconsistencies. Sure, fsck works, but it is too darn slow and in today's world with partitions getting larger and larger, it just
doesn't scale anymore. Filesystem logging, in many ways, is very similar to database logging. We all know that databases keep logs so that if information
has not yet been written to data blocks from cache, there is a way to recover. You can think of journaling file systems as providing the same function, but here at the filesystem level.
This was first published in October 2004