Understanding the Linux File System

Linux – The pioneer of ‘Open Source’ software is one hell of an operating system. It provides both unlimited customizations and rock solid stability for a variety of computing needs both small and large. The Linux file system is also one of the most robust file systems out there.

People often may come in contact with the Linux file system and try to relate it to the Windows one which they are familiar with. But although the two operating systems allow the end user to interact and use the computer, they handle all these things little to way too differently. This article intends to answer one of those fundamental questions regarding the Linux file system in much-needed clarity.

What is a File System?

In the field of computing, a file system is the method by which relevant data is stored and retrieved from the system. This data is stored in the form of files having various file extensions.

Simplified_illustration_of_hard_links_on_typical_UN

What is a File and File Extension?

A file is a digital object in a file system that stores information about a particular thing (photos, videos, music, etc.) in a relative sequence of bits, bytes, lines or records with its meaning being defined by the user.

files and their extensions

The Linux File System:

A popular saying goes like that “Everything in Linux is a file, that which is not is considered to be a process”. This is with regard to the general assumption for everything being considered as a file just like UNIX.

Note: The term “file system” is often used in the context of,

  • The part of the entire hierarchy of directories or that of the directory tree that is located on a single partition or disk. (A partition is a section of a hard disk that contains only a single type of file system)
  • The type of file system actually used to format the storage device in order to store the data, i.e. FAT32, NTFS, HFS+, EXT4 etc.

The Linux file structure is a tree-like structure. It starts with the root directory, represented by ‘/’, and then expands into sub-directories. All the partitions are under this root directory. The root is a very important directory in Linux as the kernel needs a root file system to mount at startup.

What is a Mount Point of a Linux File System?

The mount point defines the place of a particular data set in the file system by Linux. This “Mount Point” is known as the “mnt” in Linux and any device not mounted in this place is not recognized by Linux. The data in any particular device can not be accessed without mounting its file system first.

mount

Directories in the Linux File System:

Linux File system

Another classic example where people accustomed to the Windows way of computing try to compare it that way.

However, the directories are as per the computing standard set by UNIX itself.

  • Root “/” file system: This file system is mounted by the kernel itself during boot. It is generally small in size and should not be messed with. Usually, it does not contain any critical files as such.
  • /usr file system: This directory is created during the installation of Linux itself and includes sub directories include /bin, /include, /lib, /local (for local executables).
  • /var file system: This directory is specific to local file systems and contains data that changes constantly and not retained for longer time periods by the system. Data examples include log files, temp files, etc. and this directory contains sub-directories such as /cache/man (A cache for man pages), /games (any variable data belong to games), /lib (files that change), /log (log from different programs), /tmp (for temporary files).
  • /home file system: This directory differs from host to host. It contains user specific configuration files for an application. Linux creates computer system’s user directory according to the user name under this /home directory. For eg: User name is Raj, hence his home directory will be /home/raj.
  • /proc filesystem: This directory is created by the kernel in main memory (RAM) to provide information about the system such as CPU information, devices installed, memory usage, etc representing the current state of the kernel. This directory also has sub directories such as /cpuinfo, /devices, /filesystem, /net and /mem.
  • /bin: This directory contains the binaries, i.e. common programs that are shared by the system, the system administrator and the users.
  • /sbin: This directory has the secure binaries, i.e. programs meant for use by the system and the system administrator.
  • /lib: It contains the library files includes files for all kinds of programs needed by the system and the users.
  • /mnt: This directory is the standard mount point for external file systems (CD-ROMs, External HDDs, etc ).
  • /net: This directory is the standard mount point for entire remote file systems.
  • /opt: This directory typically contains extra and third-party software binaries.
  • /etc: It contains the most important system configuration files, this directory contains data similar to those in the Control Panel in Windows.
  • /dev: It contains the references to all the CPU peripheral hardware, which are represented as files with special properties.
  • /initrd: This directory contains the information needed for booting and hence should not be messed with.
  • /misc: This directory is kept for miscellaneous purposes.

The Ext Linux File System Format:

After having seen the Linux directory hierarchy in much detail lets move onto the file system format in which Linux typically stores its data.

Linux File system

The Ext file system or Extended file system was specially designed for Linux and was inspired from the UFS (Unix File System, not to be confused with Samsung’s Universal Flash Storage). Currently, it is in its 4th iteration. It is the successor to the Ext3 file system.

Features:

  • Gigantic Volume Support: File system can be up to 1 Exabyte in size. 1 Exabyte = 1000 Petabytes = 1000 x 1000 Terabytes. Individual file size can range from 16 Gigabytes to 16 Terabytes. That kind of drive support should definitely suffice your storage needs for sure!
  • Extents: An extent is a range of contiguous physical blocks on the storage device and it replaces the traditional block mapping scheme used in Ext2 and Ext3 file systems. The benefits are better handling of large files with improved performance as a whole.
  • Backward Compatibility: Ext4 is fully backward compatible with Ext3 and Ext4 file systems as they can be mounted as an Ext4 file system.
  • Persistent Pre-Allocation: Ext4 can pre-allocate disk space for a file with the benefit of guaranteed contiguous file block allocation and hence lesser fragmentation compared to other file systems.
  • Delayed Allocation: Ext4 uses a technique called as allocate-on-flush i.e. delayed allocation to improve read and write performance.
  • Greater number of sub-directories: Ext4 allows for 64000 sub-directories within a single directory as opposed to 32000 in Ext3.
  • Journal and Meta-data Checksums: Ext4 uses checksums to verify the integrity of the file system metadata and journal.
  • Faster File System Checking: Unallocated blocks and inodes are marked as such and hence are skipped during file system check-ups to speed up the process.
  • Multi-block Allocator: Ext4 uses multi-block allocator when writing multiple file blocks to ensure blocks are allocated contiguously on a disk.
  • Improved Timestamps: As Linux usage grows for more critical applications, the accuracy of time measurement becomes more important. Ext4 improves on this aspect by improving the measurement of time up to nanoseconds.
  • Transparent Encryption: Ext4 offers support for transparent encryption and it is a technology employed by Microsoft, IBM and Oracle to encrypt database files at rest.

Related: Popular Linux distros and their uses