Marquette University logo      

 

Reference:

See also Dr. Barnard's notes on file system implementation

Chapter 11 described the public-facing API of file systems

Chapter 12 describes how the API might be implemented
Algorithms and data structures to map the logical file system onto the physical secondary-storage devices

We will discuss
12.1 File-System Structure
12.2 File-System Implementation
12.3 Directory Implementation
12.4 Allocation Methods
12.5 Free-Space Management
12.6 Efficiency and Performance
12.7 Recovery
12.9 NFS

12.1 File-System Structure

Magnetic disks:
Can be rewritten in place
Can access directly any block

Levels, see Figure 12.1

12.2 File-System Implementation

On the disk

  • Boot control block - Information needed to boot the OS from that volume
  • Volume control block - Volume parameters, e.g., number of blocks, size of blocks, free block count, free block pointers, free FCB count, free FCB pointers
  • Structure to organize files, e.g., file names and inode numbers
  • Per-file File Control Block (FCB), e.g., inode with permissions, ownership, size, location of data blocks

In memory

  • Mount table - Each mounted volume
  • Directory structure cache - Information about recently-accessed directories
  • System-wide open-file table - Copy of FCB for each open file
  • Per-process open-file table - Pointer to system-wide table

Details vary widely by system

inode (from Minix code): or go to Minix book > Files > src > inode.h

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
				src/fs/inode.h
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 
20500	/* Inode table.  This table holds inodes that are currently in use.  In some
20501	 * cases they have been opened by an open() or creat() system call, in other
20502	 * cases the file system itself needs the inode for one reason or another,
20503	 * such as to search a directory for a path name.
20504	 * The first part of the struct holds fields that are present on the
20505	 * disk; the second part holds fields not present on the disk.
20506	 * The disk inode part is also declared in "type.h" as 'd1_inode' for V1
20507	 * file systems and 'd2_inode' for V2 file systems.
20508	 */
20509
20510	EXTERN struct inode {
20511	  mode_t i_mode;                /* file type, protection, etc. */
20512	  nlink_t i_nlinks;             /* how many links to this file */
20513	  uid_t i_uid;                  /* user id of the file's owner */
20514	  gid_t i_gid;                  /* group number */
20515	  off_t i_size;                 /* current file size in bytes */
20516	  time_t i_atime;               /* time of last access (V2 only) */
20517	  time_t i_mtime;               /* when was file data last changed */
20518	  time_t i_ctime;               /* when was inode itself changed (V2 only)*/
20519	  zone_t i_zone[V2_NR_TZONES];  /* zone numbers for direct, ind, and dbl ind */
20520
20521	  /* The following items are not present on the disk. */
20522	  dev_t i_dev;                  /* which device is the inode on */
20523	  ino_t i_num;                  /* inode number on its (minor) device */
20524	  int i_count;                  /* # times inode used; 0 means slot is free */
20525	  int i_ndzones;                /* # direct zones (Vx_NR_DZONES) */
20526	  int i_nindirs;                /* # indirect zones per indirect block */
20527	  struct super_block *i_sp;     /* pointer to super block for inode's device */
20528	  char i_dirt;                  /* CLEAN or DIRTY */
20529	  char i_pipe;                  /* set to I_PIPE if pipe */
20530	  char i_mount;                 /* this bit is set if file mounted on */
20531	  char i_seek;                  /* set on LSEEK, cleared on READ/WRITE */
20532	  char i_update;                /* the ATIME, CTIME, and MTIME bits are here */
20533	} inode[NR_INODES];

create(): To create a new file

  • Application program calls logical file system create()
  • Allocated new FCB (where?)
  • Read appropriate directory into memory
    (File system has no concept of "current directory")
  • Update with file name and FCB
  • Writes (what?) back to disk

open():

  • Pass a file name
  • Search system-wide open files table
  • If found, create per-process open file table entry pointing to system-wide open file entry
  • Search directory structure for given file name (parts of directory structure many be cached in memory)
  • When file is found, copy FCB to system-wide open file table, which also tracks how many processes have that file open
  • Create per-process open file table entry pointing to system-wide open file entry
  • open() returns a pointer to the appropriate entry in the per-process open file table
  • All subsequent accesses are through that pointer or file handle
  • Errors?

read() or write():

  • Access through pointer to per-process open file table
  • For sequential access, per-process open file table includes pointer to current location
  • For direct access
  • Errors?

close():

  • Remove per-process open file table entry
  • Decrement process count in system-wide open file table entry
  • When process count == 0, copy any updated file meta data back to disk-based directory structure
  • Remove system-wide open file table entry
  • Should be done "automatically" for every dying process
  • Errors?

12.3 Directory Implementation

Operations include:

  • create
  • read
  • create new file
  • find file
  • delete file
  • remove
  • get/set attributes

Directory attributes include:

  • List of files

Primarily on disk. Cached in memory

12.3.1 Linear list

Directory is

File name Pointer to FCB

 

Create file

  • find file --> none
  • add entry

Delete file

  • find file
  • recover its space
  • "remove" entry
  • Observe: easy to mark as "deleted"

Table as array, mark as available, move entries, linked list

Problem: linear search time for find file

12.3.2 Hash table

As you learned in Data Structures

hash(fileName) --> table index

Chained overflow hash table

12.4 Allocation Methods

Where on the disk are data blocks for a file?

How do we Create(); Write(); Read(); Delete(); recover deleted?

Disk address includes: volume, cylinder, sector, block

For now, consider as a linear array

12.4.1 Contiguous allocation

File is a contiguous set of blocks

Directory (cf. Figure 12.5):

File name Start block Number of blocks

Where? First fit, best fit, worst fit
See 12.5 Free-space management

Advantages:

  • Easy access: Sequential?, direct?, indexed?
  •  

Disadvantages:

  • External fragmentation
  • Initial allocation size
  • File growth

Disk defragmentation?

Variant: Extent - Clusters of blocks

12.4.2 Linked allocation

File is a linked list of blocks, located anywhere

Directory (cf. Figure 12.6):

File name Start block End block

Blocks of file are in a linked list

Where? Anywhere

Create();
Write();

Advantages:

  • Efficient space allocation
  • No external fragmentation
  • No need to compact

Disadvantages:

  • Random access? Sequential?, direct?, indexed?
  • Speed of disk search
  • Reliability - lost or damaged pointers?
  • Overhead of disk space?

Disk defragmentation makes linked allocation behave like contiguous allocation

Variant: Allocate clusters of blocks

Many variants, including File-Allocation Table (FAT) of MS-DOS. See Figure 12.7. Cache FAT?

12.4.3 Indexed allocation

Gather block pointers into a single(?) index block. See Figure 12.8

UNIX inode, see Figure 12.9

Advantages:

  • Supports direct access

Disadvantages:

  • Overhead of disk space? E.g., 1 - 2 block file

Answer?

All of the above?

 

Fig. 11.9 UNIX inode structure

 

12.5 Free-Space Management

Disk equivalent of malloc and free

A new file is composed of blocks

How do we know which disk blocks are free for assignment?

Reuse space from deleted files

Obvious: List of free blocks

12.5.1 Bit vector: One bit for each disk block

Advantages: Simple

Disadvantages: Must cache for performance
Cache size?

12.5.2 Linked list

Use head? Stack
Immediate reuse
Recovery?
FIFO?

12.5.3 Grouping: On first free block, store links to n-1 free blocks. nth is another block of links

12.5.4 Counting: Keep address of free block and number of contiguous free blocks there

12.6 Efficiency and Performance

Disk performance is a major system bottleneck

Solutions include

  • Disk strategies, see Chapter 10
  • Caching, e.g., for sequential file access, use free-behind and read-ahead

12.7 Recovery

Problems:

  • Do in-memory structures agree with on-disk structures?
    E.g., Write-through and write-back memory management strategies
  • Are on-disk structures consistent? E.g., files in directory, locations of file blocks

How? Crash?

12.7.1 Consistency checking

Exercise: Write fsck in UNIX or chkdsk in MS-DOS
What can you check?
What can you fix?

Suggests redundancy?

Error-detecting or error correcting codes?

Cyclic Redundancy Check (CRC)?

12.7.2 Backup and restore

Your hard drive will fail

Back up to tape? CD? other?

Backups: Full, incremental, differential

Schedule?

Recovery?

Disk synchronization

RAID (Redundant Arrays of Independent/Inexpensive Disks), see section 12.7

12.8 NFS (Omit)

Postpone until we talk more about networks.

 

 

 

 
  Marquette University. Be The Difference. Marquette | Corliss |