xref: /haiku/docs/develop/file_systems/fat.rst (revision 909af08f4328301fbdef1ffb41f566c3b5bec0c7)
1.. _FAT Page:
2
3===================
4The FAT File System
5===================
6
7Code Organization
8=================
9
10bsd/
11	C code ported from FreeBSD.
12
13	fs/msdosfs/
14		The FreeBSD FAT driver.  Only minor changes have been made from the FreeBSD code.  The exceptions are msdosfs_vfsops.c and msdosfs_vnops.c.  Most of the original content of these files (BSD hook functions) was removed for the port (although some of the hook functions in kernel_interface.cpp are adapted from these).
15
16	kern/ and libkern/
17		Heavily modified and simplified versions of FreeBSD kernel source files.
18
19	sys/ and vm/
20		Heavily modified and simplified versions of FreeBSD kernel headers, provided for compatability.  These are adapted to support this driver specifically and are not meant for general BSD compatability.
21
22kernel_interface.cpp
23	Hook functions.  Code in this file and the other top-level files is adapted from several sources:  the FreeBSD driver, the BFS driver, and the original Haiku FAT driver.
24
25dosfs.h
26	Header for shared definitions across all (C and C++) driver code.
27
28support.*
29	Supporting functions for C++ driver code.
30
31debug.*
32	Adapted from the BFS equivalent.
33
34mkdos.*
35	Volume initialization, mostly unchanged from the original Haiku driver.
36
37vcache.*
38	The vcache keeps track of inode assignments and ties them to direntry locations.  Mostly unchanged from the original Haiku driver.
39
40fssh_defines.h
41	Macros that are not provided by the fs_shell interface.
42
43FAT Data Types
44==============
45
46The central FAT-specific structs used in the driver include:
47
48struct msdosfsmount
49	The FAT private volume.  Note:  the address provided to the VFS is actually that of the corresponding BSD struct mount.
50
51struct denode
52	The FAT private node.  Note:  the address provided to the VFS is actually that of the corresponding BSD struct vnode.
53
54struct direntry
55	The directory entry that corresponds to a node, as stored on disk.
56
57struct winentry
58	A modified directory entry that stores some or all of a long file name on disk.
59
60BSD Compatability
61=================
62
63The BSD structs and functions that are used in the driver for compatability include the following.  In each, some member variables were removed for simplicity and others were added.
64
65struct vnode
66	The BSD VFS node.  The address provided to the Haiku VFS by publish_vnode and dosfs_read_vnode points to this object.  In FreeBSD, the driver also accesses the vnode (in a separate volume) of the device being mounted.  In the port, the mount hook creates a special struct vnode to fill in for this.  That struct vnode is unique in that its private data is NULL, it has v_type VBLK, and its v_bufobj member is set up.
67
68struct mount
69	The VFS volume that corresponds to msdosfsmount.  The address provided to the Haiku VFS by dosfs_mount points to this object.
70
71struct cdev
72	Details of the device being mounted.
73
74struct buf
75	Analagous to a Haiku block cache block, but with public metadata.  bread() or getblkx() will get a buf, and bwrite() or brelse() will put it.  In the ported driver, this system is a wrapper for the the block cache.  It is also set up to use the the file cache if a BSD function accesses regular file data, although that doesn't happen under the current implementation.
76
77struct bufobj
78	The 'parent' of all struct bufs associated with a device.
79
80Inode Numbers
81=============
82
83Since the FAT filesystem doesn't store inode numbers on disk, they must be generated by the driver.  The original Haiku FAT driver generated inode numbers based on the location of the file's directory entry when possible, and assigned "artificial" (arbitrary) numbers when the location-based number was not viable.  In this driver, that framework is carried over, but the math used to generate location-based numbers is that used in FreeBSD.
84
85The driver will attempt to assign a location-based inode number as follows:
86	regular files:
87		parent is the FAT12/16 root directory:
88			index of direntry
89		otherwise:
90			(cluster containing the direntry - 2) * (bytes per cluster) + (index of direntry) + (max root directory entries)
91	directory files:
92		(cluster number of directory - 2) * (bytes per cluster) + (max root directory entries)
93
94The index is directory-relative when dealing with the FAT12/16 root directory.  Otherwise it is cluster-relative.
95The (max root directory entries) term prevents collisions between FAT12/16 root directory entries and other directory entries.
96Note that directory files' inode numbers are based on the location of the "." direntry in the directory's own data, not the location of the directory's direntry in its parent's data.
97
98If the location-based number is already taken by a moved or deleted node, an artificial number is assigned.  These numbers are assigned sequentially starting with ARTIFICIAL_VNID_BITS, which is set greater than that maximum possible location-based number.
99
100The vcache maps inode to location and vice versa.  Unlike the original Haiku driver, all nodes are listed in the vcache, not just those with artificial numbers.  This provides a useful way to check which nodes are currently constructed.
101
102Locking
103=======
104
105vnode::v_vnlock is read- or write-locked when vnode or denode member variables are read/written.  In addition, when entries are being added to or removed from a directory, that directory's v_vnlock is write-locked.  Its v_vnlock is also write-locked when msdosfs_lookup_ino is called to find a entry within that directory, because the directory's denode::de_fndoffset and de_fndcnt will be set as part of the output of that function.  If a direntry is being modified in place, the v_vnlock of the entry's node, not the parent's, is locked.
106
107mosdosfsmount::pm_fatlock is write-locked during changes to the FAT itself and when data clusters are being allocated.
108
109mount::mnt_mtx is locked in functions that operate at the volume level and in some functions that operate at the node level, but in which locking a single node might not be sufficient.
110
111Caches
112======
113
114The file cache is used for regular files.  The block cache is used for directory files, the FAT, and the FAT32 fsinfo sector.
115
116The driver's present use of the block cache to work with directory files is inefficient.  The ported BSD code is designed to read and write directory files in cluster-size blocks.  Because FAT data clusters are offset from the start of the volume by an arbitrary number of sectors (occupied by the FAT etc.), data clusters are liable to be offset from the cluster-size blocks that Haiku's block cache can provide.  When a BSD function needs a cluster-size block, the driver gets multiple 512-byte cached blocks and copies them into another buffer to create a contiguous cluster, and vice versa when writing.
117
118Limitations
119===========
120
121In FreeBSD, the FAT driver relies on libiconv for character conversion, and has only limited
122internal support for non-ASCII characters in the short filename stored in a direntry.  In the port, libiconv is not available (except in the userlandfs module) and the driver can have trouble reading filenames containing characters that are not in OEM code page 850.  This can result in dosfs_walk failing to find an entry with the name reported by dosfs_readdir; in ls this would generate a "No such file or directory" error, while in Tracker the file would simply not appear.  It could also prevent a user from copying a file from another filesystem.
123
124The initialize hook only supports media with a 512-byte sector size.
125
126The driver will refuse to mount a volume larger than 32 gigabytes or a volume with a sector size other than 512 bytes because it hasn't been tested under those conditions.
127
128Tracker's restore command, for items in the trash, does not work on FAT files because it relies on attributes. The user must manually move the file to the desired directory instead.
129
130The volume name is normally stored in the boot sector and in a false directory entry in the root directory.  If the false directory entry was not created when a volume was initialized, the driver will not add one later.  If given a new label to write, the driver will update the boot sector label only.
131
132If a file is truncated while asyncronous IO is in progress, so that the read/write goes beyond the EOF, an error message "PageWriteWrapper: Failed to write page" may be printed to the syslog by the virtual memory system.  The failure occurs in the area that has already been deleted from the file, so in effect there is no data loss.  This error message could probably be avoided if file_cache_set_size were only used when the node is locked, but it seems necessary to unlock the node first in order to eliminate a possible deadlock (producible in the fsx test) in which file_cache_set_size is waiting for VM page events, while dosfs_io is waiting for the node lock.
133
134The fsx test sometimes complains of non-zero data past EOF when it does a mapped read or write.  This appears to happen when fsx has just changed the size of the file, and then checks for non-zero data before the file cache has had time to zero its new last page beyond EOF.
135
136FAT Reference Material
137======================
138
139FAT32 File System Specification (December 6, 2000)
140	https://download.microsoft.com/download/1/6/1/161ba512-40e2-4cc9-843a-923143f3456c/fatgen103.doc
141FAT File System (September 11, 2008)
142	https://learn.microsoft.com/en-us/previous-versions/windows/it-pro/windows-2000-server/cc938438(v=technet.10)
143How FAT Works (October 8, 2009)
144	https://learn.microsoft.com/en-us/previous-versions/windows/it-pro/windows-server-2003/cc776720(v=ws.10)
145FAT Type and Cluster Size Depends on Logical Drive Size (November 16, 2006)
146	http://web.archive.org/web/20130315020207/http://support.microsoft.com/kb/67321/en-us
147