xref: /haiku/docs/develop/file_systems/xfs.rst (revision dd2a1e350b303b855a50fd64e6cb55618be1ae6a)
1.. _XFS Page:
2
3The XFS File System
4===================
5
6This document describes how to test the XFS file system, XFS file system API for haiku
7and Its current status on haiku.
8
9
10Testing XFS File System
11-----------------------
12
13There are three ways we can test XFS :
14
15-  Using xfs_shell.
16-  Using userlandfs.
17-  Building a version of haiku with XFS support and then mounting a file system.
18
19But before that we will need to create XFS images for all testing purposes.
20
21Creating File System Images
22^^^^^^^^^^^^^^^^^^^^^^^^^^^
23
24Currently only linux has full XFS support so we will use linux for generating file system images.
25
26First we need to create an empty sparse image using command::
27
28   $ dd if=/dev/zero of=fs.img count=0 bs=1 seek=5G
29
30The output will be::
31
32   0+0 records in
33   0+0 records out
34   0 bytes (0 B) copied, 0.000133533 s, 0.0 kB/s
35
36Do note that we can create images of whatever size or name we want, for example the above command
37creates fs.img of size 5 GB, if we alter seek = 10G it will create fs.img with size 10 GB.
38
39The XFS file system on linux supports two versions, V4 and V5.
40
41To put XFS V5 file system on our sparse image run command::
42
43   $ /sbin/mkfs.xfs fs.img
44
45The output will be::
46
47   meta-data   =fs.img                 isize=512    agcount=4, agsize=65536 blks
48               =                       sectsz=512   attr=2, projid32bit=1
49               =                       crc=1        finobt=1, sparse=1, rmapbt=0
50               =                       reflink=1
51   data        =                       bsize=4096   blocks=262144, imaxpct=25
52               =                       sunit=0      swidth=0 blks
53   naming      =version 2              bsize=4096   ascii-ci=0, ftype=1
54   log         =internal log           bsize=4096   blocks=2560, version=2
55               =                       sectsz=512   sunit=0 blks, lazy-count=1
56   realtime    =none                   extsz=4096   blocks=0, rtextents=0
57
58To put XFS V4 file system on our sparse image run command::
59
60   $ /sbin/mkfs.xfs -m crc=0 file.img
61
62The output will be::
63
64    meta-data=fs.img                 isize=256    agcount=4, agsize=327680 blks
65             =                       sectsz=512   attr=2, projid32bit=0
66    data     =                       bsize=4096   blocks=1310720, imaxpct=25
67             =                       sunit=0      swidth=0 blks
68    naming   =version 2              bsize=4096   ascii-ci=0
69    log      =internal log           bsize=4096   blocks=2560, version=2
70             =                       sectsz=512   sunit=0 blks, lazy-count=1
71    realtime =none                   extsz=4096   blocks=0, rtextents=0
72
73**The linux kernel will support older XFS v4 filesystems by default until 2025 and
74Support for the V4 format will be removed entirely in September 2030**
75
76Now we can mount our file system image and create entries for testing XFS haiku driver.
77
78Test using xfs_shell
79^^^^^^^^^^^^^^^^^^^^^^^
80
81The idea of fs_shell is to run the file system code outside of haiku. We can run it
82as an application, it provides a simple command line interface to perform various
83operations on the file system (list directories, read and display files, etc).
84
85First we have to compile it::
86
87   jam "<build>xfs_shell"
88
89Then run it::
90
91   jam run ":<build>xfs_shell" fs.img
92
93Where fs.img is the file system image we created from linux kernel.
94
95Test directly inside Haiku
96^^^^^^^^^^^^^^^^^^^^^^^^^^
97
98First build a version of haiku with XFS support, to do this we need to add "xfs" to the `image
99definition <https://git.haiku-os.org/haiku/tree/build/jam/images/definitions/minimum#n239>`__.
100
101Then compile haiku as usual and run the resulting system in a virtual machine or on real hardware.
102
103We can then try to mount an XFS file system using command on Haiku::
104
105   mount -t xfs <path to image> <path to mount folder>
106
107for example::
108
109   mount -t xfs /boot/home/Desktop/fs.img /boot/home/Desktop/Testing
110
111Here fs.img is file system image and Testing is mount point.
112
113Test using userlandfs
114^^^^^^^^^^^^^^^^^^^^^
115
116To be updated
117
118
119Haiku XFS API
120-------------
121
122All the necessary hooks for file system like xfs_mount(), open_dir(), read_dir() etc.. are
123implemented in the **kernel_interface.cpp** file. It acts as an interface between the Haiku kernel
124and the XFS file system. Documentation for all necessary file system hooks can be found
125`in the API reference <https://www.haiku-os.org/docs/api/fs_modules.html>`_
126
127Whenever we run a file system under fs_shell we can't use system headers, fs_shell compatible
128headers are there which needs to be used whenever we try to mount XFS file system using xfs_shell.
129To resolve this problem we use **system_dependencies.h** header file which takes care to use
130correct headers whenever we mount XFS file system either using xfs_shell or directly inside Haiku.
131
132XFS stores data on disk in Big Endian byte order, to convert data into host order
133all classes and data headers has **SwapEndian()** function, Its better to have all data
134conversions at one place to avoid future problems related to data byte order.
135
136XFS SuperBlock starts at ondisk offset 0, the definition of SuperBlock is in **xfs.h** file.
137
138A Volume is an instance of file system and defined in **Volume.h** file.
139XFS Volume contains SuperBlock, file system device and essential functions
140like Identify(), mount() etc...
141
142*  *Identify()* function reads SuperBlock from disk and verifies it.
143*  *Mount()* function mounts file system device and publishes root inode of file system
144   (Typically root inode number for XFS is 128).
145
146XFS uses TRACE Macro to debug file system, definitions for TRACE, ERROR and ASSERT
147are defined at **Debug.h** in the form of Macro.
148
149To enable TRACE calls just add ``#define TRACE_XFS`` in Debug.h file and
150vice versa to disable it.
151
152
153XFS V5 introduced metadata checksums to ensure the integrity of metadata in file system,
154It uses CRC32C checksum algorithm. For XFS all checksums related functions are defined in
155**Checksum.h** header file.
156It contains following functions :
157
158*  *xfs_verify_cksum()* to verify checksum for buffer.
159*  *xfs_update_cksum()* to update checksum for buffer.
160
161**XFS stores checksum in little endian byte order unlike other ondisk data which is stored
162in big endian byte order**
163
164XFS V5 introduced many other fields for metadata verification like *BlockNo* *UUID* *Owner*
165etc.. All this fields are common in every data header and so are their checks. So to not
166repeat same checks again and again for all headers we created a *VerifyHeader* template
167function which is defined in **VerifyHeader.h** file. This function is commonly used in
168all forms of headers for verification purposes.
169
170Inodes
171^^^^^^
172
173XFS inodes comes in three versions:
174
175*  Inode V1 & V2. (Version 4 XFS)
176*  Inode V3. (Version 5 XFS)
177
178Version 1 inode support is already deprecated on linux kernel, Haiku XFS supports it only
179in read format. When we will have write support for XFS we will only support V2 and V3 inodes.
180
181V1 & V2 inodes are 256 bytes while V3 inodes are 512 bytes in size allowing more data to be
182stored directly inside inode.
183
184**CoreInodeSize()** is a helper funtion which returns size of inode based on version of XFS and
185is used throughout our XFS code.
186
187**DIR_DFORK_PTR** is a Macro which expands to void pointer to the data offset in inode, which
188could be either shortform entries, extents or B+Tree root node depending on the data format
189of inode (di_format).
190
191Similarly **DIR_AFORK_PTR** Macro expands to void pointer to the attribute offset in inode,
192which could be either shortform attributes, attributes extents or B+Tree node depending on
193the attribute format of Inode (di_aformat).
194
195Since size of inodes could differ based on different versions of XFS we pass CoreInodeSize()
196function as a parameter to DIR_DFORK_PTR and DIR_AFORK_PTR macros to return correct pointer offset.
197
198**di_forkoff** specifies the offset into the inode's literal area where the extended attribute
199fork starts. This value is initially zero until an extended attribute is created.
200It is fixed for V1 & V2 inode's while for V3 Inodes it is dynamic in size,
201allowing complete use of inode's literal area.
202
203Directories
204^^^^^^^^^^^
205
206Depending on the number of entries inside directory, XFS divides directories into five formats :
207
208*  Shortform directory.
209*  Block directory.
210*  Leaf directory.
211*  Node directory.
212*  B+Tree directorcy.
213
214Class DirectoryIterator in **Directory.h** file provides an interface between kernel request
215to open, read directory and all forms of directories. It first identifies correct format of
216entries inside inode and then returns request as per format found.
217
218**Shortform directory**
219
220*  When the number of entries inside directory are small enough such that we can store all
221   metadata inside inode itself, this form of directory is known as shortform directory.
222*  We can check if a directory is shortform if the format of inode is *XFS_DINODE_FMT_LOCAL*.
223*  The header for ShortForm entries is located at data fork pointer inside inode, which we cast
224   directly to *ShortFormHeader*.
225*  Since number of entries are short we can simply iterate over all entries for *Lookup()* and
226   *GetNext()* functions.
227
228**Block directory**
229
230*  When number of entries expand such that we can no longer store all directory metadata
231   inside inode we use extents.
232*  We can check if a directory is extent based if the format of inode is *XFS_DINODE_FMT_EXTENTS*.
233*  In Block directory we have a single directory block for Data header, leaf header
234   and free data header. This simple fact helps us to determine if given extent format
235   in inode is block directory.
236*  Since XFS V4 & V5 data headers differs we use a virtual class *ExtentDataHeader* which
237   acts as an interface between V4 & V5 data header, this class only stores pure virtual
238   functions and no data.
239*  *CreateDataHeader* returns a class instance based on the version of XFS mounted.
240*  Since now we have a virtual class with V_PTRS we need to be very careful with data stored
241   ondisk and data inside class, for example we now can't use sizeof() operator on class to
242   return its size which is consistent with its size inside disk. To handle this issue helper
243   function like *SizeOfDataHeader* are created which needs to be used instead of sizeof() operator.
244*  In *GetNext()* function we simply iterate over all entries inside buffer, though a found
245   entry could be unused entry so we need to have checks if a entry found is proper entry.
246*  In *Lookup()* function first we generate a hash value of entry for lookup, then we find
247   lowerbound of this hash value inside leaf entries to get address of entry inside data.
248   At last if entry matches we return B_OK else we return B_ENTRY_NOT_FOUND.
249
250**Leaf directory**
251
252*  When number of entries expand such that we can no longer store all directory metadata inside
253   directory block we use leaf format.
254*  In leaf directory we have a multiple directory block for Data header and free data header,
255   while single directory block for leaf header.
256*  To check if given extent based inode is leaf type, we simply check for offset inside last
257   extent map, if its equal to *LEAF_STARTOFFSET* then the given inode is leaf type else it is
258   node type.
259*  Since XFS V4 & V5 leaf headers differs we use a virtual class *ExtentLeafHeader* which acts
260   as an interface between V4 & V5 leaf header, this class only stores pure virtual functions
261   and no data.
262*  *CreateLeafHeader* returns a class instance based on the version of XFS mounted.
263*  Instead of sizeof() operator on ExtentLeafHeader we should always use *SizeOfLeafHeader()* function
264   to return correct size of class inside disk.
265*  *Lookup()* and *GetNext()* functions are similar to block directories except now we don't use single
266   directory block buffer.
267
268TODO : Document Node and B+Tree based directories.
269
270Files
271^^^^^
272
273XFS stores files in two formats :
274
275*  Extent based file.
276*  B+Tree based file.
277
278All implementation of read support for files is inside *Inode()* class in **Inode.h** file.
279
280When the format inside inode of file is *XFS_DINODE_FMT_EXTENTS* it is an extent based file,
281to read all data of file we simply iterate over all extents which is very similar to how we
282do it in Extent based directories.
283
284When the file becomes too large such that we cannot store more extent maps inside inode the
285format of file is changed to B+Tree. When the format inside inode of file is
286*XFS_DINODE_FMT_BTREE* it is an B+Tree based file, to read all data of file
287first we read blocks of B+Tree to extract extent maps and then read extents
288to get file's data.
289
290
291Current Status of XFS
292---------------------
293
294Currently we only have read support for XFS, below briefly summarises read support for all formats.
295
296
297Directories
298^^^^^^^^^^^
299
300**Short-Directory**
301   Stable read support for both V4 and V5 inside Haiku.
302
303**Block-Directory**
304   Stable read support for both V4 and V5 inside Haiku.
305
306**Leaf-Directory**
307   Stable read support for both V4 and V5 inside Haiku.
308
309**Node-Directory**
310   Stable read support for both V4 and V5 inside Haiku.
311
312**B+Tree-Directory**
313   Unstable read support for both V4 and V5, due to so many read from disk entire
314   process inside Haiku is too slow.
315
316Files
317^^^^^
318
319**Extent based Files**
320   |  *xfs_shell* - stable read support for both V4 and V5.
321   |  *Haiku* - Unstable, Cat command doesn't print entire file and never terminates process.
322
323**B+Tree based Files**
324   |  *xfs_shell* - stable read support for both V4 and V5.
325   |  *Haiku* - Unstable, Cat command doesn't print entire file and never terminates process.
326
327Attributes
328^^^^^^^^^^
329
330Currently we have no extended attributes support for xfs.
331
332Symlinks
333^^^^^^^^
334
335Currently we have no symlinks support for xfs.
336
337XFS V5 exclusive features
338^^^^^^^^^^^^^^^^^^^^^^^^^
339
340**MetaData Checksumming**
341   Metadata checksums for superblock, Inodes, and data headers are implemented.
342
343**Big Timestamps**
344   Currently we have no support.
345
346**Reverse mapping btree**
347   Currently we have no support, this data structure is still under construction
348   and testing inside linux kernel.
349
350**Refrence count btree**
351   Currently we have no support, this data structure is still under construction
352   and testing inside linux kernel.
353
354Write Support
355^^^^^^^^^^^^^
356
357Currently we have no write support for xfs.
358
359
360References
361----------
362
363The best and only reference for xfs is latest version of "xfs_filesystem_structure"
364written by Linux-XFS developers.
365
366The pdf version of above Doc can be found
367`here <http://ftp.ntu.edu.tw/linux/utils/fs/xfs/docs/xfs_filesystem_structure.pdf>`_
368