xref: /haiku/docs/develop/kits/storage/resources/ResourcesFormat.tex (revision bddcee2a27042b4d8d6b0142b466f30abc886648)
1\documentclass[12pt, a4paper]{article}
2
3\usepackage{amssymb,amsmath,latexsym}
4\usepackage[english]{babel}
5\usepackage{hhline}
6
7\newcommand{\code}[1]{{\tt #1}}
8
9\newenvironment{nitemize}{
10  \newdimen\oldparindent
11  \oldparindent=\parindent
12  \begin{itemize}
13  \itemindent=-\oldparindent
14}{
15  \end{itemize}
16}
17
18%\newcommand{\codeblockspace}{\vspace{12pt}}
19
20% begin/end a code block
21\newcommand{\codeblockbegin}{\begin{flushleft}\begin{minipage}{\textwidth}}
22\newcommand{\codeblockend}{\end{minipage}\end{flushleft}}
23
24
25\begin{document}
26
27\sloppy
28
29\title{The Resources Format}
30%\date{}
31\author{Ingo Weinhold (bonefish@users.sf.net)}
32
33\maketitle
34\tableofcontents
35
36
37\section{Introduction}
38\label{introduction}
39
40Resources provide a means to store structured but flat data in files. Unlike
41attributes resources are part of the file contents and thus do not require a
42special file system handling, but rather a special file format.
43On the one hand there are formats of files that exclusively contain resources
44(resource files), on the other hand these are file formats extended to
45additionally contain resources -- namely the ELF and PEF object formats.
46In either case the format of the chunk of data that frames the resources
47themselves is the same. We call it the resources format.
48
49Section \ref{file-formats} explains how the resources format is embedded in
50different file formats. Section \ref{resources-format} discusses the resources
51format itself. In section \ref{implementations} we focus on robustness of
52resources reading/writing implementations.
53The final section says some words about the status of the information provided
54by this document.
55
56
57
58\section{File Formats}
59\label{file-formats}
60
61In all file formats described in this section the resources are being located
62at the end of the files. They are completely independent of their location.
63
64
65\subsection{x86 Resource Files}
66
67x86 resource files introduce the least overhead. The resources start directly
68after the magic number identifying the file format:
69%
70\codeblockbegin
71\begin{verbatim}
72  const char kX86ResourceFileMagic[4] = { 'R', 'S', 0, 0 };
73  const uint32 kX86ResourcesOffset    = 0x00000004;
74\end{verbatim}
75\codeblockend
76%
77The resources start at \code{kX86ResourcesOffset}.
78
79
80\subsection{PPC Resource Files}
81
82PPC resource files begin with a PEF container header, after which the
83resources start.
84%
85\codeblockbegin
86\begin{verbatim}
87typedef char PefOSType[4];
88
89struct PEFContainerHeader {
90    PefOSType tag1;
91    PefOSType tag2;
92    PefOSType architecture;
93    uint32    formatVersion;
94    uint32    dateTimeStamp;
95    uint32    oldDefVersion;
96    uint32    oldImpVersion;
97    uint32    currentVersion;
98    uint16    sectionCount;
99    uint16    instSectionCount;
100    uint32    reservedA;
101};
102\end{verbatim}
103\codeblockend
104%
105\codeblockbegin
106\begin{verbatim}
107  const char kPEFFileMagic1[4]        = { 'J', 'o', 'y', '!' };
108  const char kPPCResourceFileMagic[4] = { 'r', 'e', 's', 'f' };
109  const uint32 kPPCResourcesOffset    = 0x00000028;
110\end{verbatim}
111\codeblockend
112
113\begin{nitemize}
114\item{\code{tag1}:
115  Must be \code{kPEFFileMagic1}.}
116\item{\code{tag2}:
117  Must be \code{kPPCResourceFileMagic}.}
118\item{All other fields must be set to 0.}
119\end{nitemize}
120
121\noindent
122The resources start at \code{kPPCResourcesOffset}.
123
124
125\subsection{ELF Object Files}
126
127In an ELF file, resources are appended to rather than contained in the
128regular data of the file. That is adding resources to an existing ELF file
129will not cause any modification to its data (i.e. ELF header, program header
130table, section header table or sections), but will enlarge the file by some
131alignment padding and, of course, the resources themselves.
132
133Therefore two values have to be known: The size of the actual ELF file and the
134block size to which the resources must be aligned. As ELF files do not contain
135a size field, it has to be deduced, where the file ends. This end offset is
136supposed to be the maximum of the end offsets of ELF header, program header
137table (if any), section header table (if any), sections and segments.
138
139The block size to which the resources have to be aligned is the maximum of
140\code{kELFMinResourceAlignment} and the alignments of the segments in the file.
141%
142\begin{verbatim}
143  const uint32 kELFMinResourceAlignment = 32;
144\end{verbatim}
145%
146The data used for the padding between the end of the actual ELF data and the
147beginning of the resources may be arbitrary.
148
149
150\subsection{PEF Object Files}
151
152Similar to ELF files the resources are simply appended to the regular data of
153a PEF file, but they are not aligned to any value. That is the resources
154start directly after the last PEF section without any padding.
155As no field exists, that tells about the size of the PEF container (the
156regular data), it has to be deduced by iterating through the PEF section
157headers.
158
159
160
161\section{The Resources Format}
162\label{resources-format}
163
164This section describes the resources format. After a subsection that outlines
165their general layout, it follow subsections discussing the major parts.
166
167A general remark regarding the byte ordering: Resources have no standard
168endianess, that is the resources created by little endian and big endian
169machines differ. Usually it should be possible, to deduce the used endianess
170from the type of the file. x86 resource files contain little endian, PPC
171resource files big endian data. The endianess of an ELF file is encoded in
172its header.
173
174As there is in fact no good reason to have different resource file formats,
175even if they differ only in the format of the header (see section
176\ref{file-formats}), it may be decided to use the x86 resource file format
177also for big endian machines. Therefore the endianess may be deduced by the
178first field of the resources header (\code{rh\_resources\_magic}, see
179subsection \ref{resources-header}).
180
181
182\subsection{Resources Layout}
183
184The layout of the resources in a file is shown in figure
185\ref{fig:resources-layout}.
186
187\begin{figure}[h!tb]
188  \begin{center}
189  \begin{tabular}{|c|c|c|}
190  \hline
191                & \multicolumn{2}{c|}{resources header}\\
192  \hhline{|~==|}
193  admin section &               & index section header \\
194  \hhline{~~|-|}
195                & index section & resource index \\
196  \hhline{~~|-|}
197                &               & padding \\
198  \hhline{:===:}
199  \multicolumn{3}{|c|}{unknown section}\\
200  \hhline{:===:}
201  \multicolumn{3}{|c|}{data section}\\
202  \hhline{|~~~|}
203  \hhline{:===:}
204  \multicolumn{3}{|c|}{info section}\\
205  \hline
206  \end{tabular}
207  \end{center}
208  \caption{The Resources Layout.}
209  \label{fig:resources-layout}
210\end{figure}
211
212\noindent
213There are four sections:
214%
215\begin{itemize}
216\item{An administrative section which comprises the resources header and the
217  resource index subsection. The latter locates all other data in the file.
218}
219\item{An unknown section, whose purpose is (unsurprisingly) unknown, but which
220  seems to be unused, always containing the same data.
221}
222\item{A data section holding the actual resource data.
223}
224\item{An info section, which provides aditional information for each resource,
225  such as type, id and name.
226}
227\end{itemize}
228
229
230\subsection{Resources Header}
231\label{resources-header}
232
233The resources header has the following structure:
234%
235\codeblockbegin
236\begin{verbatim}
237  struct resources_header {
238      uint32 rh_resources_magic;
239      uint32 rh_resource_count;
240      uint32 rh_index_section_offset;
241      uint32 rh_admin_section_size;
242      uint32 rh_pad[13];
243  };
244\end{verbatim}
245\codeblockend
246%
247\codeblockbegin
248\begin{verbatim}
249  const uint32 kResourcesHeaderMagic          = 0x444f1000;
250  const uint32 kResourceIndexSectionOffset    = 0x00000044;
251  const uint32 kResourceIndexSectionAlignment = 0x00000600;
252\end{verbatim}
253\codeblockend
254
255\begin{nitemize}
256\item{\code{rh\_resources\_magic}:
257  Must be \code{kResourcesHeaderMagic}.
258}
259\item{\code{rh\_resource\_count}:
260  Specifies the number of resources stored in this file. May be 0.
261}
262\item{\code{rh\_index\_section\_offset}:
263  Specifies the offset of the resource index section relative to the beginning
264  of the resources. An alternative interpretation may be the size of the
265  resources header.
266  Must be \code{kResourceIndexSectionOffset}.
267}
268\item{\code{rh\_admin\_section\_size}:
269  Specifies the size of the administrative section.
270  Must be \code{kResourceIndexSectionOffset} plus a multiple of
271  \code{kResourceIndexSectionAlignment}.
272}
273\item{\code{rh\_pad}:
274  Padding. \code{0x00000000} words.
275}
276\end{nitemize}
277
278
279\subsection{Resource Index Section}
280\label{resources-index}
281
282The resource index section starts with a header, it follows a table of
283\code{resource\_index\_entry} structures, that locates the data of each
284resource, and the section ends with a special padding.
285
286\noindent
287The resource index header has the following structure:
288%
289\codeblockbegin
290\begin{verbatim}
291  struct resource_index_section_header {
292      uint32 rish_index_section_offset;
293      uint32 rish_index_section_size;
294      uint32 rish_unused_data1;
295      uint32 rish_unknown_section_offset;
296      uint32 rish_unknown_section_size;
297      uint32 rish_unused_data2[25];
298      uint32 rish_info_table_offset;
299      uint32 rish_info_table_size;
300      uint32 rish_unused_data3;
301  };
302\end{verbatim}
303\codeblockend
304
305\begin{verbatim}
306  const uint32 kUnknownResourceSectionSize = 0x00000168;
307\end{verbatim}
308%
309\begin{nitemize}
310\item{\code{rish\_index\_section\_offset}:
311  Specifies the offset of the resource index section relative to the beginning
312  of the resources. An alternative interpretation may be the size of the
313  resources header.
314  Must be \code{kResourceIndexSectionOffset}.
315}
316\item{\code{rish\_index\_section\_size}:
317  Specifies the size of the resource index section.
318  Must be a multiple of \code{kResourceIndexSectionAlignment}.
319}
320\item{\code{rish\_unused\_data1}:
321  Contains special data as described in section \ref{resources-unknown}.
322}
323\item{\code{rish\_unknown\_section\_offset}:
324  Specifies the offset of the unknown section relative to the beginning
325  of the resources.
326  Must be the same value as given in the resources header for
327  \code{rh\_admin\_section\_size}.
328}
329\item{\code{rish\_unknown\_section\_size}:
330  Specifies the offset of the unknown section relative to the beginning
331  of the resources.
332  Must be \code{kUnknownResourceSectionSize};
333}
334\item{\code{rish\_unused\_data2}:
335  Contains special data as described in section \ref{resources-unknown}.
336}
337\item{\code{rish\_info\_table\_offset}:
338  Specifies the offset of the resource info table relative to the beginning
339  of the resources.
340}
341\item{\code{rish\_info\_table\_size}:
342  Specifies the size of the resource info table.
343}
344\item{\code{rish\_unused\_data3}:
345  Contains special data as described in section \ref{resources-unknown}.
346}
347\end{nitemize}
348
349Directly, without padding, it follows a table of \code{resource\_index\_entry}
350structures. The number of entries in the table is the number of resources
351stored in the file, that is the value specified by the
352\code{rh\_resource\_count} member of the resources header. Since the entries
353are stored without padding, the size of the table is exactly the product of
354the size of \code{resource\_index\_entry} and the number of resources.
355If the latter is 0, the table takes no space.
356%
357\codeblockbegin
358\begin{verbatim}
359  struct resource_index_entry {
360      uint32 rie_offset;
361      uint32 rie_size;
362      uint32 rie_pad;
363  };
364\end{verbatim}
365\codeblockend
366%
367\begin{nitemize}
368\item{\code{rie\_offset}:
369  Specifies the offset of the resource data relative to the beginning
370  of the resources.
371}
372\item{\code{rie\_size}:
373  Specifies the size of the resource data.
374}
375\item{\code{rie\_pad}:
376	Padding. Must be \code{0x00000000}.
377}
378\end{nitemize}
379
380Since the size of the resource index section must be a multiple of
381\code{kResourceIndexSectionAlignment}, some padding may be needed at the end
382of this section. How this padding looks like is described in section
383\ref{resources-unknown}.
384
385
386\subsection{Unknown Section}
387\label{resources-unknown}
388
389The meaning of this section is unknown. It does not seem to be used at all.
390It always contains the same data given by \code{kUnusedResourceDataPattern}:
391%
392\codeblockbegin
393\begin{verbatim}
394  const uint32 kUnusedResourceDataPattern[3] = {
395      0xffffffff, 0x000003e9, 0x00000000
396  };
397\end{verbatim}
398\codeblockend
399%
400In section \ref{resources-index} some members where named \code{unused\_data}.
401These fields contain the same kind of data. To understand what the value for a
402certain field of this type is, it may help to imagine, that before the
403resources are written to a file, the space they will take is filled with the
404pattern specified by \code{kUnusedResourceDataPattern}, and that only those
405fields are written that are not unused. Thus the original pattern can be seen
406through at the unused locations.
407
408To be precise: Let \verb|uint32 resources[]| be the resources and
409\code{index} the index of an unused field in \code{resources}, then it holds:
410%
411\begin{verbatim}
412  resources[index] == kUnusedResourceDataPattern[index % 3];
413\end{verbatim}
414%
415
416
417\subsection{Resource Info Table}
418\label{resources-infotable}
419
420The resource info table features exactly one entry for each resource.
421Such an entry (resource info) specifies the ID and name of a
422resource. Subsequent infos for resources of the same type are collected in
423a block that starts with a type field.
424
425The following grammar specifies the layout of the resource info table.
426Nonterminals start with an upper case, terminals with a lower case letter.
427%
428\begin{verbatim}
429  ResourceInfoTable      ::= [ ResourceBlockList ]
430                             ResourceInfoSeparator
431                             ResourceInfoTableEnd
432
433  ResourceBlockList      ::= ResourceBlock
434                             [ ResourceInfoSeparator
435                               ResourceBlockList ]
436
437  ResourceBlock          ::= type ResourceInfoList
438
439  ResourceInfoList       ::= ResourceInfo [ ResourceInfoList ]
440
441  ResourceInfo           ::= id index name_size name
442
443  ResourceInfoSeparator  ::= 0xffffffff 0xffffffff
444
445  ResourceInfoTableEnd   ::= check_sum 0x00000000
446\end{verbatim}
447%
448The relevant structures follow:
449%
450\codeblockbegin
451\begin{verbatim}
452  struct resource_info_block {
453      type_code     rib_type;
454      resource_info rib_info[1];
455  };
456\end{verbatim}
457\codeblockend
458%
459\begin{nitemize}
460\item{\code{rib\_type}:
461  Specifies the type of the resources in the block.
462}
463\item{\code{rib\_info}:
464  Is the first resource info of the block. More infos may follow.
465}
466\end{nitemize}
467%
468\codeblockbegin
469\begin{verbatim}
470  struct resource_info {
471      int32  ri_id;
472      int32  ri_index;
473      uint16 ri_name_size;
474      char   ri_name[1];
475  };
476\end{verbatim}
477\codeblockend
478
479\begin{verbatim}
480  const uint32 kMinResourceInfoSize = 10;
481\end{verbatim}
482%
483\begin{nitemize}
484\item{\code{ri\_id}:
485  Specifies the ID of the resource.
486}
487\item{\code{ri\_index}:
488  Specifies the index of the resource this resource info refers to.
489}
490\item{\code{ri\_name\_size}:
491  Specifies the size of the resource name. May be 0 -- then the resource does
492  not have a name and \code{ri\_name} has a size of 0.
493}
494\item{\code{ri\_name}:
495  Specifies the name of the resource. The name must be null terminated.
496  \code{ri\_name\_size} specifies the size of this field (including the
497  terminating null). If it is 0, the resource does not have a name and
498  \code{ri\_name} is empty, i.e. has size 0.
499}
500\item{\code{kMinResourceInfoSize}:
501  Is the minimal size of a resource info. That is the size it has, if the
502  resource does not have a name.
503}
504\end{nitemize}
505%
506\codeblockbegin
507\begin{verbatim}
508  struct resource_info_separator {
509      uint32 ris_value1;
510      uint32 ris_value2;
511  };
512\end{verbatim}
513\codeblockend
514%
515\begin{nitemize}
516\item{\code{ris\_value1}:
517  Specifies the first word of the separator.
518  Must be \code{0xffffffff}.
519}
520\item{\code{ris\_value2}:
521  Specifies the second word of the separator.
522  Must be \code{0xffffffff}.
523}
524\end{nitemize}
525%
526\codeblockbegin
527\begin{verbatim}
528  struct resource_info_table_end {
529      uint32 rite_check_sum;
530      uint32 rite_terminator;
531  };
532\end{verbatim}
533\codeblockend
534%
535\begin{nitemize}
536\item{\code{rite\_check\_sum}:
537  Contains the check sum for the resource info table. The check sum is
538  calculated from all bytes of the resource info table not including
539  \code{rite\_check\_sum} and \code{rite\_terminator}. The data are grouped
540  into four byte blocks, which are interpreted as big endian unsigned words
541  and summed up, ignoring carry. If the number of bytes to be considered is
542  not dividable by four, the remaining bytes are interpreted as the lower
543  bytes of a big endian unsigned word (the upper byte(s) set to 0).
544}
545\item{\code{rite\_terminator}:
546  Terminates the resource info table.
547  Must be \code{0x00000000}.
548}
549\end{nitemize}
550
551
552
553\section{Implementations}
554\label{implementations}
555
556Code that writes resources should strictly stick to the specification
557presented in the preceding sections to achieve maximal compatibility.
558
559Resources reading implementations may tolerate certain deviations that
560for instance happen to occur in several files of the BeOS R5 distribution
561and that are handled gracefully by xres and QuickRes. It follows a, possibly
562incomplete, list:
563%
564\begin{itemize}
565\item{The third and fourth byte of the x86 resource file magic (the 0 bytes)
566  may be arbitrary bytes.}
567\item{\code{rh\_resource\_count} may be unreliable. The resource index table
568  should be read until its end, which is either marked by the unused data
569  pattern (see section \ref{resources-unknown}) or at the latest by the
570  beginning of the unknown section.}
571\item{The resource info table may contain entries for indices that are out
572  of range, i.e. greater than the number of resources induced by the resource
573  index table. Those entries should be ignored.}
574\item{The resource info table may contain multiple entries for an index.
575  Any such entry after the first one should be ignored.}
576\item{The resource info table may not contain an entry for an index.
577  The respective resource should be ignored.}
578\item{The resource info table may not contain a \code{ResourceInfoTableEnd}
579  (see section \ref{resources-infotable}) and thus no check sum. The table
580  should be accepted nevertheless. Note, that a table containing a wrong check
581  sum is {\em not} to be accepted.}
582\end{itemize}
583
584
585
586\section{Status of this Document}
587\label{status}
588
589The information contained in this document are obtained by analyzing
590resources-containing files created or modified by tools available for BeOS R5,
591namely QuickRes and xres. They are incomplete and may even be partially wrong,
592where being based on incorrect assumptions.
593
594\noindent
595It follows a list of items with a low degree of reliance:
596\begin{itemize}
597\item{Resources alignment in ELF files: Several tests with linker object files
598  have shown, that QuickRes aligns their resources offset to 32 bytes.
599  For executables on the other hand the alignment was always 4096, which is
600  the usual memory page size of current x86 architectures and therefore the
601  preferred program segment alignment. From these two information it has been
602  deduced, that the alignment is, if present, the maximum of
603  the segment alignments to be found in the program header table,
604  but at minimum 32.}
605\item{The resources layout: The general layout of the resources is not very
606  well understood. The layout presented in figure \ref{fig:resources-layout}
607  resulted from the attempt to assign all the fields a reasonable meaning, but
608  in fact not even the exact length and meaning of the fields of the resources
609  header is unclear. The same holds for the resource index section header.}
610\item{The unknown section: The contents of the unknown section and of unknown
611  fields is base on educated guesses.}
612\end{itemize}
613
614\end{document}
615