Ar (Unix)

From Vero - Wikipedia
Jump to navigation Jump to search

Template:Short description Template:Lowercase title Template:Infobox software Template:Infobox file format ar, short for archiver, is a shell command for maintaining multiple files as a single archive file; a file archiver. It is often used to create and update static library files that the link editor or linker uses and for generating deb format packages for the Debian Linux distribution. It can be used to create archives for any purpose, but has been largely replaced by tar for purposes other than static libraries.<ref name="Static Libraries">Template:Cite web</ref>

Originally developed for Unix, the command is widely available on Unix-based systems, and similar commands are available on other platforms. An implementation is included in GNU Binutils.<ref name="ar(1) – Linux man page">Template:Cite web</ref> In the Linux Standard Base (LSB), the command has been deprecated and is expected to disappear in a future release of that standard. The rationale provided was that "the LSB does not include software development utilities nor does it specify .o and .a file formats."<ref>Linux Standard Base Core Specification, version 4.1, Chapter 15. Commands and Utilities > ar</ref>

File format

File:Deb File Structure.svg
Diagram showing an example file structure of a .deb file

The format of a file that results from using Template:Code has never been standardized.<ref name="ar5">Manual page for NET/2 ar file format</ref><ref name="Levine99">Template:Cite book Code: [1][2]Template:Dead linkTemplate:Cbignore Errata: [3]</ref>

The first format appeared in the first edition of Unix<ref>Template:Cite web</ref> and was used through Version 6 Unix.<ref>Template:Cite web</ref><ref name="ar5" /> Version 7 Unix had a modified version of that format,<ref>Template:Cite web</ref><ref name="ar5" /> which was also used in UNIX System III<ref>Template:Cite book</ref> and in UNIX System V on the PDP-11.<ref>Template:Cite book</ref>

A new format was introduced in the first release of System V on processors other than PDP-11s.<ref>Template:Cite book</ref>

Modern archives are, on most systems, based on a common format with two main variants, BSD<ref name="ar5" /> (initially used for a.out files) and UNIX System V release 2 and later<ref>Template:Cite book</ref> (initially used for COFF files and later used for ELF files) and used as well by GNU<ref>Template:Cite web</ref> and Windows. AIX has its own formats (small<ref>Template:Cite web</ref> and big<ref>Template:Cite web</ref>), as does Coherent; those formats vary significantly from the common format.

Structure

An archive file begins with a header that identifies the file type and is followed with a section for each contained file. Each contained file section consists of a header followed by the file content. The headers consist solely of printable ASCII characters and line feeds. In fact, an archive containing only text files is also a text file.

The content of a contained file begins on an even byte boundary. A newline is inserted between files as padding, if necessary. Nevertheless, the size stored reflects the size excluding padding.<ref>Template:Cite web</ref>

Archive header

The first header, a.k.a. file signature, is a magic number that encodes the ASCII string !<arch> followed by a single line feed character (0x0A).

Contained file header

Each file is preceded by a header that contains information about the file. The common format is as follows. Numeric values are encoded in ASCII and all values are right-padded with spaces (0x20).

Offset Length Content Format
0 16 File identifier ASCII
16 12 File modification timestamp (in seconds) Decimal
28 6 Owner ID Decimal
34 6 Group ID Decimal
40 8 File mode (type and permission) Octal
48 10 File size in bytes Decimal
58 2 Ending characters 0x60 0x0A

Variants

Variants of the command were developed to address issues including:

File name length limitation
The GNU and BSD variants devised different methods of storing long file names.
Global symbol table
Many implementations include a global symbol table (a.k.a. armap, directory or index) for fast linking without needing to scan the whole archive for a symbol. POSIX recognizes this feature, and requires implementations to have an Template:Code option for updating it. Most implementations put it at the first file entry.<ref>Template:Man</ref>
Year 2038 problem
Although the common format is not at risk of this problem, many implementations are vulnerable to failure in that year.

BSD

The BSD implementation stores file names right-padded with ASCII spaces. This causes issues with spaces inside file names.Template:Clarify The 4.4BSD implementation stores extended file namesTemplate:Clarify by placing the string "#1/" followed by the file name length in the file name field, and storing the real file name in front of the data section.<ref name="ar5"/>

The Template:AnchorBSD implementation traditionally does not handle the building of a global symbol lookup table, and delegates this task to a separate utility, Template:Code,<ref>Template:Cite web</ref> which inserts an architecture-specificTemplate:Clarify file named __.SYMDEF as first archive member.<ref>Template:Cite web</ref> Some descendants put a space and "SORTED" after the name to indicate a sorted version.<ref>Template:Cite web</ref> A 64-bit variant called Template:Code exists on Darwin.

To conform to POSIX, newer BSD implementations support the Template:Code option instead of Template:Code. FreeBSD in particular ditched the SYMDEF table format and embraced the System V style table.<ref>Template:Man</ref>

System V (or GNU)Template:Clarify

The System V implementation uses a slash ('/') to mark the end of the file name which allows for the use of spaces without the use of an extended file name. ThenTemplate:Clarify, it stores multiple extended file names in the data section of a fileTemplate:Clarify with the name "//", this record is referred to by future headersTemplate:Clarify. A header references an extended file name by storing a "/" followed by a decimal offset to the start of the file name in the extended file name data section.<ref>An offset is a number of characters; not a line or item index.</ref> The format of this "//" file itself is simply a list of the long file names, each separated by one or more LF characters. This is usually the second entry of the file, after the symbol table which always is the first.

The System V implementation uses the special file name "/" to denote that the following data entry contains a symbol lookup table, which is used in ar librariesTemplate:Clarify to speed up access. This symbol table is built in three parts which are recorded together as contiguous data.

  1. A 32-bit big endian integer, giving the number of entries in the table.
  2. A set of 32-bit big endian integers. One for each symbol, recording the position within the archive of the header for the file containing this symbol.
  3. A set of Zero-terminated strings. Each is a symbol name, and occurs in the same order as the list of positions in part 2.

Some System V systems do not use this format. For operating systems such as HP-UX 11.0, this information is stored in a data structure based on the SOM file format.

The special file "/" is not terminated with a specific sequence; the end is assumed once the last symbol name has been read.Template:Clarify

To overcome the 4 GiB file size limitTemplate:Clarify some operating system like Solaris 11.2 and GNU use a variant lookup table. Instead of 32-bit integers, 64-bit integers are used in the symbol lookup tables. The string "/SYM64/" instead "/" is used as identifier for this table<ref name="ar64bit">Template:Cite web</ref>

Windows

The Windows (PE/COFF) variant is based on the SysV/GNU variant. The first entry "/" has the same layout as the SysV/GNU symbol table. The second entry is another "/", a Microsoft extension that stores an extended symbol cross-reference table. This one is sorted and uses little-endian integers.<ref name="Levine99"/><ref>Template:Citation</ref> The third entry is the optional "//" long name data as in SysV/GNU.<ref>Template:Cite web</ref>

Thin archive

The GNU binutils and Elfutils implementations have an additional "thin archive" format with the magic number !<thin>. A thin archive only contains a symbol table and references to the file. The file format is essentially a System V format archive where every file is stored without the data sections. Every file name is stored as a "long" file name and they are to be resolved as if they were symbolic links.<ref>Template:Cite web</ref>

Examples

The following command creates an archive Template:Tt with object files Template:Tt, Template:Tt, Template:Tt:

ar rcs libclass.a class1.o class2.o class3.o

The linker Template:Code can read object code from an archive file. The following example shows how the archive Template:Tt (specified as Template:Tt) is linked with the object code of Template:Tt.

ld main.o -lclass

See also

References

Template:Reflist

Template:Wikibooks

Template:Unix commands Template:Plan 9 commands Template:Archive formats