|
|
| |
STAR(4L) |
Schily´s USER COMMANDS |
STAR(4L) |
star - tape archive file format
Tar Archives are layered archives. The basic structure is defined by the
POSIX.1-1988 archive format and documented in the BASIC TAR HEADER
DESCRIPTION section below. The higher level structure is defined by the
POSIX.1-2001 extended headers and documented in the EXTENDED TAR (PAX)
HEADER STRUCTURE section below. POSIX.1-2001 extended headers are pseudo
files that contain an unlimited number of extended header keywords and
associated values. The header keywords are documented in the EXTENDED TAR
(PAX) HEADER KEYWORDS section below.
A tar archive consists of a series of 512 byte (TBLOCK) records
that are grouped to blocks of typically 20 records (10240 bytes).
A number of TBLOCK sizes records are grouped together to a tape
block for physical I/O operations. Each block of n records is written
with a single write(2) operation. On magnetic tapes, this results in
a single tape record in case the tape drive is in variable length
record mode.
The block structure is only visible on blocked tapes. A larger
block size results in higher throughput. Tapes that use a block size larger
than 63 kB however may not be readable with any hardware and need to
be avoided if read compatibility is important. Older tar implementations do
not support block sizes larger than 10240 bytes.
For fast backups, it is recommended to use block sizes of at least
256 kB and to verify throughput and readability on available
hardware.
Note that the terms block and record are frequently
used inconsistently (starting in the historical documentation from 1979
already) and that the term block is typically used for 512 bytes if
the tape block size if not relevant in a specific context.
Physically, a POSIX.1-1988 tar archive consists of a series of fixed
sized blocks of TBLOCK (512) characters. It contains a series of file entries
terminated by a logical end-of-archive marker, which consists of two blocks of
512 bytes of binary zeroes. Each file entry is represented by a header block
that describes the file followed by zero or more blocks with the content of
the file. The length of each file is rounded up to a multiple of 512 bytes.
The header block is defined in star.h as follows:
/*
* POSIX.1-1988 field size values and magic.
*/
#define TBLOCK 512
#define NAMSIZ 100
#define PFXSIZ 155
#define TMODLEN 8
#define TUIDLEN 8
#define TGIDLEN 8
#define TSIZLEN 12
#define TMTMLEN 12
#define TCKSLEN 8
#define TMAGIC "ustar" /* ustar magic 6 chars + '\0' */
#define TMAGLEN 6 /* "ustar" including '\0' */
#define TVERSION "00"
#define TVERSLEN 2
#define TUNMLEN 32
#define TGNMLEN 32
#define TDEVLEN 8
/*
* POSIX.1-1988 typeflag values
*/
#define REGTYPE '0' /* Regular File */
#define AREGTYPE '\0' /* Regular File (outdated) */
#define LNKTYPE '1' /* Hard Link */
#define SYMTYPE '2' /* Symbolic Link */
#define CHRTYPE '3' /* Character Special */
#define BLKTYPE '4' /* Block Special */
#define DIRTYPE '5' /* Directory */
#define FIFOTYPE '6' /* FIFO (named pipe) */
#define CONTTYPE '7' /* Contiguous File */
/*
* POSIX.1-2001 typeflag extensions.
* POSIX.1-2001 calls the extended USTAR format PAX although it is
* definitely derived from and based on USTAR. The reason may be that
* POSIX.1-2001 calls the tar program outdated and lists the
* pax program as the successor.
*/
#define LF_GHDR 'g' /* POSIX.1-2001 global extended header */
#define LF_XHDR 'x' /* POSIX.1-2001 extended header */
See section EXTENDED TAR (PAX) HEADER KEYWORDS for more
information about the structure of a POSIX.1-2001 header.
/*
* star/gnu/Sun tar extensions:
*
* Note that the standards committee allows only capital A through
* capital Z for user-defined expansion. This means that defining
* something as, say '8' is a *bad* idea.
*/
#define LF_ACL 'A' /* Solaris Access Control List */
#define LF_DUMPDIR 'D' /* GNU dump dir */
#define LF_EXTATTR 'E' /* Solaris Extended Attribute File */
#define LF_META 'I' /* Inode (metadata only) no file content */
#define LF_LONGLINK 'K' /* NEXT file has a long linkname */
#define LF_LONGNAME 'L' /* NEXT file has a long name */
#define LF_MULTIVOL 'M' /* Continuation file rest to be skipped */
#define LF_NAMES 'N' /* OLD GNU for names > 100 characters */
#define LF_SPARSE 'S' /* This is for sparse files */
#define LF_VOLHDR 'V' /* tape/volume header Ignore on extraction */
#define LF_VU_XHDR 'X' /* POSIX.1-2001 xtended (Sun VU version) */
/*
* Definitions for the t_mode field
*/
#define TSUID 04000 /* Set UID on execution */
#define TSGID 02000 /* Set GID on execution */
#define TSVTX 01000 /* On directories, restricted deletion flag */
#define TUREAD 00400 /* Read by owner */
#define TUWRITE 00200 /* Write by owner special */
#define TUEXEC 00100 /* Execute/search by owner */
#define TGREAD 00040 /* Read by group */
#define TGWRITE 00020 /* Write by group */
#define TGEXEC 00010 /* Execute/search by group */
#define TOREAD 00004 /* Read by other */
#define TOWRITE 00002 /* Write by other */
#define TOEXEC 00001 /* Execute/search by other */
#define TALLMODES 07777 /* The low 12 bits */
/*
* This is the ustar (Posix 1003.1) header.
*/
struct header {
char t_name[NAMSIZ]; /* 0 Filename */
char t_mode[8]; /* 100 Permissions */
char t_uid[8]; /* 108 Numerical User ID */
char t_gid[8]; /* 116 Numerical Group ID */
char t_size[12]; /* 124 Filesize */
char t_mtime[12]; /* 136 st_mtime */
char t_chksum[8]; /* 148 Checksum */
char t_typeflag; /* 156 Typ of File */
char t_linkname[NAMSIZ]; /* 157 Target of Links */
char t_magic[TMAGLEN]; /* 257 "ustar" */
char t_version[TVERSLEN]; /* 263 Version fixed to 00 */
char t_uname[TUNMLEN]; /* 265 User Name */
char t_gname[TGNMLEN]; /* 297 Group Name */
char t_devmajor[8]; /* 329 Major for devices */
char t_devminor[8]; /* 337 Minor for devices */
char t_prefix[PFXSIZ]; /* 345 Prefix for t_name */
/* 500 End */
char t_mfill[12]; /* 500 Filler up to 512 */
};
/*
* star header specific definitions
*/
#define STMAGIC "tar" /* star magic */
#define STMAGLEN 4 /* "tar" including '\0' */
/*
* This is the new (post Posix 1003.1-1988) xstar header
* defined in 1994.
*
* t_prefix[130] is guaranteed to be ' ' to prevent ustar
* compliant implementations from failing.
* t_mfill & t_xmagic need to be zero for a 100% ustar compliant
* implementation, so setting t_xmagic to
* "tar" should be avoided in the future.
*
* A different method to recognize this format is to verify that
* t_prefix[130] is equal to ' ' and
* t_atime[0]/t_ctime[0] is an octal number and
* t_atime[11] is equal to ' ' and
* t_ctime[11] is equal to ' '.
*
* Note that t_atime[11]/t_ctime[11] may be changed in future.
*/
struct xstar_header {
char t_name[NAMSIZ]; /* 0 Filename */
char t_mode[8]; /* 100 Permissions */
char t_uid[8]; /* 108 Numerical User ID */
char t_gid[8]; /* 116 Numerical Group ID */
char t_size[12]; /* 124 Filesize */
char t_mtime[12]; /* 136 st_mtime */
char t_chksum[8]; /* 148 Checksum */
char t_typeflag; /* 156 Typ of File */
char t_linkname[NAMSIZ]; /* 157 Target of Links */
char t_magic[TMAGLEN]; /* 257 "ustar" */
char t_version[TVERSLEN]; /* 263 Version fixed to 00 */
char t_uname[TUNMLEN]; /* 265 User Name */
char t_gname[TGNMLEN]; /* 297 Group Name */
char t_devmajor[8]; /* 329 Major for devices */
char t_devminor[8]; /* 337 Minor for devices */
char t_prefix[131]; /* 345 Prefix for t_name */
char t_atime[12]; /* 476 st_atime */
char t_ctime[12]; /* 488 st_ctime */
char t_mfill[8]; /* 500 Filler up to star magic */
char t_xmagic[4]; /* 508 "tar" */
};
struct sparse {
char t_offset[12];
char t_numbytes[12];
};
#define SPARSE_EXT_HDR 21
struct xstar_ext_header {
struct sparse t_sp[21];
char t_isextended;
};
typedef union hblock {
char dummy[TBLOCK];
long ldummy[TBLOCK/sizeof (long)]; /* force long alignment */
struct header dbuf;
struct xstar_header xstar_dbuf;
struct xstar_ext_header xstar_ext_dbuf;
} TCB;
For maximum portability, all fields that contain character strings
should be limited to use the low 7 bits of a character.
The name, linkname and prefix field contain
character strings. The strings are null terminated except when they use the
full space of 100 characters for the name or linkname field or
155 characters for the prefix field.
If the prefix does not start with a null character, then
prefix and name need to be concatenated by using the
prefix, followed by a slash character followed by the name
field. If a null character appears in name or prefix before
the maximum size is reached, the field in question is terminated. This way
file names up to 256 characters may be archived. The prefix is not
used together with the linkname field, so the maximum length of a
link name is 100 characters.
The fields magic, uname and gname contain
null terminated character strings.
The version field contains the string "00"
without a trailing zero. It cannot be set to different values as
POSIX.1-1988 did not specify a way to handle different version strings.
The typeflag field contains a single character.
All numeric fields contain size-1 leading zero-filled
numbers using octal digits. They are followed by one or more space or null
characters. All recent implementations only use one space or null character
at the end of a numerical field to get maximum space for the octal number.
Star always uses a space character as terminator. Numeric fields with
8 characters may hold up to 7 octal digits (7777777) which results is a
maximum value of 2097151. Numeric fields with 12 characters may hold up to
11 octal digits (77777777777) which results is a maximum value of
8589934591.
Star implements a vendor specific (and thus non-POSIX)
extension to put bigger numbers into the numeric fields. This is done by
using a base 256 coding. The top bit of the first character in the
appropriate 8 character or 12 character field is set to flag non octal
coding. If base 256 coding is in use, then all remaining characters
are used to code the number. This results in 7 base 256 digits in 8
character fields and in 11 base 256 digits in 12 character fields.
All base 256 numbers are two's complement numbers. A base 256
number in a 8 character field may hold 56 bits, a base 256 number in
a 12 character field may hold 88 bits.
This may be extended to 63 bits for 8 character fields and to 95
bits for 12 character fields. For a negative number, the first character
currently is set to a value of 255 (all 8 bits are set). For a positive
number, the first character currently is set to 128 (the top bit is set and
all other bits are cleared).
The rightmost character in a 8 or 12 character field contains the
least significant base 256 number.
Recent GNU tar and BSD libarchive versions implement the same
extension.
While the POSIX standard makes it obvious that the fields
mode, uid, gid, size, chksum,
devmajor and devminor should be treated as unsigned numbers,
there is no such definition for the time field.
The mode field contains 12 bits holding permissions, see above for
the definitions for each of the permission bits.
The uid and gid fields contain the numerical user id
and group id of the file. These fields may use a base 256
encoding.
The size field contains the size of the file in characters.
If the tar header is followed by file data, then the amount of data
that follows is computed by (size + 511) / 512. This field may use a
base 256 encoding.
The mtime field contains the number of seconds since Jan
1st 1970 00:00 UTC as retrieved via stat(2) in st_mtime. If
the mtime field is assumed to be a signed 33 bit number, the latest
representable time is 2106 Feb 7 06:28:15 GMT. This is because POSIX does
not mention whether the time has to be a signed or unsigned number and thus
no more than 32 bits may be used for a positive value if applications care
about portability. This field may use a base 256 encoding.
The chksum field contains a simple checksum over all bytes
of the header. To compute the value, all characters in the header are
treated as unsigned integers and the characters in the chksum field
are treated as if they were all spaces. When the computation starts, the
checksum value is initialized to 0.
The typeflag field specifies the type of the file that is
archived. If a specific tar implementation does not include support
for a specific typeflag value, this implementation will extract the unknown
file types as if they were plain files. For this reason, the size
field for any file type except directories, hard links, symbolic links,
character special, block specials and FIFOs must always follow the rules for
plain files.
- '0' REGTYPE
- A regular file. If the size field is non zero, then file data
follows the header.
- '\0' AREGTYPE
- For backwards compatibility with pre POSIX.1-1988 tar
implementations, a nul character is also recognized as marker for plain
files. It is not generated by recent tar implementations. If the
size field is non zero, then file data follows the header.
- '1' LNKTYPE
- The file is a hard link to another file. The name of the file that the
file is linked to is in the linkname part of the header. For
tar archives written by pre POSIX.1-1988 implementations, the
size field usually contains the size of the file and needs to be
ignored as no data may follow this header type. For POSIX.1-1988 compliant
archives, the size field needs to be 0. For POSIX.1-2001 compliant
archives, the size field may be non zero, indicating that file data
is included in the archive.
- '2' SYMTYPE
- The file is a symbolic link to another file. The name of the file that the
file is linked to is in the linkname part of the header. The
size field needs to be 0. No file data may follow the header.
- '3' CHRTYPE
- A character special file. The fields devmajor and devminor
contain information that defines the device id of the file. The meaning of
the size field is unspecified by the POSIX standard. No file data
may follow the header.
- '4' BLKTYPE
- A block special file. The fields devmajor and devminor
contain information that defines the device id of the file. The meaning of
the size field is unspecified by the POSIX standard. No file data
may follow the header.
- '5' DIRTYPE
- A directory or sub directory. Old (pre POSIX.1-1988) tar
implementations did use the same typeflag value as for plain files
and added a slash to the end of the name field. If the size
field is non zero then it indicates the maximum size in characters the
system may allocate for this directory. If the size field is 0,
then the system shall not limit the size of the directory. On operating
systems where the disk allocation is not done on a directory base, the
size field is ignored on extraction. No file data may follow the
header.
- '6' FIFOTYPE
- A named pipe. The meaning of the size field is unspecified by the POSIX
standard. The size field must be ignored on extraction. No file
data may follow the header.
- '7' CONTTYPE
- A contiguous file. This is a file that gives special performance
attributes. Operating systems that don't support this file type extract
this file type as plain files. If the size field is non zero, then
file data follows the header.
- 'g' GLOBAL POSIX.1-2001 HEADER
- With POSIX.1-2001 pax archives, this type defines a global extended
header. The size is always non zero and denotes the sum of the
length fields in the extended header data. The data that follows the
header is in the pax extended header format. The extended header
records in this header type affect all following files in the archive
unless they are overwritten by new values. See EXTENDED TAR (PAX)
HEADER FORMAT section below.
- 'x' EXTENDED POSIX.1-2001 HEADER
- With POSIX.1-2001 pax archives, this type defines an extended header. The
size is always non zero and denotes the sum of the length fields in
the extended header data. The data that follows the header is in the
pax extended header format. The extended header records in this
header type only affect the following file in the archive. See EXTENDED
TAR (PAX) HEADER FORMAT section below.
- 'A' - 'Z'
- Reserved for vendor specific implementations.
- 'A'
- A Solaris ACL entry as used by the tar implementation from Sun. The
size is always non zero and denotes the length of the data that
follows the header. Star currently is not able to handle this
header type. As the ACL data used by this format dos not include
the numerical user and group id's, this format is not recommended for
archival.
- 'D'
- A GNU dump directory. This header type is not created by star and
handled like a POSIX type '5' directory during an extract operation, so
the data content is ignored by star. The size field denotes
the length of the data that follows the header.
- 'E'
- A Solaris Extended Attribute File that is used to archive NFSv4
type extended attributes. The size field denotes the length of the
data that follows the header. Star currently is not able to handle
this header type.
- 'I'
- A inode metadata entry. This header type is used by star to
archive inode meta data only. To archive more inode meta data than
possible with a POSIX-1.1988 tar header, a header with type
'I' is usually preceded by a 'x' header. It is used with
incremental backups. The size field holds the length of the file.
No file data follows this header.
- 'K'
- A long link name. Star is able to read and write this type of
header with the star, xstar and gnutar formats. With
the xustar and exustar formats, star prefers to store
long link names using the POSIX.1-2001 method. The size is always
non zero and denotes the length of the long link target name including the
trailing null byte. The link name is in the data that follows the
header.
- 'L'
- A long file name. Star is able to read and write this type of
header with the star, xstar and gnutar formats. With
the xustar and exustar formats, star prefers to store
long file names using the POSIX.1-2001 method. The size is always
non zero and denotes the length of the long file name including the
trailing null byte. The file name is in the data that follows the
header.
- 'M'
- A multi volume continuation entry. It is used by star to tell the
extraction program via the size field when the next regular archive
header will follow. This allows to start extracting multi volume archives
with a volume number greater than one. It is used by GNU tar to verify
multi volume continuation volumes. Other fields in the GNU multi volume
continuation header are a result of a GNU tar miss conception and cannot
be used in a reliable tar implementation. If the size field is non
zero the data following the header is skipped by star if the volume
that starts with it is mounted as the first volume. This header is ignored
if the volume that starts with it is mounted as continuation volume.
Instead, the following data is used as the continuation of the file that
is currently extracted.
- 'N'
- An outdated linktype used by old GNU tar versions to store long file
names. This type is unsupported by star.
- 'S'
- A sparse file. This header type is used by star and GNU tar.
A sparse header is used instead of a plain file header to denote a sparse
file that follows. Directly after the header, a list of sparse hole
descriptors follows followed by the compacted file data. With star
formats, the size field holds a size that represents the sum of the
sparse hole descriptors plus the size of the compacted file data. This
allows other tar implementations to correctly skip to the next
tar header. With GNU tar, up to 4 sparse hole descriptors fit into
the sparse header. Additional hole descriptors are not needed if the file
has less than 4 holes. With GNU tar, the size field breaks general
tar header rules in case more than 4 sparse hole descriptors are
used and is meaningless because the size of the additional sparse hole
descriptors used by GNU tar does not count and cannot be determined before
parsing all sparse hole descriptors.
- 'V'
- A volume header. The name field is is used to hold the volume name.
Star uses the atime field to hold the volume number in case
there is no POSIX.1-2001 extended header. This header type is used by
star and GNU tar. If the size field is non zero the
data following the header is skipped by star.
- 'X'
- A vendor unique variant of the POSIX.1-2001 extended header type. It has
been implemented by Sun many years before the POSIX.1-2001 standard has
been approved and the POSIX.1-2001 tar extensions are based on this Sun
tar extension. See also the typeflag 'x' header type. Star
is able to read and write this type of header.
The devmajor and devminor fields contain the
numerical major() and minor() information that defines the
device id of the file from the member st_rdev in struct stat.
These fields may use a base 256 encoding.
Block type |
Description |
Ustar Header [typeflag='g'] |
Global Extended Header |
Global Extended Data |
Ustar Header [typeflag='x'] |
Extended Header |
Extended Data |
Ustar header [typeflag='0'] |
File with Extended Header |
Data for File #1 |
Ustar header [typeflag='0'] |
File without Extended Header |
Data for File #2 |
Block of binary zeroes |
First EOF Block |
Block of binary zeroes |
Second EOF Block |
The data block that follows a tar archive header with typeflag
'g' or 'x' contains one or more records in the following format:
"%d %s=%s\n", <length>,
<keyword>, <value>
Each record starts with a a decimal length field. The length
includes the total size of a record including the length field itself and
the trailing new line.
The keyword may not include an equal sign. All keywords
beginning with lower case letters and digits are reserved for future use by
the POSIX standard.
If the value field is of zero length, it deletes any header field
of the same name that is in effect from the same extended header or from a
previous global header.
Null characters do not delimit any value. The data used for
value is only limited by its implicit length.
POSIX.1-2001 extended pax header keywords. All numerical values are
represented as decimal strings. All texts are represented as UTF-8 or an
unspecified binary format (see hdrcharset keyword) that is expected to
be understood by the receiving system:
- atime
- The time from st_atime in sub second granularity. Star
currently supports a nanosecond granularity.
- charset
- The name of the character set used to encode the data in the following
file(s).
The following values are supported for charset:
- ISO-IR 646 1990
- ISO/IEC 646:1990
- ISO-IR 8859 1 1998
- ISO/IEC 8859-1:1998
- ISO-IR 8859 2 1998
- ISO/IEC 8859-2:1998
- ISO-IR 8859 3 1998
- ISO/IEC 8859-3:1998
- ISO-IR 8859 4 1998
- ISO/IEC 8859-4:1998
- ISO-IR 8859 5 1998
- ISO/IEC 8859-5:1998
- ISO-IR 8859 6 1998
- ISO/IEC 8859-6:1998
- ISO-IR 8859 7 1998
- ISO/IEC 8859-7:1998
- ISO-IR 8859 8 1998
- ISO/IEC 8859-8:1998
- ISO-IR 8859 9 1998
- ISO/IEC 8859-9:1998
- ISO-IR 8859 10 1998
- ISO/IEC 8859-10:1998
- ISO-IR 8859 11 1998
- ISO/IEC 8859-11:1998
- ISO-IR 8859 12 1998
- ISO/IEC 8859-12:1998
- ISO-IR 8859 13 1998
- ISO/IEC 8859-13:1998
- ISO-IR 8859 14 1998
- ISO/IEC 8859-14:1998
- ISO-IR 8859 15 1998
- ISO/IEC 8859-15:1998
- ISO-IR 10646 2000
- ISO/IEC 10646:2000
- ISO-IR 10646 2000 UTF-8
- ISO/IEC 10646, UTF-8 encoding
- BINARY
- None
This keyword is currently ignored by star.
- comment
- Any number of characters that should be treated as comment.
Star ignores the comment as documented by the POSIX standard.
- ctime
- The time from st_ctime in sub second granularity. Star
currently supports a nanosecond granularity.
- gid
- The group ID of the group that owns the file. The argument is a decimal
number. This field is used if the group ID of a file is greater than
2097151 (octal 7777777).
- gname
- The group name keyword for the following file(s) is created if the group
name does not fit into 32 characters or cannot be expressed in 7-Bit
ASCII. It is coded in UTF-8 or (if the hdrcharset keyword is
present) coded to fit the charset value.
- hdrcharset
- The name of the character set used to encode the data for the
gname, linkpath, path and uname fields in the
POSIX.1-2001 extended header records and for the gname,
uname and path parts in the vendor specific extended header
records SCHILY.acl.ace, SCHILY.acl.access,
SCHILY.acl.default and SCHILY.dir.
The following values are supported for hdrcharset:
- ISO-IR 10646 2000 UTF-8
- ISO/IEC 10646, UTF-8 encoding
- BINARY
- None
If the binary encoding is selected, the encoding is the same as
used by the creating system and it is assumed that the receiving system is
able to use the values in that encoding.
- linkpath
- The linkpath keyword is created if the linkpath is longer than 100
characters or cannot be expressed in 7-Bit ASCII. It is coded in UTF-8 or
(if the hdrcharset keyword is present) coded to fit the charset
value.
- mtime
- The time from st_mtime in sub second granularity. Star
currently supports a nanosecond granularity.
- path
- The path keyword is created if the path does not fit into 100
characters + 155 characters prefix or cannot be expressed in 7-Bit ASCII.
It is coded in UTF-8 or (if the hdrcharset keyword is present)
coded to fit the charset value.
- realtime.any
- The keywords prefixed by realtime. are reserved for future
standardization.
- security.any
- The keywords prefixed by security. are reserved for future
standardization.
- size
- The size of the file as decimal number if the file size is greater than
8589934591 (octal 77777777777). The size keyword may not refer to
the real file size but is related to the size if the file in the archive.
See also SCHILY.realsize for more information.
- uid
- The uid ID of the group that owns the file. The argument is a decimal
number. This field is used if the uid ID of a file is greater than 2097151
(octal 7777777).
- uname
- The user name keyword for the following file(s) is created if the user
name does not fit into 32 characters or cannot be expressed in 7-Bit
ASCII. It is coded in UTF-8 or (if the hdrcharset keyword is
present) coded to fit the charset value.
- VENDOR.keyword
- Any keyword that starts with a vendor name in capital letters is reserved
for vendor specific extensions by the standard. Star uses a lot of
these vendor specific extension. See below for more informations.
Star uses own vendor specific extensions. The SCHILY vendor
specific extended pax header keywords are:
- SCHILY.acl.ace
- The NFSv4 ACL for a file.
Since no official backup format for the NFSv4 ACL standard has
been defined, star uses the vendor defined attributes
SCHILY.acl.ace for storing the NFSv4 ACL entries.
Previous versions of star used a format for the ACL
text that is is the format created by the function acl_totext()
from libsec on Solaris, using the call:
acl_totext(aclp, \
ACL_COMPACT_FMT | ACL_APPEND_ID | ACL_SID_FMT);
The flags have the following meaning:
- ACL_COMPACT_FMT
- Create the compact version of the ACL text representation.
- ACL_APPEND_ID
- Append uid or gid for additional user or group entries.
- ACL_SID_FMT
- Use the usersid or groupsid format for entries related to an
ephemeral uid or gid. The raw sid format will only be
used when the "id" cannot be resolved to a windows name.
This is an example of the format used for SCHILY.acl.ace (a
space has been inserted after the equal sign and lines are broken [marked
with '\' ] for readability):
SCHILY.acl.ace= user:lisa:rwx-----------:-------:allow:502, \
group:toolies:rwx-----------:-------:allow:102, \
owner@:--x-----------:-------:deny, \
owner@:rw-p---A-W-Co-:-------:allow, \
group@:-wxp----------:-------:deny, \
group@:r-------------:-------:allow, \
everyone@:-wxp---A-W-Co-:-------:deny, \
everyone@:r-----a-R-c--s:-------:allow
The numerical user and group identifiers are essential when
restoring a system completely from a backup, as initially the
name-to-identifier mappings may not be available, and then file ownership
restoration would not work.
Newer versions of star use a highly compact variant of the format
mentioned above that avoids the '-' characters in the text. The example
below is using lines broken the same way as in the previous example.
SCHILY.acl.ace= user:lisa:rwx::allow:502, \
group:toolies:rwx::allow:102, \
owner@:x::deny, \
owner@:rwpAWCo::allow, \
group@:wxp::deny, \
group@:r::allow, \
everyone@:wxpAWCo::deny, \
everyone@:raRcs::allow
This highly compact format is understood by acl_fromtext()
in libsec from Solaris and by the corresponding code from FreeBSD. It
is created by removing the '-' characters from the normal compact
format.
The advantage of th highly compact format is that it typically
avoids the need to make the extended header data larger than 512 bytes.
In addition to the documented entry formats, a compatible
implementation needs to be able to understand the long ace format, if
it appears in extended tar headers. The long format for the ACL text is the
format created by the function acl_totext() from libsec on
Solaris, using the call:
acl_totext(aclp, ACL_APPEND_ID | ACL_SID_FMT);
As the archive format that is used for backing up access control
lists is compatible with the pax archive format, archives created
that way can be restored by star or a POSIX.1-2001 compliant
pax. Note that programs that do not implement compatibility to the
star extensions will ignore the ACL information.
- SCHILY.acl.access
- The withdrawn POSIX draft ACL for a file.
Since no official backup format for the withdrawn POSIX draft
access control lists has been defined, star uses the vendor
defined attributes SCHILY.acl.access and
SCHILY.acl.default for storing the ACL and Default
ACL of a file, respectively. The access control lists are stored in
the short text form as defined in the withdrawn POSIX 1003.1e draft
standard 17.
Note that the POSIX 1003.1e draft has been withdrawn in 1997
but some operating systems still support it with some filesystems.
To each named user ACL entry a fourth colon separated
field, containing the user identifier (UID) of the associated
user, is appended. To each named group entry a fourth colon separated
field containing the group identifier (GID) of the associated
group is appended. (POSIX 1003.1e draft standard 17 allows to add fields
to ACL entries.)
If the user name or group name field is numeric
because the related user has no entry in the passwd/group
database at the time the archive is created, the additional numeric
field may be omitted.
This is an example of the format used for
SCHILY.acl.access (a space has been inserted after the equal sign
and lines are broken [marked with '\' ] for readability, additional
fields in bold):
SCHILY.acl.access= user::rwx,user:lisa:r-x:502, \
group::r-x,group:toolies:rwx:102, \
mask::rwx,other::r--x
If and only if the user ID 502 and group ID 102
have no passwd/group entry, our example acl entry looks this way:
SCHILY.acl.access= user::rwx,user:502:r-x, \
group::r-x,group:102:rwx:, \
mask::rwx,other::r--x
The added numerical user and group identifiers are essential
when restoring a system completely from a backup, as initially the
name-to-identifier mappings may not be available, and then file
ownership restoration would not work.
When the archive is unpacked and the ACL entries for
the files are restored, first the additional numeric fields are removed
and an attempt is made to restore the resulting ACL data. If that
fails, the numeric fields are extracted and the related user name
and group name fields are replaced by the numeric fields, before
the ACL restore is retried.
As the archive format that is used for backing up access
control lists is compatible with the pax archive format, archives
created that way can be restored by star or a POSIX.1-2001
compliant pax. Note that programs other than star will
ignore the ACL information.
- SCHILY.acl.default
- The default ACL for a file. See SCHILY.acl.access for more
information.
This is an example of the format used for
SCHILY.acl.default (a space has been inserted after the equal
sign and lines are broken [marked with '\' ] for readability, additional
fields in bold):
SCHILY.acl.default= user::rwx,user:lisa:r-x:502, \
group::r-x,mask::r-x,other::r-x
- SCHILY.acl.type
- The ACL type used for coding access control lists.
The following ACL types are possible:
- POSIX draft
- ACLs as defined in the withdrawn POSIX 1003.1e draft standard 17.
- NFSv4
- ACLs as used by NFSv4, NTFS and ZFS.
Note that the SCHILY.acl.type keyword is currently not
generated by star. Star however accepts this keyword if it appears in
extended tar headers. The ACL type is determined from the existence of the
keyword type that holds the ACL text.
- SCHILY.ddev
- The device ids for names used is the SCHILY.dir dump directory list
from st_dev of the file as decimal number. The SCHILY.ddev
keyword is followed by a space separated list of device id numbers. Each
corresponds exactly to a name in the list found in SCHILY.dir. If a
specific device id number is repeated, a comma (,) without a following
space may be use to denote that the current device id number is identical
to the previous number. This keyword is used in dump mode. This
keyword is not yet implemented. It will be implemented in case that
star will support incremental dumps that span more than one
filesystem.
The value is a signed int. An implementation should be able to
handle at least 64 bit values. Note that the value is signed because
POSIX does not specify more than the type should be an int.
- SCHILY.dev
- The device id from st_dev of the file as decimal number. This
keyword is used in dump mode.
The value is a signed int. An implementation should be able to
handle at least 64 bit values. Note that the value is signed because
POSIX does not specify more than the type should be an int.
- SCHILY.devmajor
- The device major number of the file (from st_rdev) if it is a
character or block special file. The argument is a decimal number. This
field is used if the device major of the file is greater than 2097151
(octal 7777777).
The value is a signed int. An implementation should be able to
handle at least 64 bit values. Note that the value is signed because
POSIX does not specify more than the type should be an int.
- SCHILY.devminor
- The device minor number of the file (from st_rdev) if it is a
character or block special file. The argument is a decimal number. This
field is used if the device minor of the file is greater than 2097151
(octal 7777777).
The value is a signed int. An implementation should be able to
handle at least 64 bit values. Note that the value is signed because
POSIX does not specify more than the type should be an int.
- SCHILY.devminorbits
- The number of minorbits used in the device id from st_dev as
decimal number.
The value is mainly needed for SunOS where the number of minor
bits in st_dev depends on whether a program is run in 32 or 64 bit mode.
There is no support for platforms that do not have the minor part of the
device id in a contiguous set of bits (like e.g. FreeBSD).
- SCHILY.dino
- The inode numbers for names used is the SCHILY.dir dump directory
list from st_ino of the file as decimal number. The
SCHILY.dino keyword is followed by a space separated list of inode
numbers. Each corresponds exactly to a name in the list found in
SCHILY.dir. This keyword is used in dump mode.
The values are unsigned int. An implementation should be able
to handle at least 64 bit unsigned values.
- SCHILY.dir
- A list of filenames (the content) for the current directory. The names are
coded in UTF-8. Each file name is prefixed by a single character that is
used as a flag. Each file name is limited by a null character. The null
character is directly followed by he flag character for the next file name
in case the list is not terminated by the current file name. The flag
character must not be a null character. By default, a ^A (octal 001) is
used. The following flags are defined:
- \000
- This is the list terminator character - the second null byte, see
below.
- \001 (^A)
- The default flag that is used in case the dump dir features have
not been active or in case that the file type is unknown.
- \002 (^B)
- The related file is a FIFO special (named pipe).
- \003 (^C)
- The related file is a character special.
- \004 (^D)
- Reserved, used e.g. by XENIX multiplexed character special.
- \005 (^E)
- The related file is a directory.
- \006 (^F)
- Reserved, used e.g. by XENIX named file.
- \007 (^G)
- The related file is a block special.
- \010 (^H)
- Reserved, used e.g. by XENIX multiplexed block special.
- \011 (^I)
- The related file is a regular file.
- \012 (^J)
- The related file is a contiguous file.
- \013 (^K)
- The related file is a symbolic link.
- \014 (^L)
- Reserved, used e.g. by Solaris shadow inode.
- \015 (^M)
- The related file is a socket.
- \016 (^N)
- The related file is a Solaris DOOR.
- \017 (^O)
- The related file is a BSD whiteout entry.
- \020 (^P)
- Reserved, used e.g. by UNOS eventcount.
- Y
- A non directory file that is in the current (incremental) dump.
- N
- A non directory file that is not in the current (incremental) dump.
- D
- A directory that is in the current (incremental) dump.
- d
- A directory that is not in the current (incremental) dump.
The list is terminated by two successive null bytes. The first is
the null byte for the last file name. The second null byte is at the
position where a flag character would be expected, it acts ad a list
terminator. The length tag for the SCHILY.dir data includes both null
bytes.
If a dump mode has been selected that writes compact complete
directory information to the beginning of the archive, the flag character
may contain values different from ^A. Star implementations at least
up to star-1.5.1 do not use the feature to tag entries and use the
default entry \001 (^A) for all files. Tar implementations that like to read
archives that use the SCHILY.dir keyword, shall not rely on values
other than \000 (^@) or \001 (^A).
In 2016, with star-1.5.3 the values from \002 to \020 have
been introduced as a result of a libfind update that uses struct
dirent member d_type where available.
This keyword is used in dump mode.
- SCHILY.fflags
- A textual version of the BSD or Linux extended file flags.
The following flags are defined by star, the bold names are
the names that are generated and the other names are accepted on input
as well:
- arch
- set the archived flag (super-user only).
- archived
- Alias for arch.
- compressed
- set the compressed flag (Mac OS only).
- ucompressed
- Alias for compressed.
- hidden
- Set the file is hidden flag.
- uhidden
- Alias for hidden.
- nodump
- set the nodump flag (owner or super-user).
- offline
- Set the file is offline flag.
- uoffline
- Alias for offline.
- opaque
- set the opaque flag (owner or super-user).
- rdonly
- Set the readonly flag.
- urdonly
- Alias for rdonly.
- readonly
- Alias for rdonly.
- reparse
- Set the reparse flag.
- ureparse
- Alias for reparse.
- sappnd
- set the system append-only flag (super-user only).
- sappend
- Alias for sappnd.
- schg
- set the system immutable flag (super-user only).
- schange
- Alias for schg.
- simmutable
- Alias for schg.
- sparse
- Set the sparse flag.
- usparse
- Alias for sparse.
- sunlnk
- set the system undeletable flag (super-user only).
- sunlink
- Alias for sunlnk.
- system
- Set the system flag.
- usystem
- Alias for system.
- uappnd
- set the user append-only flag (owner or super-user).
- uappend
- Alias for uappnd.
- uchg
- set the user immutable flag (owner or super-user).
- uchange
- Alias for uchg.
- uimmutable
- Alias for uchg.
- uunlnk
- set the user undeletable flag (owner or super-user).
- uunlink
- Alias for uunlnk.
The following flags are only available on Linux:
- compress
- Set the Linux compress flag (owner or super-user).
- dirsync
- Set the Linux dirsync flag (owner or super-user) that causes
synchronous writes for directories.
- journal-data
- Set the Linux journal data flag (super-user only).
- noatime
- Set the Linux no access time flag (owner or super-user).
- nocow
- Set the Linux no copy on write flag (owner or super-user).
- notail
- Set the Linux no tail merging flag (owner or super-user).
- projinherit
- Set the Linux project inherit flag (owner or super-user).
- secdel
- Set the Linux secure deletion (purge before delete) flag (owner or
super-user).
- sync
- Set the Linux sync flag (owner or super-user).
- topdir
- Set the Linux top of directory hierarchies flag (owner or
super-user).
- undel
- Set the Linux allow unrm flag (owner or super-user).
- SCHILY.filetype
- A textual version of the real file type of the file. The following names
are used:
- unallocated
- An unknown file type that may be a result of a unlink(2) operation.
This should never happen.
- regular
- A regular file.
- contiguous
- A contiguous file. On operating systems or file systems that don't support
this file type, it is handled like a regular file.
- symlink
- A symbolic link to any file type.
- directory
- A directory.
- character special
- A character special file.
- block special
- A block special file.
- fifo
- A named pipe.
- socket
- A UNIX domain socket.
- mpx character special
- A multiplexed character special file.
- mpx block special
- A multiplexed block special file.
- XENIX nsem
- A XENIX named semaphore.
- XENIX nshd
- XENIX shared data.
- door
- A Solaris door.
- eventcount
- A UNOS event count.
- whiteout
- A BSD whiteout directory entry.
- sparse
- A sparse regular file.
- volheader
- A volume header.
- unknown/bad
- Any other unknown file type. This should never happen.
- SCHILY.fsdevmajor
- The device major number of the file (from st_dev). This keyword is
used in dump mode. The argument is a decimal number.
The value is a signed int. An implementation should be able to
handle at least 64 bit values. Note that the value is signed because
POSIX does not specify more than the type should be an int.
- SCHILY.fsdevminor
- The device minor number of the file (from st_dev). This keyword is
used in dump mode. The argument is a decimal number.
The value is a signed int. An implementation should be able to
handle at least 64 bit values. Note that the value is signed because
POSIX does not specify more than the type should be an int.
- SCHILY.ino
- The inode number from st_ino of the file as decimal number. This
keyword is used in dump mode.
The value is an unsigned int. An implementation should be able
to handle at least 64 bit unsigned values.
- SCHILY.nlink
- The link count of the file as decimal number. This keyword is used in
dump mode.
The value is an unsigned int. An implementation should be able
to handle at least 32 bit unsigned values.
- SCHILY.offset
- The offset value for a multi volume continuation header. This
keyword is used with multi volume continuation headers. Multi volume
continuation headers are used to allow to start reading a multi volume
archive past the first volume.
SCHILY.offset specifies the byte offset within a file
that was split across volumes as a result of a multi volume media change
operation.
The value is an unsigned int. An implementation should be able
to handle at least 64 bit unsigned values.
- SCHILY.realsize
- The real size of the file as decimal number. This keyword is used if the
real size of the file differs from the visible size of the file in the
archive. The real file size differs from the size in the archive if the
file type is sparse or if the file is a continuation file on a
multi volume archive. In case the SCHILY.realsize keyword is
needed, it must be past any size keyword in case a size
keyword is also present.
As sparse files allocate less space on tape than a regular
file and as a continued file that started on a previous volume only
holds parts of the file, the SCHILY.realsize keyword holds a
bigger number than the size keyword.
The value is an unsigned int. An implementation should be able
to handle at least 64 bit unsigned values.
- SCHILY.tarfiletype
- The following additional file types are used in
SCHILY.tarfiletype:
- hardlink
- A hard link to any file type.
- dumpdir
- A directory with dump entries
- multivol continuation
- A multi volume continuation for any file type.
- meta
- A meta entry (inode meta data only) for any file type.
- SCHILY.xattr.attr
- A POSIX.1-2001 coded version of the Linux extended file attributes. Linux
extended file attributes are name/value pairs. Every attribute name
results in a SCHILY.xattr.name tag and the value of the
extended attribute is used as the value of the POSIX.1-2001 header tag.
Note that this way of coding is not portable across platforms, even though
it is compatible with the implementation on Mac OS X. A version for
BSD may be created but NFSv4 includes far more features with
extended attribute files than Linux does.
A future version of star will implement a similar
method as the tar program on Solaris currently uses. When this
implementation is ready, the SCHILY.xattr.name feature may
be removed in favor of a truly portable implementation that supports
Solaris also.
The following star vendor unique extensions may only appear in
'g'lobal extended pax headers:
- SCHILY.archtype
- The textual version of the archive type used. The textual values used for
SCHILY.archtype are the same names that are used in the star
command line options to set up a specific archive type.
The following values may currently appear in a global extended
header:
- xustar
- 'xstar' format without "tar" signature at header offset
508.
- exustar
- 'xustar' format variant that always includes x-headers and g-headers.
A complete tar implementation must be prepared to handle all
archives names as documented in star(1).
In order to allow archive type recognition from this keyword, the
minimum tape block size must be 2x512 bytes (1024 bytes) and the
SCHILY.archtype keyword needs to be in the first 512 bytes of the
content of the first 'g'lobal pax header. Then the first tape block
may be scanned to recognize the archive type.
- SCHILY.release
- The textual version of the star version string and the platform
name where this star has been compiled. The same text appears when
calling star -version.
Other implementations may use a version string that does not
start with the text star.
- SCHILY.volhdr.blockoff
- This keyword is used for multi volume archives. It represents the offset
within the whole archive expressed in 512 byte units.
The value is an unsigned int with a valid range between 1 and
infinity. An implementation should be able to handle at least 64 bit
unsigned values.
- SCHILY.volhdr.blocksize
- The tape blocksize expressed in 512 byte units that was used when writing
the archive.
The value is an unsigned int with a valid range between 1 and
infinity. An implementation should be able to handle at least 31 bit
unsigned values.
- SCHILY.volhdr.cwd
- This keyword is used in dump mode. It is only emitted in case the
fs-name= option of star was used to overwrite the
SCHILY.volhdr.filesys value. If SCHILY.volhdr.cwd is
present, it contains the real backup working directory.
Overwriting SCHILY.volhdr.filesys is needed when
backups are run on file system snapshots rather than on the real file
system.
- SCHILY.volhdr.device
- This keyword is used in dump mode. It represents the name of the device
that holds the file system data. For disk based file systems, this is the
device name of the mounted device.
This keyword is optional. It helps to correctly identify the
file system from which this dump has been made.
- SCHILY.volhdr.dumpdate
- This keyword is used in dump mode. It represents the time the current dump
did start.
- SCHILY.volhdr.dumplevel
- This keyword is used in dump mode. It represents the level of the current
dump. Dump levels are small numbers, the lowest possible number is 0. Dump
level 0 represents a full backup. Dump level 1 represents a backup that
contains all changes that did occur since the last level 0 dump. Dump
level 2 represents a backup that contains all changes that did occur since
the last level 1 dump. Star does not specify a maximum allowed dump
level but you should try to keep the numbers less than 100.
The value is an unsigned int with a valid range between 0 and
at least 100.
- SCHILY.volhdr.dumptype
- This keyword is used in dump mode. If the dump is a complete dump of a
file system (i.e. no files are excluded via command line), then the
argument is the text full, else the argument is the text
partial.
- SCHILY.volhdr.filesys
- This keyword is used in dump mode. It represents the top level directory
for the file system from which this dump has been made. If the dump
represents a dump that has an associated level, then the this directory
needs to be identical to the root directory of this file system which is
the mount point.
- SCHILY.volhdr.hostname
- This keyword is used in dump mode. The value is retrieved from
gethostname(3) or uname(2).
- SCHILY.volhdr.label
- The textual volume label. The volume label must be identical within a set
of multi volume archives.
- SCHILY.volhdr.refdate
- This keyword is used in dump mode if the current dump is an incremental
dump with a level > 0. It represents the time the related dump did
start.
- SCHILY.volhdr.reflevel
- This keyword is used in dump mode if the current dump is an incremental
dump with a level > 0. It represents the level of the related dump. The
related dump is the last dump with a level that is lower that the level of
this dump. If a dump with the level of the current dump -1 exists, then
this is the related dump level. Otherwise, the dump level is decremented
until a valid dump level could be found in the dump database.
The value is an unsigned int with a valid range between 0 and
at least 100.
- SCHILY.volhdr.tapesize
- This keyword is used for multi volume archives and may be used to verify
the volume size on read back. It represents the tape size expressed in 512
byte units. This keyword is set in multi volume mode if the size of the
tape was not autodetected but set from a command line option.
The value is an unsigned int with a valid range between 1 and
infinity. An implementation should be able to handle at least 64 bit
unsigned values.
- SCHILY.volhdr.volume
- This keyword is used for multi volume archives. It represents the volume
number within a volume set. The number used for the first volume is 1.
The value is an unsigned int with a valid range between 1 and
infinity. An implementation should be able to handle at least 31 bit
unsigned values.
Multi volume archives always use volume headers. Starting with the second
volume, there is a multi volume header that helps to skip the rest of the file
that was split at the end of the previous volume.
Star is able to work with arbitrary unknown volume sizes by
detecting the end of the current media via a write() call that
returns 0. A fixed media size is used, when the option
tsize=# has been specified.
Since star uses a fifo for optimizing the I/O,
except when called with the option -no-fifo, it is not possible to
know the name of the file that is split at a volume limit nor to know the
offset in that file. Unless POSIX.1-2001 extensions are used,
star does not verify whether a follow up volume is the right follow
up volume. For this reason, it is recommended to create multi volume
archives only with archive formats that support POSIX.1-2001
extensions.
The following POSIX.1-2001 extensions are used together
with multi volume archives:
- SCHILY.volhdr.label
- the volume lavel is used to help to identify a set of volumes.
- SCHILY.volhdr.dumpdate
- The start of the dump with nanosecond precision is used to identify the
correct follow up volume.
- SCHILY.volhdr.volno
- The volume number counts starting with 1 and is used to identify the
correct follow up volume.
- SCHILY.volhdr.blockoff
- The number of blocks read with all previous volumes.
- SCHILY.volhdr.tapesize
- The tape size in case that the tsize=# option was used.
spax(1), suntar(1), scpio(1), tar(1),
cpio(1), pax(1), star_sym(1), tartest(1),
star(1)
A tar command appeared in Seventh Edition Unix, which was released in
January, 1979. It replaced the tp program from Fourth Edition Unix
which replaced the tap program from First Edition Unix.
Star was first created in 1982 to extract tapes on a UNIX
clone (UNOS) that had no tar command. In 1985 the first fully
functional version has been released as mtar.
When the old star format extensions have been introduced in
1985, it was renamed to star (Schily tar). In 1994, Posix 1003.1-1988
extensions were added and star was renamed to star (Standard
tar).
Joerg Schilling
D-13353 Berlin
Germany
Mail bugs and suggestions to:
joerg@schily.net
Visit the GSP FreeBSD Man Page Interface. Output converted with ManDoc. |