|
NAMEsgmls - a validating SGML parserAn SGML System Conforming to
SYNOPSISsgmls [ -deglprsuv ] [ -cfile ] [ -iname ] [ -mfile ] [ filename... ]DESCRIPTIONSgmls parses and validates the SGML document entity in filename... and prints on the standard output a simple ASCII representation of its Element Structure Information Set. (This is the information set which a structure-controlled conforming SGML application should act upon.) Note that the document entity may be spread amongst several files; for example, the SGML declaration, document type declaration and document instance set could each be in a separate file. If no filenames are specified, then sgmls will read the document entity from the standard input. A filename of - can also be used to refer to the standard input.The following options are available:
occurs at the start of the document type declaration subset in the SGML document entity. Since repeated definitions of an entity are ignored, this definition will take precedence over any other definitions of this entity in the document type declaration. Multiple -i options are allowed. If the SGML declaration replaces the reserved name INCLUDE then the new reserved name will be the replacement text of the entity. Typically the document type declaration will contain
and will use %name; in the status keyword specification of a marked section declaration. In this case the effect of the option will be to cause the marked section not to be ignored.
Entity ManagerAn external entity resides in one or more files. The entity manager component of sgmls maps a sequence of files into an entity in three sequential stages:
A system identifier is interpreted as a list of filenames separated by colons. A filename of - can be used to refer to the standard input. If a system identifier is not specified, then the entity manager can generate one using catalog entry files in the format defined in the SGML Open Draft Technical Resolution on Entity Management. A catalog entry file contains a sequence of entries in one of the following four forms:
The last two forms are extensions to the SGML Open format. The delimiters can be omitted from the sysid provided it does not contain any white space. Comments are allowed between parameters delimited by -- as in SGML. The environment variable SGML_CATALOG_FILES contains a colon-separated list of catalog entry files. These will be searched after any catalog entry files specified using the -m option. If this environment variable is not set, then a system dependent list of catalog entry files will be used. A match in a catalog entry file for a PUBLIC entry will take precedence over a match in the same file for an ENTITY or DOCTYPE entry. A filename in a system identifier in a catalog entry file is interpreted relative to the directory containing the catalog entry file. If no match can be found in a catalog entry file, then the entity manager will attempt to generate a filename using the public identifier (if there is one) and other information available to it. Notation identifiers are not subject to this treatment. This process is controlled by the environment variable SGML_PATH; this contains a colon-separated list of filename templates. A filename template is a filename that may contain substitution fields; a substitution field is a % character followed by a single letter that indicates the value of the substitution. The value of a substitution can either be a string or it can be null. The entity manager transforms the list of filename templates into a list of filenames by substituting for each substitution field and discarding any template that contained a substitution field whose value was null. It then uses the first resulting filename that exists and is readable. Substitution values are transformed before being used for substitution: firstly, any names that were subject to upper case substitution are folded to lower case; secondly, space characters are mapped to underscores and slashes are mapped to percents. The value of the %S field is not transformed. The values of substitution fields are as follows:
tab(&); c|c|c s c|c|c s c|c|c|c c|c|c|c l|lB|lB|lB. &&With public identifier &&_ &No public&Device&Device &identifier&independent&dependent _ Data or subdocument entity&nsd&pns&vns General SGML text entity&gml&pge&vge Parameter entity&spe&ppe&vpe Document type definition&dtd&pdt&vdt Link process definition&lpd&plp&vlp The device dependent version is selected if the public text class allows a public text display version but no public text display version was specified.
The value of the following substitution fields will be null unless a valid formal public identifier was supplied.
Normally if the external identifier for an entity includes a
system identifier, the entity manager will use the specified system
identifier and not attempt to generate one. If, however, SGML_PATH
uses the %S field, then the entity manager will first search for a
matching entry in the catalog entry files. If a match is found, then this
will be used instead of the specified system identifier. Otherwise, if the
specified system identifier does not contain any colons, the entity manager
will use SGML_PATH to generate a filename. Otherwise the entity
manager will use the specified system identifier.
System declarationThe system declaration for sgmls is as follows:tab(&); c1 s1 s1 s1 s1 s1 s1 s1 s c s s s s s s s s l l s s s s s s s l l s s s s s s s l l s s s s s s s l l l s s s s s s c s s s s s s s s l l l l l l l l l l l l l l l l l l l l l l l l l l l l l s s s s s s s l l l s s s s s s l l l s s s s s s c s s s s s s s s l l l l l l l l l. SYSTEM "ISO 8879:1986" CHARSET BASESET&"ISO 646-1983//CHARSET &International Reference Version (IRV)//ESC 2/5 4/0" DESCSET&0 128 0 CAPACITY&PUBLIC&"ISO 8879:1986//CAPACITY Reference//EN" FEATURES MINIMIZE&DATATAG&NO&OMITTAG&YES&RANK&NO&SHORTTAG&YES LINK&SIMPLE&NO&IMPLICIT&NO&EXPLICIT&NO OTHER&CONCUR&NO&SUBDOC&YES 1&FORMAL&YES SCOPE&DOCUMENT SYNTAX&PUBLIC&"ISO 8879:1986//SYNTAX Reference//EN" SYNTAX&PUBLIC&"ISO 8879:1986//SYNTAX Core//EN" VALIDATE &GENERAL&YES&MODEL&YES&EXCLUDE&YES&CAPACITY&YES &NONSGML&YES&SGML&YES&FORMAL&YES c s s s s s s s s l l l l l l l l l. SDIF &PACK&NO&UNPACK&NO Exceeding a capacity limit will be ignored unless the -c option is given. The memory usage of sgmls is not a function of the capacity points used by a document; however, sgmls can handle capacities significantly greater than the reference capacity set. In some environments, higher values may be supported for the SUBDOC parameter. Documents that do not use optional features are also supported. For example, if FORMAL NO is specified in the SGML declaration, public identifiers will not be required to be valid formal public identifiers. Certain parts of the concrete syntax may be changed: The shunned character numbers can be changed. Eight bit characters can be assigned to LCNMSTRT, UCNMSTRT, LCNMCHAR and UCNMCHAR. Uppercase substitution can be performed or not performed both for entity names and for other names. Either short reference delimiters assigned by the reference delimiter set or no short reference delimiters are supported. The reserved names can be changed. The quantity set can be increased within certain limits subject to there being sufficient memory available. The upper limit on NAMELEN is 239. The upper limits on ATTCNT, ATTSPLEN, BSEQLEN, ENTLVL, LITLEN, PILEN, TAGLEN, and TAGLVL are more than thirty times greater than the reference limits. The upper limit on GRPCNT, GRPGTCNT, and GRPLVL is 253. NORMSEP cannot be changed. DTAGLEN are DTEMPLEN irrelevant since sgmls does not support the DATATAG feature. SGML declarationThe SGML declaration may be omitted, the following declaration will be implied:tab(&); c1 s1 s1 s1 s1 s1 s1 s1 s c s s s s s s s s l l s s s s s s s. <!SGML "ISO 8879:1986" CHARSET BASESET&"ISO 646-1983//CHARSET &International Reference Version (IRV)//ESC 2/5 4/0" DESCSET& 0 9 UNUSED & 9 2 9 & 11 2 UNUSED & 13 1 13 & 14 18 UNUSED & 32 95 32 &127 1 UNUSED l l l s s s s s s l l s s s s s s s l l l s s s s s s c s s s s s s s s l l l l l l l l l. CAPACITY&PUBLIC&"ISO 8879:1986//CAPACITY Reference//EN" SCOPE&DOCUMENT SYNTAX&PUBLIC&"ISO 8879:1986//SYNTAX Reference//EN" FEATURES MINIMIZE&DATATAG&NO&OMITTAG&YES&RANK&NO&SHORTTAG&YES LINK&SIMPLE&NO&IMPLICIT&NO&EXPLICIT&NO OTHER&CONCUR&NO&SUBDOC&YES 99999999&FORMAL&YES c s s s s s s s s. APPINFO NONE> with the exception that characters 128 through 254 will be assigned to DATACHAR. Sgmls identifies base character sets using the designating sequence in the public identifier. The following designating sequences are recognized: tab(&); c c c c c c c c c ^ c c c c ^ l n n n l. Designating&ISO&Minimum&Number&Description Escape&Registration&Character&of& Sequence&Number&Number&Characters& _ ESC 2/5 4/0&-&0&128&full set of ISO 646 IRV ESC 2/8 4/0&2&33&94&G0 set of ISO 646 IRV ESC 2/8 4/2&6&33&94&G0 set of ASCII ESC 2/13 4/1&100&32&96&G1 set of ISO 8859-1 ESC 2/1 4/0&1&0&32&C0 set of ISO 646 ESC 2/2 4/3&77&0&32&C1 set of ISO 6429 ESC 2/5 2/15 3/0&-&0&256&the system character set When one of the G0 sets is used as a base set, the characters SPACE and DELETE are treated as occurring at positions 32 and 127 respectively; although these characters are not part of the character sets designated by the escape sequences, this mimics the behaviour of ISO 2022 with respect to these code positions. Output formatThe output is a series of lines. Lines can be arbitrarily long. Each line consists of an initial command character and one or more arguments. Arguments are separated by a single space, but when a command takes a fixed number of arguments the last argument can contain spaces. There is no space between the command character and the first argument. Arguments can contain the following escape sequences.
A record start character will be represented by \012. Most applications will need to ignore \012 and translate \n into newline. The possible command characters and arguments are as follows:
BUGSSome non-SGML characters in literals are counted as two characters for the purposes of quantity and capacity calculations.SEE ALSOThe SGML Handbook, Charles F. GoldfarbISO 8879 (Standard Generalized Markup Language), International Organization for Standardization ORIGINARCSGML was written by Charles F. Goldfarb.Sgmls was derived from ARCSGML by James Clark (jjc@jclark.com), to whom bugs should be reported. Visit the GSP FreeBSD Man Page Interface. |