|
|
| |
SNOBOL4IO(1) |
CSNOBOL4 Manual |
SNOBOL4IO(1) |
snobol4io - SNOBOL4 file I/O
Macro SNOBOL4 originally depended on FORTRAN libraries, unit numbers and
FORMATs for input and output. CSNOBOL4 uses the C stdio(3)
library instead, but unit numbers (INTEGERs between 1 and 256) and record
lengths remain embedded in the Macro SNOBOL4 code.
Output on a closed unit generates a fatal ``Output error'', see
snobol4error(1).
The following variable/unit/file associations exist by
default;
Variable Unit Association
INPUT 5 standard input (input)
OUTPUT 6 standard output (output)
TERMINAL 7 standard error (output)
TERMINAL 8 /dev/tty (input)
Input and output filenames can be supplied to the INPUT() and
OUTPUT() functions via an optional fourth argument.
- filename - (hyphen)
- is interpreted as stdin on INPUT() and stdout on
OUTPUT().
- sub-process I/O using PIPE and Pseudo-terminals
- If the filename begins with single a vertical bar (|), the
remainder is used as a shell command whose stdin (in the case of
OUTPUT()) or stdout (in the case of INPUT()) will be
connected to the file variable via a pipe. If a pipe is opened by INPUT()
input in ``update'' mode, the connection will be bi-directional (on
systems with socketpair and Unix-domain sockets). See below for how to
associate a variable for I/O in both directions.
- If the filename begins with two vertical bars (||) the remainder is
used as a shell command executed with stdin, stdout and stderr attached to
the slave side of a pseudo-terminal (pty), if the system C library
contains the forkpty(3) routine. Use of ptys are necessary when the
program to be invoked cannot be run without a ``terminal'' for I/O. See
below on how to properly associate the I/O variable.
- magic paths /dev/stdin, /dev/stdout, and
/dev/stderr
- /dev/stdin, /dev/stdout, and /dev/stderr refer to the
current process standard input, standard output and standard error I/O
streams respectively regardless of whether those special filenames
exist on your system.
- magic path /dev/fd/n
- /dev/fd/n uses fdopen(3) to open a new I/O stream
associated with file descriptor number n, regardless of whether the
special device entries exist.
- magic paths /tcp/hostname/service,
/udp/hostname/service
- and /tls/hostname/service.
/tcp/hostname/service can be used to open connection
to a TCP server. /udp/hostname/service behaves
similarly for UDP. /tls/hostname/service opens a TLS
over TCP connection (NOTE! does not attempt to verify certificate unless
"verify" option used, and even then does not handle SNI or SAN).
Path can followed by a number of different slash separated options:
broadcast Allow broadcast address (UDP only).
dontroute Enables routing bypass for outgoing messages.
keepalive Enables TCP connection keep alive messages.
nodelay Send TCP data without waiting.
oobinline Enables reception of out-of-band data in band.
priv Bind local port number under 1024 (if allowed).
reuseaddr Allow quick reuse of local addresses.
verify Attempt to verify server TLS certificate.
- magic pathname /dev/tmpfile
- /dev/tmpfile opens an anonymous temporary file for reading and
writing, see tmpfile(3).
- /dev/null and /dev/tty
- On non-POSIX systems /dev/null and /dev/tty are magical, and
refer to the null device, and the user's terminal/console,
respectively.
Originally the third argument specified record length for INPUT(), or a
FORTRAN FORMAT for OUTPUT().
CSNOBOL4 interprets it as string of single letter options, commas
are ignored. Some options effect only the I/O variable named in the first
argument, others effect any variable associated with the unit number in the
second argument.
- digits
- A span of digits will set the input record length for the named I/O
variable. This controls the maximum string that will be returned for
regular text I/O, and the number of bytes returned for binary I/O. Record
length is per-variable association; multiple variables may be associated
with the same unit, but with different record lengths. The default record
length for input is 1024. Lines longer than the record length will be
silently truncated. Since CSNOBOL4 2.2, record length is only honored
for binary I/O, and all characters upto a newline (ASCII Line Feed) are
interpreted as a single line.
- A
- For OUTPUT() the unit will be opened for append access (and ignored
by INPUT()). All writes will occur at the end of the file at the
time of the write, regardless of the file position before the write.
- B
- The unit will be opened for binary access. On input, newline characters
have no special meaning; the number of bytes transferred depends on record
length (see above). On output, no newline is appended.
- B
- For terminal devices, all input from this unit will be done without
special processing for line editing or EOF; the number of characters
returned depends on the record length. Characters which deliver signals
(including interrupt, kill, and suspend) are still processed. Units (with
different fds) opened on the same terminal device operate independently;
some can use binary mode, while others operate in text mode.
- C
- Character at a time I/O. A synonym for B,1.
- E
- Set the "close on exec" flag for the underlying file descriptor.
Depends on support by the C library fopen(3) call for 'e' in the
mode string for regular files. Honored for sockets regardless, (but not on
Windows).
- J
- Read and write compressed data in .xz format, using liblzma, as
written by xz(1). If a digit 0 through 9 immediately follows the
option, it will be interpreted as the compression level to use when
writing. It's claimed that level zero is "sometimes faster than gzip
-9 while compressing much better". The default compression level is
6, larger numbers will require more than 16MiB of memory to decompress,
and are only useful only when compressing files bigger than 8 MiB (level
7), 16 MiB (level 8), and 32 MiB (level 9). Matches the tar(1)
command line option. Added in CSNOBOL4 2.2.
- j
- Read and write compressed data in .bz2 format, using libbz2,
as created by bzip2(1). If a digit 1 through 9 immediately follows
the option, it will be interpreted as the compression level to use when
writing. Matches the tar(1) command line option. Added in
CSNOBOL4 2.2.
- K
- If an input line is longer than the input record length, return the line
in multiple reads (breaK up the line) instead of discarding the extra
characters. Added in CSNOBOL4 2.0. Obsolete in CSNOBOL4 2.2.
- T
- Terminal mode. Writes are performed ``unbuffered'' (see below), and no
newline characters are added. On input newline characters are returned.
Terminal mode effects only the referenced unit, and does not require
opening a new file descriptor (ie; by using a magic pathname):
OUTPUT(.TT, 8, "T", "-"). Terminal mode is
useful for outputting prompts in interactive programs.
- Q
- Quiet mode. Turns off input echo on terminals. Effects only input on this
file descriptor.
- U
- Update mode. The unit is opened for both input and output. Example of
associating a variable for I/O in both directions:
unit = IO_FINDUNIT()
INPUT(.name, unit, 'U', 'filepath')
OUTPUT(.name, unit)
- Useful situations for this when filepath is /dev/fd/n where
n is a file descriptor number returned by SERV_LISTEN(), or
filepath specifies a pipe (|command) or
pseudo-terminal (||command) paths.
- The above sequence is also useful with when combined with fixed record
length, binary mode and the SET() function for I/O to preexisting
files. Performing OUTPUT() first will create a regular file if it
does not exist, but will also truncate a preexisting file!
- W
- Unbuffered mode. Each output variable assignment causes an immediate I/O
transfer to occur by direct read(1) or write(1) system
calls, rather than collecting the data in a buffer for efficiency.
- X
- Open fails if file exists (meaningless for /dev/fd/n).
Depends on support by the C library fopen(3) call for 'x' in the
mode string. Added in CSNOBOL4 2.1 where it was ignored for
sockets. In CSNOBOL4 2.2 applies to sockets, and means don't allow
local socket address reuse.
- Z
- Reserved for .Z (compress(1)) style compression?!
- z
- Read and write compressed data in .gz format using zlib(3),
as created by gzip(1). If a digit 0 through 9 immediately follows
the option, it will be interpreted as the compression level to use when
writing. Matches the tar(1) command line option. Added in
CSNOBOL4 2.2.
- SERV_LISTEN(), SET(), SSET()
- see snobol4func(1).
The Macro SNOBOL4 and POSIX I/O architectures have subtleties which interact,
and are explained here:
- Variable association
- Input and output is done by reading or writing variables associated with a
unit number for I/O.
- Input (maximum) record lengths are associated each variable
association!
- Unit number
- Multiple variables can be associated with the same unit number using the
INPUT() and OUTPUT() functions.
- Each unit number refers to a stdio(3) stream (except on broken
systems like Windows, where socket handles are incompatible with file
handles, and all network I/O is performed ``unbuffered'').
- Sequential named files can be associated with an I/O unit when the
-r option is given on the command line! REWIND() should
return to to after the program END label!
- ``Standard I/O'' Stream
- snobol4(1) performs MOST I/O through ``Standard
Input/Output'' streams. Multiple units can be associated with the same
stdio stream (FILE struct) using magic pathnames (``-'' and
/dev/std{in,out,err}). Buffering is performed by the stdio
layer.
- Operating System file descriptor
- More than one stdio stream can be associated with the same O/S ``fd'' (by
opening magic pathname ``/dev/fd/n'').
- Each POSIX ``fd'' has a file position pointer, changed by reading, writing
and the REWIND(), SET() and SSET() functions.
- Normally terminal device ``special files'' have one set of mode
settings, but CSNOBOL4 associates (saves and restores) different terminal
settings (echo and the number of characters returned on read) based on fd
numbers.
- Operating System open file object
- More than one ``fd'' slot can be associated with the same ``open file''
object, either in multiple forks, or by dup(2) of the same fd. This
is often the case for stdin, stdout and stderr.
- Open file objects have flags which effect all associated fds, including
input, output and append modes.
- Operating System named file
- Independent opens of the same named ``regular'' file will have different
open file objects, and thus have independent access modes and file
positions.
- Terminal devices normally have one set of ``line discipline'' mode
settings, but CSNOBOL4 maintains different settings for each file
descriptor (see above).
This page was cut and pasted from various parts of the original
snobol4(1) man page, and still needs review and cleanup.
All extensions should be annotated with the version they appeared
in (and what other implementations they're compatible or inspired by).
Record lengths.
Unit numbers.
snobol4(1), snobol4ezio(3)
Visit the GSP FreeBSD Man Page Interface. Output converted with ManDoc. |