|
NAMEiconv - charset conversion functionSYNOPSIS#include <iconv.h>size_t iconv(iconv_t cd, const char **inbuf, size_t *inbytesleft, char **outbuf, size_t *outbytesleft); DESCRIPTIONThe iconv() function converts the sequence of characters from one charset, in the array specified by inbuf, into a sequence of corresponding characters in another charset, in the array specified by outbuf. The charsets are those specified in the iconv_open() call that returned the conversion descriptor, cd. The inbuf argument points to a variable that points to the first character in the input buffer and inbytesleft indicates the number of bytes to the end of the buffer to be converted. The outbuf argument points to a variable that points to the first available byte in the output buffer and outbytesleft indicates the number of the available bytes to the end of the buffer.For state-dependent encodings, the conversion descriptor cd is placed into its initial shift state by a call for which inbuf is a null pointer, or for which inbuf points to a null pointer. When iconv() is called in this way, and if outbuf is not a null pointer or a pointer to a null pointer, and outbytesleft points to a positive value, iconv() will place, into the output buffer, the byte sequence to change the output buffer to its initial shift state. If the output buffer is not large enough to hold the entire reset sequence, iconv() will fail and set errno to E2BIG. Subsequent calls with inbuf as other than a null pointer or a pointer to a null pointer cause the conversion to take place from the current state of the conversion descriptor. If a sequence of input bytes does not form a valid character in the specified charset, conversion stops after the previous successfully converted character. If the input buffer ends with an incomplete character or shift sequence, conversion stops after the previous successfully converted bytes. If the output buffer is not large enough to hold the entire converted input, conversion stops just prior to the input bytes that would cause the output buffer to overflow. The variable pointed to by inbuf is updated to point to the byte following the last byte successfully used in the conversion. The value pointed to by inbytesleft is decremented to reflect the number of bytes still not converted in the input buffer. The variable pointed to by outbuf is updated to point to the byte following the last byte of converted output data. The value pointed to by outbytesleft is decremented to reflect the number of bytes still available in the output buffer. For state-dependent encodings, the conversion descriptor is updated to reflect the shift state in effect at the end of the last successfully converted byte sequence. If iconv() encounters a character in the input buffer that is legal, but for which an identical character does not exist in the target charset, iconv() performs an implementation-defined conversion on this character. RETURN VALUESThe iconv() function updates the variables pointed to by the arguments to reflect the extent of the conversion and returns the number of non-identical conversions performed. If the entire string in the input buffer is converted, the value pointed to by inbytesleft will be 0. If the input conversion is stopped due to any conditions mentioned above, the value pointed to by inbytesleft will be non-zero and errno is set to indicate the condition. If an error occurs iconv() returns (size_t) -1 and sets errno to indicate the error.ERRORSThe iconv() function will fail if:
The iconv() function may fail if:
APPLICATION USAGEThe inbuf argument indirectly points to the memory area which contains the conversion input data. The outbuf argument indirectly points to the memory area which is to contain the result of the conversion. The objects indirectly pointed to by inbuf and outbuf are not restricted to containing data that is directly representable in the ISO C language char data type. The type of inbuf and outbuf, char **, does not imply that the objects pointed to are interpreted as null-terminated C strings or arrays of characters. Any interpretation of a byte sequence that represents a character in a given character set encoding scheme is done internally within the codeset converters. For example, the area pointed to indirectly by inbuf and/or outbuf can contain all zero octets that are not interpreted as string terminators but as coded character data according to the respective codeset encoding scheme. The type of the data (char, short int, long int, and so on) read or stored in the objects is not specified, but may be inferred for both the input and output data by the converters determined by the from_charset and to_charset arguments of iconv_open().Regardless of the data type inferred by the converter, the size of the remaining space in both input and output objects (the intbytesleft and outbytesleft arguments) is always measured in bytes. IMPLEMENTATION DETAILSConversions between different charsets are done via the UCS-4 universal character set. Conversions between the same charset (e.g. when two different aliases of the same charset are used) are done by direct copying from the input buffer to the output one. The libiconv library itself usually contains only a small set of (built-in) charsets. Tables for conversion between UCS-4 and particular charsets are mapped to memory from binary table files, or C methods are loaded dynamically from shared modules:
Any CCS table or CES module can be built in into the library at compilation time. A CCS or CES charset can have zero or more aliases (alternative names) which are listed in charset.aliases file located in the same directory as CCS tables. The library maps the aliases file to memory to find canonical charset names. If iconv() encounters a character in the input buffer that is legal, but for which an identical character does not exist in the target charset, iconv() replaces the source character with the '_' (underscore) character and tries to convert it into the target charset. If there is no underscore character in the target charset, no bytes are written to the target buffer for the source character. In any case, iconv() increments the number of non-identical conversions performed (the value being returned as the function result). FILES
SEE ALSOiconv(1), iconv_close(3), iconv_open(3)
Visit the GSP FreeBSD Man Page Interface. |