  | 
 
 
 
 |  
 |  | 
 
  
    | KHTTP_PARSE(3) | 
    FreeBSD Library Functions Manual | 
    KHTTP_PARSE(3) | 
   
 
khttp_parse,
    khttp_parsex — parse a CGI
    instance for kcgi 
#include
    <sys/types.h>
   
  #include <stdarg.h>
   
  #include <stdint.h>
   
  #include <kcgi.h> 
enum kcgi_err
   
  khttp_parse(struct kreq *req,
    const struct kvalid *keys, size_t
    keysz, const char *const *pages,
    size_t pagesz, size_t
  defpage); 
enum kcgi_err
   
  khttp_parsex(struct kreq *req,
    const struct kmimemap *suffixes, const
    char *const *mimes, size_t mimesz,
    const struct kvalid *keys, size_t
    keysz, const char *const *pages,
    size_t pagesz, size_t defmime,
    size_t defpage, void *arg,
    void (*argfree)(void *arg), unsigned
    int debugging, const struct kopts *opts); 
extern const char *const
    kmimetypes[KMIME__MAX];
   
  extern const char *const khttps[KHTTP__MAX];
   
  extern const char *const kschemes[KSCHEME__MAX];
   
  extern const char *const kmethods[KMETHOD__MAX];
   
  extern const struct kmimemap ksuffixmap[];
   
  extern const char *const ksuffixes[KMIME__MAX]; 
The
    khttp_parse()
    and khttp_parsex() functions parse and validate
    input and the HTTP environment (compression, paths, MIME types, and so on).
    They are the central functions in the
    kcgi(3)
    library, parsing and validating key-value form (query string, message body,
    cookie) data and opaque message bodies. 
They must be matched by
    khttp_free(3)
    if and only if the return value is KCGI_OK.
    Otherwise, resources are internally freed. 
The collective arguments are as follows: 
  - arg
 
  - A pointer to private application data. It is not touched unless
      argfree is provided.
 
  - argfree
 
  - Function invoked with arg by the child process
      starting to parse untrusted network data. This makes sure that no
      unnecessary data is leaked into the child.
 
  - debugging
 
  - This bit-field enables debugging of the underlying parse and/or write
      routines. It may have 
KREQ_DEBUG_WRITE for writes
      and KREQ_DEBUG_READ_BODY for the pre-parsed body.
      Debugging messages to
      kutil_info(3)
      consist of the process ID followed by "-tx" or "-rx"
      for writing or reading, a colon and space, then the logged data. A newline
      will flush the existing line, as well reaching 80 characters. If flushed
      at 80 characters and not a newline, an ellipsis will follow the line. The
      total logged bytes will be emitted at the end of all reads or writes. 
  - defmime
 
  - If no MIME type is specified (that is, there's no suffix to the page
      request), use this index in the mimes array.
 
  - defpage
 
  - If no page was specified (e.g., the default landing page), this is
      provided as the requested page index.
 
  - keys
 
  - An optional array of input and validation fields or
      
NULL. 
  - keysz
 
  - The number of elements in keys.
 
  - mimesz
 
  - The number of elements in mimes. Also the MIME index
      used if no MIME type was matched. This differs from
      defmime, which is used if there is no MIME suffix at
      all.
 
  - mimes
 
  - An array of MIME types (e.g., “text/html”), mapped into a
      MIME index during MIME body parsing. This relates both to pages and input
      fields with a body type. Any array should include at least
      
text/plain, as this is the default content type
      for MIME documents. 
  - opts
 
  - Tunable options regarding socket buffer sizes and so on. If set to
      
NULL, meaningful defaults are used. 
  - pages
 
  - An array of recognised pathnames. When pathnames are parsed, they're
      matched to indices in this array.
 
  - pagesz
 
  - The number of pages in pages. Also used if the
      requested page was not in pages.
 
  - req
 
  - This structure is cleared and filled with input fields and HTTP context
      parsed from the CGI environment. It is the main structure carried around
      in a
      kcgi(3)
      application.
 
  - suffixes
 
  - Define the MIME type (suffix) mapping.
 
 
The first form,
    khttp_parse(),
    is for applications using the system-recognised MIME types. This should work
    well enough for most applications. It is equivalent to invoking the second
    form, khttp_parsex(), as follows: 
khttp_parsex(req, ksuffixmap,
  kmimetypes, KMIME__MAX, keys, keysz,
  pages, pagesz, KMIME_TEXT_HTML,
  defpage, NULL, NULL, 0, NULL); 
 
A struct kreq object is filled in by
    khttp_parse() and
    khttp_parsex(). It consists of the following
  fields: 
  - void *arg
 
  - Private application data. This is set during
      
khttp_parse(). 
  - enum kauth auth
 
  - Type of “managed” HTTP authorisation performed by the web
      server according to the 
AUTH_TYPE header variable,
      if any. This is KAUTH_DIGEST for the
      AUTH_TYPE of "digest",
      KAUTH_BASIC for "basic",
      KAUTH_BEARER for "bearer",
      KAUTH_UNKNOWN for other values of
      AUTH_TYPE, or KAUTH_NONE
      if AUTH_TYPE is not set. See the
      rawauth field for raw (i.e., not processed by the
      web server) authorisation requests. 
  - struct kpair **cookiemap
 
  - An array of keysz singly linked lists of elements of
      the cookies array. If
      cookie->key is equal to one
      of the entries of keys and
      cookie->state is
      
KPAIR_VALID or
      KPAIR_UNCHECKED, the cookie is added to the list
      cookiemap[cookie->keypos].
      Empty lists are NULL. If a list contains more than
      one cookie, cookie->next
      points to the next cookie. For the last cookie in a list,
      cookie->next is NULL. 
  - struct kpair **cookienmap
 
  - Similar to cookiemap, except that it contains the
      cookies where cookie->state
      is 
KPAIR_INVALID. 
  - struct kpair *cookies
 
  - Key-value pairs read from request cookies found in the
      
HTTP_COOKIE header variable, or
      NULL if cookiesz is 0. See
      fields for key-value pairs from the request query
      string or message body. 
  - size_t cookiesz
 
  - The size of the cookies array.
 
  - struct kpair **fieldmap
 
  - Similar to cookiemap, except that the lists contain
      elements of the fields array.
 
  - struct kpair **fieldnmap
 
  - Similar to fieldmap, except that it contains the
      fields where field->state
      is 
KPAIR_INVALID. 
  - struct kpair *fields
 
  - Key-value pairs read from the 
QUERY_STRING header
      variable and from the message body, or NULL if
      fieldsz is 0. See cookies
      for key-value pairs from request cookies. 
  - size_t fieldsz
 
  - The number of elements in the fields array.
 
  - char *fullpath
 
  - The full requested path as contained in the
      
PATH_INFO header variable. For example, requesting
      "https://bsd.lv/app.cgi/dir/file.html?q=v", where
      "app.cgi" is the CGI program, this value would be
      /dir/file.html. It is not guaranteed to start with
      a slash and it may be an empty string. 
  - char *host
 
  - The host name received in the 
HTTP_HOST header
      variable. When using name-based virtual hosting, this is typically the
      virtual host name specified by the client in the HTTP request, and it
      should not be confused with the canonical DNS name of the host running the
      web server. For example, a request to
      "https://bsd.lv/app.cgi/file" would have a host of
      "bsd.lv". If HTTP_HOST is not defined,
      host is set to "localhost". 
  - struct kdata *kdata
 
  - Internal data. Should not be touched.
 
  - const struct kvalid *keys
 
  - Value passed to 
khttp_parse(). 
  - size_t keysz
 
  - Value passed to 
khttp_parse(). 
  - enum kmethod method
 
  - The 
KMETHOD_ACL,
      KMETHOD_CONNECT,
      KMETHOD_COPY,
      KMETHOD_DELETE,
      KMETHOD_GET, KMETHOD_HEAD,
      KMETHOD_LOCK,
      KMETHOD_MKCALENDAR,
      KMETHOD_MKCOL,
      KMETHOD_MOVE,
      KMETHOD_OPTIONS,
      KMETHOD_POST,
      KMETHOD_PROPFIND,
      KMETHOD_PROPPATCH,
      KMETHOD_PUT,
      KMETHOD_REPORT,
      KMETHOD_TRACE, or
      KMETHOD_UNLOCK submission method obtained from the
      REQUEST_METHOD header variable. If an unknown
      method was requested, KMETHOD__MAX is used. If no
      method was specified, the default is KMETHOD_GET.
    Applications will usually accept only
        KMETHOD_GET and
        KMETHOD_POST, so be sure to emit a
        KHTTP_405 status for undesired methods. 
   
  - size_t
    mime
 
  - The MIME type of the requested file as determined by its
      suffix matched to the
      mimemap map passed to
      
khttp_parsex()
      or the default kmimemap if using
      khttp_parse(). This defaults to the
      mimesz value passed to
      khttp_parsex() or the default
      KMIME__MAX if using
      khttp_parse() when no suffix is specified or when
      the suffix is specified but not known. 
  - size_t page
 
  - The page index found by looking up pagename in the
      pages array. If pagename is
      not found in pages, pagesz is
      used; if pagename is empty,
      defpage is used.
 
  - char *pagename
 
  - The first component of fullpath or an empty string
      if there is none. It is compared to the elements of the
      pages array to determine which
      page it corresponds to. For example, for a
      fullpath of "/dir/file.html" this
      component corresponds to dir. For
      "/file.html", it's file.
 
  - char *path
 
  - The middle part of fullpath, after stripping
      pagename/ at the beginning and
      .suffix at the end, or an empty string if there is
      none. For example, if the fullpath is
      bar/baz.html, this component is
      baz.
 
  - char *pname
 
  - The script name received in the 
SCRIPT_NAME header
      variable. For example, for a request to a CGI program
      /var/www/cgi-bin/app.cgi mapped by the web server
      from "https://bsd.lv/app.cgi/file", this would be
      app.cgi. This may not reflect a file system entity
      and it may be an empty string. 
  - uint16_t port
 
  - The server's receiving TCP port according to the
      
SERVER_PORT header variable, or 80 if that is not
      defined or an invalid number. 
  - struct khttpauth rawauth
 
  - The raw authorization request according to the
      
HTTP_AUTHORIZATION header variable passed by the
      web server. This is only set if the web server is not managing
      authorisation itself. 
  - char *remote
 
  - The string form of the client's IPv4 or IPv6 address taken from the
      
REMOTE_ADDR header variable, or
      "127.0.0.1" if that is not defined. The address format of the
      string is not checked. 
  - struct khead
    *reqmap[
KREQU__MAX] 
  - Mapping of enum krequ enumeration values to
      reqs parsed from the input stream.
 
  - struct khead *reqs
 
  - List of all HTTP request headers, known via enum
      krequ and not known, parsed from the input stream, or
      
NULL if reqsz is 0. 
  - size_t reqsz
 
  - Number of request headers in reqs.
 
  - enum kscheme scheme
 
  - The access scheme according to the 
HTTPS header
      variable, either KSCHEME_HTTPS if
      HTTPS is set and equal to the string
      "on" or KSCHEME_HTTP otherwise. 
  - char *suffix
 
  - The suffix part of the last component of fullpath or
      an empty string if there is none. For example, if the
      fullpath is /bar/baz.html,
      this component is html. See the
      mime field for the MIME type parsed from the
    suffix.
 
 
The application may optionally define
    keys provided to
    khttp_parse()
    and khttp_parsex() as an array of
    struct kvalid. This structure is central to the
    validation of input data. It consists of the following fields: 
  - const char *name
 
  - The field name, i.e., how it appears in the HTML form input name. This
      cannot be 
NULL. If the field name is an empty
      string and the HTTP message consists of an opaque body (and not key-value
      pairs), then that field will be used to validate the HTTP message body.
      This is useful for KMETHOD_PUT style
    requests. 
  - int (*)(struct kpair *)
    valid
 
  - A validation function returning non-zero if parsing and validation succeed
      or 0 otherwise. If it is 
NULL, then no validation
      is performed, the data is considered as valid, and it is bucketed into
      cookiemap or fieldmap as such.
    User-defined valid functions usually set
        the type and parsed fields
        in the key-value pair. When working with binary data or with a key that
        can take different data types, it is acceptable for a validation
        function to set the type to
        KPAIR__MAX and for the application to ignore the
        parsed field and to work directly with
        val and valsz. 
    The validation function is allowed to allocate new memory for
        val: if the val pointer
        changes during validation, the memory pointed to after validation will
        be freed with
        free(3)
        after the data is passed out of the sandbox. 
    These functions are invoked from within a system-specific
        sandbox that may not allow some system calls, for example opening files
        or sockets. In other words, validation functions should only do pure
        computation. 
   
 
The struct kpair
    structure presents the user with fields parsed from input and (possibly)
    matched to the keys variable passed to
    khttp_parse()
    and khttp_parsex(). It is also passed to the
    validation function to be filled in. In this case, the MIME-related fields
    are already filled in and may be examined to determine the method of
    validation. This is useful when validating opaque message bodies. 
  - char *ctype
 
  - The value's MIME content type (e.g., 
image/jpeg),
      or an empty string if not defined. 
  - size_t ctypepos
 
  - If ctype is not 
NULL, it is
      looked up in the mimes parameter passed to
      khttp_parsex() or ksuffixmap
      if using khttp_parse(). If found, it is set to the
      appropriate index. Otherwise, it's mimesz. 
  - char *file
 
  - The value's MIME source filename or an empty string if not defined.
 
  - char *key
 
  - The NUL-terminated key (input) name. If the HTTP message body is opaque
      (e.g., 
KMETHOD_PUT), then an empty-string key is
      cooked up. The key may contain an arbitrary sequence of non-NUL bytes,
      even non-ASCII bytes, control characters, and shell metacharacters. 
  - size_t keypos
 
  - If found in the keys array passed to
      
khttp_parse(), the index of the matching key.
      Otherwise keysz. 
  - struct kpair *next
 
  - In a cookie or field map, next points to the next
      parsed key-value pair with the same key name. This
      occurs most often in HTML checkbox forms, where many fields may have the
      same name.
 
  - union parsed parsed
 
  - The parsed, validated value. These may be integer in
      i, for a 64-bit signed integer; a string
      s, for a NUL-termianted character string; or a
      double d, for a double-precision floating-point
      number. This is intentionally basic because the resulting data must be
      reliably passed from the parsing context back into the web
    application.
 
  - enum kpairstate state
 
  - The validation state: 
KPAIR_VALID if the pair was
      successfully validated by a validation function,
      KPAIR_INVALID if a validation function was invoked
      but failed, or KPAIR_UNCHECKED if no validation
      function is defined for this key. 
  - enum kpairtype type
 
  - If parsed, the type of data in parsed, otherwise
      
KFIELD__MAX. 
  - char *val
 
  - The (input) value, which may contain an arbitrary sequence of bytes, even
      NUL bytes, non-ASCII bytes, control characters, and shell metacharacters.
      The byte following the end of the array,
      val[valsz], is always
      guaranteed to be NUL. The validation function may modify the contents. For
      example, for integer numbers and e-mail adresses, trailing whitespace may
      be replaced with NUL bytes.
 
  - size_t valsz
 
  - The length of the val buffer in bytes. It is not a
      string length.
 
  - char *xcode
 
  - The value's MIME content transfer encoding (e.g.,
      
base64), or an empty string if not defined. 
 
The struct khttpauth structure holds
    authorisation data if passed by the server. The specific fields are as
    follows. 
  - enum kauth type
 
  - If no data was passed by the server, the type value
      is 
KAUTH_NONE. Otherwise it's
      KAUTH_BASIC, KAUTH_BEARER,
      or KAUTH_DIGEST.
      KAUTH_UNKNOWN signals that the authorisation type
      was not recognised. 
  - int authorised
 
  - For 
KAUTH_BASIC,
      KAUTH_BEARER, or
      KAUTH_DIGEST authorisation, this field indicates
      whether all required values were specified for the application to perform
      authorisation. 
  - char *digest
 
  - An MD5 digest of 
REQUEST_METHOD,
      SCRIPT_NAME, PATH_INFO,
      header variables and the request body. It is not a NUL-terminated string,
      but an array of exactly MD5_DIGEST_LENGTH bytes.
      Only filled in when HTTP_AUTHORIZATION is
      "digest" and authorised is non-zero.
      Otherwise, it remains NULL. Used in
      khttpdigest_validatehash(3). 
  - d
 
  - An anonymous union containing parsed fields per type:
      struct khttpbasic basic for
      
KAUTH_BASIC or
      KAUTH_BEARER, or struct
      khttpdigest digest for
      KAUTH_DIGEST. 
 
If the field for an HTTP authorisation request is
    KAUTH_BASIC or KAUTH_BEARER,
    it will consist of the following for its parsed entities in its
    struct khttpbasic structure: 
  - response
 
  - The hashed and encoded response string for
      
KAUTH_BASIC, or an opaque string for
      KAUTH_BEARER. 
 
If the field for an HTTP authorisation request is
    KAUTH_DIGEST, it will consist of the following in
    its struct khttpdigest structure: 
  - alg
 
  - The encoding algorithm, parsed from the possible
      
MD5 or MD5-Sess
    values. 
  - qop
 
  - The quality of protection algorithm, which may be unspecified,
      
Auth or Auth-Init. 
  - user
 
  - The user coordinating the request.
 
  - uri
 
  - The URI for which the request is designated. (This must match the request
      URI).
 
  - realm
 
  - The request realm.
 
  - nonce
 
  - The server-generated nonce value.
 
  - cnonce
 
  - The (optional) client-generated nonce value.
 
  - response
 
  - The hashed and encoded response string, which entangled fields depending
      on algorithm and quality of protection.
 
  - count
 
  - The (optional) cnonce counter.
 
  - opaque
 
  - The (optional) opaque string requested by the server.
 
 
The struct kopts structure consists of
    tunables for network performance. You probably don't want to use these
    unless you really know what you're doing! 
  - sndbufsz
 
  - The size of the output buffer. The output buffer is a heap-allocated
      region into which writes (via
      khttp_write(3)
      and
      khttp_head(3))
      are buffered instead of being flushed directly to the wire. The buffer is
      flushed when it is full, when the HTTP headers are flushed, and when
      khttp_free(3)
      is invoked. If the buffer size is zero, writes are flushed immediately to
      the wire. If the buffer size is less than zero, it is filled with a
      meaningful default.
 
 
Lastly, the struct khead structure holds
    parsed HTTP headers. 
  - key
 
  - Holds the HTTP header name. This is not the CGI header name (e.g.,
      
HTTP_COOKIE), but the reconstituted HTTP name
      (e.g., Coookie). 
  - val
 
  - The opaque header value, which may be an empty string.
 
 
A number of variables are defined
    <kcgi.h> to simplify
    invocations of the
    khttp_parse()
    family. Applications are strongly suggested to use these variables (and
    associated enumerations) in khttp_parse() instead of
    overriding them with hand-rolled sets in
    khttp_parsex(). 
  - kmimetypes
 
  - Indexed list of common MIME types, for example, “text/html”
      and “application/json”. Corresponds to enum
      kmime enum khttp.
 
  - khttps
 
  - Indexed list of HTTP status code and identifier, for example, “200
      OK”. Corresponds to enum khttp.
 
  - kschemes
 
  - Indexed list of URL schemes, for example, “https” or
      “ftp”. Corresponds to enum
    kscheme.
 
  - kmethods
 
  - Indexed list of HTTP methods, for example, “GET” and
      “POST”. Corresponds to enum
    kmethod.
 
  - ksuffixmap
 
  - Map of MIME types defined in enum kmime to possible
      suffixes. This array is terminated with a MIME type of
      
KMIME__MAX and name
    NULL. 
  - ksuffixes
 
  - Indexed list of canonical suffixes for MIME types corresponding to
      enum kmime. This may be a
      
NULL pointer for types that have no canonical
      suffix, for example. “application/octet-stream”. 
 
khttp_parse() and
    khttp_parsex() return an error code: 
  KCGI_OK 
  - Success (not an error).
 
  KCGI_ENOMEM 
  - Memory failure. This can occur in many places: spawning a child,
      allocating memory, creating sockets, etc.
 
  KCGI_ENFILE 
  - Could not allocate file descriptors.
 
  KCGI_EAGAIN 
  - Could not spawn a child.
 
  KCGI_FORM 
  - Malformed data between parent and child whilst parsing an HTTP request.
      (Internal system error.)
 
  KCGI_SYSTEM 
  - Opaque operating system error.
 
 
On failure, the calling application should terminate as
    soon as possible. Applications should
    not try to write an
    HTTP 505 error or similar, but allow the web server to handle the empty CGI
    response on its own. 
The khttp_parse() and
    khttp_parsex() functions were written by
    Kristaps Dzonsons
    <kristaps@bsd.lv>. 
 
 
  Visit the GSP FreeBSD Man Page Interface. Output converted with ManDoc.
  |