GNU Dico Manual (split by chapter):   Section:   Chapter:FastBack: Time and Date Formats   Up: Top   FastForward: Copying This Manual   Contents: Table of ContentsIndex: Concept Index

Appendix D The Libdico Library

D.1 Strategies

Editor’s note:

The information in this node may be obsolete or otherwise inaccurate. This message will disappear, once this node revised.

struct dico_strategy {
    char *name;
    char *descr;
    dico_select_t sel;
    void *closure;
    int is_default;
};
Function: dico_strategy_t dico_strategy_dup (const dico_strategy_t strat)
Function: dico_strategy_t dico_strategy_find (const char *name)
Function: int dico_strategy_add (const dico_strategy_t strat)
Function: dico_iterator_t dico_strategy_iterator (void)
Function: void dico_strategy_iterate (dico_list_iterator_t itr, void *data)
Function: size_t dico_strategy_count (void)
Function: int dico_set_default_strategy (const char *name)
Function: const dico_strategy_t dico_get_default_strategy (void)
Function: int dico_strategy_is_default_p (dico_strategy_t strat)

D.2 argcv

Editor’s note:

The information in this node may be obsolete or otherwise inaccurate. This message will disappear, once this node revised.

enum: dico_argcv_quoting_style
Variable: enum dico_argcv_quoting_style dico_argcv_quoting_style
Function: int dico_argcv_get (const char *command, const char *delim, const char *cmnt, int *argc, char ***argv)
Function: int dico_argcv_get_n (const char *command, int len, const char *delim, const char *cmnt, int *argc, char ***argv)
Function: int dico_argcv_get_np (const char *command, int len, const char *delim, const char *cmnt, int flags, int *argc, char ***argv, char **endp)
Function: int dico_argcv_string (int argc, const char **argv, char **string)
Function: void dico_argcv_free (int argc, char **argv)
Function: void dico_argv_free (char **argv)
Function: int dico_argcv_unquote_char (int c)
Function: int dico_argcv_quote_char (int c)
Function: size_t dico_argcv_quoted_length (const char *str, int *quote)
Function: void dico_argcv_unquote_copy (char *dst, const char *src, size_t n)
Function: void dico_argcv_quote_copy (char *dst, const char *src)
Function: void dico_argcv_remove (int *argc, char ***argv, int (*sel) (const char *, void *), void *data)

D.3 Lists

Editor’s note:

The information in this node may be obsolete or otherwise inaccurate. This message will disappear, once this node revised.

Type: dico_list_t
Type: dico_iterator_t
Function Type: dico_list_iterator_t
typedef int (*dico_list_iterator_t)(void *item, void *data);
Function Type: dico_list_comp_t
typedef int (*dico_list_comp_t)(const void *, const void *);
Function: dico_list_t dico_list_create (void)
Function: void dico_list_destroy (dico_list_t *list, dico_list_iterator_t free, void *data)
Function: void dico_list_iterate (dico_list_t list, dico_list_iterator_t itr, void *data)
Function: void * dico_list_item (dico_list_t list, size_t n)
Function: size_t dico_list_count (dico_list_t list)
Function: int dico_list_append (dico_list_t list, void *data)
Function: int dico_list_prepend (dico_list_t list, void *data)
Function: int dico_list_push (dico_list_t list, void *data)
Function: int dico_list_insert_sorted (dico_list_t list, void *data, dico_list_comp_t cmp)
Function: dico_list_t dico_list_intersect (dico_list_t a, dico_list_t b, dico_list_comp_t cmp)
Function: int dico_list_intersect_p (dico_list_t a, dico_list_t b, dico_list_comp_t cmp)
Function: void * dico_list_pop (dico_list_t list)
Function: void * dico_list_locate (dico_list_t list, void *data, dico_list_comp_t cmp)
Function: void * dico_list_remove (dico_list_t list, void *data, dico_list_comp_t cmp)
Function: void * dico_iterator_current (dico_iterator_t itr)
Function: dico_iterator_t dico_iterator_create (dico_list_t list)
Function: void dico_iterator_destroy (dico_iterator_t *pitr)
Function: void * dico_iterator_first (dico_iterator_t itr)
Function: void * dico_iterator_next (dico_iterator_t itr)
Function: void * dico_iterator_remove_current (dico_iterator_t itr)
Function: void dico_iterator_set_data (dico_iterator_t itr, void *data)

D.4 Associative lists

Editor’s note:

The information in this node may be obsolete or otherwise inaccurate. This message will disappear, once this node revised.

struct dico_assoc {
    char *key;
    char *value;
};
Type: dico_assoc_list_t
Function: dico_assoc_list_t dico_assoc_create (void)
Function: dico_assoc_list_t dico_assoc_dup(dico_assoc_list_t src)
Function: void dico_assoc_destroy (dico_assoc_list_t *passoc)
int: dico_assoc_clear(dico_assoc_list_t assoc)
Function: int dico_assoc_add (dico_assoc_list_t assoc, const char *key, const char *value)
Function: int dico_assoc_append(dico_assoc_list_t assoc, const char *key, const char *value)
Function: const char * dico_assoc_find_n( dico_assoc_list_t assoc, const char *key, size_t n)
Function: const char * dico_assoc_find ( dico_assoc_list_t assoc, const char *key)
Function: void dico_assoc_remove_n( dico_assoc_list_t assoc, const char *key, size_t n)
Function: void dico_assoc_remove ( dico_assoc_list_t assoc, const char *key)
Function: size_t dico_assoc_count(dico_assoc_list_t assoc)
Function: dico_iterator_t dico_assoc_iterator( dico_assoc_list_t assoc)

D.5 Diagnostics Functions

Editor’s note:

The information in this node may be obsolete or otherwise inaccurate. This message will disappear, once this node revised.

L_DEBUG
L_INFO
L_NOTICE
L_WARN
L_ERR
L_CRIT
L_ALERT
L_EMERG
Variable: const char * dico_program_name
Variable: const char * dico_invocation_name
Function: void dico_set_program_name (char *name)
Function Type: void dico_log_printer_t ( int lvl, int exitcode, int errcode, const char *fmt, va_list ap)
Function: void dico_set_log_printer (dico_log_printer_t prt)
Function: void dico_vlog ( int lvl, int errcode, const char *fmt, va_list ap)
Function: void dico_log (int lvl, int errcode, const char *fmt, ...)
Function: void dico_die (int exitcode, int lvl, int errcode, char *fmt, ...)
Function: int dico_str_to_diag_level (const char *str)
Function: dico_stream_t dico_log_stream_create (int level)

D.6 Filter

Define: FILTER_ENCODE
Define: FILTER_DECODE
Function Type: filter_xcode_t
typedef int (*filter_xcode_t) (const char *, size_t,
                               char *, size_t, size_t *, size_t, size_t *);
Function: dico_stream_t filter_stream_create ( dico_stream_t str, size_t min_level, size_t max_line_length, filter_xcode_t xcode, int mode)
Function: dico_stream_t dico_codec_stream_create ( const char *encoding, int mode, dico_stream_t transport)
Function: dico_stream_t dico_base64_stream_create ( dico_stream_t str, int mode)
Function: dico_stream_t dico_qp_stream_create ( dico_stream_t str, int mode)
Function: int dico_base64_input (char c)
Function: int dico_base64_decode ( const char *iptr, size_t isize, char *optr, size_t osize, size_t *pnbytes, size_t line_max, size_t *pline_len)
Function: int dico_base64_encode ( const char *iptr, size_t isize, char *optr, size_t osize, size_t *pnbytes, size_t line_max, size_t *pline_len)
Function: int dico_qp_decode ( const char *iptr, size_t isize, char *optr, size_t osize,  

size_t *pnbytes,   size_t line_max, size_t *pline_len)

Function: int dico_qp_encode ( const char *iptr, size_t isize, char *optr, size_t osize, size_t *pnbytes, size_t line_max, size_t *pline_len)

D.7 parseopt

Editor’s note:

The information in this node may be obsolete or otherwise inaccurate. This message will disappear, once this node revised.

DICO_PARSEOPT_PARSE_ARGV0
DICO_PARSEOPT_PERMUTE
Enumeration: dico_opt_type
dico_opt_null
dico_opt_bool
dico_opt_bitmask
dico_opt_bitmask_rev
dico_opt_long
dico_opt_string
dico_opt_enum
dico_opt_const
dico_opt_const_string
struct: dico_option
struct dico_option {
    const char *name;
    size_t len;
    enum dico_opt_type type;
    void *data;
    union {
        long value;
        const char **enumstr;
    } v;
    int (*func) (struct dico_option *, const char *);
};
Macro: DICO_OPTSTR name
Function: int dico_parseopt (struct dico_option *opt, int argc, char **argv, int flags, int *index)

D.8 stream

Editor’s note:

The information in this node may be obsolete or otherwise inaccurate. This message will disappear, once this node revised.

Function: int dico_stream_create (dico_stream_t *pstream, int flags, void *data)
DICO_STREAM_READ
DICO_STREAM_WRITE
DICO_STREAM_SEEK
Function: int dico_stream_open (dico_stream_t stream)
Function: void dico_stream_set_open ( dico_stream_t stream, int (*openfn) (void *, int))
Function: void dico_stream_set_seek ( dico_stream_t stream, int (*fun_seek) (void *, off_t, int, off_t *))
Function: void dico_stream_set_size ( dico_stream_t stream, int (*sizefn) (void *, off_t *))
Function: void dico_stream_set_read ( dico_stream_t stream, int (*readfn) (void *, char *, size_t, size_t *))
Function: void dico_stream_set_write ( dico_stream_t stream, int (*writefn) (void *, const char *, size_t, size_t *))
Function: void dico_stream_set_flush ( dico_stream_t stream, int (*flushfn) (void *))
Function: void dico_stream_set_close ( dico_stream_t stream, int (*closefn) (void *))
Function: void dico_stream_set_destroy ( dico_stream_t stream, int (*destroyfn) (void *))
Function: void dico_stream_set_ioctl ( dico_stream_t stream, int (*ctl) (void *, int, void *))
Function: void dico_stream_set_error_string ( dico_stream_t stream, const char *(*error_string) (void *, int))
Function: int dico_stream_set_buffer ( dico_stream_t stream, enum dico_buffer_type type, size_t size)
Enumeration: dico_buffer_type
dico_buffer_none
dico_buffer_line
dico_buffer_full
Function: off_t dico_stream_seek ( dico_stream_t stream, off_t offset, int whence)
DICO_SEEK_SET
DICO_SEEK_CUR
DICO_SEEK_END
Function: int dico_stream_size (dico_stream_t stream, off_t *psize)
Function: int dico_stream_read_unbuffered ( dico_stream_t stream, void *buf, size_t size, size_t *pread)
Function: int dico_stream_write_unbuffered ( dico_stream_t stream, const void *buf, size_t size, size_t *pwrite)
Function: int dico_stream_read ( dico_stream_t stream, void *buf, size_t size, size_t *pread)
Function: int dico_stream_readln ( dico_stream_t stream, char *buf, size_t size, size_t *pread)
Function: int dico_stream_getdelim ( dico_stream_t stream, char **pbuf, size_t *psize, int delim, size_t *pread)
Function: int dico_stream_getline ( dico_stream_t stream, char **pbuf, size_t *psize, size_t *pread)
Function: int dico_stream_write ( dico_stream_t stream, const void *buf, size_t size)
Function: int dico_stream_writeln ( dico_stream_t stream, const char *buf, size_t size)
Function: int dico_stream_ioctl ( dico_stream_t stream, int code, void *ptr)
Function: const char * dico_stream_strerror ( dico_stream_t stream, int rc)
Function: int dico_stream_last_error (dico_stream_t stream)
Function: void dico_stream_clearerr (dico_stream_t stream)
Function: int dico_stream_eof (dico_stream_t stream)
Function: int dico_stream_flush (dico_stream_t stream)
Function: int dico_stream_close (dico_stream_t stream)
Function: void dico_stream_destroy (dico_stream_t *stream)
Function: off_t dico_stream_bytes_in (dico_stream_t stream)
Function: off_t dico_stream_bytes_out (dico_stream_t stream)

D.9 url

Editor’s note:

The information in this node may be obsolete or otherwise inaccurate. This message will disappear, once this node revised.

struct: dico_url
#define DICO_REQUEST_DEFINE 0
#define DICO_REQUEST_MATCH 1

struct dico_request {
    int type;
    char *word;
    char *database;
    char *strategy;
    unsigned long n;
};

struct dico_url {
    char *string;
    char *proto;
    char *host;
    int port;
    char *path;
    char *user;
    char *passwd;
    dico_assoc_list_t args;
    struct dico_request req;
};
Pointer: dico_url_t
Function: int dico_url_parse (dico_url_t *purl, const char *str)
Function: void dico_url_destroy (dico_url_t *purl)
Function: const char * dico_url_get_arg ( dico_url_t url, const char *argname)
Function: char * dico_url_full_path (dico_url_t url)

D.10 UTF-8

This section describes functions for handling UTF-8 strings. A UTF-8 character can be represented either as a multi-byte character or a wide character.

Multibyte character is a char * pointing to one or more bytes representing the UTF-8 character.

Wide character is an unsigned value identifying the character.

In the discussion below, a sequence of one or more multi-byte characters is called a multi-byte string. Multibyte strings terminate with a single ‘nul’ (0) character.

A sequence of one or more wide characters is called a wide character string. Such strings terminate with a single 0 value.

D.10.1 Character sizes

Function: size_t utf8_char_width (const unsigned char *cp)

Returns length in bytes of the UTF-8 character representation pointed to by cp.

Function: size_t utf8_strlen (const char *str)

Returns number of UTF-8 characters (not bytes) in str.

Function: size_t utf8_wc_strlen (const unsigned *s)

Returns number of wide characters in the wide character string s.

D.10.2 Iterating over UTF-8 strings

struct: utf8_iterator

A data type for iterating over a string of UTF-8 characters. Defined as:

struct utf8_iterator {
    char *string;
    char *curptr;
    unsigned curwidth;
};

When iterating over characters in string, curptr points to the current character, and curwidth holds its length in bytes.

Function: int utf8_iter_isascii (struct utf8_iterator itr)

Returns ‘true’ if itr points to a ASCII character.

Function: int utf8_iter_end_p (struct utf8_iterator *itr)

Returns ‘true’ if itr reached end of the input string.

Function: int utf8_iter_first (struct utf8_iterator *itr, unsigned char *str)

Initializes itr for iterating over the string str. On success, positions itr.curptr to the next character from the input string, sets itr.curwidth to the length of that character in bytes, and returns ‘0’. If str is an empty string, returns ‘1’.

Function: int utf8_iter_next (struct utf8_iterator *itr)

Positions itr.curptr to the next character from the input string. Sets itr.curwidth to the length of that character in bytes.

D.10.3 Conversions

The following functions convert between the two string representations.

Function: int utf8_mbtowc_internal (void *data, int (*read) (void*), unsigned int *pwc)

Internal function for converting a single UTF-8 character to a corresponding wide character representation. The character to convert is obtained by calling the function pointed to by read with data as its only argument. If that call returns a non-positive value, the function sets errno to ‘ENODATA’ and returns -1.

Function: int utf8_mbtowc (unsigned int *pwc, const char *r, size_t len)

Converts first len characters from the multi-byte string r to wide character representation. On success, returns 0 and stores the result in pwc. The result pointer is allocated using malloc(3).

On error (invalid multi-byte sequence encountered), returns -1 and sets errno to ‘EILSEQ’.

Function: int utf8_wctomb (unsigned char *r, unsigned int wc)

Stores the UTF-8 representation of the Unicode character wc in r[0..5]. Returns the number of bytes stored. If wc is out of range, return -1 and sets errno to ‘EILSEQ’.

Function: int utf8_wc_to_mbstr (const unsigned *word, size_t wordlen, char **retptr)

Converts first wordlen characters of the wide character string word to multi-byte representation. The result is returned in retptr. It is allocated using malloc(3).

Returns 0 on success. On error, returns -1 and sets errno to one of the following values:

ENOMEM

Not enough memory to allocate the return buffer.

EILSEQ

An invalid wide character is encountered.

Function: int utf8_mbstr_to_wc (const char *str, unsigned **wptr, size_t *plen)

Converts a multi-byte string from str to its wide character representation.

The result is returned in retptr. It is allocated using malloc(3).

Returns 0 on success. On error, returns -1 and sets errno to one of the following values:

ENOMEM

Not enough memory to allocate the return buffer.

EILSEQ

An invalid wide character is encountered.

Function: int utf8_mbstr_to_norm_wc (const char *str, unsigned **wptr, size_t *plen)

Converts a multi-byte string from str to its wide character representation, replacing each run of one or more whitespace characters with a single space character (ASCII 32).

The result is returned in retptr. It is allocated using malloc(3).

Returns 0 on success. On error, returns -1 and sets errno to one of the following values:

ENOMEM

Not enough memory to allocate the return buffer.

EILSEQ

An invalid wide character is encountered.

D.10.4 Comparing UTF-8 strings

Function: int utf8_symcmp (unsigned char *a, unsigned char *b)

Compares first UTF-8 characters from a and b.

Function: int utf8_symcasecmp (unsigned char *a, unsigned char *b)

Compares first UTF-8 characters from a and b, ignoring the case.

Function: int utf8_strcasecmp (unsigned char *a, unsigned char *b)

Compares the two UTF-8 strings a and b, ignoring the case of the characters.

Function: int utf8_strncasecmp (unsigned char *a, unsigned char *b, size_t maxlen)

Compares at most maxlen first characters from the two UTF-8 strings a and b, ignoring the case of the characters.

Function: int utf8_wc_strcmp (const unsigned *a, const unsigned *b)

Compare the two wide character strings a and b.

Function: int utf8_wc_strncmp (const unsigned *a, const unsigned *b, size_t n)

Compares at most n first characters from the wide character strings a and b.

Function: int utf8_wc_strcasecmp (const unsigned *a, const unsigned *b)

Compares the two wide character strings a and b, ignoring the case of the characters.

Function: int utf8_wc_strncasecmp (const unsigned *a, const unsigned *b, size_t n)

Compares at most first n characters of the two wide character strings a and b, ignoring the case.

D.10.5 Character lookups

Function: unsigned * utf8_wc_strchr (const unsigned *str, unsigned chr)

Returns a pointer to the first occurrence of wide character chr in string str, or ‘NULL’, if no such character is encountered.

Function: unsigned * utf8_wc_strchr_ci (const unsigned *str, unsigned chr)

Returns a pointer to the first occurrence of wide character chr (case-insensitive) in string str, or ‘NULL’, if no such character is encountered.

Function: const unsigned * utf8_wc_strstr (const unsigned *vartext, const unsigned *pattern)

Finds the first occurrence of pattern in text. Returns a pointer to the beginning of pattern in text. Returns NULL if no occurrence was found.

D.10.6 Functions for converting UTF-8 characters

Function: unsigned utf8_wc_toupper (unsigned wc)

Converts wide character wc to upper case, if possible. Returns wc, if it cannot be converted.

Function: int utf8_toupper (char *s, size_t len)

Converts first len bytes of the UTF-8 string s to upper case, if possible.

Function: unsigned utf8_wc_tolower (unsigned wc)

Converts wide character wc to lower case, if possible. Returns wc, if it cannot be converted.

Function: int utf8_tolower (char *s, size_t len)

Converts first len bytes of the UTF-8 string s to lower case, if possible.

Function: void utf8_wc_strupper (unsigned *str)

Converts each character from the wide character string str to uppercase, if applicable.

Function: void utf8_wc_strlower (unsigned *str)

Converts each character from the wide character string str to lowercase, if applicable.

D.10.7 Additional functions

Function: unsigned * utf8_wc_strdup (const unsigned *s)

Returns a pointer to a new wide character string which is a duplicate of the string s. Memory for the new string is obtained with malloc(3), and can be freed with free(3).

Function: unsigned * utf8_wc_quote (const unsigned *s)

Quotes occurrences of backslash and double-quote in s by prefixing each of them with a backslash. The return value is allocated using malloc(3).

Function: int utf8_quote (const char *str, char **sptr)

Quotes occurrences of backslash and double-quote in s by prefixing each of them with a backslash. On success stores the result (allocated with malloc(3)) in sptr, and returns 0. On error, returns -1 and sets errno to the one of the following:

ENOMEM

Not enough memory to allocate the return buffer.

EILSEQ

An invalid wide character is encountered.

Function: size_t utf8_wc_hash_string (const unsigned *ws, size_t n)

Compute a hash code of ws for a symbol table of n buckets.

Function: int dico_levenshtein_distance (const char *a, const char *b, int flags)

Computes Levenshtein distance between UTF-8 strings a and b. The flags argument is a bitwise or of one or more flags:

0

Default - compute Levenstein distance, treating both arguments literally.

DICO_LEV_NORM

Treat runs of one or more whitespace characters as a single space character (ASCII 32).

DICO_LEV_DAMERAU

Compute Damerau-Levenshtein distance. This distance takes into account transpositions.

Function: int dico_soundex (const char *word, char code[DICO_SOUNDEX_SIZE])

Computes the Soundex code for the given word. The code is stored in code. Returns 0 on success, -1 if word is not a valid UTF-8 string.

Define: DICO_SOUNDEX_SIZE

This macro definition expands to the size of Soundex code buffer, including the terminal zero.

Note that this function silently ignores all characters, except Latin letters.

D.11 util

Editor’s note:

The information in this node may be obsolete or otherwise inaccurate. This message will disappear, once this node revised.

Function: char * dico_full_file_name (const char *dir, const char *file)
Function: size_t dico_trim_nl (char *buf)
Function: size_t dico_trim_ws (char *buf)

D.12 xlat

Editor’s note:

The information in this node may be obsolete or otherwise inaccurate. This message will disappear, once this node revised.

struct: xlat_tab
struct xlat_tab {
    char *string;
    int num;
};
Function: int xlat_string (struct xlat_tab *tab, const char *string, size_t len, int flags, int *result)
Function: int xlat_c_string (struct xlat_tab *tab, const char *string, int flags, int *result);
XLAT_ICASE

GNU Dico Manual (split by chapter):   Section:   Chapter:FastBack: Libdico   Up: Libdico   FastForward: Copying This Manual   Contents: Table of ContentsIndex: Concept Index