utf8 (GNU Dico Manual)

Main
Modules
Dicoweb
Downloads
Documentation
The Team

D.10 UTF-8

This section describes functions for handling UTF-8 strings. A UTF-8 character can be represented either as a multi-byte character or a wide character.

Multibyte character is a char * pointing to one or more bytes representing the UTF-8 character.

Wide character is an unsigned value identifying the character.

In the discussion below, a sequence of one or more multi-byte characters is called a multi-byte string. Multibyte strings terminate with a single ‘nul’ (0) character.

A sequence of one or more wide characters is called a wide character string. Such strings terminate with a single 0 value.

This document was generated on September 4, 2020 using makeinfo.

Verbatim copying and distribution of this entire article is permitted in any medium, provided this notice is preserved.

Dico

GNU Dictionary Server

D.10 UTF-8