6.5 – UTF-8 Support

This library provides basic support for UTF-8 encoding. It provides all its functions inside the table utf8. This library does not provide any support for Unicode other than the handling of the encoding. Any operation that needs the meaning of a character, such as character classification, is outside its scope.

Unless stated otherwise, all functions that expect a byte position as a parameter assume that the given position is either the start of a byte sequence or one plus the length of the subject string. As in the string library, negative indices count from the end of the string.

utf8.codepoint()

utf8.codepoint (s [, i [, j]])sijiji

2017-02-21 04:16:29
utf8.len()

utf8.len (s [, i [, j]])sijij

2017-02-21 04:16:30
utf8.char()

utf8.char (···)

2017-02-21 04:16:27
utf8.offset()

utf8.offset (s, n [, i])nsiniin#s + 1utf8.offset(s, -n)nnil

2017-02-21 04:16:31
utf8.charpattern

utf8.charpattern[\0-\x7F\xC2-\xF4][\x80-\xBF]*§6.4.1

2017-02-21 04:16:28
utf8.codes()

utf8.codes (s) Returns values so that the construction for p, c in utf8.codes(s) do body end will iterate over all characters

2017-02-21 04:16:30