6.5 – UTF-8 Support

This library provides basic support for UTF-8 encoding. It provides all its functions inside the table utf8. This library does not provide any support for Unicode other than the handling of the encoding. Any operation that needs the meaning of a character, such as character classification, is outside its scope.

Unless stated otherwise, all functions that expect a byte position as a parameter assume that the given position is either the start of a byte sequence or one plus the length of the subject string. As in the string library, negative indices count from the end of the string.

utf8.char()
  • References/Lua/Lua/Standard Libraries/UTF-8 Support

utf8.char (···)

2025-01-10 15:47:30
utf8.len()
  • References/Lua/Lua/Standard Libraries/UTF-8 Support

utf8.len (s [, i [, j]])sijij

2025-01-10 15:47:30
utf8.codepoint()
  • References/Lua/Lua/Standard Libraries/UTF-8 Support

utf8.codepoint (s [, i [, j]])sijiji

2025-01-10 15:47:30
utf8.charpattern
  • References/Lua/Lua/Standard Libraries/UTF-8 Support

utf8.charpattern[\0-\x7F\xC2-\xF4][\x80-\xBF]*§6.4.1

2025-01-10 15:47:30
utf8.offset()
  • References/Lua/Lua/Standard Libraries/UTF-8 Support

utf8.offset (s, n [, i])nsiniin#s + 1utf8.offset(s, -n)nnil

2025-01-10 15:47:30
utf8.codes()
  • References/Lua/Lua/Standard Libraries/UTF-8 Support

utf8.codes (s) Returns values so that the construction for p, c in utf8.codes(s) do body end will iterate over all characters

2025-01-10 15:47:30