Type:
Module
Constants:
NORMALIZATION_FORMS : [:c, :kc, :d, :kd]

A list of all available normalization forms. See www.unicode.org/reports/tr15/tr15-29.html for more information about normalization.

UNICODE_VERSION : '6.3.0'

The Unicode version that is supported by the implementation

HANGUL_SBASE : 0xAC00

Hangul character boundaries and properties

HANGUL_LBASE : 0x1100
HANGUL_VBASE : 0x1161
HANGUL_TBASE : 0x11A7
HANGUL_LCOUNT : 19
HANGUL_VCOUNT : 21
HANGUL_TCOUNT : 28
HANGUL_NCOUNT : HANGUL_VCOUNT * HANGUL_TCOUNT
HANGUL_SCOUNT : 11172
HANGUL_SLAST : HANGUL_SBASE + HANGUL_SCOUNT
HANGUL_JAMO_FIRST : 0x1100
HANGUL_JAMO_LAST : 0x11FF
WHITESPACE : [ (0x0009..0x000D).to_a, # White_Space # Cc [5] <control-0009>..<control-000D> 0x0020, # White_Space # Zs SPACE 0x0085, # White_Space # Cc <control-0085> 0x00A0, # White_Space # Zs NO-BREAK SPACE 0x1680, # White_Space # Zs OGHAM SPACE MARK (0x2000..0x200A).to_a, # White_Space # Zs [11] EN QUAD..HAIR SPACE 0x2028, # White_Space # Zl LINE SEPARATOR 0x2029, # White_Space # Zp PARAGRAPH SEPARATOR 0x202F, # White_Space # Zs NARROW NO-BREAK SPACE 0x205F, # White_Space # Zs MEDIUM MATHEMATICAL SPACE 0x3000, # White_Space # Zs IDEOGRAPHIC SPACE ].flatten.freeze

All the unicode whitespace

LEADERS_AND_TRAILERS : WHITESPACE + [65279]

BOM (byte order mark) can also be seen as whitespace, it's a non-rendering character used to distinguish between little and big endian. This is not an issue in utf-8, so it must be ignored.

TRAILERS_PAT : /(#{codepoints_to_pattern(LEADERS_AND_TRAILERS)})+\Z/u
LEADERS_PAT : /\A(#{codepoints_to_pattern(LEADERS_AND_TRAILERS)})+/u
swapcase
  • References/Ruby on Rails/Rails/Classes/ActiveSupport/ActiveSupport::Multibyte/ActiveSupport::Multibyte::Unicode

swapcase(string) Instance Public methods

2025-01-10 15:47:30
load
  • References/Ruby on Rails/Rails/Classes/ActiveSupport/ActiveSupport::Multibyte/ActiveSupport::Multibyte::Unicode/ActiveSupport::Multibyte::Unicode::UnicodeDatabase

load() Instance Public methods Loads the

2025-01-10 15:47:30
filename
  • References/Ruby on Rails/Rails/Classes/ActiveSupport/ActiveSupport::Multibyte/ActiveSupport::Multibyte::Unicode/ActiveSupport::Multibyte::Unicode::UnicodeDatabase

filename() Class Public methods Returns the filename for the data file for this

2025-01-10 15:47:30
upcase
  • References/Ruby on Rails/Rails/Classes/ActiveSupport/ActiveSupport::Multibyte/ActiveSupport::Multibyte::Unicode

upcase(string) Instance Public methods

2025-01-10 15:47:30
pack_graphemes
  • References/Ruby on Rails/Rails/Classes/ActiveSupport/ActiveSupport::Multibyte/ActiveSupport::Multibyte::Unicode

pack_graphemes(unpacked) Instance Public methods Reverse operation of unpack_graphemes

2025-01-10 15:47:30
decompose
  • References/Ruby on Rails/Rails/Classes/ActiveSupport/ActiveSupport::Multibyte/ActiveSupport::Multibyte::Unicode

decompose(type, codepoints) Instance Public methods Decompose composed characters

2025-01-10 15:47:30
dirname
  • References/Ruby on Rails/Rails/Classes/ActiveSupport/ActiveSupport::Multibyte/ActiveSupport::Multibyte::Unicode/ActiveSupport::Multibyte::Unicode::UnicodeDatabase

dirname() Class Public methods Returns the directory in which the data files

2025-01-10 15:47:30
swapcase_mapping
  • References/Ruby on Rails/Rails/Classes/ActiveSupport/ActiveSupport::Multibyte/ActiveSupport::Multibyte::Unicode/ActiveSupport::Multibyte::Unicode::Codepoint

swapcase_mapping() Instance Public methods

2025-01-10 15:47:30
reorder_characters
  • References/Ruby on Rails/Rails/Classes/ActiveSupport/ActiveSupport::Multibyte/ActiveSupport::Multibyte::Unicode

reorder_characters(codepoints) Instance Public methods Re-order codepoints so

2025-01-10 15:47:30
tidy_bytes
  • References/Ruby on Rails/Rails/Classes/ActiveSupport/ActiveSupport::Multibyte/ActiveSupport::Multibyte::Unicode

tidy_bytes(string, force = false) Instance Public methods Replaces all ISO-8859-1

2025-01-10 15:47:30