Encoding::Converter.new(source_encoding, destination_encoding, opt)
Encoding::Converter.new(convpath)
possible options elements:
hash form: :invalid => nil # raise error on invalid byte sequence (default) :invalid => :replace # replace invalid byte sequence :undef => nil # raise error on undefined conversion (default) :undef => :replace # replace undefined conversion :replace => string # replacement string ("?" or "\uFFFD" if not specified) :newline => :universal # decorator for converting CRLF and CR to LF :newline => :crlf # decorator for converting LF to CRLF :newline => :cr # decorator for converting LF to CR :universal_newline => true # decorator for converting CRLF and CR to LF :crlf_newline => true # decorator for converting LF to CRLF :cr_newline => true # decorator for converting LF to CR :xml => :text # escape as XML CharData. :xml => :attr # escape as XML AttValue integer form: Encoding::Converter::INVALID_REPLACE Encoding::Converter::UNDEF_REPLACE Encoding::Converter::UNDEF_HEX_CHARREF Encoding::Converter::UNIVERSAL_NEWLINE_DECORATOR Encoding::Converter::CRLF_NEWLINE_DECORATOR Encoding::Converter::CR_NEWLINE_DECORATOR Encoding::Converter::XML_TEXT_DECORATOR Encoding::Converter::XML_ATTR_CONTENT_DECORATOR Encoding::Converter::XML_ATTR_QUOTE_DECORATOR
::new creates an instance of Encoding::Converter.
Source_encoding and #destination_encoding should be a string or Encoding object.
opt should be nil, a hash or an integer.
convpath should be an array. convpath may contain
-
two-element arrays which contain encodings or encoding names, or
-
strings representing decorator names.
::new optionally takes an option. The option should be a hash or an integer. The option hash can contain :invalid => nil, etc. The option integer should be logical-or of constants such as Encoding::Converter::INVALID_REPLACE, etc.
- :invalid => nil
-
Raise error on invalid byte sequence. This is a default behavior.
- :invalid => :replace
-
Replace invalid byte sequence by replacement string.
- :undef => nil
-
Raise an error if a character in #source_encoding is not defined in destination_encoding. This is a default behavior.
- :undef => :replace
-
Replace undefined character in #destination_encoding with replacement string.
- :replace => string
-
Specify the replacement string. If not specified, âuFFFDâ is used for Unicode encodings and â?â for others.
- :universal_newline => true
-
Convert CRLF and CR to LF.
- :crlf_newline => true
-
Convert LF to CRLF.
- :cr_newline => true
-
Convert LF to CR.
- :xml => :text
-
Escape as XML CharData. This form can be used as a HTML 4.0 #PCDATA.
-
'&' -> '&'
-
'<' -> '<'
-
'>' -> '>'
-
undefined characters in #destination_encoding -> hexadecimal CharRef such as &#xHH;
-
- :xml => :attr
-
Escape as XML AttValue. The converted result is quoted as ââ¦â. This form can be used as a HTML 4.0 attribute value.
-
'&' -> '&'
-
'<' -> '<'
-
'>' -> '>'
-
'â' -> '"'
-
undefined characters in #destination_encoding -> hexadecimal CharRef such as &#xHH;
-
Examples:
# UTF-16BE to UTF-8 ec = Encoding::Converter.new("UTF-16BE", "UTF-8") # Usually, decorators such as newline conversion are inserted last. ec = Encoding::Converter.new("UTF-16BE", "UTF-8", :universal_newline => true) p ec.convpath #=> [[#<Encoding:UTF-16BE>, #<Encoding:UTF-8>], # "universal_newline"] # But, if the last encoding is ASCII incompatible, # decorators are inserted before the last conversion. ec = Encoding::Converter.new("UTF-8", "UTF-16BE", :crlf_newline => true) p ec.convpath #=> ["crlf_newline", # [#<Encoding:UTF-8>, #<Encoding:UTF-16BE>]] # Conversion path can be specified directly. ec = Encoding::Converter.new(["universal_newline", ["EUC-JP", "UTF-8"], ["UTF-8", "UTF-16BE"]]) p ec.convpath #=> ["universal_newline", # [#<Encoding:EUC-JP>, #<Encoding:UTF-8>], # [#<Encoding:UTF-8>, #<Encoding:UTF-16BE>]]
Please login to continue.