Encoding::Converter.new(source_encoding, destination_encoding, opt)
Encoding::Converter.new(convpath)
possible options elements:
hash form:
:invalid => nil # raise error on invalid byte sequence (default)
:invalid => :replace # replace invalid byte sequence
:undef => nil # raise error on undefined conversion (default)
:undef => :replace # replace undefined conversion
:replace => string # replacement string ("?" or "\uFFFD" if not specified)
:newline => :universal # decorator for converting CRLF and CR to LF
:newline => :crlf # decorator for converting LF to CRLF
:newline => :cr # decorator for converting LF to CR
:universal_newline => true # decorator for converting CRLF and CR to LF
:crlf_newline => true # decorator for converting LF to CRLF
:cr_newline => true # decorator for converting LF to CR
:xml => :text # escape as XML CharData.
:xml => :attr # escape as XML AttValue
integer form:
Encoding::Converter::INVALID_REPLACE
Encoding::Converter::UNDEF_REPLACE
Encoding::Converter::UNDEF_HEX_CHARREF
Encoding::Converter::UNIVERSAL_NEWLINE_DECORATOR
Encoding::Converter::CRLF_NEWLINE_DECORATOR
Encoding::Converter::CR_NEWLINE_DECORATOR
Encoding::Converter::XML_TEXT_DECORATOR
Encoding::Converter::XML_ATTR_CONTENT_DECORATOR
Encoding::Converter::XML_ATTR_QUOTE_DECORATOR::new creates an instance of Encoding::Converter.
Source_encoding and #destination_encoding should be a string or Encoding object.
opt should be nil, a hash or an integer.
convpath should be an array. convpath may contain
-
two-element arrays which contain encodings or encoding names, or
-
strings representing decorator names.
::new optionally takes an option. The option should be a hash or an integer. The option hash can contain :invalid => nil, etc. The option integer should be logical-or of constants such as Encoding::Converter::INVALID_REPLACE, etc.
- :invalid => nil
-
Raise error on invalid byte sequence. This is a default behavior.
- :invalid => :replace
-
Replace invalid byte sequence by replacement string.
- :undef => nil
-
Raise an error if a character in #source_encoding is not defined in destination_encoding. This is a default behavior.
- :undef => :replace
-
Replace undefined character in #destination_encoding with replacement string.
- :replace => string
-
Specify the replacement string. If not specified, âuFFFDâ is used for Unicode encodings and â?â for others.
- :universal_newline => true
-
Convert CRLF and CR to LF.
- :crlf_newline => true
-
Convert LF to CRLF.
- :cr_newline => true
-
Convert LF to CR.
- :xml => :text
-
Escape as XML CharData. This form can be used as a HTML 4.0 #PCDATA.
-
'&' -> '&'
-
'<' -> '<'
-
'>' -> '>'
-
undefined characters in #destination_encoding -> hexadecimal CharRef such as &#xHH;
-
- :xml => :attr
-
Escape as XML AttValue. The converted result is quoted as ââ¦â. This form can be used as a HTML 4.0 attribute value.
-
'&' -> '&'
-
'<' -> '<'
-
'>' -> '>'
-
'â' -> '"'
-
undefined characters in #destination_encoding -> hexadecimal CharRef such as &#xHH;
-
Examples:
# UTF-16BE to UTF-8
ec = Encoding::Converter.new("UTF-16BE", "UTF-8")
# Usually, decorators such as newline conversion are inserted last.
ec = Encoding::Converter.new("UTF-16BE", "UTF-8", :universal_newline => true)
p ec.convpath #=> [[#<Encoding:UTF-16BE>, #<Encoding:UTF-8>],
# "universal_newline"]
# But, if the last encoding is ASCII incompatible,
# decorators are inserted before the last conversion.
ec = Encoding::Converter.new("UTF-8", "UTF-16BE", :crlf_newline => true)
p ec.convpath #=> ["crlf_newline",
# [#<Encoding:UTF-8>, #<Encoding:UTF-16BE>]]
# Conversion path can be specified directly.
ec = Encoding::Converter.new(["universal_newline", ["EUC-JP", "UTF-8"], ["UTF-8", "UTF-16BE"]])
p ec.convpath #=> ["universal_newline",
# [#<Encoding:EUC-JP>, #<Encoding:UTF-8>],
# [#<Encoding:UTF-8>, #<Encoding:UTF-16BE>]]
Please login to continue.