The version of the installed library.
A FieldInfo Struct contains details about a field's position in the data source it was read from. CSV will pass this Struct to some blocks that make decisions based on field structure. See CSV.convert_fields() for an example.
-
index
-
The zero-based index of the field in its row.
-
line
-
The line of the data source this row is from.
-
header
-
The header for the column, when available.
A Regexp used to find and convert some common Date formats.
A Regexp used to find and convert some common DateTime formats.
The encoding used by all converters.
This Hash holds the built-in converters of CSV that can be accessed by name. You can select Converters with #convert or through the
options
Hash passed to ::new.
-
:integer
-
Converts any field Integer() accepts.
-
:float
-
Converts any field Float() accepts.
-
:numeric
-
A combination of
:integer
and:float
. -
:date
-
Converts any field Date.parse accepts.
-
:date_time
-
Converts any field DateTime.parse accepts.
-
:all
-
All built-in converters. A combination of
:date_time
and:numeric
.
All built-in converters transcode field data to UTF-8 before attempting a conversion. If your data cannot be transcoded to UTF-8 the conversion will fail and the field will remain unchanged.
This Hash is intentionally left unfrozen and users should feel free to add values to it that can be accessed by all CSV objects.
To add a combo field, the value should be an Array of names. Combo fields can be nested with other combo fields.
This Hash holds the built-in header converters of
CSV that can be accessed by name. You can select HeaderConverters with #header_convert or through the
options
Hash passed to ::new.
-
:downcase
-
Calls downcase() on the header String.
-
:symbol
-
The header String is downcased, spaces are replaced with underscores, non-word characters are dropped, and finally to_sym() is called.
All built-in header converters transcode header data to UTF-8 before attempting a conversion. If your data cannot be transcoded to UTF-8 the conversion will fail and the header will remain unchanged.
This Hash is intetionally left unfrozen and users should feel free to add values to it that can be accessed by all CSV objects.
To add a combo field, the value should be an Array of names. Combo fields can be nested with other combo fields.
The options used when no overrides are given by calling code. They are:
-
:col_sep
-
","
-
:row_sep
-
:auto
-
:quote_char
-
'"'
-
:field_size_limit
-
nil
-
:converters
-
nil
-
:unconverted_fields
-
nil
-
:headers
-
false
-
:return_headers
-
false
-
:header_converters
-
nil
-
:skip_blanks
-
false
-
:force_quotes
-
false
-
:skip_lines
-
nil
This class provides a complete interface to CSV files and data. It offers tools to enable you to read and write to and from Strings or IO objects, as needed.
Reading
From a File
A Line at a Time
CSV.foreach("path/to/file.csv") do |row| # use row here... end
All at Once
arr_of_arrs = CSV.read("path/to/file.csv")
From a String
A Line at a Time
CSV.parse("CSV,data,String") do |row| # use row here... end
All at Once
arr_of_arrs = CSV.parse("CSV,data,String")
Writing
To a File
CSV.open("path/to/file.csv", "wb") do |csv| csv << ["row", "of", "CSV", "data"] csv << ["another", "row"] # ... end
To a String
csv_string = CSV.generate do |csv| csv << ["row", "of", "CSV", "data"] csv << ["another", "row"] # ... end
Convert a Single Line
csv_string = ["CSV", "data"].to_csv # to CSV csv_array = "CSV,String".parse_csv # from CSV
Shortcut Interface
CSV { |csv_out| csv_out << %w{my data here} } # to $stdout CSV(csv = "") { |csv_str| csv_str << %w{my data here} } # to a String CSV($stderr) { |csv_err| csv_err << %w{my data here} } # to $stderr CSV($stdin) { |csv_in| csv_in.each { |row| p row } } # from $stdin
Advanced Usage
Wrap an IO Object
csv = CSV.new(io, options) # ... read (with gets() or each()) from and write (with <<) to csv here ...
CSV and Character Encodings (M17n or Multilingualization)
This new CSV parser is m17n savvy. The parser works in the Encoding of the IO or String object being read from or written to. Your data is never transcoded (unless you ask Ruby to transcode it for you) and will literally be parsed in the Encoding it is in. Thus CSV will return Arrays or Rows of Strings in the Encoding of your data. This is accomplished by transcoding the parser itself into your Encoding.
Some transcoding must take place, of course, to accomplish this
multiencoding support. For example, :col_sep
,
:row_sep
, and :quote_char
must be transcoded to
match your data. Hopefully this makes the entire process feel transparent,
since CSV's defaults should just magically work for you data. However,
you can set these values manually in the target Encoding to avoid the translation.
It's also important to note that while all of CSV's core parser is now Encoding agnostic, some features are not. For example, the built-in converters will try to transcode data to UTF-8 before making conversions. Again, you can provide custom converters that are aware of your Encodings to avoid this translation. It's just too hard for me to support native conversions in all of Ruby's Encodings.
Anyway, the practical side of this is simple: make sure IO and String objects passed into CSV have the proper Encoding set and everything should just work. CSV methods that allow you to open IO objects (CSV::foreach(), ::open, ::read, and ::readlines) do allow you to specify the Encoding.
One minor exception comes when generating CSV into a String with an Encoding that is not ASCII compatible. There's no existing data for CSV to use to prepare itself and thus you will probably need to manually specify the desired Encoding for most of those cases. It will try to guess using the fields in a row of output though, when using ::generate_line or Array#to_csv().
I try to point out any other Encoding issues in the documentation of methods as they come up.
This has been tested to the best of my ability with all non-“dummy” Encodings Ruby ships with. However, it is brave new code and may have some bugs. Please feel free to report any issues you find with it.