File Formats

When reading or writing rows (:array) or records (:hash), IOStreams converts each line to or from the file’s tabular format. The following formats are supported:

Format inference

The format is inferred from the file name when it contains a recognized extension:

IOStreams.path("sample.csv").each(:hash) { |record| p record }
IOStreams.path("sample.json").each(:hash) { |record| p record }
IOStreams.path("sample.psv").each(:hash) { |record| p record }

The format extension can appear anywhere in the file name, so sample.csv.gz and sample.json.pgp are recognized as CSV and JSON respectively.

When the file name does not contain a recognized format extension, the format defaults to :csv.

Specifying the format

When the file name cannot be used to infer the format, set it explicitly with format:

path = IOStreams.path("sample_data")
path.format(:json)
path.each(:hash) { |record| p record }

format can be chained with the other path methods:

IOStreams.path("sample_data").format(:json).each(:hash) { |record| p record }

Format options

Format specific options are supplied with format_options. They are passed to the parser for the chosen format. The :fixed format requires its file layout to be supplied this way, as shown in the next section. The other formats do not currently take any options.

Fixed width files

Fixed width files have no delimiters; each column is identified by its position within the line. Since the layout cannot be inferred from the file, supply it using format_options:

path = IOStreams.path("sample_data")
path.format(:fixed)
path.format_options(
  layout: [
    {size: 23, key: "name"},
    {size: 40, key: "address"},
    {size: 5,  key: "zip"}
  ]
)
path.each(:hash) { |record| p record }

Writing a fixed width file uses the same layout to render each record:

path = IOStreams.path("sample_data")
path.format(:fixed)
path.format_options(
  layout: [
    {size: 23, key: "name"},
    {size: 40, key: "address"},
    {size: 5,  key: "zip"}
  ]
)
path.writer(:hash) do |io|
  io << {"name" => "Jack Jones", "address" => "Somewhere", "zip" => 12345}
end

Note: The keys in the hashes being written must match the layout :key values exactly, including whether they are strings or symbols.

Layout column definitions:

Header options

When reading or writing records (:hash), the following options control the header row:

Example, reading a headerless CSV file:

path = IOStreams.path("no_header.csv")
path.each(:hash, columns: ["name", "address", "zip"]) do |record|
  p record
end

Example, writing only specific columns in a fixed order:

path = IOStreams.path("sample.csv")
path.writer(:hash, columns: ["name", "zip"]) do |io|
  io << {"name" => "Jack Jones", "address" => "Somewhere", "zip" => 12345}
end
path.read
# => "name,zip\nJack Jones,12345\n"

Note: Column names are converted to strings, and the keys in the hashes being written may be strings or symbols.