GSP
Quick Navigator

Search Site

Unix VPS
A - Starter
B - Basic
C - Preferred
D - Commercial
MPS - Dedicated
Previous VPSs
* Sign Up! *

Support
Contact Us
Online Help
Handbooks
Domain Status
Man Pages

FAQ
Virtual Servers
Pricing
Billing
Technical

Network
Facilities
Connectivity
Topology Map

Miscellaneous
Server Agreement
Year 2038
Credits
 

USA Flag

 

 

Man Pages
Text::CSV::Hashify(3) User Contributed Perl Documentation Text::CSV::Hashify(3)

Text::CSV::Hashify - Turn a CSV file into a Perl hash

This document refers to version 0.11 of Text::CSV::Hashify. This version was released May 22 2018.

    # Simple functional interface
    use Text::CSV::Hashify;
    $hash_ref = hashify('/path/to/file.csv', 'primary_key');

    # Object-oriented interface
    use Text::CSV::Hashify;
    $obj = Text::CSV::Hashify->new( {
        file        => '/path/to/file.csv',
        format      => 'hoh', # hash of hashes, which is default
        key         => 'id',  # needed except when format is 'aoh'
        max_rows    => 20,    # number of records to read; defaults to all
        ... # other key-value pairs as appropriate from Text::CSV
    } );

    # all records requested
    $hash_ref       = $obj->all;

    # arrayref of fields input
    $fields_ref     = $obj->fields;

    # hashref of specified record
    $record_ref     = $obj->record('value_of_key');

    # value of one field in one record
    $datum          = $obj->datum('value_of_key', 'field');

    # arrayref of all unique keys seen
    $keys_ref       = $obj->keys;

The Comma-Separated-Value ('CSV') format is the most common way to store spreadsheets or the output of relational database queries in plain-text format. However, since commas (or other designated field-separator characters) may be embedded within data entries, the parsing of delimited records is non-trivial. Fortunately, in Perl this parsing is well handled by CPAN distribution Text::CSV <http://search.cpan.org/dist/Text-CSV/>. This permits us to address more specific data manipulation problems by building modules on top of Text::CSV.

Note: In this document we will use CSV as a catch-all for tab-delimited files, pipe-delimited files, and so forth. Please refer to the documentation for Text::CSV to learn how to handle field separator characters other than the comma.

Text::CSV::Hashify is designed for the case where you simply want to turn a CSV file into a Perl hash. In particular, it is designed for the case where:

  • the CSV file's first record is a list of fields in the ancestral database table; and
  • one field (column) functions as a primary key, i.e., each record's entry in that field is non-null and is distinct from every other record's entry therein.

Text::CSV::Hashify turns that kind of CSV file into one big hash of hashes.

Text::CSV::Hashify can now take gzip-compressed (.gz) files as input as well as uncompressed files.

Text::CSV::Hashify is designed for the case where you simply want to turn a CSV file into a Perl hash. In particular, it is designed for the case where (a) the CSV file's first record is a list of fields in the ancestral database table and (b) one field (column) functions as a primary key, i.e., each record's entry in that field is non-null and is distinct from every other record's entry therein.

Text::CSV::Hashify turns that kind of CSV file into one big hash of hashes. Elements of this hash are keyed on the entries in the designated primary key field and the value for each element is a hash reference of all the data in a particular database record (including the primary key field and its value).

You may, however, encounter cases where a CSV file's header row contains the list of database fields but no field is capable of serving as a primary key, i.e., there is no field in which the entry for that field in any record is guaranteed to be distinct from the entries in that field for all other records.

In this case, while an individual record can be turned into a hash, the CSV file as a whole cannot accurately be turned into a hash of hashes. As a fallback, Text::CSV::Hashify can, upon request, turn this into an array of hashes. In this case, you will not be able to look up a particular record by its primary key. You will instead have to know its index position within the array (which is equivalent to knowing its record number in the original CSV file minus 1).

Text::CSV::Hashify provides two interfaces: one functional, one object-oriented.

Use the functional interface when all you want is to turn a CSV file with a primary key field into a hash of hashes.

Use the object-oriented interface for any more sophisticated manipulation of the CSV file. This includes:

  • Text::CSV options

    Access to any of the options available to Text::CSV, such as use of a separator character other than a comma. Note: Much of the time you will not need any of the Text::CSV options. Text::CSV::Hashify is focused on reading CSV files, whereas Text::CSV is focused on both reading and writing CSV files. Some Text::CSV options, such as "eol", are unlikely to be needed when using Text::CSV::Hashify. Hence, you should be very selective in your use of Text::CSV options.

  • Limit number of records

    Selection of a limited number of records from the CSV file, rather than slurping the whole file into your in-memory hash.

  • Array of hash references format

    Probably better than the default hash of hash references format when the CSV file has no field able to serve as a primary key.

  • Metadata

    Access to the list of fields, the list of all primary key values, the values in an individual record, or the value of an individual field in an individual record.

Note: On the recommendation of the authors/maintainers of Text::CSV, Text::CSV::Hashify will internally always set Text::CSV's "binary => 1" option.

Text::CSV::Hashify by default exports one function: "hashify()".

    $hash_ref = hashify('/path/to/file.csv', 'primary_key');

or

    $hash_ref = hashify('/path/to/file.csv.gz', 'primary_key');

Function takes two arguments: path to CSV file; field in that file which serves as primary key. If the path to the input file ends in .gz, it is assumed to be compressed by gzip. If the file name ends in .psv (or .psv.gz), the separator character is assumed to be a pipe ("|"). If the file name ends in .tsv (or .tsv.gz), the separator character is assumed to be a tab (" "). Otherwise, the separator character will be assumed to be a comma (",").

Returns a reference to a hash of hash references.

  • Purpose

    Text::CSV::Hashify constructor.

  • Arguments

        $obj = Text::CSV::Hashify->new( {
            file        => '/path/to/file.csv',
            format      => 'hoh', # hash of hashes, which is default
            key         => 'id',  # needed except when format is 'aoh'
            max_rows    => 20,    # number of records to read; defaults to all
            ... # other key-value pairs as appropriate from Text::CSV
        } );
        

    Single hash reference. Required element is:

"file"

String: path to CSV file serving as input. If the path to the input file ends in .gz, it is assumed to be compressed by gzip.

Element usually needed:

"key"

String: name of field in CSV file serving as unique key. Needed except when optional element "format" is "aoh".

Optional elements are:

  • "format"

    String: possible values are "hoh" and "aoh". Defaults to "hoh" (hash of hashes). "new()" will fail if the same value is encountered in more than one record's entry in the "key" column. So if you know in advance that your data cannot meet this condition, explicitly select "format => aoh".

  • "max_rows"

    Number: provide this if you do not wish to populate the hash with all data records from the CSV file. (Will have no effect if the number provided is greater than or equal to the number of data records in the CSV file.)

  • Any option available to Text::CSV

    See documentation for either Text::CSV or Text::CSV_XS, but see discussion of "Text::CSV options" above.

  • Return Value

    Text::CSV::Hashify object.

  • Comment

  • Purpose

    Get a representation of all data found in a CSV input file.

  • Arguments

        $hash_ref   = $obj->all; # when format is default or 'hoh'
        $array_ref  = $obj->all; # when format is 'aoh'
        
  • Return Value

    Reference representing all data records in the CSV input file. In the default case, or if you have specifically requested "format =" 'hoh'>, the return value is a hash reference. When you have requested "format =" 'aoh'>, the return value is an array reference.

  • Comment

    In the default ("hoh") case, the return value is equivalent to that of "hashify()".

  • Purpose

    Get a list of the fields in the CSV source.

  • Arguments

        $fields_ref = $obj->fields;
        
  • Return Value

    Array reference.

  • Comment

    If any field names are duplicate, you will not get this far, as "new()" would have died.

  • Purpose

    Get a hash representing one record in the CSV input file.

  • Arguments

        $record_ref = $obj->record('value_of_key');
        

    One argument. In the default case ("format => 'hoh'"), this argument is the value in the record in the column serving as unique key.

    In the "format => 'aoh'" case, this will be index position of the data record in the array. (The header row will be at index 0.)

  • Return Value

    Hash reference.

  • Purpose

    Get value of one field in one record.

  • Arguments

        $datum = $obj->datum('value_of_key', 'field');
        

    List of two arguments: the value in the record in the column serving as unique key; the name of the field.

  • Return Value

    Scalar.

  • Purpose

    Get a list of all unique keys found in the input file.

  • Arguments

        $keys_ref = $obj->keys;
        
  • Return Value

    Array reference.

  • Comment

    If you have selected "format => 'aoh'" in the options to "new()", the "keys" method is inappropriate and will cause your program to die.

    James E Keenan
    CPAN ID: jkeenan
    jkeenan@cpan.org
    http://thenceforward.net/perl/modules/Text-CSV-Hashify

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

The full text of the license can be found in the LICENSE file included with this module.

Copyright 2012-2018, James E Keenan. All rights reserved.

There are no bug reports outstanding on Text::CSV::Hashify as of the most recent CPAN upload date of this distribution.

To report any bugs or make any feature requests, please send mail to "bug-Text-CSV-Hashify@rt.cpan.org" or use the web interface at <http://rt.cpan.org>.

Thanks to Christine Shieh for serving as the alpha consumer of this library's output.

These distributions underlie Text-CSV-Hashify and provide all of its file-parsing functionality. Where possible, install both. That will enable you to process a file with a single, shared interface but have access to the faster processing speeds of XS where available.

Like Text-CSV-Hashify, Text-CSV-Slurp slurps an entire CSV file into memory, but stores it as an array of hashes instead.

This distribution inspired the "max_rows" option to "new()".
2018-05-22 perl v5.32.1

Search for    or go to Top of page |  Section 3 |  Main Index

Powered by GSP Visit the GSP FreeBSD Man Page Interface.
Output converted with ManDoc.