NAME

FlatFile - Manipulate flat-file databases


SYNOPSIS

  # Direct use
  use UPenn::ISC::FlatFile;
  my $password = UPenn::ISC::FlatFile->new(FILE => $filename, 
                   FIELDS => [qw(username password uid gid gecos home shell)],
                   MODE => "+<",  # "<" for read-only access
                   RECSEP => "\n", FIELDSEP => ":");
  my ($mjd) = $file->lookup(username => "mjd");
  print "mjd: ", $mjd->uid, "\n";
  # Look up all records for which function returns true
  sub is_chen { $_{gecos} =~ /\bChen$/ }
  my @chens = $file->c_lookup(\&is_chen);
  for (@chens) { $_->set_shell("/bin/false") }
  $mjd->delete;  # delete MJD from file
  $password->flush;  # write out changes to file
  # Subclass:
  package PasswordFile;
  our @ISA = 'UPenn::ISC::FlatFile';
  our @FIELDS = qw(username password uid gid gecos home shell);
  our $RECSEP = "\n";
  our $FIELDSEP = ":";
  our $MODE = "<";
  our $FILE = "/etc/passwd";
  # Main program uses subclass:
  package main;
  my $password = PasswordFile->new;
  ... the rest as above ...


DESCRIPTION

FlatFile is a class for manipulating flat-file (plain text) databases. One first opens the database, obtaining a database object. Queries may be perfomed on the database object, yielding record objects, which can be queried to retrieve information from the database. If the database is writable, the objects can be updated, and the updates written back to the file.

The module locks the file while operating on it, and all updates are done atomically.

Subclasses of this module can be created to represent specific files, such as the Unix password file or the Account Management db.list file.


Methods

$db = UPenn::ISC::FlatFile->new(FILE => $filename, FIELDS => [...], ...);

The new method opens the database. At least two arguments are required: the FILE argument that gives the path at which the data can be accessed, and the FIELDS argument that names the fields, in order.

By default, the file will be opened for reading only. To override this, supply a MODE argument whose value is a mode string like the one given as the second argument to the Perl built-in open function. For read-write access, you should probably use MODE => "+<". To modify the database, you will need permission to write the data file itself and the directory in which it resides.

The file will be assumed to contain ``records'' that are divided into ``fields''. By default, records are assumed to be terminated with a newline character; to override this, use RECSEP => $separator. Fields are assumed to be separated by whitespace; to override, use FIELDSEP => $pattern. $pattern may be a compiled regex object or a literal string. If it is a pattern, you must also supply an example string with <FIELDSEPSTR> that will be used when writing out records. For example, for the Unix password file, whose fields are separated by colons, use:

        FIELDSEP => ":"

but for a file whose fields are separated by one or more space characters, use:

        FIELDSEP => qr/ +/,  FIELDSEPSTR => "  "

The FIELDSEPSTR argument tells the module to use two spaces between fields when writing out new records.

When changes are written to the disk, the module first copies the modified data to a temporary file, then atomically replaces the old file with the temporary file. To specify a temporary filename, use TMPFILE => $filename. Otherwise, it will default to the name of the main file with ".tmp" appended.

Record objects will be allocated in dynamically generated classes named UPenn::ISC::FlatFile::Rec::A, UPenn::ISC::FlatFile::Rec::B, and so on, which inherit from UPenn::ISC::FlatFile::Rec. To override this choice of class, supply a class name with RECBASECLASS => $classname.

The data file will be opened and locked using the UPenn::ISC::Lock module. To override this choice, use LOCK_FACTORY => $factory where $factory is a factory object that supports the UPenn::ISC::Lock interface, or the name of a class that supports that interface.

$db->lookup($field, $value)

Returns an array of all records in the database for which the field $field contains the value $value. For information about record objects, see Record objects below.

Field contents are always compared stringwise. For numeric or other comparisons, use c_lookup instead.

The behavior in scalar context is undefined.

$db->c_lookup($predicate)

Returns an array of all records in the database for which the predicate function $predicate returns true. For information about record objects, see Record objects below.

The predicate function will be called repeatedly, once for each record in the database.

Each record will be passed to the predicate function as a hash, with field names as the hash keys and record data as the hash values. The global variable %_ will also be initialized to contain the current record hash. For example, if $db is the Unix password file, then we can search for people named ``Chen'' like this:

        sub is_chen {
          my %data = @_;
          $data{gecos} =~ /\bChen$/;
        }
        @chens = $db->c_lookup(\&is_chen);

Or, using the %_ variable, like this:

        sub is_chen { $_{gecos} =~ /\bChen$/ }
        @chens = $db->c_lookup(\&is_chen);

The behavior in scalar context is undefined.

$db->rec_count

Return a count of the number of records in the database.

my $record = $db->nextrec

Get the next record from the database and return a record object representing it. Each call to nextrec returns a different record. Returns an undefined value when there are no more records left.

For information about record objects, see Record objects below.

To rewind the database so that nextrec will start at the beginning, use the rewind method.

The following code will scan all the records in the database:

        $db->rewind;
        while (my $rec = $db->nextrec) {
          ... do something with $rec...
        }

$db->append(@data)


Create a new record and add it to the database.  New records may not be
written out until the C<< ->flush >> method is called.  The new
records will be added at the end of the file.

@data is a complete set of data values for the new record, in the appropriate order. It is a fatal error to pass too many or too few values.

$db->delete_rec($record)

Delete a record from the database. $record should be a record object, returned from a previous call to lookup, nextrec, or some similar function. The record will be removed from the disk file when the flush method is called.

Returns true on success, false on failure.

$db->flush

Adding new records, deleting and modifying old records is performed in-memory only until flush is called. At this point, the program will copy the original data file, making all requested modifications, and then atomically replace the original file with the new copy.

Returns true on success, false if the update was not performed.

flush is also called automatically when the program exits.

$db->has_field($fieldname)

Returns true if the database contains a field with the specified name.


Record objects

Certain methods return ``record objects'', each of which represents a single record. The data can be accessed and the database can be modified by operating on these record objects.

Each object supports a series of accessor methods that are named after the fields in the database. If the database contains a field ``color'', for example, record objects resulting from queries on that database will support a get_color method to retrieve the color value from a record, and a synonymous <color> method that does the exact same thing. If the database was opened for writing, the record objects will also support a set_color method to modify the color in a record. The effects of the set_* methods will be propagated to the file when the database is flushed.

Other methods follow.

$record->fields

Returns a list of the fields in the object, in order.

$record->db

Returns the database object from which the record was originally selected. This example shows how one might modify a record and then write the change to disk, even if the original database object was unavailable:

        $employee->set_salary(1.06 * $employee->salary);
        $employee->db->flush;

%hash = $record->as_hash

Returns a hash containing all the data in the record. The keys in the hash are the field names, and the corresponding values are the record data.

@data = $record->as_array

Return the record data values only.

$line = $record->as_string

Return the record data in the same form that it appeared in the original file. For example, if the record were selected from the Unix password file, this might return the string "root:x:0:0:Porpoise Super-User:/:/sbin/sh".

$line = $record->delete

Delete this record from its associated database. It will be removed from the disk file the next time the database object is flushed.

AUTHOR

Mark Jason Dominus (mjd@isc.upenn.edu)

  $Id: FlatFile.pm,v 1.9 2006/06/12 18:42:57 mjd Exp $
  $Revision: 1.9 $