|
NAMEGenezzo::Row::RSTab.pm - Row Source TABle tied hash class.SYNOPSISuse Genezzo::Row::RSTab; # see Tablespace.pm -- implementation and usage is tightly tied # to genezzo engine... # make a factory for rsfile my $fac2 = make_fac2('Genezzo::Row::RSFile'); my %args = ( factory => $fac2, # need tablename, bufcache, etc... tablename => ... tso => ... bufcache => ... ); my %td_hash; $tie_val = tie %td_hash, 'Genezzo::Row::RSTab', %args; # pushhash style my @rowarr = ("this is a test", "and this is too"); my $newkey = $tie_val->HPush(\@rowarr); @rowarr = ("update this entry", "and this is too"); $tied_hash{$newkey} = \@rowarr; my $getcount = $tie_val->HCount(); DESCRIPTIONRSTab is a hierarchical pushhash (see Genezzo::PushHash::hph) class that stores perl arrays as rows in a table, writing them into a block (byte buffer) via Genezzo::Row::RSFile and Genezzo::Block::RDBlock.ARGUMENTS
CONCEPTSLogically, a table is made of rows, and rows are vectors of columns. Physically (at least from an OS implementation viewpoint), a table is made up of blocks stored in files. The RSTab hierarchical pushhash (hph) uses an RSFile factory, though it could be constructed as an hph of arbitrary depth. The basic HPush mechanism takes an array, flattens it into a string, and pushes the string into one of the underlying blocks.While the RSTab api is primarily intended as a row-based interface, it has some extensions to directly manipulate the underlying blocks. These extensions are useful for building specialized index mechanisms (see Genezzo::Index) like B-trees, or for supporting rows that span multiple blocks. Basic PushHashYou can use RSTab as a persistent hash of arrays of scalars if you like. The arrays and scalars can be of arbitrary length (as long as they fit in your datafiles).SQL DBI-style interfaceRSTab is designed to efficiently support prepare/execute/fetch operations against tables. What distinguishes this API from a standard hash is that the "prepare" operation generates a custom, stateful iterator that understands filters and range selection. A filter is simply a predicate which is applied to every row -- rows which pass are returned to the caller, and rows which fail are "filtered out". Range selection is somewhat similar, with the notion of start and stop keys -- the iterator only returns the rows which are restricted to a certain range of values. In general, range selection is driven off a separate indexing mechanism that positions the fetch to specifically retrieve the range in an efficient manner, versus fetching all rows and filtering rows outside the range.HPHRowBlk - Row and Block operationsHPHRowBlk is a special pushhash subclass with certain direct block manipulation methods. One very useful function is HSuck, which provides support for rows that span multiple blocks. While the standard HPush fails if a row exceeds the space in a single block, the HSuck api lets the underlying blocks consume the rows in pieces -- each block "sucks up" as much of the row as it can. The RSTab HPush is re-implemented on top of HSuck to support large rows.Counting, Estimation, ApproximationRSTab has some support for count estimation, inspired by some of Peter Haas' work (Sequential Sampling Procedures for Query Size Estimation, ACM SIGMOD 1992, Online Aggregation (with J. Hellerstein and H. Wang), ACM SIGMOD 1997 Ripple Joins for Online Aggregation (with J. Hellerstein) ACM SIGMOD 1999). It could use support for confidence intervals, so drop me a line if you understand Central Limit Theorem, Hoeffding and Chebyshev inequalites. Knowledge of change-points and time-series is also a plus.FUNCTIONSRSTab support all standard hph hierarchical pushhash operations, with the extension that it manipulates arrays of scalars, not individual scalars.EXPORTLIMITATIONSvariousTODO
AUTHORJeffrey I. Cohen, jcohen@genezzo.comSEE ALSOGenezzo::PushHash::HPHRowBlk, Genezzo::PushHash::hph, Genezzo::PushHash::PushHash, Genezzo::Tablespace, Genezzo::Row::RSFile, Genezzo::Row::RSBlock, Genezzo::Block::RDBlock, Genezzo::BufCa::BCFile, Genezzo::BufCa::BufCaElt, perl(1).Copyright (c) 2003, 2004, 2005 Jeffrey I Cohen. All rights reserved. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA Address bug reports and comments to: jcohen@genezzo.com For more information, please visit the Genezzo homepage at <http://www.genezzo.com>
Visit the GSP FreeBSD Man Page Interface. |