|
|
| |
Tree::Simple(3) |
User Contributed Perl Documentation |
Tree::Simple(3) |
Tree::Simple - A simple tree object
use Tree::Simple;
# make a tree root
my $tree = Tree::Simple->new("0", Tree::Simple->ROOT);
# explicitly add a child to it
$tree->addChild(Tree::Simple->new("1"));
# specify the parent when creating
# an instance and it adds the child implicitly
my $sub_tree = Tree::Simple->new("2", $tree);
# chain method calls
$tree->getChild(0)->addChild(Tree::Simple->new("1.1"));
# add more than one child at a time
$sub_tree->addChildren(
Tree::Simple->new("2.1"),
Tree::Simple->new("2.2")
);
# add siblings
$sub_tree->addSibling(Tree::Simple->new("3"));
# insert children a specified index
$sub_tree->insertChild(1, Tree::Simple->new("2.1a"));
# clean up circular references
$tree->DESTROY();
Alternately, to avoid calling Tree::Simple->new(...) just to
add a node:
use Tree::Simple;
use Data::TreeDumper; # Provides DumpTree().
# ---------------
my($root) = Tree::Simple->new('Root', Tree::Simple->ROOT);
$root->generateChild('Child 1.0');
$root->generateChild('Child 2.0');
$root->getChild(0)->generateChild('Grandchild 1.1');
print DumpTree($root);
$root->DESTROY;
This module in an fully object-oriented implementation of a simple n-ary tree.
It is built upon the concept of parent-child relationships, so therefore every
Tree::Simple object has both a parent and a set of children (who
themselves may have children, and so on). Every Tree::Simple object
also has siblings, as they are just the children of their immediate parent.
It is can be used to model hierarchal information such as a
file-system, the organizational structure of a company, an object
inheritance hierarchy, versioned files from a version control system or even
an abstract syntax tree for use in a parser. It makes no assumptions as to
your intended usage, but instead simply provides the structure and means of
accessing and traversing said structure.
This module uses exceptions and a minimal Design By Contract
style. All method arguments are required unless specified in the
documentation, if a required argument is not defined an exception will
usually be thrown. Many arguments are also required to be of a specific
type, for instance the $parent argument to the
constructor must be a Tree::Simple object or an object derived
from Tree::Simple, otherwise an exception is thrown. This may seems
harsh to some, but this allows me to have the confidence that my code works
as I intend, and for you to enjoy the same level of confidence when using
this module. Note however that this module does not use any Exception or
Error module, the exceptions are just strings thrown with
"die".
I consider this module to be production stable, it is based on a
module which has been in use on a few production systems for approx. 2 years
now with no issue. The only difference is that the code has been cleaned up
a bit, comments added and the thorough tests written for its public release.
I am confident it behaves as I would expect it to, and is (as far as I know)
bug-free. I have not stress-tested it under extreme duress, but I do not so
much intend for it to be used in that type of situation. If this module
cannot keep up with your Tree needs, i suggest switching to one of the
modules listed in the "OTHER TREE MODULES" section below.
- ROOT
- This class constant serves as a placeholder for the root of our tree. If a
tree does not have a parent, then it is considered a root.
- new ($node, $parent)
- The constructor accepts two arguments a $node
value and an optional $parent. The
$node value can be any scalar value (which
includes references and objects). The optional
$parent value must be a Tree::Simple
object, or an object derived from Tree::Simple. Setting this value
implies that your new tree is a child of the parent tree, and therefore
adds it to the children of that parent. If the
$parent is not specified then its value defaults
to ROOT.
- setNodeValue ($node_value)
- This sets the node value to the scalar
$node_value, an exception is thrown if
$node_value is not defined.
- setUID ($uid)
- This allows you to set your own unique ID for this specific Tree::Simple
object. A default value derived from the hex address of the object is
provided for you, so use of this method is entirely optional. It is the
responsibility of the user to ensure the value has uniqueness, all that is
tested by this method is that $uid is a true value
(evaluates to true in a boolean context). For even more information about
the Tree::Simple UID see the "getUID"
method.
- addChild ($tree)
- This method accepts only Tree::Simple objects or objects derived
from Tree::Simple, an exception is thrown otherwise. This method
will append the given $tree to the end of the
children list, and set up the correct parent-child relationships. This
method is set up to return its invocant so that method call chaining can
be possible. Such as:
my $tree = Tree::Simple->new("root")->addChild(Tree::Simple->new("child one"));
Or the more complex:
my $tree = Tree::Simple->new("root")->addChild(
Tree::Simple->new("1.0")->addChild(
Tree::Simple->new("1.0.1")
)
);
- generateChild ($scalar)
- This method accepts a scalar and calls
addChild(Tree::Simple->new($scalar) ) purely to save you the effort of
needing to use
"Tree::Simple->new(...)" as the
parameter.
- addChildren (@trees)
- This method accepts an array of Tree::Simple objects, and adds them
to the children list. Like "addChild"
this method will return its invocant to allow for method call
chaining.
- insertChild ($index, $tree)
- This method accepts a numeric $index and a
Tree::Simple object ($tree), and inserts
the $tree into the children list at the specified
$index. This results in the shifting down of all
children after the $index. The
$index is checked to be sure it is the bounds of
the child list, if it out of bounds an exception is thrown. The
$tree argument is verified to be a
Tree::Simple or Tree::Simple derived object, if this
condition fails, an exception is thrown.
- insertChildren ($index, @trees)
- This method functions much as insertChild does, but instead of inserting a
single Tree::Simple, it inserts an array of Tree::Simple
objects. It too bounds checks the value of $index
and type checks the objects in @trees just as
"insertChild" does.
- removeChild ($child | $index)>
- Accepts two different arguments. If given a Tree::Simple object
($child), this method finds that specific
$child by comparing it with all the other children
until it finds a match. At which point the $child
is removed. If no match is found, and exception is thrown. If a
non-Tree::Simple object is given as the
$child argument, an exception is thrown.
This method also accepts a numeric
$index and removes the child found at that index
within the list of children. The $index is
bounds checked, if this condition fail, an exception is thrown.
When a child is removed, it results in the shifting up of all
children after it, and the removed child is returned. The removed child
is properly disconnected from the tree and all its references to its old
parent are removed. However, in order to properly clean up and circular
references the removed child might have, it is advised to call the
"DESTROY" method. See the
"CIRCULAR REFERENCES" section for more information.
- addSibling ($tree)
- addSiblings (@trees)
- insertSibling ($index, $tree)
- insertSiblings ($index, @trees)
- The "addSibling",
"addSiblings",
"insertSibling" and
"insertSiblings" methods pass along
their arguments to the "addChild",
"addChildren",
"insertChild" and
"insertChildren" methods of their parent
object respectively. This eliminates the need to overload these methods in
subclasses which may have specialized versions of the *Child(ren) methods.
The one exceptions is that if an attempt it made to add or insert siblings
to the ROOT of the tree then an exception is thrown.
NOTE: There is no
"removeSibling" method as I felt it was
probably a bad idea. The same effect can be achieved by manual upwards
traversal.
- getNodeValue
- This returns the value stored in the node field of the object.
- getUID
- This returns the unique ID associated with this particular tree. This can
be custom set using the "setUID" method,
or you can just use the default. The default is the hex-address extracted
from the stringified Tree::Simple object. This may not be a
universally unique identifier, but it should be adequate for at
least the current instance of your perl interpreter. If you need a UUID,
one can be generated with an outside module (there are
many to choose from on CPAN) and the
"setUID" method (see above).
- getChild ($index)
- This returns the child (a Tree::Simple object) found at the
specified $index. Note that we do use standard
zero-based array indexing.
- getAllChildren
- This returns an array of all the children (all Tree::Simple
objects). It will return an array reference in scalar context.
- getSibling ($index)
- getAllSiblings
- Much like "addSibling" and
"addSiblings", these two methods simply
call "getChild" and
"getAllChildren" on the parent of the
invocant.
See also </getSiblingCount>.
Warning: This method includes the invocant, so it is not
really all siblings but rather all children of the parent!
- getSiblingCount
- Returns 0 if the invocant is the root node. Otherwise returns the count of
siblings, which excludes the invocant.
See also </getAllSiblings>.
Warning: This differs from
scalar(parent->getAllSiblings() ) just above, which for some
reason includes the invocant. I cannot change getAllSiblings()
now for a module first released in 2004.
- getDepth
- Returns a number representing the depth of the invocant within the
hierarchy of Tree::Simple objects.
NOTE: A "ROOT" tree
has the depth of -1. This be because Tree::Simple assumes that a root
node will usually not contain data, but just be an anchor for the
data-containing branches. This may not be intuitive in all cases, so I
mention it here.
- getParent
- Returns the parent of the invocant, which could be either ROOT or a
Tree::Simple object.
- getHeight
- Returns a number representing the length of the longest path from the
current tree to the furthest leaf node.
- getWidth
- Returns the a number representing the breadth of the current tree,
basically it is a count of all the leaf nodes.
- getChildCount
- Returns the number of children the invocant contains.
- getIndex
- Returns the index of this tree within its sibling list. Returns -1 if the
tree is the root.
- isLeaf
- Returns true (1) if the invocant does not have any children, false (0)
otherwise.
- isRoot
- Returns true (1) if the invocant has a "parent" of ROOT,
returns false (0) otherwise.
- isFistChild
- Returns 0 if the invocant is the root node.
Returns 1 if the invocant is the first child in the parental
list of children. Otherwise returns 0.
- isLastChild
- Returns 0 if the invocant is the root node.
Returns 1 if the invocant is the last child in the parental
list of children. Otherwise returns 0.
- traverse ($func, ?$postfunc)
- This method accepts two arguments a mandatory
$func and an optional
$postfunc. If the argument
$func is not defined then an exception is thrown.
If $func or $postfunc are
not in fact CODE references then an exception is thrown. The function
$func is then applied recursively to all the
children of the invocant, or until $func returns
'ABORT'. If given, the function
$postfunc will be applied to each child after the
children of the child have been traversed.
Here is an example of a traversal function that will print out
the hierarchy as a tabbed in list.
$tree->traverse(sub {
my ($_tree) = @_;
my $tag = $_tree->getNodeValue();
print (("\t" x $_tree->getDepth()), $tag, "\n");
return 'ABORT' if 'foo' eq $tag;
});
Here is an example of a traversal function that will print out
the hierarchy in an XML-style format.
$tree->traverse(sub {
my ($_tree) = @_;
print ((' ' x $_tree->getDepth()),
'<', $_tree->getNodeValue(),'>',"\n");
},
sub {
my ($_tree) = @_;
print ((' ' x $_tree->getDepth()),
'</', $_tree->getNodeValue(),'>',"\n");
});
Note that aborting traverse is not recommended when using
$postfunc because post-function will not be
called for any nodes after aborting which might lead to less than
predictable results.
- size
- Returns the total number of nodes in the current tree and all its
sub-trees.
- height
- This method has also been deprecated in favor of the
"getHeight" method above, it remains as
an alias to "getHeight" for backwards
compatibility.
NOTE: This is also no longer a recursive method which
get's it's value on demand, but a value stored in the Tree::Simple
object itself, hopefully making it much more efficient and usable.
- accept ($visitor)
- It accepts either a Tree::Simple::Visitor object (which includes
classes derived
from Tree::Simple::Visitor), or an object who has the
"visit" method available
(tested with
"$visitor->can('visit')"). If these
qualifications are not met,
and exception will be thrown. We then run the Visitor
"visit" method giving the
current tree as its argument.
I have also created a number of Visitor objects and packaged
them into the Tree::Simple::VisitorFactory.
Cloning a tree can be an extremely expensive operation for large trees, so we
provide two options for cloning, a deep clone and a shallow clone.
When a Tree::Simple object is cloned, the node is deep-copied in
the following manner. If we find a normal scalar value (non-reference), we
simply copy it. If we find an object, we attempt to call
"clone" on it, otherwise we just copy the
reference (since we assume the object does not want to be cloned). If we
find a SCALAR, REF reference we copy the value contained within it. If we
find a HASH or ARRAY reference we copy the reference and recursively copy
all the elements within it (following these exact guidelines). We also do
our best to assure that circular references are cloned only once and
connections restored correctly. This cloning will not be able to copy CODE,
RegExp and GLOB references, as they are pretty much impossible to clone. We
also do not handle "tied" objects, and
they will simply be copied as plain references, and not
re-"tied".
- clone
- The clone method does a full deep-copy clone of the object, calling
"clone" recursively on all its children.
This does not call "clone" on the parent
tree however. Doing this would result in a slowly degenerating spiral of
recursive death, so it is not recommended and therefore not implemented.
What happens is that the tree instance that
"clone" is actually called upon is
detached from the tree, and becomes a root node, all if the cloned
children are then attached as children of that tree. I personally think
this is more intuitive then to have the cloning crawl back up the
tree is not what I think most people would expect.
- cloneShallow
- This method is an alternate option to the plain
"clone" method. This method allows the
cloning of single Tree::Simple object while retaining connections
to the rest of the tree/hierarchy.
- DESTROY
- To avoid memory leaks through uncleaned-up circular references, we
implement the "DESTROY" method. This
method will attempt to call "DESTROY" on
each of its children (if it has any). This will result in a cascade of
calls to "DESTROY" on down the tree. It
also cleans up it's parental relations as well.
Because of perl's reference counting scheme and how that
interacts with circular references, if you want an object to be properly
reaped you should manually call
"DESTROY". This is especially
necessary if your object has any children. See the section on
"CIRCULAR REFERENCES" for more information.
- fixDepth
- Tree::Simple will manage the depth field for you using this method. You
should never need to call it on your own, however if you ever did need to,
here is it. Running this method will traverse your all the sub-trees of
the invocant, correcting the depth as it goes.
- fixHeight
- Tree::Simple will manage the height field for you using this method. You
should never need to call it on your own, however if you ever did need to,
here is it. Running this method will correct the heights of the current
tree and all ancestors heights too.
- fixWidth
- Tree::Simple will manage the width field for you using this method. You
should never need to call it on your own, however if you ever did need to,
here is it. Running this method will correct the widths of the current
tree and all ancestors widths too.
I would not normally document private methods, but in case you need to subclass
Tree::Simple, here they are.
- _init ($node, $parent, $children)
- This method is here largely to facilitate subclassing. This method is
called by new to initialize the object, where new has the primary
responsibility of creating the instance.
- _setParent ($parent)
- This method sets up the parental relationship. It is for internal use
only.
- _setHeight ($child)
- This method will set the height field based upon the height of the given
$child.
I have revised the model by which Tree::Simple deals with circular references.
In the past all circular references had to be manually destroyed by calling
DESTROY. The call to DESTROY would then call DESTROY on all the children, and
therefore cascade down the tree. This however was not always what was needed,
nor what made sense, so I have now revised the model to handle things in what
I feel is a more consistent and sane way.
Circular references are now managed with the simple idea that the
parent makes the decisions for the child. This means that child-to-parent
references are weak, while parent-to-child references are strong. So if a
parent is destroyed it will force all the children to detach from it,
however, if a child is destroyed it will not be detached from the
parent.
By default, you are still required to call DESTROY in order for things to
happen. However I have now added the option to use weak references, which
alleviates the need for the manual call to DESTROY and allows Tree::Simple to
manage this automatically. This is accomplished with a compile time setting
like this:
use Tree::Simple 'use_weak_refs';
And from that point on Tree::Simple will use weak references to
allow for
reference counting to clean things up properly.
For those who are unfamiliar with weak references, and how they
affect the reference counts, here is a simple illustration. First is the
normal model that Tree::Simple uses:
+---------------+
| Tree::Simple1 |<---------------------+
+---------------+ |
| parent | |
| children |-+ |
+---------------+ | |
| |
| +---------------+ |
+->| Tree::Simple2 | |
+---------------+ |
| parent |-+
| children |
+---------------+
Here, Tree::Simple1 has a reference count of 2 (one for the
original variable it is assigned to, and one for the parent reference in
Tree::Simple2), and Tree::Simple2 has a reference count of 1 (for the child
reference in Tree::Simple1).
Now, with weak references:
+---------------+
| Tree::Simple1 |.......................
+---------------+ :
| parent | :
| children |-+ : <--[ weak reference ]
+---------------+ | :
| :
| +---------------+ :
+->| Tree::Simple2 | :
+---------------+ :
| parent |..
| children |
+---------------+
Now Tree::Simple1 has a reference count of 1 (for the variable it
is assigned to) and 1 weakened reference (for the parent reference in
Tree::Simple2). And Tree::Simple2 has a reference count of 1, just as
before.
None that I am aware of. The code is pretty thoroughly tested (see "CODE
COVERAGE" below) and is based on an (non-publicly released) module which
I had used in production systems for about 3 years without incident. Of
course, if you find a bug, let me know, and I will be sure to fix it.
I use Devel::Cover to test the code coverage of my tests, below is the
Devel::Cover report on the test suite.
---------------------------- ------ ------ ------ ------ ------ ------ ------
File stmt branch cond sub pod time total
---------------------------- ------ ------ ------ ------ ------ ------ ------
Tree/Simple.pm 99.6 96.0 92.3 100.0 97.0 95.5 98.0
Tree/Simple/Visitor.pm 100.0 96.2 88.2 100.0 100.0 4.5 97.7
---------------------------- ------ ------ ------ ------ ------ ------ ------
Total 99.7 96.1 91.1 100.0 97.6 100.0 97.9
---------------------------- ------ ------ ------ ------ ------ ------ ------
I have written a number of other modules which use or augment this module, they
are describes below and available on CPAN.
- Tree::Parser - A module for parsing formatted files into Tree::Simple
hierarchies
- Tree::Simple::View - For viewing Tree::Simple hierarchies in various
output formats
- Tree::Simple::VisitorFactory - Useful Visitor objects for Tree::Simple
objects
- Tree::Binary - If you are looking for a binary tree, check this one
out
Also, the author of Data::TreeDumper and I have worked together to
make sure that Tree::Simple and his module work well together. If you
need a quick and handy way to dump out a Tree::Simple hierarchy, this module
does an excellent job (and plenty more as well).
I have also recently stumbled upon some packaged distributions of
Tree::Simple for the various Unix flavors. Here are some links:
- FreeBSD Port -
<http://www.freshports.org/devel/p5-Tree-Simple/>
- Debian Package -
<http://packages.debian.org/unstable/perl/libtree-simple-perl>
- Linux RPM - <http://rpmpan.sourceforge.net/Tree.html>
There are a few other Tree modules out there, here is a quick comparison between
Tree::Simple and them. Obviously I am biased, so take what I say with a
grain of salt, and keep in mind, I wrote Tree::Simple because I could
not find a Tree module that suited my needs. If Tree::Simple does not
fit your needs, I recommend looking at these modules. Please note that I am
only listing Tree::* modules I am familiar with here, if you think I have
missed a module, please let me know. I have also seen a few tree-ish modules
outside of the Tree::* namespace, but most of them are part of another
distribution (HTML::Tree, Pod::Tree, etc) and are likely
specialized in purpose.
- Tree::DAG_Node
- This module seems pretty stable and very robust with a lot of
functionality. But it> only comes with 1 sophisticated test,
t/cut.and.paste.subtrees.t. While I am sure the author tested his code, I
would feel better if I was able to see that. The module is approx. 3000
lines with POD, and 1,500 without the POD. The shear depth and detail of
the documentation and the ratio of code to documentation is impressive,
and not to be taken lightly. But given that it is a well known fact that
the likeliness of bugs increases along side the size of the code, I do not
feel comfortable with large modules like this which have no tests.
All this said, I am not a huge fan of the API either, I prefer
the gender neutral approach in Tree::Simple to the
mother/daughter style of Tree::DAG_Node. I also feel very
strongly that Tree::DAG_Node is trying to do much more than makes
sense in a single module, and is offering too many ways to do the same
or similar things.
However, of all the Tree::* modules out there,
Tree::DAG_Node seems to be one of the favorites, so it may be
worth investigating.
- Tree::MultiNode
- I am not very familiar with this module, however, I have heard some good
reviews of it, so I thought it deserved mention here. I believe it is
based upon C++ code found in the book Algorithms in C++ by Robert
Sedgwick. It uses a number of interesting ideas, such as a ::Handle object
to traverse the tree with (similar to Visitors, but also seem to be to be
kind of like a cursor). However, like Tree::DAG_Node, it is
somewhat lacking in tests and has only 6 tests in its suite. It also has
one glaring bug, which is that there is currently no way to remove a child
node.
- Tree::Nary
- It is a (somewhat) direct translation of the N-ary tree from the GLIB
library, and the API is based on that. GLIB is a C library, which means
this is a very C-ish API. That does not appeal to me, it might to you, to
each their own.
This module is similar in intent to Tree::Simple. It
implements a tree with n branches and has polymorphic node
containers. It implements much of the same methods as
Tree::Simple and a few others on top of that, but being based on
a C library, is not very OO. In most of the method calls the
$self argument is not used and the second
argument $node is. Tree::Simple is a much
more OO module than Tree::Nary, so while they are similar in
functionality they greatly differ in implementation style.
- Tree
- This module is pretty old, it has not been updated since Oct. 31, 1999 and
is still on version 0.01. It also seems to be (from the limited
documentation) a binary and a balanced binary tree, Tree::Simple is
an n-ary tree, and makes no attempt to balance anything.
- Tree::Ternary
- This module is older than Tree, last update was Sept. 24th, 1999.
It seems to be a special purpose tree, for storing and accessing strings,
not general purpose like Tree::Simple.
- Tree::Ternary_XS
- This module is an XS implementation of the above tree type.
- Tree::Trie
- This too is a specialized tree type, it sounds similar to the
Tree::Ternary, but it much newer (latest release in 2003). It seems
specialized for the lookup and retrieval of information like a hash.
- Tree::M
- Is a wrapper for a C++ library, whereas Tree::Simple is pure-perl.
It also seems to be a more specialized implementation of a tree, therefore
not really the same as Tree::Simple.
- Tree::Fat
- Is a wrapper around a C library, again Tree::Simple is pure-perl.
The author describes FAT-trees as a combination of a Tree and an array. It
looks like a pretty mean and lean module, and good if you need speed and
are implementing a custom data-store of some kind. The author points out
too that the module is designed for embedding and there is not default
embedding, so you cannot really use it "out of the box".
- Thanks to Nadim Ibn Hamouda El Khemir for making Data::TreeDumper work
with Tree::Simple.
- Thanks to Brett Nuske for his idea for the "getUID" and
"setUID" methods.
- Thanks to whomever submitted the memory leak bug to RT (#7512).
- Thanks to Mark Thomas for his insight into how to best handle the
height and width properties without unnecessary
recursion.
- Thanks for Mark Lawrence for the &traverse post-func patch, tests and
docs.
Stevan Little, <stevan@iinteractive.com>
Rob Kinyon, <rob@iinteractive.com>
Ron Savage <ron@savage.net.au> has taken over maintenance as
of V 1.19.
<https://github.com/ronsavage/Tree-Simple>.
Copyright 2004-2006 by Infinity Interactive, Inc.
<http://www.iinteractive.com>
This library is free software; you can redistribute it and/or
modify it under the same terms as Perl itself.
Visit the GSP FreeBSD Man Page Interface. Output converted with ManDoc. |