|
NAMEGraphViz::Data::Structure - Visualise data structuresSYNOPSISuse GraphViz::Data::Structure; my $gvds = GraphViz:Data::Structure->new($data_structure); print $gvds->graph()->as_png; DESCRIPTIONThis module makes it easy to visualise data structures, even recursive or circular ones.It is provided as an alternative to GraphViz::Data::Grapher. Differences:
REPRESENTING DATA STRUCTURES AS GRAPHS"Graphviz::Data::Structure" tries to draw data structure diagrams with a minimum of complexity and a maximum of elegance. To this end, the following design choices were made:
ALGORITHMThe algorithm is a standard recursive depth-first treewalk; we determine how the current node should be added to the current graph, add it, and then call ourselves recursively to determine how all nodes below this one should be visualized.Edges are added after the subnodes are added to the graph.Items "within" the current subnode (array and hash elements which are not references) are rendered inside a cell in the aggregate corresponding to their position. References are represented by an edge linking the appropriate postion in the aggregate to the appropriate subnode. This code does its data-structure unwrapping in a manner very similar to that used by "dumpvar.pl", the code used by the debugger to display data structures as text. The initial structure treewalk was written in isolation; the "dumpvar.pl" code was integrated only after it was recognized that there was more to life than hashes, arrays, and scalars.The "dumpvar.pl" code to decode globs and code references was used almost as-is. Code was added to attempt to spot references to array or hash elements, but this code still does not work as desired. Array and hash element references still appear to be scalars to the current algorithm. GLOBAL SETTINGS"GraphViz::Data::Structure::Debug"Set this to a true value to turn on some debugging messages output to STDERR. Defaults to false, and should probably be left that way unless you're reworking init().# Turn on GraphViz::Data::Structure debugging. $GraphViz::Data::Structure::Debug = 1; CLASS METHODS"new()"This is the constructor. It takes one mandatory argument, which is the data structure to be visualised. A "GraphViz:Data::Structure" object, the name of the top node, and a list defining the 'to' port for this top node (if there is a 'to' port; if none, an empty list) are all returned.# Graph a data structure, creating a GraphViz object. # The new GraphViz:Data::Structure object, the name of # the top node in the structure, and the "in" port are returned. my ($gvds, $top_name, @port) = GraphViz::Data::Structure->new($structure); print $gvds->graph()->as_png("my.png"); If you so desire, you can use the returned information to join other graphs up to the top of the graph contained in this object by callling "graph()" to extract the "GraphViz" object and calling other "GraphViz" primitives on that object. Most of the time you'll only care about the "GraphViz::Data::Structure" object and not the additional info. Optional parameters You can specify any, none, or all of the following optional keyword parameters:
"add()""add()", called as a class method, simply calls "new()", supporting all of the "new()" parameters as usual.# Create a graph (replicates the new() call). Parameters default. my ($gvds, $top_name, @ports) = GraphViz::Data::Structure->add($structure); INSTANCE METHODS"graph()""graph()" returns a "GraphViz" object, loaded with the nodes and edges corresponding to any data structure passed in via "new()" and/or "add()". You can make any of the standard "GraphViz" calls to this object.Methods include "as_ps", "as_hpgl", "as_pcl", "as_mif", "as_pic", "as_gd", "as_gd2", "as_gif", "as_jpeg", "as_png", "as_wbmp", "as_ismap", "as_imap", "as_vrml", "as_vtx", "as_mp", "as_fig", "as_svg". See the "GraphViz" documentation for more information. The most common methods are: # Print out a PNG-format file print $gvds->graph->as_png(); # Print out a PostScript-format file print $gvds->graph->as_ps(); # Print out a dot file, in "canonical" form: print $gvds->graph->as_canon(); "was_null""was_null()" checks to ensure that your data structure didn't generate a graph that was too complex for "dot" to handle. Directly self-referential structures (e.g., "@a = (1,\@a,3)") seem to be the only offenders in this area; if your structure isn't directly self-referential -- by far the most likely situation -- you won't need to use "was_null()" at all."was_null" forces a "dot" run to get the "canonical" form of the graph back, which can be computationally expensive; avoid it if possible. "add()""add()", called as an instance method, simply adds new nodes and edges (corresponding to a new data structure) to an existing "GraphViz::Data::Structure" object.You can specify the "Fuzz", "Label", and "Depth" arguments, just as you would for "new()". You cannot specify "GraphViz", "Orientation", or any of the "GraphViz" parameters that are used to create a "GraphViz" object; "add()" uses the pre-existing "GraphViz" object in the "GraphViz::Data::Structure" object to add new nodes. # Create a graph (replicates the new() call). my ($gvds, $top_name, @ports) = GraphViz::Data::Structure->add($structure); # Add a second structure; nodes will be merged as necessary. my ($gvds, $top_name, @ports) = $gvds->add($structure); DOT INPUT - LAYOUT DETAILSPort strings and "shape=record" nodes are the key to visualizing the data structures in a readable way. The examples in the "dot" documentation are some help, but a certain amount of experimentation was needed to determine exactly how the port strings needed to be set up so that the desired layout was achieved.Port strings do two things: they determine where edges come in and where they go out, and they allow you to position items relative to one another inside a node. This conflation of function makes creating port strings that Do What You Want a little more difficult. A little study of port strings seems to indicate that just alternating items will cause them to be laid out horizontally, while putting them in braces and alternating seems to yield a vertical layout: # Horizontal port string for 1 2 3: $ports = "1|2|3"; # Vertical port string for 1 2 3: $ports = "{1}|{2}|{3}"; This works fine for very simple sets of boxes in a line (which, from studying the examples, seems to be the principal thing that the original "GraphViz" implementors used). Anything more complicated (such as getting paired sets of boxes to all line up smartly) takes a bit of extra work. SCALARSScalars are represented either by plaintext nodes (for non-reference values) or record nodes (for references); they don't need ports, because we'll be linking at most one edge out, and there's only one "thingy" to link to in a scalar. However, we do have to deal with blessed scalars as well, which need to have both their class name and value in the node, but need to look different than arrays.If a scalar's value is a reference, we add a record-style node and link it to the value. If the scalar is blessed, we put the class name and the scalar's value both in the same node by constructing a multi-line string with the class name on top, tagged appropriately, and the value on the bottom. ARRAYSArrays have to be handled four different ways:
Unblessed arrays should (ideally) simply be rows of boxes, with either values or edges each box. We can set up port strings for this fairly easily: # Array assumed to contain (1,\$x,"s"). $ports = "<port1>1|<port2>|<port3>s"; This gives a nice row of boxes, with all the cells lined up nicely in either horizontal or vertical orientations. We don't need extra fiddling with the port string to get them to look right. Things become a bit mode complex for blessed arrays, though, because we want to include the class name as well in the record. We want to make sure that the class name itself isn't confused with any of the data items, so it needs to be off in a box by itself, parallel to the boxes defining the array. This means laying out a box the length of the whole array above the boxes defining the array in a horizontal layout, and a box the height of the whole array to the left of the boxes defining the array in a vertical layout. Fortunately (again), the same basic port string works in both orientations. # Object is an array blessed into class "Foo", containing (1,\$x,"s"). # Horizontal: $ports = "{<port0>Foo|{{<port1>1}|{<port2>}|{<port2>s}}}"; Note that we use the alernating braced items to get the array to lay out at 90 degrees from the box containing the class name. This particular string was arrived at after a fair amount of twiddling in "dotty" and seems to be the simplest port layout that works. Empty arrays, if they're unblessed, are just shown as a "[]" plaintext node. If they're blessed, we set up a record that looks sort of like a two-element array, but contains the classname, notes that it's an array, and shows that it's empty explicitly. HASHESHashes are similar to arrays, with the twist that we need to have two parallel sets of boxes which correspond to the keys and values. In addition, we have the same four cases we did for arrays:
Unblessed hashes should (ideally) simply be pairs of rows of boxes - one for key, one for value - with either values or edges in each "key" box. Setting up port strings for this is a bit more difficult. # Hash assumed to contain (A=>1,B=>\$x,C=>"s"). # Horizontal: $hports = "{<port1>A|<port2>1}|{<port3>B|<port4>}|{<port5>C|<port6>s}"; # Vertical: $vports = "{<port1>A|<port3>B|<port5>C}|{<port2>1|<port4>|<port6>s}"; Switching from horizontal to vertical requires us to separate the keys from the values. Adding a class name presents some problems. "dot" is not absolutely symmetric when it comes to parsing complex port strings; in some cases, it carefully lines up all the edges of boxes internal to a record; other times it doesn't. Rather than continue to try to kludge around this, it seemed the better part of valor to simply accept what it would do prettily and ignore the rest. In laying out blessed hashes (and following our self-imposed standard), we can either have
Anything else both significantly increases the complexity of the interface ("let's see, arrays should be horizontal, and hashes should be horizontal with names on top, so I code ... uh ...") and, well, doesn't work very well. So we stick with these two basic layouts and keep it pretty and simple. # Object is an hash blessed into class "Foo". # Hash assumed to contain (A=>1,B=>\$x,C=>"s"). # Horizontal, name on top: $hports = "{<port0>Foo|{{<port1>A|<port2>1}|{<port3>B|<port4>}|{<port5>C|<port6>s}}}"; # Vertical, name on left: $vports = "<port0>Foo|{<port1>A|<port3>B|<port5>C}|{<port2>1|<port4>|<port6>s}"; Note that we also have to change how we add the braces to the keys and values when switching where the name is, in addition to separating or associating the keys and values as needed. The good thing is that once this is all worked out, no one else has to care anymore. It just works and looks nice. GLOBSGlobs, from the layout point of view, look pretty much like blessed hashes. The only exception for globs is if there's nothing in the glob, we want to display it just as a plaintext node.In the interest of coding as little as possible, we just reuse the hash code. We construct a tiny pair of wrapper methods which add the necessary information to the parameter list and then call the common module. CODE references"CODE" references are the simplest. We just say that they're code, and add on the class name if they're blessed. The mainline code's done all the nasty work of actually figuring out the code ref's name, so we don't have to worry any further.HANDING TEXT TO DOT"dot" is a C program and therefore can get extremely upset (as in segfault upset) about text that is too long. In addition, it will become very testy if the text contains characters which it considers significant in constructing labels and the like.It is necessary to clean up and shorten any text that "dot" will be expected to put into a node. The "_dot_escape" method is used to do this. Note that the limit on strings is actually not very large; setting a really big "Fuzz" will probably make "dot" segfault when it tries to draw your graph. BUGSCannot catch pointers to individual array or hash elements yet and display the containing items, even though it tries.BUGS EXPOSED IN DOTData structures which point directly to themselves will cause "dot" to discard all input in some cases. There's currently no fix for this; you can call the "was_null()" method for now, which will tell you the graph was null and let you decide what to do.It isn't possible (in current releases of "dot") to code a record label which contains no text (e.g.: "{<port1>}"); this generates a zero-width box. This has been worked around by placing a single period in places where nothing at all would have been preferable. The "graphviz" developers have developed a patch for "dot" that corrects the problem, but it is not yet in a released version, though it is in CVS. OTHER DOT CONSIDERATIONSThe "record" type is officially deprecated, and it probably would be an idea to convert the labels to HTML format. The current implementation has been updated to work with "dot 2.40.1"; there's no guarantee that future versions won't break the "record" type again.AUTHORJoe McMahon <mcmahon@ibiblio.org>COPYRIGHTCopyright (C) 2001-2002, Joe McMahonThis module is free software; you can redistribute it or modify it under the same terms as Perl itself.
Visit the GSP FreeBSD Man Page Interface. |