Graph Visualization Library v1.20

David Duke

This page describes a library of filters for visualizing graphs, built as an extension to VTK. It builds on two earlier attempts to extend VTK with support for more general information visualization, a data spaces model, and an earlier graph library.

The library and "readme" file are available for download through links at the end of the page.

Recent Changes

Introduction

Graphs are a general type of data organization that arise in a range of application areas, including various branches of bioinformatics, file and web site management, transport and communication systems, and finite-state simulation.

There are a number of bespoke tools that can be used to visualize graphs; this library however is designed as an extension to an existing visualization system, VTK. Advantages of using a library rather than a stand-along application include:

Examples produced using this library are shown below, beginning with simple images of trees and a graph. Click on any image to bring up a larger (600x600) version.

Basic Layouts

Reingold-Tilford layout Radial layout Two pictures showing 2D tree layout of the same dataset, the tree structure of a file system. The tree contains 47600 nodes. Edge colouring is based on the Strahler metric, and nodes are shown by white glyphs. On the left, Reingold-Tilford layout has been used; on the right, a radial layout.
Cone-tree layout 3D Graph layout These pictures show the use of 3D layout; on the left, the tree from the previous two images is laid out using a cone-tree organization, and on the right, a general graph (of a finite-state protocol simulation) is shown. The graph layout uses the cone tree approach to draw an underlying spanning tree, after which non-tree edges are added. This particular algorithm was first developed by D. Auber from LaBRI, Universite Bordeaux, and published in his thesis.

Larger datasets

A recent application of the library has been to visualize the results of data mining on web logs. This application is being developed with Amir Youssefi from the Department of Computer Science at Rensselaer Polytechnic Insitute, New York. Some initial images from this work are shown below.

Data mining: 400K nodes, 600K edges m_2 These images have been obtained by visualizing output from a data mining system working on an extract from the access log of a website. The image on the right is a zoom-in on detail just below and to the right-of-center of the left. Colour here reflects graph Strahler values, and the yellow lines on the right correspond to connections between areas that are structurally more complicated in the graph.
Data mining: 400K nodes, 600K edges m_2 The image on the left, and the enlargement of the upper-right area shown on the right, show "bands" of edges running between parts of the graph. A tree layout has been applied to a spanning graph extracted from the data. Bands like these are places where the layout algorithm has had to locate closely-related subtrees at different parts of the overall layout because of back-edges in the graph.

Further research is required here; edge groups may point to places where the logical structure of a site does not match the way that users are navigating within a site. The graphs used above are comparatively small, with 10K nodes and nearly 12K edges. As graphs become larger, and in particular if the edges become denser, drawing graphs as nodes and individual edges can result in images in which structure is lost within detail. We are working on representations and methods for visualizing much larger graphs; some initial results on a graph with around 400K nodes and 620K edges are shown below.

Data mining: 400K nodes, 600K edges close-up of small data mining result This dataset has been visualized using techniques that combine graph visualization (layout of the nodes) with tools for flow and surface visualization. Spatial distribution of the nodes is shown via an isosurface, while edges are shown by elongated hedgehogs that convey a sense of the edge direction without completely blocking the view of the overall data distribution. In the close-up on the right, bundles of edges can again be seen (appearing as blue rectangles).

Again, more work is still needed on interaction techniques and filters to extract such structure, and empirical evaluation is needed to support arguments about the utility of these methods.

Interaction

A number of filters are provided for generating subgraphs. One novel use of subgraphs is to generate a simplified representation of the overall graph that can be used during interaction. Rather than redraw a 50K node graph each frame, it is more efficient to redraw a much smaller graph, but one that still provides an impression of the space occupied by the full dataset.

Static representation Representation during interaction On the left, the 46K node tree is shown as it is statically. During interaction, the subtree shown on the right is drawn, using polygons to approximate clusters of edges. The subtree is extracted from the dataset using the Strahler metric; a filter included in the library produces the fan-like drawing of the trees.

Downloads

The class library is available for download, along with Tcl examples and sample data. Two versions of the library are available; version 1.1 is built with VTK4.3, while version 1.2 uses VTK4.4, at least it matches the announced conversion of the API to use doubles. A number of filters are defined in 1.2 that do not exist in 1.1, if someone really needs one of these filters and can not update their VTK installation to 4.4, please contact me.

For more information please contact me; see my home page for contact information. Finally, if you use this library, please let me know what are you doing with it, and any suggestions for improvements.


Home page.