gViz logo

The gViz Project

Research Overview

Grid-enabling of Existing Visualization Systems




In this phase of the project we selected two representative visualization systems, and explored how they could evolve to exploit the new technologies emerging in Grid computing and in Web services.  The first system is IRIS Explorer, developed by NAG Ltd, a partner in the project.  IRIS Explorer is representative of a class of dataflow visualization systems; the challenge in this study was to allow the dataflow to extend in a secure manner across Grid computing resources.  The second system is pV3, developed at MIT but used industrially within the UK by institutions such as Rolls-Royce.  The challenge here was to incorporate Web services as a replacement for message passing between the desktop visualization and the remote data source.

IRIS Explorer and Grid-enabled Dataflow Visualization

Figure 1 illustrates the concept of dataflow visualization as a pipeline of data input, data selection, data transformation to geometry, and geometry rendering.  IRIS Explorer follows this model, providing a set of pre-defined modules that a user can connect together into a network, or map.  The system is open in the sense that a user can encapsulate their own code as a module, and this allows for example simulation code to be included as part of the pipeline (again illustrated in Figure 1).

The conventional use of IRIS Explorer has all modules executing on the workstation or PC at the user’s desk.  For large simulations, however, it becomes extremely important to execute the simulation module on a remote server.  The early design of IRIS Explorer allowed for distributed working of this nature, but was developed before the era of Grid computing with its emphasis on security, authorisation, authentication and resource discovery. 

We have developed a Grid-enabled version of IRIS Explorer, in which Grid middleware is used by IRIS Explorer, allowing it to be run securely in a distributed processing environment.  The primary instance of IRIS Explorer runs as before on the desktop, providing a library of modules and a workspace in which these modules can be connected into a dataflow pipeline. However the e-scientist can now call up multiple instances of IRIS Explorer, running on remote Grid resources.  Authentication to allow this is handled either by the ssh utility, or by the Globus Toolkit (versions 2 or 3). The e-scientist is then provided with further libraries of modules, one library for each of the remote instances but all displayed on their desktop, from which they can select modules for inclusion in the workspace as part of the processing pipeline. These modules form part of the single visualization application, but execute remotely.  Figure 2 illustrates an environmental crisis demonstrator where the dispersion of a toxic pollutant is simulated, its progress being steered by a wind arrow from the desktop – the simulation module executes remotely (indicated by green bar at foot of the module panel) to exploit the most powerful Grid compute resource available in the crisis.

Dataflow Ref Model
gViz demonstrator
Figure 1 - The Dataflow Pipeline
Figure 2 - Grid-enabled IRIS Explorer

As an additional strand to this work, an outreach activity has been led by the industrial partner NAG Ltd.  A number of oral presentations on the project were given at a variety of venues ranging from academic conferences to commercial site visits.  As part of their contribution to the project, NAG donated IRIS Explorer licences to the UK e-Science activity, for the lifetime of the gViz project.  This allowed IRIS Explorer to be used in any project within this activity (including those based at the national centre, in the regional centres and on systems like the White Rose Grid).  In addition, NAG offered technical support via telephone and email to e-Science users of IRIS Explorer.  Around a dozen projects made use of this offer.

pV3 and Web Services for Remote Visualization


In contrast to the dataflow model used within IRIS Explorer, pV3 uses a data-parallel model in which the application data is distributed amongst the nodes of a distributed-memory parallel system, such as a PC cluster which has become a very popular choice for cost-effective parallel computations.   On each node, pV3 computes the corresponding pieces of isosurface, streamlines, etc. which form the visualization.  These "extracts" are then sent via a "concentrator" on a front-end node to the remote desktop where they are rendered to form the final image.  Interactivity and computational steering is achieved by sending messages from the desktop to the cluster to determine how the extracts are generated, and if appropriate to alter parameters of the running computation.


The original pV3 implementation used PVM (Parallel Virtual Machine) message-passing to perform the communication between the cluster front-end node and the desktop machine.  However, this places stringent restrictions on its use, essentially requiring all processes to operate under the same userid.  This is inappropriate for collaborations between different organisations, whether academic or industrial.  The aim of the project was to replace this with SOAP message-passing, which has the added benefit of coping easily with firewalls.   Since pV3 is written mainly in Fortran and C, we chose to use gSOAP, a C/C++ Web services toolkit, which proved to be very easy-to-use and had a number of advanced features which improved performance.


Our implementation used gSOAP messaging to emulate the PVM messaging used within pV3.  With efficient multithreading, and the "keepalive" and gzip compression options in gSOAP, the performance is comparable to the original PVM implementation.

Critique - Benefits and Limitations

There are tangible benefits from this initial phase of work:

The enhancements to IRIS Explorer will form the key advance in the next release of the system from NAG (version 5.4, due for release in early 2005). Existing users of the system will thus be able to migrate with minimal disruption to a system that exploits Grid technologies.

pV3 is being used at Rolls-Royce to view data from the Hydra CFD code (originally developed at Oxford University and now the main production CFD code for the whole corporation) generated on Intel/Linux PC clusters supplied by Streamline Computing.

However there are limitations:

IRIS Explorer is only one of many dataflow visualization systems – can we find an abstraction for dataflow visualization that will be independent of any particular system?

T
he binding to hardware resources is explicit in the IRIS Explorer map – in the spirit of Grid computing, can we separate the design of the visualization from the specific resources used to execute the visualization?

T
he data formats of IRIS Explorer and pV3 are system-specific – can we define an approach to facilitate the interchange of data between systems?

There is convenience in including the simulation as an IRIS Explorer module, but it can then only be active for the lifetime of one session, which is quite unsatisfactory for long-running simulations.  In pV3 the separation of simulation and visualization allows the user to connect in and out of a simulation as and when they wish, but only using the pV3 viewer.  Can we generalise this notion to provide for the linking of simulation and visualization processes?

 

We now explain how these limitations have been successfully addressed.   In the section on Abstractions, we  discuss solutions to the first three points, and in the section on Linking Simulations and Visualizations, we address the final point.

Next: Abstractions for Visualization

Return to gViz Home Page



December 2004