
|
The
gViz Project
Research
Overview
Grid-enabling
of Existing Visualization Systems
|
In this
phase of the project we selected
two representative visualization systems, and explored how they could
evolve to
exploit the new technologies emerging in Grid computing and in Web
services. The first system is IRIS
Explorer, developed by NAG Ltd, a partner in the project.
IRIS Explorer is representative of a class of
dataflow visualization systems; the challenge in this study was to
allow the
dataflow to extend in a secure manner across Grid computing resources. The second system is pV3, developed at MIT but
used industrially within the UK
by institutions such as Rolls-Royce. The
challenge here was to incorporate Web services as a replacement for
message
passing between the desktop visualization and the remote data source.
IRIS
Explorer and Grid-enabled Dataflow Visualization
Figure 1 illustrates the concept of
dataflow visualization as a pipeline of data input, data selection,
data
transformation to geometry, and geometry rendering. IRIS Explorer
follows this model, providing a
set of pre-defined modules that a user can connect together into a
network, or map.
The system is open in the sense that a user can encapsulate their own
code as a module, and this allows for example simulation code to be
included as
part of the pipeline (again illustrated in Figure 1).
The conventional use of IRIS Explorer has
all modules executing on the workstation or PC at the user’s
desk. For large simulations, however, it becomes
extremely important to execute the simulation module on a remote
server. The early design of IRIS Explorer allowed for
distributed working of this nature, but was developed before the era of
Grid
computing with its emphasis on security, authorisation, authentication
and
resource discovery.
We have developed a Grid-enabled
version of
IRIS Explorer, in which Grid middleware is used by IRIS Explorer,
allowing it
to be run securely in a distributed processing environment. The primary instance of IRIS Explorer runs as
before on the desktop, providing a library of modules and a workspace
in which
these modules can be connected into a dataflow pipeline. However the
e-scientist
can now call up multiple instances of IRIS Explorer, running on remote
Grid
resources. Authentication to allow this
is handled either by the ssh
utility, or by the Globus Toolkit (versions
2 or 3). The e-scientist is then provided with further libraries of
modules, one
library for each of the remote instances but all displayed on their
desktop,
from which they can select modules for inclusion in the workspace as
part of
the processing pipeline. These modules form part of the single
visualization
application, but execute remotely. Figure
2 illustrates an environmental crisis demonstrator where the dispersion
of a
toxic pollutant is simulated, its progress being steered by a wind
arrow from
the desktop – the simulation module executes remotely (indicated by
green bar
at foot of the module panel) to exploit the most powerful Grid compute
resource
available in the crisis.

|

|
Figure 1 -
The Dataflow Pipeline
|
Figure 2 -
Grid-enabled IRIS Explorer
|
As an
additional strand to this work, an outreach activity has been led by
the
industrial partner NAG Ltd. A number of
oral presentations on the project were given at a variety of venues
ranging
from academic conferences to commercial site visits.
As part of their contribution to the project,
NAG donated IRIS Explorer licences to the UK e-Science activity, for
the
lifetime of the gViz project. This
allowed IRIS Explorer to be used in any project within this activity
(including
those based at the national centre, in the regional centres and on
systems like
the White Rose Grid). In addition, NAG
offered technical support via telephone and email to e-Science users of
IRIS
Explorer. Around a dozen projects made
use of this offer.
pV3 and Web Services
for Remote Visualization
In contrast to the dataflow
model used
within IRIS Explorer, pV3 uses a data-parallel model in which the
application
data is distributed amongst the nodes of a distributed-memory parallel
system,
such as a PC cluster which has become a very popular choice for
cost-effective
parallel computations. On each node, pV3 computes the
corresponding
pieces of isosurface, streamlines, etc. which form the
visualization.
These "extracts" are then sent via a "concentrator" on a
front-end node to the remote desktop where they are rendered to form
the final
image. Interactivity and computational steering is achieved by
sending messages
from the desktop to the cluster to determine how the extracts are
generated,
and if appropriate to alter parameters of the running computation.
The original pV3 implementation used PVM (Parallel Virtual Machine)
message-passing
to perform the communication between the cluster front-end node and the
desktop
machine. However, this places stringent restrictions on its use,
essentially requiring all processes to operate under the same
userid.
This is inappropriate for collaborations between different
organisations,
whether academic or industrial. The aim of the project was to
replace
this with SOAP message-passing, which has the added benefit of coping
easily with
firewalls. Since pV3 is written mainly in Fortran and C, we
chose
to use gSOAP, a C/C++ Web services toolkit, which proved to be very
easy-to-use
and had a number of advanced features which improved performance.
Our implementation used gSOAP messaging to emulate the PVM messaging
used
within pV3. With efficient multithreading, and the "keepalive"
and gzip compression options in gSOAP, the performance is comparable to
the
original PVM implementation.
Critique - Benefits and Limitations
There are tangible benefits from
this initial
phase of work:
The
enhancements to IRIS Explorer will form
the key advance in the next release of the system from NAG (version
5.4, due
for release in early 2005). Existing users of the system will thus be
able to
migrate with minimal disruption to a system that exploits Grid
technologies.
pV3
is being used at Rolls-Royce
to view
data from the Hydra CFD code (originally developed at Oxford University
and now
the main production CFD code for the whole corporation) generated on
Intel/Linux PC clusters supplied by Streamline Computing.
However there are limitations:
IRIS
Explorer is only one of many dataflow
visualization systems – can we find an abstraction for dataflow
visualization
that will be independent of any particular system?
The binding to hardware
resources is
explicit in the IRIS Explorer map – in the spirit of Grid computing,
can we
separate the design of the visualization from the specific resources
used to
execute the visualization?
The
data formats of IRIS Explorer and pV3
are system-specific – can we define an approach to facilitate the
interchange
of data between systems?
There is convenience
in including the
simulation as an IRIS Explorer module, but it can then only be active
for the
lifetime of one session, which is quite unsatisfactory for long-running
simulations. In pV3 the separation of
simulation and visualization allows the user to connect in and out of a
simulation as and when they wish, but only using the pV3 viewer. Can we generalise this notion to provide for
the linking of simulation and visualization processes?
We now explain how these
limitations have
been successfully addressed. In the section on Abstractions, we discuss
solutions to the first three points, and in the section on Linking Simulations and
Visualizations, we address the final point.
Next: Abstractions for Visualization
December
2004