![]() |
The gViz ProjectResearch OverviewAbstractions
for Visualization
|
In this part of the project, the
aim was
“to study the potential of XML for visualization in the context of Grid
computing, and to demonstrate its effectiveness through prototype
implementations”. Four advances have been made: a new layered reference
model
for visualization; an XML application and pilot software; new work
towards an
ontology for visualization; and a proposal for handling the myriad of
data
formats for visualization.
A Layered Reference Model for Visualization
The reference model is described
in three
layers: a conceptual layer where the visualization is described
in terms
of abstract processes independent of any software or physical resources
with
which it might eventually be realised; a logical layer which
binds in
software resources and a physical layer which binds in
computing
resources from the environment (e.g. Grid, but the model is not
restricted to
Grid resources). At each layer constraints may be specified which
restrict the
relationship to the next layer, for example constraints which limit the
possible software bindings, or constraints which limit the binding to
physical
resources, e.g. “these two entities have to be co-located”. The model
was used
in the presentation of a State of the Art Report of Distributed and
Collaborative Visualization and was one thread that led us to recognise
the
need for work in the ontology area.
A Language for Dataflow Visualization
We have developed an XML application to capture a description of the visualization application at any of the levels in the reference model, though it was initially developed for the logical level. The design of this language, skML, has been reported in the literature, the key points to note here are:
Visualization
applications are represented as graphs. We call the nodes “modules” and
the
edges “links”. The names of nodes define
their function, e.g. isosurface, and parameter elements define the
parameters
of the module. Modules have named input and output ports and a link is
a
connection between two ports.
Annotations
describe
other properties of nodes and links, using RDF. The content of the
annotation
can use arbitrary XML applications, for example to express resource
constraints
or additional information about a module (e.g. that it is an IRIS
Explorer
module). We have not developed our own vocabularies for resource
constraint
annotations, recognising that other activities such as GGF are doing
this.
Collaborative
working can be described by annotating nodes and links with “roles”,
for
example, teacher, student; or geoscientist, meteorologist etc. Roles
may be
instantiated for particular users.
We recognised
four consequences of this approach:
skML
at the
conceptual level can be used to capture the visualization designer’s
intent and
can in principle be transformed into a skML document at the logical
level to
realise that intent.
skML
at the
logical and conceptual levels can be used to record constraints on
resource
allocation. We have given a mechanism to record constraints, but we
have not
concerned ourselves with how these constraints might be resolved.
skML
at the
physical level annotated with descriptions of the resources used for a
particular problem is part of the provenance of the problem. It
captures how a
particular visualization of a particular dataset was generated in terms
of
software and hardware resources.
Resource
allocation to a problem can change over time. There are XML
applications that
deal with time-dependent behaviour, e.g. SMIL and the related animation
elements in SVG, but we have not explored the consequences of this in
this
project.
The project organised a workshop
on
Visualization for e-Science held at NeSC in January 2003. Our own work
and that
of other projects recognised the need to establish an ontology for
visualization. The motivation from gViz was the need for a way to
express what
modules do at each of the levels in the reference model for reasons of
provenance, resource allocation and mapping. But there are good reasons
other
than these for ontology work and such work to be really useful must
have broad
support in the community. A workshop on
this specific topic was organised at NeSC in April 2004 and made good
initial
progress. The results of the workshop
were presented in a poster session at the IEEE Visualization conference
and in
a Birds-Of-a-Feather session. As a result of the BOF more activities
are being
planned.
A major goal of e-science is to
enable the
best use of existing data sources for scientific investigation and to
maximise
re-use. The diversity of formats of
data sources and of visualization systems has led to special purpose
and
project-specific solutions. Although there are mature previous
initiatives to produce self-describing datasets (HDF5 and netCDF),
these have
not led to universal acceptance and future work on data representation
already
uses and will continue to use application-oriented data initiatives
based on
XML. E-science virtual organisations
(VOs) in the future are likely to be loosely coupled groups of
investigators
who will bring diverse knowledge of data sources and tools to a
collaborative
investigation. Thus a solution is
needed to convert an N*M problem to an N+M one and to provide a
framework for
conversion based on descriptions provided previously by an expert in a
data
source and an expert in a visualization system.
A solution that has been
investigated is
not to convert all the data to a standardised intermediate language,
but to use
a descriptive approach, based on XML to describe data source and the
capabilities of a visualization system (or indeed more generally of a
data
analysis or data mining system).
Although still at an early stage, this has the potential to make
new
matches between data format and visualization system.
It is notable that the descriptive approach
is also being taken by such initiatives as BinX for binary data and
DFDL a GGF
initiative for describing datasets. This
study was presented at the AHM 04 meeting.