Abstract

This paper explores the way in which data visualization systems, in particular modular visualization environments, can be used over the World Wide Web. The conventional approach is for the publisher of the data to be responsible also for creating the visualization, and posting it as an image on the Web. This leaves the viewer in a passive role, with no opportunity to analyse the data in any way. We look at different scenarios that occur as we transfer more responsibility for the creation of the visualization to the viewer, allowing visualization to be used for analysis as well as presentation. We have implemented one particular scenario, where the publisher mounts the raw data on the Web, and the viewer accesses this data through a modular visualization environment - in this case IRIS Explorer. The visualization system is hosted by the publisher, but its fine control is the responsibility of the viewer. The picture is returned to the viewer as VRML, for exploration via a VRML viewer such as Webspace. We have applied this to air quality data which is posted on the Web hourly: through our system, the viewer selects what data to look at (i.e. species of pollutant, location, time period) and how to look at it - at any time and from anywhere on the web.

1. Introduction

The World Wide Web has revolutionalised the way we access information. Sources of data can remain with the originator, or publisher, and the receiver, or viewer, of the data can access it conveniently over the Internet, using a Web browser such as Mosaic or Netscape. This model of working allows the publisher to keep the information continuously up to date, so the viewer gets data that is current at the time of query. An obvious attraction of the Web is that the data may be in pictorial form: images for example have been a central part of the Web from its inception. Indeed the scientific community has been quick to exploit the potential of the Web for marketing and promotion, and research groups are now able to present their work as multimedia documents, with images and animations of their results. However this surely is only the start. These images and video clips are passive, a posteriori views of the research, in which the visualization is created by the publisher of the data and the viewer merely looks at the pictures as though leafing through a book. The medium of the Web offers far greater opportunities. There is scope for active participation by the viewer in the way data is visualized. Indeed we need to provide this if we are to use visualization on the Web as an analysis tool, rather than just as a tool for the presentation of results.

In this paper, we look at some of the possibilities that are available. We begin in section 2 with a very simple reference model in which there are two ‘players’: the publisher of the data, and the viewer of the data. Different scenarios arise according to who has the responsibility for creating the visualization: the conventional Web model gives the publisher responsibility, but it is possible to allocate responsibility to the viewer, or there can be a shared responsibility. Section 3 describes the implementation using IRIS Explorer of the shared responsibility case: the publisher posts the data, and provides a visualization framework, but delegates precise control to the viewer. In section 4 we apply this technology to the visual monitoring of environmental information, allowing interested parties to view pollution data at any time, at any place. In section 5 we conclude with thoughts on other applications of these ideas.

2. A Reference Model for Visualization over the Web

2.1 Visualization Reference Model

The reference model which has underpinned much of modern scientific visualization is that proposed by Upson et al [11] and Haber and McNabb [6]. This model sees the visualization process as a pipeline, in which a source of data is fed in, and successively filtered, mapped and rendered to create a final image. The filter process selects the data of interest - perhaps a cross-section; the map process creates an abstract geometrical representation of the data - perhaps a contour map; and the render process takes the 3D geometry from the map process, and applies lighting, shading and projection to create an image. This pipeline is shown in Figure 1 - it has of course proved the implementation model for the family of dataflow visualization systems, or Modular Visualization Environments (MVEs) that includes AVS, IRIS Explorer, IBM Data Explorer and Khoros (see [3] for an overview).

Figure 1 : Visualization Pipeline

This model can help us understand visualization over the Web: the data clearly begins with the publisher; the image clearly ends with the viewer; but different scenarios occur when we consider who has responsibility for the intermediate processes.

Figure 2 : Scenario 1. Getting images across the WWW

2.2 Scenario 1 - Publisher creates the visualization as an image

Figure 3 : Scenario 2. Getting 3D objects across the WWW

This is the status quo on the Web. The person with the information creates a visual representation and posts it as an image, or as a video sequence, on the Web. The viewer can study this, either directly from the browser, or via a helper application such as xv or an mpeg player. This scenario is shown in Figure 2, with the interpretation in terms of the reference model alongside.

Already the limitations of this approach are being criticised: images are often large in size, and when they arrive, they are ‘dead’: the viewer has little opportunity to manipulate them. This leads naturally to the next scenario.

2.3 Scenario 2 - Publisher creates the visualization as a 3D model

There is much current interest in VRML, which allows the publisher to create a 3D model rather than an image; the viewer can now render the model as they wish, with the chance to walk-through in 3D space. This is a significant advance (though only paralleling the development of computer graphics itself in the 1960-1970 era). This scenario is illustrated in Figure 3. The publisher provides VRML, and the viewer looks at this using a browser such as Webspace.

These two scenarios allow the viewer to see the results, largely as they are seen by the publisher. Admittedly, in the VRML case, the viewer has much more freedom to explore - but the abstract model of the data has been fixed in both cases by the information provider.

2.4 Scenario 3 - The viewer creates the visualization

A different scenario occurs when the viewer has the responsibility for the visualization. Figure 4 shows the publisher posting the raw data on the Web: the viewer accesses the data and uses a visualization system, say an MVE such as IRIS Explorer, as the helper application to investigate the data. In terms of the visualization reference model, the entire pipeline moves to the viewer.

This approach has a number of advantages:

and some disadvantages:

There are further problems. There is no standard for visualization - different packages use different data models. Do we agree on one visualization system for the Web and ask data originators to provide data to the data format for that system - for example, the IRIS Explorer lattice and pyramid datatypes; or the AVS field datatype? A possible solution is to allow a number of data formats as secondary MIME types - this is the strategy being followed by molecular chemists who have proposed a Chemistry MIME type [12]. This accepts that there are a range of formats in common use, such as pdb for example, and includes them as secondary types.

One system using this MIME mechanism is FASTtrek from NASA Ames Research Centre [4]. Here a platform dependent (SGI) visualization system is provided which is started as a helper application when an object of the particular mime type is encountered. This system allows local visualization of data provided in the appropriate format using a number of possible techniques. Added to this is the capability to run collaborative sessions and to play back visualizations created by others by means of a script. Thus in terms of the reference model FASTtreks correspond essentially to Figure 4, but the scripts (provided by the publisher) set up an initial framework for the visualization that executes in the viewer’s environment; once this framework is set up, the viewer may then take control. This system still encounters the previously mentioned disadvantages of this scenario.

Figure 4 : Scenario 3. Visualizing data from across the WWW

2.5 Scenario 4 - Publisher creates visualization framework, viewer chooses the options

The previous scenarios have given responsibility for the creation of the visualization either to the publisher (scenarios 1 and 2) or to the viewer (scenario 3). In both cases there are drawbacks: leaving it to the publisher is very inflexible, and inhibits the viewer; yet giving entire responsibility to the viewer imposes serious requirements on the computing power and expertise available at the receiving end. Of course this is just another facet of the ongoing debate on whether computer processing should be centralised and results delivered as a network service - or whether processing power should be distributed to the end-user.

There is a compromise solution, offering in some senses the best of both worlds, in which publisher and viewer share the responsibility. We suppose the publisher provides the processing power and the basic visualization framework appropriate for the data concerned, but we give the viewer responsibility for the fine control of the options within that framework. Figure 5 shows this scenario: the pipeline runs on a server associated with the publisher, but control of that pipeline resides with the viewer. Note that the render responsibility resides entirely with the viewer (as in scenario 2), with VRML being delivered by the publisher.

Thus the viewer of the data controls the visualization. Notice the complexity of the visualization software remains with the publisher - the user interface can be a simple form in which options are selected and passed to the server. Indeed in the next section, we describe an implementation of this scenario in terms of IRIS Explorer.

Figure 5 : Scenario 4. Requesting visualization across the WWW

An example of this scenario was created by Ang et al [1] but centred around a custom volume visualization package called VIS. This was very closely coupled to their extended Mosaic WWW browser and other integrated tools for volume visualization. It allows the user to specify a view point and a number of parameters that it sends to the server which renders the scene and returns the image to the browser. This system covers a specific field of visualization and relies on the user having the extended browser. With respect to the reference model, this puts the line between R and Image with the user having some control over M and R.

3. A Visualization Web Server

3.1 Visualization Server Architecture

Our aim then is to use the model of scenario 4 to develop a system architecture for a visualization web server. The data and the visualization framework will reside with the publisher - we shall assume now that this framework is to be provided by one of the family of MVEs. IRIS Explorer [5] was used in this implementation, but the same system architecture would apply to almost any MVE. The viewer will drive the visualization from a standard Web browser - their visualization requests will be delivered to the server, the 3D abstract model created as directed, and this model returned to the viewer as VRML.

 

Figure 6 : Visualization Web Server

3.2 Implementation Using IRIS Explorer

Firstly a form is created by the publisher that contains the parameters that may be set by the viewer. Obviously this form will be particular to the data service being offered. Once the viewer has set the values they require the form is submitted and the web server passes the information to a Common Gateway Interface (CGI) script which runs at the publisher’s site. This script contacts the visualization server and passes on the values selected by the viewer. This server is then able to generate the appropriate scripting commands to drive the visualization system to create the requested visualization. In the case of IRIS Explorer, this involves generating Skm commands. These can set parameters of modules in an Explorer dataflow network, or map, that has been created previously by the publisher. Indeed it is possible to allow simple configuration of the map ‘on-the-fly’ by including switch modules whose values can be set by the viewer through the form interface. The final output from the IRIS Explorer map will be geometry data, which is passed back to the CGI script. This script then processes this geometry data to build a VRML scene which is returned to the viewer. Subsequent alterations may be made to the values by the viewer and the form re-submitted to the system to allow further investigation of the data.

IRIS Explorer is an attractive vehicle for this architecture, for a number of reasons:

Figure 6 shows this architecture with its realisation in terms of IRIS Explorer.

4. Application to Environmental Data

Pollution data is of interest to us all - to the general public and especially those with illnesses triggered by poor air quality; and to the scientists who are seeking to understand the causes of pollution, and predict its future levels. In the UK, the National Atmospheric Emission Inventory (NAEI) is funded by the Department of the Environment to estimate and monitor emissions of a wide range of air pollutants. Measurements are made at a number of selected sites located throughout the UK.

The pollutants monitored include those which have a global impact - the ‘greenhouse’ gases: carbon dioxide, methane and nitrous oxide; and those which have a more local effect, such as photochemical pollution (ground level ozone) and acid deposition (acid rain): sulphur dioxide, nitrogen oxides, carbon monoxide and ammonia; plus metals such as lead, and other pollutants such as pesticides.

The data is sent from the observation points to the National Environmental Technology Centre at Culham, in Oxfordshire, who make the collected data publicly available on an hourly basis on their Web site [13]. This provides an important archive of environmental data. It is largely in numeric form. However some of the data has been processed into simple graphs, and these made available as GIF images. For example, Figure 7 shows the concentration of Benzene recorded in Liverpool over a 6 week period (Provisional Data). In terms of our reference model, this corresponds to scenario 1 (Figure 2) - the visualization is created by the publisher, the viewer plays only a passive role.

Figure 7 : Current data display method. (Provisional Data.)

The requirements are surely much greater. Certainly a visualization of the data is central. But the viewer - whether general public or scientist - wants the freedom to select the data to be visualized, and also how it is to be visualized. Moreover they want to be able to look at today’s data - not wait a week until the next set of images are created.

The technology outlined in the previous section allows us to achieve this. We have built a prototype pollution visualizer which makes the NAEI data available in pictorial form - at any time, and from anywhere on the Web. The user points their browser at a Web page that hosts the ‘Air Quality Data Visualization’ service, and enters via a forms interface the species of pollutant, the location and the time period of interest. These selections allow different correlations to be examined: one species at a series of locations over a period of time; or several species at the same location.

The visualization can be a simple 1D plot (as provided before, but now with the user in control of data selection) - or more valuably, a 2D plot of concentrations against time and location or species. Once the selection is made, the relevant dataset is retrieved, and IRIS Explorer fired up with a map which will create an appropriate view of the data. The resulting geometry is output as VRML, and passed back to the user as described in the previous section.

Some example output is shown here where the user has selected from the forms interface to view Ozone data from London covering the 25 day period from June 1st 0 hours to June 25th 23 hours using a 1 day period on the x-axis. Two views are possible: a block chart (Figure 8), with pillars indicating the discrete data readings; or a smooth surface view (Figure 9) which often gives a better visualization of trends in the data. The 2D plots are returned as 3D geometry, whose view can be manipulated by the user - to look from behind for example.

Figure 8 : Histogram View of Ozone from 3 sites in London. (Provisional Data)

A problem occurs with missing data values. It is important to show these as missing, but also important that they do not distort the view of the data that is available. Our solution has been to assign a value by interpolation (using the NAG Library implementation - E01SAF / E01SBF - of the Renka and Cline scattered data modelling algorithm [9]) to any location with missing data: this gives a consistent geometry. At the rendering stage, the missing values are highlighted by assigning transparency to the surface in the areas of missing data. In this way, the visualization of the smooth surface is not unduly affected by the missing data.

Figure 9 : Surface View of Ozone from 3 sites in London. (Provisional Data)

Conclusions and Future Work

We have described a number of scenarios in which visualization and the Web can interact. These lead to architectures which basically differ according to who has the responsibility for creating the visualization:

We have implemented an instance of the third architecture, using IRIS Explorer as the base visualization system hosted by the publisher, driven by controls operated by the viewer across the Web.

We have described one application of the technology - to environmental data. However it is clear that this is a generic approach which can be applied to a variety of public information databases. The visualization is simply a filter placed between the raw data and the viewer. There are obvious applications to economic data, to meteorological data, to transport data, and so on - virtually any data posted on the Web could be accompanied by a visualization filter in this way.

A by-product of this work is that we have provided a very simple interface to a visualization system. Thus the system has value for novice users, and in the teaching of visualization. Students can be shielded from the intricacies of dataflow networks, and left to focus on the different visualization techniques [2].

In our specific example, we used the NAG Renka and Cline interpolation software to fill out missing values. More generally, interpolation is a fundamental operation in visualization - typically we look at a model of the data rather than the data itself, and it is interpolation (explicit or implicit) that gives us the model. In a visualization web service, one might usefully provide a range of interpolation methods, so that a user could experiment with different techniques. Good reviews of interpolation for scientific visualization, based on scattered data are [7,8].

The present form-based interface is simple, but limiting. A more advanced way of providing the user with interfaces would be to use Java [14], a platform independent programming language. Programs, called applets, are created in Java which can be pulled across the web and executed locally within the user’s browser. This would allow us to extend the essentially ‘dead’ forms interface into an active interface that locally processes information provided by the user and only sends back to the visualization server the required information for the visualization. This would greatly reduce the number of interactions between browser and server to drive the visualization. A further step is to develop an ‘intelligent’ interface, which infers from a description of the data the type of visualization that might be appropriate (see for example, the thesis of Stead [10]).

So far, we have assumed that the data and the visualization framework are closely coupled, both the responsibility of the publisher. However the ideas of the previous paragraph allow us to envisage a separation. The publisher provides only the data and a description of it in a meta-language, and posts as a URL; the viewer selects the data and its description by pointing their browser at it; this is passed to a general WWW visualization server - somewhere on the Web - which takes the data and returns the visualization. We now have three players: the publisher, the viewer, and the intelligent visualization system.

Finally, the Web offers great opportunities for collaborative computing. We have only described a very simple form of collaboration here: the publisher posts the data, for one or more viewers to pick up and study individually. The potential is much greater:

Other work on collaborative visualization can be embedded within these architectures - for example, the COVISA project which has developed a collaborative version of IRIS Explorer, could be integrated within scenario 3 above.

Acknowledgements

We would like to thank a number of people who have helped us in this work with data, inspiration and funding! In particular, we wish to thank Brian Kelly, of the Netskills project at the University of Newcastle; Joel Smith of the University of Leeds for his help with the environmental data; Trevor Davies and Geoff Broughton of AEA Technology; John Tubby of Leeds Environment Department; and Jeremy Walton of NAG Ltd.

This work was carried out under the EPSRC-funded COVISA project at the University of Leeds, which is looking at different aspects of collaborative visualization.

References

[1] Ang C S, D C Martin and M D Boyle, "Integrated Control of Distributed Volume Visualization Through the World Wide Web", Proceedings IEEE Visualization 94 Conference, October 1994

[2] K W Brodlie, J D Wood and H Wright, "Scientific visualization: Some novel approaches to learning" Proceedings of ACM SIGCSE/SIGCUE Conference , Barcelona, 1996

[3] G Cameron, "Modular Visualization Environments: Past, Present and Future", Computer Graphics, Vol 29, Number 2, pp3-4, 1995.

[4] Clucas J, "Interactive Visualization of Computational Fluid Dynamics Using Mosaic", Proceedings Second International WWW Conference ‘94, Chicago, October 1994

[5] D. Foulser. "IRIS Explorer: A Framework for Investigation". Computer Graphics, 29(2):13-16, 1995.

[6] R B Haber and D A McNabb, "Visualization idioms: A conceptual model for scientific visualization systems", in B Shriver, G M Neilson and L Rosenblum, editors, Visualization in Scientific Computing, pp74-93. IEEE, 1990.

[7] G. M. Nielson, T. Foley, G. Hamann and D. Lane, "Visualization and modeling of scattered multivariate data", Computer Graphics and Applications, May 1991, pp. 47-45.

[8] G. M. Nielson, "Scattered Data Modeling", Computer Graphics and Applications, January 1993, pp. 60-70.

[9] R J Renka and A K Cline, "A triangle-based C1 interpolation method", Rocky Mountain Journal of Mathematics, Volume 14, Number 1, 1984.

[10] G A Stead, "Simplifying the visualization process", PhD thesis, University of Leeds, 1995

[11] C Upson et al, "The Application Visualization System: A computational environment for scientific visualization". IEEE Computer Graphics and Applications, Vol 9, Number 4, pp30-42, 1989.

[12] WWW, "Chemical applications of the web", http://chem.leeds.ac.uk/novel.html.

[13]WWW, Air quality information home page, http://www.aeat.co.uk/netcen/airqual/welcome.html.

[14] WWW, "Java", http://www.sun.com/sun-on-net/java/java.html