Quick Reference Guide

MPI debugging using dbx

Debugging parallel MPI programs on Maxima

As a long time user of cvd on the SGI machines I was used to easy access to multiple programs loaded into a debugger. Using dbx, however, things weren't that simple as the mprun command negated the easy loading of the programin without effort.

I was using Maxima, and trying to track down the cause of the message:

 Note: IEEE floating-point exception flags raised: 
    Inexact;  Division by Zero;  
 Note: IEEE floating-point exception flags raised: 
    Inexact;  Division by Zero;  Invalid Operation; 

from a multiprocessor job. Thankfully I found a way to run the MPI job and track the cause down to a particular routine, and hence the answer as to why the errors were different between different processors.

My method was as follows. It requires two different windows to be open on the machine in question, and follows a combination of approaches from the SUN debugging guide and a guide to MPI debugging on SGI using dbx.

Window 1 Window 2
Compile with -g and -ftrap=%all,no%underflow,no%inexact % dbx
% mprun -np 4 ./prog
<^Z> straight away
% ps -ef | grep prog
(dbx) attach -p <pid> ./prog
Reading prog
% fg
(dbx) catch fpe
(dbx) cont
Program will run until trap or termination Trap will be detailed

Using this method the floating point exceptions chosen to be tested for in compilation can be found - and hopefully eliminated. If it manages to avoid them all them the previous error message at termination becomes:

 Note: IEEE floating-point exception traps enabled: 
    overflow;  division by zero;  invalid operation;