TotalView

From HPC Wiki
Jump to navigation Jump to search

TotalView is a debugging software designed for high-scale HPC applications. It supports debugging serial as well as parallel applications. Supported parallel programming models include OpenMP, MPI, OpenACC and CUDA.
This page briefly describes how to debug serial and parallel programs written in C, C++ or Fortran 90/95 using the TotalView debugger.

Debugging Serial Programs

Preparation

Before debugging the program needs to be compiled with debug information using the -g flag and without any optimization (-O0 flag).

Starting TotalView

There are three ways to start debugging your program:

  • by starting TotalView with your program as a parameter
    $ totalview a.out [ -a options ]
    
  • by starting your program first and then attaching TotalView to it. In this case just start
    $ totalview
    
    . You will be asked what you would like to debug. Choose A running program (attach) and look for your already started program.
  • by analyzing the core dump after your program crashed using
    $ totalview a.out <corefile>
    

If the program requires startup parameters like runtime arguments, environment variables or standard IO these can be set in the ProcessStartup Parameters... menu.
After starting your programm Total View opens the Process Window which consists of different parts summarized by the table below:

Source Pane displays the program's source code
Stack Trace Pane displays the call stack
Stack Frame Pane displays all variables associated with the selected stack routine
Tabbed Pane displays all breakpoints, action points and evaluation points in the Action Points subpane
displays all (MPI) processes in the Processes subpane
displays all threads of the current process in the Threads subpane
Status Bar displays the status of the current process and thread
Toolbar contains all action buttons

Starting, Stopping and Restarting a program

A program can be started by selecting the Go button and stopped by clicking on the Halt button in the Toolbar. If a breakpoint is set the program will also stop upon reaching that breakpoint. It is also possible to select a program line in the Source pane and let the program run until it reaches that point during execution by clicking on the Run To button in the Toolbar. You can also manually execute the program line by line. By clicking on the Step button you can step into a function call and by clicking on the Next button you may jump over function calls. In order to leave the current function just use the Out button in the Toolbar. A program can be restarted with the Restart button.

Printing variables

The Stack Frame Pane provides values of simple variables. You can also search for a specific value using the ViewLookup Variable command. This will bring up a new window, where you need to enter the name of the variable you are interested in. The value of the variable as well as its address and its type are then shown in the Variable Window. Alternatively, you can also dive (middle click) on a variable to open the Variable Window.
If an array is examined in the Variable Window the Slice and Filter fields can be specified to only show a subset of all entries of the array. For example, Slice: 1:10:2 will show every second entry starting from index 1 and ending with index 10. Specifying Filter: > 30 will only show array entries that have a value larger than 30. Arrays can also be visualized using the ToolsVisualize command in the Variable Window as long as they array is one- or two-dimensional.
If you are examining a structure variable that contains other structures you can dive in the hierarchy of structures and navigate through it using the left and right arrow buttons in the top right of the Variable Window.

Action Points: Breakpoints, Evaluation Points, Watchpoints

A breakpoint can easily be set by clicking on a boxed line number of an executable statement in the Source Pane. To remove a breakpoint just click on the corresponding boxed line number again.
After creating a breakpoint, an Evaluation Point can be created by right-clicking on the STOP sign and selecting PropertiesEvaluate. Such an Evaluation Point can be used to temporarily add some program lines. Some basic examples are illustrated by the table below:

Additional print statement (FORTRAN write not accepted) printf("x = %f\n", x/20)
Conditional breakpoint if(i == 20) $stop
Stop after every 20 executions $count 20
Jump to program line 78 goto $78
Visualize an array $visualize a

A watchpoint can be used to monitor the value of a variable. Each time the value at the corresponding memory location of the variable is changed the program is stopped. Watchpoints can be created in the Variable Window using the ToolsWatchpoint command.

Memory Debugging

Different memory debugging features are offered by TotalView. Dynamically allocated memory can be guarded. If a memory access outside the boundaries of an allocated block occurs the program will stop. Memory can also be hoarded to avoid program crashes if the program accesses a memory block that has already been freed. TotalView can also detect memory leaks.
Memory debugging needs to be enabled before starting the debugging process using the DebugEnable memory debugging command. After that you can set a breakpoint and let the program run in to it. The memory debugging window can be displayed by the DebugOpen MemoryScape menu entry. In order to detect memory leaks select the Memory ReportsLeak Detection tab and choose either Source report or Backtrace report to get a list of leaking memory blocks.

ReplayEngine

Another interesting feature of Totalview is the ReplayEngine.