Difference between revisions of "Benchmarking & Scaling Tutorial/Automated Benchmarking JUBE"

From HPC Wiki
Benchmarking & Scaling Tutorial/Automated Benchmarking JUBE
Jump to navigation Jump to search
m (Fix incomplete sentence.)
(Adding output parsing and result table)
Line 27: Line 27:
 
     <parameterset name="execute_pset">
 
     <parameterset name="execute_pset">
 
       <parameter name="tasks">1,2,4,8,12,18,24,30,36,42,48,54,60,66,72</parameter>
 
       <parameter name="tasks">1,2,4,8,12,18,24,30,36,42,48,54,60,66,72</parameter>
       <!-- you could also compute the list using Python  
+
       <!-- you could also compute the list using Python
 
         <parameter name="tasks" mode="python">
 
         <parameter name="tasks" mode="python">
 
           ",".join([str(x) for x in range(1,73) if (x % 6 == 0 and x > 10) or (x % 4 == 0 and x < 10) or x == 2 or x==1 ])
 
           ",".join([str(x) for x in range(1,73) if (x % 6 == 0 and x > 10) or (x % 4 == 0 and x < 10) or x == 2 or x==1 ])
Line 33: Line 33:
 
       -->
 
       -->
 
     </parameterset>
 
     </parameterset>
   
+
 
 
     <!-- Input files -->
 
     <!-- Input files -->
 
     <fileset name="gromacs_files">
 
     <fileset name="gromacs_files">
Line 43: Line 43:
 
       <use>execute_pset</use>  <!-- use parameterset -->
 
       <use>execute_pset</use>  <!-- use parameterset -->
 
       <use>gromacs_files</use> <!-- use fileset -->
 
       <use>gromacs_files</use> <!-- use fileset -->
       <do>srun -n $ntasks gmx_mpi -quiet mdrun -deffnm MD_5NM_WATER -nsteps 10000 -ntomp 1 -pin on</do> <!-- start GROMACS -->
+
       <do>srun -n $tasks gmx_mpi -quiet mdrun -deffnm MD_5NM_WATER -nsteps 10000 -ntomp 1 -pin on</do> <!-- start GROMACS -->
     </step>  
+
     </step>
 
   </benchmark>
 
   </benchmark>
 
</jube>
 
</jube>
 
</syntaxhighlight>
 
</syntaxhighlight>
 +
 +
This configuration will already create separate directories for each of the measurements, which makes sure that temporary files written by the application do not interact across the different measurements.
 +
 +
== Parsing output ==
 +
 +
GROMACS outputs performance numbers to `stderr`.
 +
As identifying such output from executions is a core part of benchmarking, JUBE provides infrastructure to parse output and store specific information to output this information later in result tables.
 +
 +
<syntaxhighlight lang="xml" line>
 +
    <patternset name="gromacs_output_patterns">
 +
        <pattern name="gromacs_num_procs" unit="s">Using ${jube_pat_int} MPI proc.*</pattern>
 +
        <pattern name="gromacs_num_threads" unit="s">Using ${jube_pat_int} OpenMP thread.*</pattern>
 +
        <pattern name="gromacs_core_time" unit="s">Time:\s*${jube_pat_fp}</pattern>
 +
        <pattern name="gromacs_wall_time" unit="s">Time:\s*${jube_pat_nfp}\s*${jube_pat_fp}</pattern>
 +
        <pattern name="gromacs_core_perf" unit="ns/day">Time:\s*${jube_pat_fp}</pattern>
 +
        <pattern name="gromacs_wall_perf" unit="hours/ns">Time:\s*${jube_pat_nfp}\s*${jube_pat_fp}</pattern>
 +
    </patternset>
 +
</syntaxhighlight>
 +
 +
The pattern matching is done line based with regular expressions and JUBE provides predefined variables, such as '''${jube_pat_int}''' and '''${jube_pat_fp}''' that contain the regular expression pattern to match an integer or floating-point number, respectively.
 +
The defined patterns can then be used in a so called '''analyser''', where the patterns are connected to the file they are applied to.
 +
 +
<syntaxhighlight lang="xml" line>
 +
    <analyser name="gromacs_analyser">
 +
        <analyse step="run">
 +
            <file use="gromacs_output_patterns">stderr</file>
 +
        </analyse>
 +
    </analyser>
 +
</syntaxhighlight>
 +
 +
Finally, result tables can be defined with columns referencing any defined parameter or pattern.
 +
 +
<syntaxhighlight lang="xml" line>
 +
    <result>
 +
        <use>gromacs_analyser</use>
 +
        <table name="gromacs_run" style="pretty">
 +
            <column title="wp">jube_wp_id</column>
 +
            <column>gromacs_core_time</column>
 +
            <column>gromacs_wall_time</column>
 +
            <column>gromacs_core_perf</column>
 +
            <column>gromacs_wall_perf</column>
 +
        </table>
 +
    </result>
 +
</syntaxhighlight>
 +
 +
Resulting in the following output.
 +
 +
<pre>
 +
$ jube result -a jube_run --id <jube_run_id>
 +
gromacs_run:
 +
| wp | tasks | gromacs_core_time[s] | gromacs_wall_time[s] | gromacs_core_perf[ns/day] | gromacs_wall_perf[hours/ns] |
 +
|----|-------|----------------------|----------------------|---------------------------|-----------------------------|
 +
|  0 |    1 |              44.366 |              44.366 |                    44.366 |                      44.366 |
 +
|  1 |    2 |              46.942 |              23.471 |                    46.942 |                      23.471 |
 +
|  2 |    4 |              49.548 |              12.387 |                    49.548 |                      12.387 |
 +
|  3 |    8 |              52.969 |                6.621 |                    52.969 |                      6.621 |
 +
|  4 |    12 |              59.370 |                4.948 |                    59.370 |                      4.948 |
 +
|  5 |    18 |              66.097 |                3.672 |                    66.097 |                      3.672 |
 +
|  6 |    24 |              76.391 |                3.183 |                    76.391 |                      3.183 |
 +
|  7 |    30 |              89.233 |                2.975 |                    89.233 |                      2.975 |
 +
|  8 |    36 |              91.187 |                2.533 |                    91.187 |                      2.533 |
 +
|  9 |    42 |              99.743 |                2.375 |                    99.743 |                      2.375 |
 +
| 10 |    48 |              183.114 |                3.815 |                  183.114 |                      3.815 |
 +
| 11 |    54 |              121.728 |                2.255 |                  121.728 |                      2.255 |
 +
| 12 |    60 |              199.882 |                3.332 |                  199.882 |                      3.332 |
 +
| 13 |    66 |                      |                      |                          |                            |
 +
| 14 |    72 |              116.555 |                1.619 |                  116.555 |                      1.619 |
 +
</pre>
 +
  
 
== Further information ==
 
== Further information ==
  
 +
* [https://mahermanns.github.io/jube-novice/index.html Reproducible HPC Workflows using JUBE ]
 
* [https://apps.fz-juelich.de/jsc/jube/jube2/docu/index.html JUBE Documentation ]
 
* [https://apps.fz-juelich.de/jsc/jube/jube2/docu/index.html JUBE Documentation ]
 
* [https://www.youtube.com/watch?v=CTexZWKhF0I Video: Automated Benchmarking with JUBE ]
 
* [https://www.youtube.com/watch?v=CTexZWKhF0I Video: Automated Benchmarking with JUBE ]

Revision as of 19:55, 24 June 2025

Tutorial
Title: Benchmarking & Scaling
Provider: HPC.NRW

Contact: tutorials@hpc.nrw
Type: Online
Topic Area: Performance Analysis
License: CC-BY-SA
Syllabus

1. Introduction & Theory
2. Interactive Manual Benchmarking
3. Automated Benchmarking using a Job Script
4. Automated Benchmarking using JUBE
5. Plotting & Interpreting Results

Introduction

The Jülich Benchmarking Environment is an application that helps you automate your workflow for system and application benchmarking.

JUBE allows you to define different steps of your workflow with dependencies between them.

One key advantage of using JUBE, as opposed to manually running an application in different configurations in a job script is that individual run configurations are automatically separated into separate workpackages with individual run directories, while common files and directories (like input files, preprocessing, etc.) can easily be integrated into the workflow.

Furthermore, application output (such as the runtime of the application) can easily be parsed and output in CSV or human-readable table format.

Writing a minimal configuration

As JUBE executes each workpackage (step with concrete configuration) in its own sandbox, the benchmark configuration must specify a fileset that either copies or links files into the run directory. Parameters have a separator defined (default is ',') that is used to tokenize the parameter string. Each token will be part of a separate configuration. In this example, the comma-separated list of tasks will result in the parameter tasks with one specific value in 15 different workpackages.

 1<?xml version="1.0" encoding="UTF-8"?>
 2<jube>
 3  <benchmark name="GROMACS" outpath="bench_run">
 4    <comment>A minimal JUBE config to run our GROMACS example</comment>
 5
 6    <!-- Configuration -->
 7    <parameterset name="execute_pset">
 8      <parameter name="tasks">1,2,4,8,12,18,24,30,36,42,48,54,60,66,72</parameter>
 9      <!-- you could also compute the list using Python
10         <parameter name="tasks" mode="python">
11           ",".join([str(x) for x in range(1,73) if (x % 6 == 0 and x > 10) or (x % 4 == 0 and x < 10) or x == 2 or x==1 ])
12         </parameter>
13      -->
14    </parameterset>
15
16    <!-- Input files -->
17    <fileset name="gromacs_files">
18      <link>MD_5NM_WATER.deff</link> <!-- link input file -->
19    </fileset>
20
21    <!-- Operation -->
22    <step name="run">
23      <use>execute_pset</use>  <!-- use parameterset -->
24      <use>gromacs_files</use> <!-- use fileset -->
25      <do>srun -n $tasks gmx_mpi -quiet mdrun -deffnm MD_5NM_WATER -nsteps 10000 -ntomp 1 -pin on</do> <!-- start GROMACS -->
26    </step>
27  </benchmark>
28</jube>

This configuration will already create separate directories for each of the measurements, which makes sure that temporary files written by the application do not interact across the different measurements.

Parsing output

GROMACS outputs performance numbers to `stderr`. As identifying such output from executions is a core part of benchmarking, JUBE provides infrastructure to parse output and store specific information to output this information later in result tables.

1    <patternset name="gromacs_output_patterns">
2        <pattern name="gromacs_num_procs" unit="s">Using ${jube_pat_int} MPI proc.*</pattern>
3        <pattern name="gromacs_num_threads" unit="s">Using ${jube_pat_int} OpenMP thread.*</pattern>
4        <pattern name="gromacs_core_time" unit="s">Time:\s*${jube_pat_fp}</pattern>
5        <pattern name="gromacs_wall_time" unit="s">Time:\s*${jube_pat_nfp}\s*${jube_pat_fp}</pattern>
6        <pattern name="gromacs_core_perf" unit="ns/day">Time:\s*${jube_pat_fp}</pattern>
7        <pattern name="gromacs_wall_perf" unit="hours/ns">Time:\s*${jube_pat_nfp}\s*${jube_pat_fp}</pattern>
8    </patternset>

The pattern matching is done line based with regular expressions and JUBE provides predefined variables, such as ${jube_pat_int} and ${jube_pat_fp} that contain the regular expression pattern to match an integer or floating-point number, respectively. The defined patterns can then be used in a so called analyser, where the patterns are connected to the file they are applied to.

1    <analyser name="gromacs_analyser">
2        <analyse step="run">
3            <file use="gromacs_output_patterns">stderr</file>
4        </analyse>
5    </analyser>

Finally, result tables can be defined with columns referencing any defined parameter or pattern.

 1    <result>
 2        <use>gromacs_analyser</use>
 3        <table name="gromacs_run" style="pretty">
 4            <column title="wp">jube_wp_id</column>
 5            <column>gromacs_core_time</column>
 6            <column>gromacs_wall_time</column>
 7            <column>gromacs_core_perf</column>
 8            <column>gromacs_wall_perf</column>
 9        </table>
10    </result>

Resulting in the following output.

$ jube result -a jube_run --id <jube_run_id>
gromacs_run:
| wp | tasks | gromacs_core_time[s] | gromacs_wall_time[s] | gromacs_core_perf[ns/day] | gromacs_wall_perf[hours/ns] |
|----|-------|----------------------|----------------------|---------------------------|-----------------------------|
|  0 |     1 |               44.366 |               44.366 |                    44.366 |                      44.366 |
|  1 |     2 |               46.942 |               23.471 |                    46.942 |                      23.471 |
|  2 |     4 |               49.548 |               12.387 |                    49.548 |                      12.387 |
|  3 |     8 |               52.969 |                6.621 |                    52.969 |                       6.621 |
|  4 |    12 |               59.370 |                4.948 |                    59.370 |                       4.948 |
|  5 |    18 |               66.097 |                3.672 |                    66.097 |                       3.672 |
|  6 |    24 |               76.391 |                3.183 |                    76.391 |                       3.183 |
|  7 |    30 |               89.233 |                2.975 |                    89.233 |                       2.975 |
|  8 |    36 |               91.187 |                2.533 |                    91.187 |                       2.533 |
|  9 |    42 |               99.743 |                2.375 |                    99.743 |                       2.375 |
| 10 |    48 |              183.114 |                3.815 |                   183.114 |                       3.815 |
| 11 |    54 |              121.728 |                2.255 |                   121.728 |                       2.255 |
| 12 |    60 |              199.882 |                3.332 |                   199.882 |                       3.332 |
| 13 |    66 |                      |                      |                           |                             |
| 14 |    72 |              116.555 |                1.619 |                   116.555 |                       1.619 |


Further information


Next: Plotting and Interpreting Results

Previous: Automated Benchmarking using a Job Script