Difference between revisions of "Performance Engineering"

From HPC Wiki
Jump to navigation Jump to search
Line 16: Line 16:
 
The following steps are required for a minimum performance engineering process:
 
The following steps are required for a minimum performance engineering process:
  
* Define a relevant test case which reflects production behavior
+
* Define a relevant test case which reflects production behavior
* Aquire runtime profile to determine on which parts of the code the processing time is spent
+
* Aquire runtime profile to determine on which parts of the code the processing time is spent
* For all code parts (hot spots) of the runtime profile perform:
+
* For all code parts (hot spots) of the runtime profile perform:
    * Static code analysis
+
** Static code analysis
    * Instrumentation based hardware performance counter profiling
+
** Instrumentation based hardware performance counter profiling
    * Application benchmarking (thread and data set scaling)
+
** Application benchmarking (thread and data set scaling)
* Based on the data aquired by above activities narrow down performance issues
+
* Based on the data aquired by above activities narrow down performance issues
* Improve performance by changing runtime setup or implementation
+
* Improve performance by changing runtime setup or implementation
  
 
Those steps need to be repeated multiple times until a required or good enough
 
Those steps need to be repeated multiple times until a required or good enough
Line 32: Line 32:
 
engineering are required:
 
engineering are required:
  
* Perform application benchmarking
+
* Perform application benchmarking
* Create a runtime profile
+
* Create a runtime profile
* Create a performance profile
+
* Create a performance profile
  
 
Those skills will are are documented in separate articles and will be assumed
 
Those skills will are are documented in separate articles and will be assumed

Revision as of 09:48, 16 January 2019

Introduction

HPC is about high application performance requirements. There exist many options to improve the performance of an application code. In the following it is assumed that a given algorithm is executed on a given HPC system.

The following factors influence the performance:

  • Implementation of the algorithm (programming language, optimisation techniques)
  • Compiler used and compiler options
  • Machine and operating system configuration
  • Runtime setup (pinning and resource allocation)

Generic iterative procedure for performance engineering

The following steps are required for a minimum performance engineering process:

  • Define a relevant test case which reflects production behavior
  • Aquire runtime profile to determine on which parts of the code the processing time is spent
  • For all code parts (hot spots) of the runtime profile perform:
    • Static code analysis
    • Instrumentation based hardware performance counter profiling
    • Application benchmarking (thread and data set scaling)
  • Based on the data aquired by above activities narrow down performance issues
  • Improve performance by changing runtime setup or implementation

Those steps need to be repeated multiple times until a required or good enough performance is reached. After an optimisation steps must be taken that the optimised variants are used and taking effect in regular production.

To carry out above procedure multiple special skills beyond standard software engineering are required:

  • Perform application benchmarking
  • Create a runtime profile
  • Create a performance profile

Those skills will are are documented in separate articles and will be assumed in the following.

Strategies for performance analysis

After definition of a benchmark case, application benchmarking and performance profiling the interpretation and analysis of the results is the first difficult task in any performance engineering effort. While there is no silver bullet for performance analysis multiple strategies provide guidelines for different levels of expertise. It must be noted that in complicated cases the software developer carrying out the process must possess a certain level of experience to succeed. Therefore it is recommended to consult an experienced HPC consultant in the local HPC center if no progress is achieved using the simpler approaches.

Three approaches are described in more detail:

* Threshold based performance analysis process based on the proven EU COE POP
 project approach for a rough initial performance analysis suited also for
 beginners
* Performance pattern based process for more complicated cases targeted at
 experienced software developers
* A instruction count based approach applicable for the special case of
 instruction based codes on SIMD architecture