Difference between revisions of "Performance Engineering"

From HPC Wiki
Jump to navigation Jump to search
Line 53: Line 53:
 
Three approaches are described in more detail:
 
Three approaches are described in more detail:
  
* Threshold based performance analysis process based on the proven EU COE POP
+
* Threshold based performance analysis process based on the proven EU COE POP
 
   project approach for a rough initial performance analysis suited also for
 
   project approach for a rough initial performance analysis suited also for
 
   beginners
 
   beginners
* Performance pattern based process for more complicated cases targeted at
+
* Performance pattern based process for more complicated cases targeted at
 
   experienced software developers
 
   experienced software developers
* A instruction count based approach applicable for the special case of
+
* A instruction count based approach applicable for the special case of
 
   instruction based codes on SIMD architecture
 
   instruction based codes on SIMD architecture

Revision as of 09:50, 16 January 2019

Introduction

HPC is about high application performance requirements. There exist many options to improve the performance of an application code. In the following it is assumed that a given algorithm is executed on a given HPC system.

The following factors influence the performance:

  • Implementation of the algorithm (programming language, optimisation techniques)
  • Compiler used and compiler options
  • Machine and operating system configuration
  • Runtime setup (pinning and resource allocation)

Generic iterative procedure for performance engineering

The following steps are required for a minimum performance engineering process:

  • Define a relevant test case which reflects production behavior
  • Aquire runtime profile to determine on which parts of the code the processing time is spent
  • For all code parts (hot spots) of the runtime profile perform:
    • Static code analysis
    • Instrumentation based hardware performance counter profiling
    • Application benchmarking (thread and data set scaling)
  • Based on the data aquired by above activities narrow down performance issues
  • Improve performance by changing runtime setup or implementation

Those steps need to be repeated multiple times until a required or good enough performance is reached. After an optimisation it must be ensured that the optimised variants are used and taking effect in regular production.

To carry out above procedure multiple special skills beyond standard software engineering are required:

  • Perform application benchmarking
  • Create a runtime profile
  • Create a performance profile

Those skills are documented in separate articles and will be expected in the following.

Strategies for performance analysis

After definition of a benchmark case, application benchmarking and performance profiling the interpretation and analysis of the results is the first difficult task in any performance engineering effort. While there is no silver bullet for performance analysis multiple strategies provide guidelines for different levels of expertise. It must be noted that in complicated cases the software developer carrying out the process must possess a certain level of experience to succeed. Therefore it is recommended to consult an experienced HPC consultant in the local HPC center if no progress is achieved using the simpler approaches.

Three approaches are described in more detail:

  • Threshold based performance analysis process based on the proven EU COE POP
 project approach for a rough initial performance analysis suited also for
 beginners
  • Performance pattern based process for more complicated cases targeted at
 experienced software developers
  • A instruction count based approach applicable for the special case of
 instruction based codes on SIMD architecture