Difference between revisions of "Relevant test case"

From HPC Wiki
Jump to navigation Jump to search
Line 1: Line 1:
 
[[Category:HPC-User]]
 
[[Category:HPC-User]]
 
[[Category:HPC-Developer]]
 
[[Category:HPC-Developer]]
 
 
  
 
A relevant test case is a combination of data set, application and parameters which reflects production behaviour or at least allow assumptions (to be proven) about real operating point. The definition of a relevant test case is essential and vital for performance engeneering both for developing an application and for efficient use of (even blackboxed) application. Typically it need (at least basic) knowledge about algorithms used in the application, prediction about needed size of computation jobs, and of course which features of software will be used.
 
A relevant test case is a combination of data set, application and parameters which reflects production behaviour or at least allow assumptions (to be proven) about real operating point. The definition of a relevant test case is essential and vital for performance engeneering both for developing an application and for efficient use of (even blackboxed) application. Typically it need (at least basic) knowledge about algorithms used in the application, prediction about needed size of computation jobs, and of course which features of software will be used.
  
In the first approximation  a [set of] real production test cases[s] is a relevant test case for itself, sic! However these test cases are typicall large, long-running and unhandy till impossible to analyse, so a ''reduced'' relevant test case is needed. Typical ways to get a reduced test case are using a real data set and then reduce the size of data set (e.g. grid resolution), crop the execution after a handful of iterations (prior reaching the convergence), or also elimination some computation parts with known behaviour.
+
In the first approximation  a [set of] real production test cases[s] is a relevant test case for itself, sic! However these test cases are typicall large, long-running and unhandy till impossible to analyse, so a ''reduced'' relevant test case is needed. Typical ways to get a reduced test case are using a real data set and then reduce the size of data set (e.g. grid resolution), crop the execution after a handful of iterations (prior reaching the convergence), or also omitting some computation parts with known behaviour.
  
 
* The same software path as in production must be used. Needless to say that if you use in production a compute kernel A, a data set for kernel B would be ''not'' relevant.
 
* The same software path as in production must be used. Needless to say that if you use in production a compute kernel A, a data set for kernel B would be ''not'' relevant.
Line 12: Line 10:
 
* Is the computation numerically stable?
 
* Is the computation numerically stable?
 
* Know the scalability pattern of your application depending on data set size: linear, logarithmical, polynomial (O², O³?), or exponentially? (In latter case don't care about scalability - you won't be able to compute interesting data sets until the end of the world..)
 
* Know the scalability pattern of your application depending on data set size: linear, logarithmical, polynomial (O², O³?), or exponentially? (In latter case don't care about scalability - you won't be able to compute interesting data sets until the end of the world..)
 +
* if you accieved any results (e.g. scalability improvement) on a reduced  test case, confirm this with at least selective tests on a full-size test case.

Revision as of 14:13, 11 May 2020


A relevant test case is a combination of data set, application and parameters which reflects production behaviour or at least allow assumptions (to be proven) about real operating point. The definition of a relevant test case is essential and vital for performance engeneering both for developing an application and for efficient use of (even blackboxed) application. Typically it need (at least basic) knowledge about algorithms used in the application, prediction about needed size of computation jobs, and of course which features of software will be used.

In the first approximation a [set of] real production test cases[s] is a relevant test case for itself, sic! However these test cases are typicall large, long-running and unhandy till impossible to analyse, so a reduced relevant test case is needed. Typical ways to get a reduced test case are using a real data set and then reduce the size of data set (e.g. grid resolution), crop the execution after a handful of iterations (prior reaching the convergence), or also omitting some computation parts with known behaviour.

  • The same software path as in production must be used. Needless to say that if you use in production a compute kernel A, a data set for kernel B would be not relevant.
  • The hotspots of production runs must also be hotspost in the reduced relevant test case (rule-of-thumb: about of the half of overall excution time should be in hotspots). This in turn lead to the rule: Do not downsize too much.
  • Is the computation numerically stable?
  • Know the scalability pattern of your application depending on data set size: linear, logarithmical, polynomial (O², O³?), or exponentially? (In latter case don't care about scalability - you won't be able to compute interesting data sets until the end of the world..)
  • if you accieved any results (e.g. scalability improvement) on a reduced test case, confirm this with at least selective tests on a full-size test case.