Benchmarking & Scaling Tutorial/Introduction
Scalability
Often users who start running applications on an HPC system tend to assume the more resources (compute nodes / cores) they use, the faster their code will run (i.e. they expect a linear behaviour). Unfortunately this is not the case for the majority of applications. How fast a program runs with different amounts of resources is referred to as scalability. For parallel programs a limiting factor is defined by [Amdahl's law](https://en.wikipedia.org/wiki/Amdahl%27s_law). It takes into account the fact, that a certain amount of work of your code is done in parallel but the speedup is ultimately limited by the sequential part of the program.
Speedup and Efficiency
We assume that the total execution time of a program is comprised of
- , a part of the code which can only run in serial
- , a part of the code which can be parallelized
- , parallel overheads due to, e.g. communication
The execution time of a serial code would then be
The time for a parallel code, where the work would be perfectly divided by processors, would be given by
is the speed up amount of time due to the usage of multiple CPUs. The total **speedup** is defined as the ratio of the sequential to the parallel runtime:
The **efficiency** is the speedup per processor, i.e.
Amdahl's Law
Knowing that , and writing as the fraction of the serial code, we can rewrite this to
This places an upper limit on the **strong scalability**, i.e. how quickly can we solve a problem of fixed size by increasing . It is known as *Amdahl's Law*.