ThreadSanitizer

From HPC Wiki
Revision as of 11:06, 4 July 2024 by Joachim-jenke-0d01@rwth-aachen.de (talk | contribs) (Created page with "ThreadSanitizer is a Sanitizer developed in the LLVM project that detects data races and other threading errors. It consists of a compiler instrumentation module and a run-tim...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

ThreadSanitizer is a Sanitizer developed in the LLVM project that detects data races and other threading errors. It consists of a compiler instrumentation module and a run-time library. Typical slowdown introduced by ThreadSanitizer is about 5x-15x. Typical memory overhead introduced by ThreadSanitizer is about 2x-5x. ThreadSanitizer is available for C/C++ codes with clang/clang++ and for C, C++ and Fortran codes with icx/icpx/ifx

Detecting Data Race Bugs in C/C++ Programs with ThreadSanitizer

A data race in C and C++ occurs when two or more threads access the same memory without synchronization and at least one of the threads writes to the memory. If a data race can occur in a program execution, the program has **undefined behavior**. ThreadSanitizer detects data races in the execution of multi-threaded applications.

The following C++ program has a data race for unsynchronized writing to a in line 5:

#include <thread>
#include <iostream>

void foo(int *a, int *b) {
  *a += *b + 1;
}

int main() {
  int a = 2, b = 3;
  std::thread first(foo, &a, &b);
  foo(&a, &b);
  first.join();
  printf("%i, %i\n", a, b);
  return 0;
}


To detect the data race, compile the code with clang using ThreadSanitizer:

$ clang++ -fsanitize=thread -g -fno-omit-frame-pointer example.cc

Or with the Intel® oneAPI DPC++/C++ compiler:

$ icpx -fsanitize=thread -g -fno-omit-frame-pointer example.cc

During execution of the resulting application ThreadSanitizer reports a data race between two writing accesses in line 5 and column 8 which is the write two `a`:

 $ ./a.out
 ==================
 WARNING: ThreadSanitizer: data race (pid=36417)
  Write of size 4 at 0x7fffffffd868 by main thread:
    #0 foo(int*, int*) example.cc:5:8 (a.out+0xe0b30) (BuildId: 8c833f4f068264a91d1da2f1b2623e500f258605)
    #1 main example.cc:11:3 (a.out+0xe0be6) (BuildId: 8c833f4f068264a91d1da2f1b2623e500f258605)
 
  Previous write of size 4 at 0x7fffffffd868 by thread T1:
    #0 foo(int*, int*) example.cc:5:8 (a.out+0xe0b30) (BuildId: 8c833f4f068264a91d1da2f1b2623e500f258605)
    #1 void std::__invoke_impl<void, void (*)(int*, int*), int*, int*>(std::__invoke_other, void (*&&)(int*, int*), int*&&, int*&&) /usr/include/c++/13/bits/invoke.h:61:14 (a.out+0xe1842) (BuildId: 8c833f4f068264a91d1da2f1b2623e500f258605)
    #2 std::__invoke_result<void (*)(int*, int*), int*, int*>::type std::__invoke<void (*)(int*, int*), int*, int*>(void (*&&)(int*, int*), int*&&, int*&&) /usr/include/c++/13/bits/invoke.h:96:14 (a.out+0xe16d5) (BuildId: 8c833f4f068264a91d1da2f1b2623e500f258605)
    #3 void std::thread::_Invoker<std::tuple<void (*)(int*, int*), int*, int*>>::_M_invoke<0ul, 1ul, 2ul>(std::_Index_tuple<0ul, 1ul, 2ul>) /usr/include/c++/13/bits/std_thread.h:292:13 (a.out+0xe1663) (BuildId: 8c833f4f068264a91d1da2f1b2623e500f258605)
    #4 std::thread::_Invoker<std::tuple<void (*)(int*, int*), int*, int*>>::operator()() /usr/include/c++/13/bits/std_thread.h:299:11 (a.out+0xe15e5) (BuildId: 8c833f4f068264a91d1da2f1b2623e500f258605)
    #5 std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (*)(int*, int*), int*, int*>>>::_M_run() /usr/include/c++/13/bits/std_thread.h:244:13 (a.out+0xe1229) (BuildId: 8c833f4f068264a91d1da2f1b2623e500f258605)
    #6 execute_native_thread_routine src/libstdc++-v3/src/c++11/thread.cc:104:18 (libstdc++.so.6+0xeabb3) (BuildId: 40b9b0d17fdeebfb57331304da2b7f85e1396ef2)
 
  Location is stack of main thread.
 
  Location is global '??' at 0x7ffffffdd000 ([stack]+0x20868)
 
  Thread T1 (tid=36419, finished) created by main thread at:
    #0 pthread_create <null> (a.out+0x6063f) (BuildId: 8c833f4f068264a91d1da2f1b2623e500f258605)
    #1 __gthread_create /build/gcc-14-OQFzmN/gcc-14-14-20240412/build/x86_64-linux-gnu/libstdc++-v3/include/x86_64-linux-gnu/bits/gthr-default.h:676:35 (libstdc++.so.6+0xeac88) (BuildId: 40b9b0d17fdeebfb57331304da2b7f85e1396ef2)
    #2 std::thread::_M_start_thread(std::unique_ptr<std::thread::_State, std::default_delete<std::thread::_State>>, void (*)()) src/libstdc++-v3/src/c++11/thread.cc:172:37 (libstdc++.so.6+0xeac88)
    #3 main example.cc:10:15 (a.out+0xe0bd4) (BuildId: 8c833f4f068264a91d1da2f1b2623e500f258605)
 
 SUMMARY: ThreadSanitizer: data race example.cc:5:8 in foo(int*, int*)
 ==================
 10, 3
 ThreadSanitizer: reported 1 warnings


Detecting Data Race Bugs in OpenMP Programs with Archer

For precise data race detection in OpenMP applications, the additional Archer library is necessary. The library is distributed with LLVM since version 10 and with the Intel® oneAPI DPC++/C++ Compiler since version 2024.0.

The following OpenMP program has a data race for unsynchronized reading i and j:

#include <stdio.h>
#include <stdlib.h>

int fib(int n) {
  int i, j;
  if (n < 2) {
    return n;
  } else {
#pragma omp task shared(i) if (n - 1 > 2)
    { i = fib(n - 1); }
#pragma omp task shared(j) if (n - 2 > 2)
    { j = fib(n - 2); }
    int ret = i + j;
#pragma omp taskwait
    return ret;
  }
}

int main(int argc, char **argv) {
  int n = 5;
  if (argc > 1)
    n = atoi(argv[1]);
#pragma omp parallel sections
  { printf("fib(%i) = %i\n", n, fib(n)); }
  return 0;
}


To detect the data race, compile the code with clang using ThreadSanitizer:

$ clang -fopenmp -fsanitize=thread -g -fno-omit-frame-pointer omp-fib-race.c

Or with latest Intel oneAPI DPC++/C++ compiler:

$ icx -qopenmp -fsanitize=thread -g -fno-omit-frame-pointer omp-fib-race.c

During execution of the resulting application ThreadSanitizer reports two data races for `i` and `j`:

$ export TSAN_OPTIONS='ignore_noninstrumented_modules=1'
$ OMP_NUM_THREADS=4 ./a.out 
==================
WARNING: ThreadSanitizer: data race (pid=42433)
  Write of size 4 at 0x7fffffffd210 by thread T1:
    #0 .omp_outlined..1 omp-fib-race.c:12:7 (a.out+0xdf12b) (BuildId: 018c00fd8abe0d714ed83289d445e3a851e16112)
    #1 .omp_task_entry..3 omp-fib-race.c:11:1 (a.out+0xdf12b)
    #2 __kmp_invoke_task openmp/runtime/src/kmp_tasking.cpp:1913:9 (libomp.so.5+0x68c82) (BuildId: 4a9c6f4352f802a2f83811673d52fd888dd18b5c)

  Previous read of size 4 at 0x7fffffffd210 by main thread:
    #0 fib omp-fib-race.c:13:19 (a.out+0xdee4f) (BuildId: 018c00fd8abe0d714ed83289d445e3a851e16112)
    #1 main.omp_outlined_debug__ omp-fib-race.c:24:33 (a.out+0xdf343) (BuildId: 018c00fd8abe0d714ed83289d445e3a851e16112)
    #2 main.omp_outlined omp-fib-race.c:23:1 (a.out+0xdf3d5) (BuildId: 018c00fd8abe0d714ed83289d445e3a851e16112)
    #3 __kmp_invoke_microtask openmp/runtime/src/z_Linux_asm.S:1198 (libomp.so.5+0xe1722) (BuildId: 4a9c6f4352f802a2f83811673d52fd888dd18b5c)
    #4 main omp-fib-race.c:23:1 (a.out+0xdf1e4) (BuildId: 018c00fd8abe0d714ed83289d445e3a851e16112)

  Location is stack of main thread.

  Location is global '??' at 0x7ffffffdd000 ([stack]+0x20210)

  Thread T1 (tid=42435, running) created by main thread at:
    #0 pthread_create <null> (a.out+0x6067f) (BuildId: 018c00fd8abe0d714ed83289d445e3a851e16112)
    #1 __kmp_create_worker openmp/runtime/src/z_Linux_util.cpp:833:7 (libomp.so.5+0xb9546) (BuildId: 4a9c6f4352f802a2f83811673d52fd888dd18b5c)

SUMMARY: ThreadSanitizer: data race omp-fib-race.c:12:7 in .omp_outlined..1
==================
==================
WARNING: ThreadSanitizer: data race (pid=42433)
  Write of size 4 at 0x7ffff27ff4c4 by thread T2:
    #0 .omp_outlined. omp-fib-race.c:10:7 (a.out+0xdefcb) (BuildId: 018c00fd8abe0d714ed83289d445e3a851e16112)
    #1 .omp_task_entry. omp-fib-race.c:9:1 (a.out+0xdefcb)
    #2 __kmp_invoke_task openmp/runtime/src/kmp_tasking.cpp:1913:9 (libomp.so.5+0x68c82) (BuildId: 4a9c6f4352f802a2f83811673d52fd888dd18b5c)

  Previous read of size 4 at 0x7ffff27ff4c4 by thread T3:
    #0 fib omp-fib-race.c:13:15 (a.out+0xdee40) (BuildId: 018c00fd8abe0d714ed83289d445e3a851e16112)
    #1 .omp_outlined. omp-fib-race.c:10:9 (a.out+0xdefaf) (BuildId: 018c00fd8abe0d714ed83289d445e3a851e16112)
    #2 .omp_task_entry. omp-fib-race.c:9:1 (a.out+0xdefaf)
    #3 __kmp_invoke_task openmp/runtime/src/kmp_tasking.cpp:1913:9 (libomp.so.5+0x68c82) (BuildId: 4a9c6f4352f802a2f83811673d52fd888dd18b5c)

  Location is stack of thread T3. 

  Thread T2 (tid=42436, running) created by main thread at:
    #0 pthread_create <null> (a.out+0x6067f) (BuildId: 018c00fd8abe0d714ed83289d445e3a851e16112)
    #1 __kmp_create_worker openmp/runtime/src/z_Linux_util.cpp:833:7 (libomp.so.5+0xb9546) (BuildId: 4a9c6f4352f802a2f83811673d52fd888dd18b5c)

  Thread T3 (tid=42437, running) created by main thread at:
    #0 pthread_create <null> (a.out+0x6067f) (BuildId: 018c00fd8abe0d714ed83289d445e3a851e16112)
    #1 __kmp_create_worker openmp/runtime/src/z_Linux_util.cpp:833:7 (libomp.so.5+0xb9546) (BuildId: 4a9c6f4352f802a2f83811673d52fd888dd18b5c)

SUMMARY: ThreadSanitizer: data race omp-fib-race.c:10:7 in .omp_outlined.
==================

On Ubuntu systems, libarcher cannot be found, so that setting another environmental variables is necessary (the exact path depends on the version of LLVM installed):

$ export OMP_TOOL_LIBRARIES=/usr/lib/llvm-18/lib/libarcher.so 

The Intel oneAPI DPC++/C++ does not automatically load libarcher, so export:

$ export OMP_TOOL_LIBRARIES=libarcher.so

To verify that Archer is loaded and active for a specific execution, additionally export:

$ export ARCHER_OPTIONS=verbose=1 

This lead to the following output during execution:

Archer detected OpenMP application with TSan, supplying OpenMP synchronization semantics