I'm not certain, but I think your _start and _stop macros may rely on wrap around in signed twos-complement arithmetic. To be safe from weird optimisations based on undefined behaviour, I'd suggest adding -fwrapv to the compile options. In my opinion that should be the default since that's how almost all other languages work.Thanks for the comments, I will go with pthread for know, knowing Open MPI as fallback if needed.
I created threaded initialization of a 20,000×20,000 matrix in small test code:
https://github.com/Hermann-SW/RR/blob/m ... /20K.D.cpp
If compiling without optimizationreported runtimes in microseconds show the desired speedups.Code:
pi@raspberrypi5:~/RR/tsp/pthread $ g++ -Wall -Wextra -pedantic 20K.D.cpp pi@raspberrypi5:~/RR/tsp/pthread $
I did this on a Pi5, so number of threads between 1 and 4.
No explicit allocation of threads to a specific core yet.Code:
pi@raspberrypi5:~/RR/tsp/pthread $ ./a.out 14860146uspi@raspberrypi5:~/RR/tsp/pthread $ ./a.out 22450638uspi@raspberrypi5:~/RR/tsp/pthread $ ./a.out 31642796uspi@raspberrypi5:~/RR/tsp/pthread $ ./a.out 41244822uspi@raspberrypi5:~/RR/tsp/pthread $ freqmin=cur=3000000=maxpi@raspberrypi5:~/RR/tsp/pthread $
P.S:
With small commit before
https://github.com/Hermann-SW/RR/commit ... 6e60fc0791
now time for running "init_dist()" is reported. That will allow to verify the intended speedups.
In scientific computing the usual practice is to initialise the threads at the start of the program, assign then to cores and never create any more threads later. For this reason, I would move the _start down in the code so the creation time for the threads is not included in the timings.
To clarify what I meant to say earlier, OpenMP and Open MPI are two different things.
- OpenMP is a standard for multi processing built-in to the gcc and clang compilers and used to make a multi-threaded program.
- Open MPI is an implementation of the message-passing interface used for cluster computing. It is not built in and there are other libraries for same standard.
Statistics: Posted by ejolson — Tue Aug 19, 2025 11:26 pm