Quantcast
Channel: Raspberry Pi Forums
Viewing all articles
Browse latest Browse all 8621

Graphics programming • Re: Performance of rotating video by 90/270 degree on Pi3

$
0
0
Thanks for the VPU insights!
There is one other trick you can use.
[...]
You should then find your code takes no longer that a vld/vst memcpy as any other instructions are run while waiting for the loads to arrive.
I managed to implement a 64x64 transpose with four individual 32x32 transpose steps, each getting its data preloaded one step ahead. By my crude measurements (is there a good way other than externally looking how long the executecode mailbox call takes?) it can transpose at around 350MB/s. Not sure if that's good or not, but getting rid of the transpose related instructions and leaving only the ld/st does indeed not make a difference now. Nice. Thanks!

Statistics: Posted by dividuum — Mon Feb 24, 2025 7:38 pm



Viewing all articles
Browse latest Browse all 8621

Trending Articles