Friday, 22 April 2022
Zeta v099 - FP32 - dummy benchmarks
I implemented some dummy 10x12 mailbox floating-point move generation with 32 parallel gpu-threads and get only 2x speedup for an unoptimized approach compared to 64 gpu-threads Bitboards, too lil, it does not pay off for me to explore that branch any further.
So I am stuck on ~100 Knps per worker with Zeta v099 with up to 320 workers on current gpu architectures.
Comments
New comment