I implemented ABDADA as parallel game tree search in an iterative way in Zeta, and it scales well, but I did it with some atomic operations on global memory what will cause less optimal NPS scaling with more workers, so here, yet another Lazy SMP derivate - RMO:
With <= 64 workers it can not compete with ABDADA, but with >= 128 there is not such a big difference, at least on my time to depth benchmarks.
RMO - basics:
- communication of TT move and TT scores between workers via Shared Hash Table
- all workers follow the same Iterative Deepening loop with same depth
- worker 0 performs a normal search with move ordering etc.
- workers >0 perform a normal search for oldest son first
- workers >0 randomize the move order up to QSearch for other siblings
- as soon as one worker finishes its search, all worker should terminate the current ID iteration and start the next one
RMO - exceptions for workers>0:
There are some cases you may not want to randomize the move order:
- TT move
- IID move
- Counter-Heuristics move
- Killer-Heuristics move
and some cases where you may want to perform a normal search:
- Null move
I have tested to randomize captures with an higher priority than quiet moves, but have no results that justify to do so.
In general you can also combine RMO with an classic parallel search, to let the first n workers search a classic way and the others via RMO. Maybe such an approach will make more sense with a workers count of > 128.