Zeta Chess

Zeta - v099 revisited

It works, Zeta v099 plays decent chess, with an classic parallel AlphaBeta approach, and I am convinced that with some further work it could reach more than 3000 CCRL Elo on an highend gpu.

But the obvious thing is, it lacks nps throughput per worker, the single thread performance is too low, and even with an better parallel search, there is not much to gain on massive parallel systems with more than 128 workers.

So to be able to beat the top 10 chess engines out there, the nps throughput per worker must be increased ten or twenty fold...

during early development I tried an design based on an LIFO-Stack parallel search. It had the best nps throughput of all my designs, but I was not able to implement AlphaBeta pruning efficient, so the speed gain was lost again during pruning.

If I had to start over, and make another Zeta version, I would try the LIFO-Stack based parallel search again...

Zeta v099m

Zeta v099m released as source and Linux/Windows 64 bit binary:

https://github.com/smatovic/Zeta/releases

Alternative downloads:

https://zeta-chess.app26.de/downloads/

Please consider the README file or --help option before running the engine.

From the changelog:

Zeta (099m) alpha; urgency=medium

* patch for ABDADA parallel search
* disabled RMO parallel search
* removed max device memory limitation
* mods in time control
* cleanups
*
* Zeta 099m on Nvidia V100, 160 workers, ~ 13.5 Mnps
* Zeta 099m on Nvidia V100, 1 worker, ~ 85 Knps

-- Srdja Matovic 13 Jul 2019

Here some nps and search scaling results...

################################################################################
# Zeta 099m, startposition, depth 12, best of 4 runs, Nvidia V100:
# tt1: 2048 MB, tt2: 1536 MB
#
### workers #nps          #nps speedup   #time in s   #ttd speedup   #relative ttd
### 1       86827         1.000000       156.586000   1.000000       1.000000 
### 2       180282        2.076336       55.749000    2.808768       2.808768 
### 4       356910        4.110588       35.564000    4.402936       1.567568 
### 8       704741        8.116611       19.637000    7.974029       1.811071 
### 16      1385758       15.959989      14.583000    10.737571      1.346568 
### 32      2786039       32.087242      11.124000    14.076411      1.310949 
### 64      5460849       62.893443      8.838000     17.717357      1.258656 
### 128     10235993      117.889516     7.377000     21.226244      1.198048 
### 160     11639290      134.051505     7.202000     21.742016      1.024299 

Zeta v099l

Zeta v099k did not scale well on Nvidia Pascal and Turing gpus, so I wrote a patch to fix this issue, and released Zeta v099l:

https://github.com/smatovic/Zeta/release

On Pascal it runs now 4 workers per Compute Unit and on Turing 2 workers per Compute Unit during guessconfigx.

According to Nvidia papers, Turing should have 16 wide SIMD units, with four units per Compute Unit, but according to my tests I can only speculate that the integer units are 32 wide, not 16, with two of them per Compute Unit.

During benchmarks on other systems it was shown again that some Windows OS have an OS gpu timeout, so you may want to apply this registry update on your Windows machine:

https://zeta-chess.app26.de/downloads/SetWindowsGPUTimeoutTo20s.reg

Download, double-click and reboot OS to increase gpu timeout from 2 to 20 seconds.

If you want to run an SMP benchmark for your gpu, I suggest to increase the gpu timeout to 400 seconds:

https://zeta-chess.app26.de/downloads/SetWindowsGPUTimeoutTo400s.reg

Neural Networks on GPU

Currently there is much going on with neural networks for chess. With GiraffeAlphaZero, and its open source adaptation LC0 (Leela Chess Zero), it was shown that, with enough horse power, artificial neural networks are competitive in computer chess.

Currently LC0 uses an MCTS, Monte-Carlo Tree Search, approach with GPU as neural network accelerator for position evaluation.

My own experiments showed that AlphaBeta search is superior to MCTS, but current GPU architectures suffer from host-device latency, so you have to couple tasks to batches to be executed in one run on the GPU, not that conform with the serial nature of AlphaBeta.

With upcoming GPGPU architectures (or ANN accelerators) with less latency there might be AlphaBeta ANN engines possible...

Bye bye 8800 GT

Hmm,
my GPU workstation went broken, so I decommissioned my Maschina, time to say bye bye to my workhorse from 2008, the Nvidia 8800 GT
 

Nvidia 8800 GT

 

and the Asus Crosshair Formula II

Asus Crosshair Formula II

 

and the 8 GB GeIL Black Dragon Memory

GeIL Black Dragon Memory

 

not to forget, the water-cooled 'Beast' from 2015, AMD Fury X with 4096 cores, 8 TFLOPS and 4 GB HBM

The Beast - AMD Fury X

...we had a lot of fun, may you find a new, worthy owner on eBay.
 

Zeta - Source Code and Binaries online

I fixed some issues in Zeta Dva and Zeta, source code and binaries are online again

https://github.com/smatovic/ZetaDva/releases

https://github.com/smatovic/Zeta/releases

Please consider the README file or --help option before running the Zeta engine on GPU.

I lost the source of Zeta Vintage, and an attempt to do an rewrite in C showed again that the 6502 processor should really be programmed in assembly, so a rewrite in 6502 assembly is still on my bucket list...

https://github.com/smatovic/ZetaVintage

Alternative downloads:

https://zeta-chess.app26.de/downloads/

 

Home - Top