DTM50 tablebases : format discussion - extension from syzygy

Endgame analysis using tablebases, EGTB generation, exchange, sharing, discussions, etc..
Post Reply
kronsteen
Posts: 88
Joined: Fri Aug 01, 2008 11:20 am
Sign-up code: 0
Location: France

DTM50 tablebases : format discussion - extension from syzygy

Post by kronsteen »

DTM50 work is making progress. Galen and me are currently checking our 3-5 men DTM50 EGTs against each other, and no bug has been found so far. So we are not far from getting a complete and fully verified 3-5 men DTM50 set, with 6-men expected to follow later (Galen's generator already has some 6-men capability, see his blog http://galen.metapath.org/egtb50/).

The complete 3-5 men DTM50 set looks like a result that is substantial enough for release to the chess community, which raises right now the question of the possible future release format.
As was explained before, a single position can have multiple DTM50 values according to the value of the ply-counter (PC), so DTM50 tablebases don't store a single value per position, but a string of multiple values. There are also some relationships between DTM50 and existing DTM and DTZ50 tablebases, that must be taken into consideration.

I've laid down some ideas about a future DTM50 release format (see the draft in the zip file below), and they come down to the following concept :

- create a package of 4 files per ending that fully support several metrics : WDL50+, DTZ50+, DTM, DTM50, for a wide variety of choices for the end users ;
- build this package over existing syzygy WDL50 / DTZ50 tablebases by creating two supplementary files : an "adm" file containing absolute DTM information (similar to standard DTM EGTs such as Nalimov, but working with WDL) , and a "m50" file containing keys pointing to strings that describe the delayed wins over DTM, according to PC relative to DTZ50.

This sounds a bit technical, but becomes clearer when looking at the examples given for the kpk ending in the zip file below :
DTM50_format.zip
(1.83 MiB) Downloaded 1876 times
(see codage.txt for codes used in the different files)

kpk.wdl contains win/draw/loss information
kpk.z50 contains DTZ50 values (in plies)
kpk.adm contains DTM values (in full moves)
kpk.m50 contains the DTM50 information relative to DTM and DTZ50 (only positions with non-empty DTM50 information are listed). DTM and DTZ50 values are repeated for convenience.

See for example in kpk.m50 the position Ka2 Pb2 / Kd3 (wtm) :
DTZ50=1, meaning that the position is winning if PC is no more than 100-1 = 99 plies (in this example 1.b3 wins for White so there are no values of PC that allow a 50-move rule draw)
DTM=19, meaning that the position is a mate in 19 playing without the 50 move rule
DTM50key is 8, and DTM50 string number 8 (see at the end of kpk.m50 file) is 6 4 / 4 5, meaning that there are delayed wins if PC gets high enough :
- if PC is <=99-6 = 93, the position is still winning in 19 moves (=DTM)
- if PC is >93 but <=99-4=95, the win is delayed by 4 moves i.e the position becomes a win in 23
- if PC is >95 but <=99, the win is delayed by 5 moves i.e the position becomes a win in 24

As can be seen in kpk.m50, there are only 46 different DTM50 strings (including the empty string) for the kpk ending.

I’ve also given a try on a more complex ending, krkp. In this ending there are 7840 different possible DTM50 strings, but fortunately - and just like I hoped for - the most frequent strings cover the vast majority of positions. When sorted by decreasing frequency, the first 5 DTM50 strings (including the don't care string and the empty string) cover 90% of positions, the first 23 cover 95%, the first 137 cover 98% and the first 425 (still less than 6% of all the strings) cover 99%. This is very promising, as applying Huffman coding onto such a collection will likely lead to excellent compression ratios.

k
Post Reply