EGTB compression (H. G. Muller tablebase...)

Endgame analysis using tablebases, EGTB generation, exchange, sharing, discussions, etc..
Post Reply
dann corbit
Posts: 16
Joined: Fri Jan 25, 2008 9:10 am

EGTB compression (H. G. Muller tablebase...)

Post by dann corbit »

Post subject: EGTB compression (H. G. Muller tablebase...) Posted: Fri Jan 25, 2008 7:36 am

--------------------------------------------------------------------------------

Given the following uncompressed H. G. Muller tablebase file:
01/24/2008 11:25 PM 268,439,552 KBB_KN

From here:
http://www.quicklz.com/
I ran a compression benchmark against the file, which shows compression ratio and speed of both compression and decompression:

C:\COMPRE~1\bench\Release>bench KBB_KN
LZF 3.1 VER : 22490 KB (28.8%) 150.2 MB/s 289.0 MB/s.
FastLZ 0.1.0 1: 22751 KB (29.1%) 177.6 MB/s 329.5 MB/s.
FastLZ 0.1.0 2: 22507 KB (28.8%) 141.1 MB/s 325.5 MB/s.
LZO 1X 2.02 : 20575 KB (26.3%) 144.5 MB/s 361.7 MB/s.
QuickLZ 1.4.0 4 : 18339 KB (23.5%) 35.0 MB/s 429.1 MB/s.

The QuickLZ 1.4.0 compressor created an 18 MB file that decompresses at 429 MB/Sec.


The size certainly compares well with Nalimov files:
Directory of C:\arena\Engines\Nalimov

06/04/2003 11:00 PM 17,795,185 kbbkn.nbb.emd
06/04/2003 11:00 PM 13,167,012 kbbkn.nbw.emd

I bring this up because it is so cotton-picking hard to get ahold of Eugene for permissions.

I was wondering, which other EGTB formats are available as open source? I believe the Scorpio format is published.

So we have:

H.G.Muller
D.Shawul
E.Nalimov

with source code for EGTB available. Are there any others?
jkominek
Posts: 150
Joined: Mon Dec 04, 2006 9:02 am
Sign-up code: 0
Location: Pittsburgh, PA

Re: EGTB compression (H. G. Muller tablebase...)

Post by jkominek »

Hi Dann,

Thanks for pointing out QuickLZ (a library I hadn't heard of before) and running some tests. My standard disclaimer applies: unless it is block-indexable, the compression library is not suitable for the large DTM and DTC tablebases. It may be suitable for bitbases and the smaller tables that can be held entirely in an amount of memory typical on today's machines.

You raise a quite interesting point regarding Nalimov tablebases. Namely, the dwindling supply of personal permission. I might be able to help out. As you'll notice in other threads, I've been reworking the tbgen code (in an on-again off-again manner) for the immediate goal of generating the missing 5-1 tables, and the long range goal of tackling 7- and 8-man DTM tablebases. When released they'll come accompanied by a "permission granted to use for eternity" statement.

But, I want to make you aware of some issues.

1. The indexing is different in that it is simpler, but less compact. The tables consume approximately the same amount of storage when compressed, but when uncompressed they take up 10% extra space in memory.

2. The indexing is different with regards to piece mapping. Specifically, pawns are moved from the "inner axis" to the "outer axis". Say the position is in the set kqrnkbp. In the usual approach the king-pair forms the outermost multiplier, followed by the other pieces in canonical order: q r n - b p. What this means is that if one of the kings moves, the index changes by a huge amount, while if the pawn moves forward one row the index increases by only 8. Location locality favors black's "little guys". In my modification, moving a pawn changes the index by a huge factor, the kings by a medium factor, and so on. There are two motivating reasons. The first is that child positions bunch together more tightly, with only the small number of pawn opportunities going "way yonder". The larger reason relates to generation, not access: pawn slicing. As Muller points out, distinct pawn positions create logically distinct tablebases, with dependencies that can be enumerated. This aspect is important for 7-man generation, where the larger configurations contain 2.2 Tera positions. In my indexing scheme, all the indexes that belong to a pawn configuration form one contiguous block. In practice this has the benefit that different pawn slices write to separate files. When finished they are simply concatenated together to form a complete tablebase. This is not possible with Nalimov indexing (or with any of the others as far as I am aware), and is my one major departure from previous work.

3. I don't care so much how practical DTM tablebases are during game tree search, while Nalimov appears to have been quite concerned that way. The value of tablebases during live play is hotly contested among engine authors, with most concluding that the benefit is minor-to-negligible. Bitbases are the more appropriate choice for embedding in search. Generating derivative bitbases would be my concession to that audience.

Until your posting I had not been keen to the idea of releasing, in an alternate form, another 1.2 TB of data that duplicates information already available (and a pain enough as is to collect). Who would love me for that?

john
User avatar
Kirill Kryukov
Site Admin
Posts: 7399
Joined: Sun Dec 18, 2005 9:58 am
Sign-up code: 0
Location: Mishima, Japan
Contact:

Re: EGTB compression (H. G. Muller tablebase...)

Post by Kirill Kryukov »

jkominek wrote:Until your posting I had not been keen to the idea of releasing, in an alternate form, another 1.2 TB of data that duplicates information already available (and a pain enough as is to collect). Who would love me for that?
I would, if that another 1.2 TB was accompanied by a free generator code and free probing code. You probably know that Nalimov code is not free and it's close to impossible to contact him these days.

Though while you are at it you might think about doing a DTC or DTZ tablebases. While they are not useful in finding longer checkmates, they should be more compact than DTM tables. If that 1.2 TB becomes 0.5 TB it would be nice?

We are still lacking a complete and free 6-men EGTB solution, which should include generator, probing code, and tablebase files.

By the way making a probing interface compatible with Nalimov would be a nice feature as it would make it easier for engine writers to incorporate it.
jkominek
Posts: 150
Joined: Mon Dec 04, 2006 9:02 am
Sign-up code: 0
Location: Pittsburgh, PA

Re: EGTB compression (H. G. Muller tablebase...)

Post by jkominek »

I would.
Thanks! :)
Though while you are at it you might think about doing a DTC or DTZ tablebases.
Guy Haw. has made a similar suggestion. I'm particularly intrigued by the prospect of DTZ50 data, so that "this is the truth" may be accompanied by "this is what's legal". Yakov Konoval is quite far ahead on the DTC/DTZ front, but I haven't seen any announcements in a year and a half now, and wonder if that effort has stalled. He might simply be quiet.

john

P.S. Dann - What ever happened to your connx ftp site?
User avatar
jshriver
Posts: 298
Joined: Tue Jan 31, 2006 5:59 am
Sign-up code: 0
Location: Toledo, OH, USA
Contact:

Re: EGTB compression (H. G. Muller tablebase...)

Post by jshriver »

Kirill Kryukov wrote:[
I would, if that another 1.2 TB was accompanied by a free generator code and free probing code. You probably know that Nalimov code is not free and it's close to impossible to contact him these days.
I have to agree with Kirll. I have very very much respect for Nalimov. But he seems to be out of the scene for many years now, and even getting permission to include his code in an engine to use the freely available dataset seems rare.

While I'm grateful for the available dataset I would love to see a BSD-licensed or public domain codebase. From a developer point of view I'm grateful he is offering code and data for free, but having the "must contact owner for approval" can be a bottleneck for those getting into the field.

I never heard about Mr. Muller's tables bases but anxious to hear more about them. What is the license? I'd be willing to dedicate CPU time for egtb generation if given the code (linux).
guyhaw
Posts: 489
Joined: Sat Jan 21, 2006 10:43 am
Sign-up code: 10159
Location: Reading, UK
Contact:

Re: EGTB compression (H. G. Muller tablebase...)

Post by guyhaw »

Eugene didn't respond to emails about releasing the last 16 6-man DTM EGTs. In the end, it took the simple device of a 'phone call by Nelson F to move things on. Maybe one can get allergic to email.
Marc B championed the creation of some DTZ50 EGTs, given the absence of EGTs to the DTR and DTZR metrics. He was right: the DTR/DTZR EGTs still haven't been created - a project waiting for a skilful programmer. DTR EGTs tell you how far off you are from definitely avoiding a draw-claim under a k-move rule, for all k. They say that you can win from the position under a 'dtr'-move rule but not under a 'dtr-1'-move rule.
Post Reply