CCRL Discussion Board

Posted: **Wed Aug 09, 2006 6:58 am**

It seems after everything going on.. Mr Nalimov himself has released the missing egtbs. Mr. Hernandez has received them, so now begins an interesting anaylsis.

I'm working with him now to see if there are any differences.
I'm crossing my fingers, hoping, and praying that they are the same for the sake of the chess community.

If possible I'm going to try and collect both sets and see if there are any differences. But for now we're going to suffice with simple md5sums.

The first thing that pops in mind.. is even though the data might be the same, there might be differences in how they were split.

While that might be an easy fix.. it's a first step in assuring the data is identical.

As I learn more I'll gladly post here. Comments are appreciated.
As of right now however, the sets that I received are the only ones available.

Mr. Hernandez has noted he plans to send them to Kirill.

-Josh

Posted: **Wed Aug 09, 2006 8:49 am**

Hi,

I think we should stop distributing the "16 missing sets", until it is clear, that they are kongruent to the files, generated from Mr. Nalimov. But there is not much hope, that the split of the big files is identic. We will get a mismatch of files, when than the first file of Nalimovs original sets is puplished. What to you think about the case?

Posted: **Wed Aug 09, 2006 9:49 am**

Hello Thomas,

the file splitting for Nalimov tablebases is absolutely unique.
Please have a look at the thread "EGBT file splitting".
The uncompressed files are broken every 2^31 bytes
and then compressed individually. You can even get the
splitting program by Marc in that thread.

Greetings, Martin Kreuzer

Posted: **Wed Aug 09, 2006 11:50 am**

Martin Kreuzer wrote:Hello Thomas,

the file splitting for Nalimov tablebases is absolutely unique.
Please have a look at the thread "EGBT file splitting".
The uncompressed files are broken every 2^31 bytes
and then compressed individually. You can even get the
splitting program by Marc in that thread.

Greetings, Martin Kreuzer

The biggest ambiguity is whether 8 or 16 bits are used to store tablebase values. This is hard coded in the index code, and is the reason why often index code needs to be updated when new files are released. I have used the index code available in late 2004. If this was changed later some endings could be affexted. The most likely ending that may be different is kbnkpp, which uses 16 bits per entry, even though 8 bits are sufficient. It is relatively straightforward to convert from 16 to 8 bits in this case.

-Marc

Posted: **Thu Aug 10, 2006 4:40 am**

Nelson have sent me the TBS files for last 16 sets:

Posted: **Thu Aug 10, 2006 6:30 am**

My understanding so far is that:

1) [via MB] the EGT-parts will be split at identical '2^31' points and are self-contained, so should admix together
2) [via MB and NH] there is an 8-bit/16-bit issue re at least KQPKBP & KQPKRP
MB converted from FEG in line with a 'late 2004' version of Eugene's access code: EN may have made some improvements (to 8-bit) since then
3) On the MD5sum-checking front, there are two issues. First, the MD5sums for EN's version of the last 16 DTM EGTs have not been computed by Eugene, or yet by Nelson. Secondly, MB advises that 'broken positions' may have their value set to 'broken' or to something that improves compressions further as this does not affect the chessic value of the files.

I would like to see a file-by-file comparison on the basis of MD5sums between EN's and MB's versions of the last 16 EN DTM EGT files. Obviously, an agreement at this level gives best confidence that no issues will arise, and we get the earliest opportunity to learn from the disagreements.
g

CCRL Discussion Board

Ironic yet wonderful.

Ironic yet wonderful.

Re: Ironic yet wonderful.

file splitting

Re: file splitting

Reconciling the 'EN' and 'MB' versions of the last 16 EGTs