5-1 EGTB generation revisited

guido · Post by **guido** » Sat Dec 02, 2006 3:45 pm

Martin Kreuzer wrote:Hi Derek,

thank you very much for your offer to help with 5-1 EGTB generation.
Possible targets of your computation could be:

(1) kqrnnk, kqbnnk, and then (using the endgames kqqnnk and kqnnnk
which I already posted) also kqnnpk.

(2) krbnnk, and then krnnpk (using kqrnnk from (1) and the endgames
krrnnk [to be posted today] and krnnnk [already posted]).

These five computations should be doable in less than 1 week CPU time
(1/2 to 1 day for each).

If you want a "challenge", try kbbbpk. (I was not able to finish this one.
It takes several days. The necessary subendgames are already posted.)

Greetings, Martin

Hi Martin,

As you know, I'm doing the same work with my generator.

At present I have generated (and verified) all the 5-1 pawnless TBs and some TBs with one pawn (kqqqpk,
krrrpk, kbbbpk, knnnpk, kqqrpk, kqqbpk and kqqnpk). Until now the longest CPU time spent was 121335 s for kqqnpk.
For kbbbpk I spent only (?) 61956 s.
I think that for my program the worst 5-1 case will be krbnpk, because has the maximum dimension and a long distance to mate, if compared with other 5-1 TBs.

I don't understand why FEG doesn't consider identical men in order to reduce the dimension of the TBs also during the computation. The cpu time for indexing these pieces is very small while the advantage to have smaller files is very big.

Ciao
Guido

Martin Kreuzer · Post by **Martin Kreuzer** » Mon Dec 04, 2006 3:54 pm

Dear Guido,

at present I have generated 38 of the 70 EGTB with 5-1 men,
including kqqppk with two pawns. Since I had trouble uploading to
Kirills ftp area, my uploads are somewhat behind.

I had so far two endgames which caused trouble: kbbbpk and
knnnpk. After taking two weeks for generating the positions,
knnnpk is now finally finishing. I will launch a similar "brute force"
attack on kbbbpk immediately afterwards.

It is very good that you are generating these EGTB, too. This will enable
us to compare the results. We can make statistics a la the *.tbs files
of the Nalimov EGTB. Moreover, we can check randomly chosen positions
and compare the "mate in ..." announcements.

I looked on your web page and did not see an explicit description of the
way you number the endgames. Are you also generating some illegal positions and then mark them as "broken" (or similarly)?
How much space per position do you allocate? Is you algorithm
easily describable?

I think it would be a nice project to create c (or c++) code for the
various numbering schemes. Also Nalimovs code could use a
major "clean-up". As far as I have seen, little research has been done
into the question what the most efficient numering scheme is when
one compares results _after_ compression.

Cordial greetings and best wishes,
Martin

Martin Kreuzer · Post by **Martin Kreuzer** » Wed Dec 06, 2006 10:06 am

Hi all,

now I have computed and uploaded eight new 5-1EGTB:

kqqppk, kqrrnk, kqrrbk, kqrbpk, kqrbbk, kqqnpk, kqrrpk, and krrbnk.

(The last two are still unter transmission which will be finished later today.) All EGTB are complete with stats files and md5sums. The total number of EGTB computed now is 38. (Two more have been computed already and will be uploaded later today or tomorrow.)

Since the computation has passed the HALFTIME mark, let me give a brief
overview:

The 38 EGTB take up about 19 GB of space. Most could be computed in 1/2 - 2 days. Two EGTB turned out to be much harder: kbbbpk and knnnpk. I have no idea why. The computation of knnnpk is finished now and took over 2 weeks. I have started kbbbpk. There was no big additional problem to compute endgames with pawns. I have also computed one endgame with two pawns (kqqppk) and it was not more difficult either.

Some jobs for the next stage of the project would be:

1) Verify the computation, both internally using the verification capability of FEG.EXE and externally using Guido's program.

2) Create statistics files in the style of the *.tbs files of Nalimov EGTB, using the command FEG -LLL and/or the *.log statistics files.

3) Examine the numbering scheme used in the EGTB and write a
program which converts them to Nalimov format.

Any help with these tasks would be most welcome. For the time being, I will concentrate on completing the computation of the 70 EGTB.

Greetings, Martin Kreuzer

guyhaw · Post by **guyhaw** » Thu Dec 07, 2006 12:51 pm

Would be interested to review FEG statistics of 5-1 DTM EGTs, hoping that FEG's correctness problems are a thing of the past.
Marc B did explain to me the difference between Nalimov and FEG stats once, so I'm reaching into my memory.
Basically FEG/Nalimov stats for P-ful endgames are easier to reconcile than for non-P-ful endgames. That suggests to me that the reduction-for-symmetries is treated differently in FEG and in Nalimov stats. Anyway, I'm not clear where and how the symmetry-reduction takes place in FEG stats.
However, I do think that the 'cycle number' is effectively plies-to-mate: if there are counterexamples to that belief, please let us know.
Would be interested in exemplar maxDTM positions, with supporting DTM-minimaxing lines - which will at least be short.
g

Martin Kreuzer · Post by **Martin Kreuzer** » Thu Dec 07, 2006 6:19 pm

Hi all,

today I have uploaded two new EGTB: knnnpk and krrbpk .
They are complete with stats files and md5sums.

Guy: the *.log stats files are in the directory /stats on egtb51.
I would be most interested in "ply to mate" statistics which do not take symmetries into account and simply enumerate positions.

Greetings,
martin Kreuzer

guyhaw · Post by **guyhaw** » Thu Dec 07, 2006 6:29 pm

I think you are assuming I know where the 5-1 info can be downloaded from: not so I'm afraid.
I won't be d'loading the EGTs themselves but am interested in the two types of stats files per EGT.
You can, if you wish, get in touch by email via http://www.tinyurl.com/law6k to send a zip of stats files, post them here, or tell me where/how I can d'load them.
Thanks - Guy

guido · Post by **guido** » Thu Dec 07, 2006 10:49 pm

Martin Kreuzer wrote:Dear Guido,

at present I have generated 38 of the 70 EGTB with 5-1 men,
including kqqppk with two pawns. Since I had trouble uploading to
Kirills ftp area, my uploads are somewhat behind.

I had so far two endgames which caused trouble: kbbbpk and
knnnpk. After taking two weeks for generating the positions,
knnnpk is now finally finishing. I will launch a similar "brute force"
attack on kbbbpk immediately afterwards.

It is very good that you are generating these EGTB, too. This will enable
us to compare the results. We can make statistics a la the *.tbs files
of the Nalimov EGTB. Moreover, we can check randomly chosen positions
and compare the "mate in ..." announcements.

I looked on your web page and did not see an explicit description of the
way you number the endgames. Are you also generating some illegal positions and then mark them as "broken" (or similarly)?
How much space per position do you allocate? Is you algorithm
easily describable?

I think it would be a nice project to create c (or c++) code for the
various numbering schemes. Also Nalimovs code could use a
major "clean-up". As far as I have seen, little research has been done
into the question what the most efficient numering scheme is when
one compares results _after_ compression.

Cordial greetings and best wishes,
Martin

Hi Martin,

My present situation is: 43 5-1men Tbs done and checked, and 27 still to do.
Unfortunately among the remaining TBs there are the longest to generate for gafs, i.e. kqrbpk, kqrnpk, kqbnpk and krbnpk. Each of these TBs will take about 4 days, 3 for generation and 1 for checking.

Your problem with kbbbpk and knnnpk is very strange but, if FEG doesn't consider the identical pieces, these endgames should have the max dimension and the max number of cycles. But it seems to me an insufficient explanation because gafs, generally much slower, spends about 60000 s of elapsed time in both cases.
Multiplying this time by 6 = 3! the time becomes 360000 s equivalent to 4d + 4h much less than two weeks.

In order to compare the stats I did a small program which convert my stats output in Nalimov's format, but my (pseudo)legal positions will be always less than Nalimov's ones. In my stats the number of broken position is not printed but easily obtained by the difference: total positions - calculated positions.
There is no problem to extract positions also in FEN format on the basis of the result or the combined results of White and Black.
The version 1.41 of my program has some bugs (not in the generation!) when the printing is requested in English notation. In the new version 1.42, not yet on my site, these bugs will be eliminated and the FEN format for the ASCII output will be added.

The numbering scheme used by gafs is based on the following sequence, each of which represents a sub-index of the global index:

Kings (462 without pawns or 1806 with pawns) treated in the program by tables
White pawn(s)
Black pawn(s)
White queen(s), rook(s), bishop(s), knight(s)
Black queen(s), rook(s), bishop(s), knight(s)

Pawns start from the 48 possible positions in the chessboard without keeping into account the positions of the kings. If a square was occupied by a King and a pawn, the correspondent byte in the TB is set to illegal (= broken in Nalimov definition). If you are interested to this subject, see on my site the function calcola(void) in the source of the program gafsdim where this point has been taken in consideration to calculate the number of possible positions more accurately (unfortunately the comments are in Italian).

The successive pieces are allocated keeping into account the squares already occupied by kings and pawns.
The presence of n identical pawns or pieces are treated together to form one sub-index with a reduction of the correspondent space in the TBs by the factor n!.

Example: kqqnpk = 1806 * 48 * (61*60)/2 *59 = 9,359,703,360.

As a consequence of this organization my TBs contain positions marked as illegal (broken) for the following
reasons:

- Opponent King under check.
- A King and a pawn on the same square.
- Positions equivalent for symmetry in respect to the main diagonal (a1- h8). This happens when the King are both on this diagonal.
- Triple, quadruple, etc. ( > 2) check and some type of impossible double check.
- Unreachable positions. This control is made executing one backmove and is optional, but it costs too much in cpu time and is incompatible with the method 3. I don't use normally this option.

Gafs starts using 8 bit per position but when the number of moves becames > 126, automatically the program adds one bit of carry to each position. The carry bits are written at the end of the files. If one bit is not sufficient 2 or more bits are added. I checked this algorithm successfully only for one bit during the generation of kppkp. I don't know if the program will run correctly when 2 or more bits must be added. I'm practically sure that some bugs will exist

.

The algorithms used are conceptually easy and have been described by Nalimov in a thread on CCC and by Aarontay for the method 1, and by Wu and Beal for the method 3 applied to the Chinese chess endgames. In method 2 I use the direct move in looking for losses and the backmove for finding wins.
The problems arise in the practical implementation where captures and promotions must be kept into account and when the dimensions of the TBs are greater than the RAM and therefore it is necessary to use the disk memory for storing the TBs during the generation, swapping data between RAM and disk.

Endgames with pawns are not more difficult than the others for my program, but they are in general bigger and request the previous generation of all the TBs obtained by promotion or capture.

Creating a C program for the different numbering schemes is not easy. The FEG format is not known while for the Nalimov format I remember to have been unable to decipher.

About space occupation of compressed files unfortunately I have no experience; I use compression only to transfer files. In fact the library gafslib for using my TBs uses only uncompressed files, with two possibilities:

- loading the whole file in RAM.
- reading a single result on the disk (randomly obviously!)

I think that if the number of the accesses to the TBs is relatively limited, the second mode could be sufficiently fast, adding that the result obtained by the TBs could be kept in the hash table.

My best wishes
Guido

Martin Kreuzer · Post by **Martin Kreuzer** » Fri Dec 08, 2006 9:54 am

Dear Guy,

attached I am posting the FEG stats files of the 40 EGTB I computed so far. There are 3 files for each endgame. The most interesting one should be the *.log file. The description of the format is in FEG.TXT which I posted above.
The EGTB themselves are on Kirills ftp account whose url and login info you can also see further up in this thread.

Best wishes and kind regards,
Martin Kreuzer

clocks · Post by **clocks** » Fri Dec 08, 2006 4:57 pm

I have the necessary disk space now, just got it all setup the other night.

Please update me what files I should generate and where to upload the stats to. I would like to commit to a handful at a time.

Derek

guyhaw · Post by **guyhaw** » Sun Dec 10, 2006 1:38 am

For P-ful endgames, 'LOG/LOF' (column 5) = 2 precisely, corresponding to the a-h symmetry being the only symmetry.
For P-less endgames, 'LOG/LOF' varies widely in the range [4, 8] which rather surprises me: I expected it to be consistently around ~7.6.
I think one has to divide both the LOF and LOG column-5 figures for the 'like men effect', i.e. dividing by 2, 4, 6 or 24 as appropriate.
I would be interested in exemplar wtm and btm maxDTM (1-0 of course) positions: the stats look like the following I believe ...

(after 'Like Men')
# distinct pos. maxDTM
1-0 1-0
Endgame wtm btm wtm btm

KBBBBK 80 8 13 16
KBBBNK 2 48 22 33
KBBNNK 2 10 25 34
KBNNNK 2 6 22 34
KNNNNK 1,139 507 18 21
KQBBBK 14,709 8 6 16
KQNNNK 6 507 8 21
KQQBBK 91,048 396,494 4 6
KQQBNK 405,459 11,414 4 7
KQQNNK 375,519 592 4 8
KQQQBK 9 38,647 4 4
KQQQNK 58 79,205 4 4
KQQQQK 8,442 245,486 3 3
KQQQRK 358,536 382 3 4
KQQRBK 7,611 39,801 4 5
KQQRNK 15,997 671,632 4 5
KQQRRK 17 71,713 4 4
KQRBBK 29,134 2,190 5 12
KQRBNK 234,359 56 5 29
KQRRBK 16 6 5 10
KQRRNK 82 10 5 10
KQRRRK 1,609 922 4 5
KRBBBK 1,130 8 11 16
KRNNNK 12 507 16 21
KRRBBK 27 2,190 10 12
KRRBNK 1 56 10 29
KRRNNK 2 5 10 15
KRRRBK 37 6 6 10
KRRRNK 43,711 10 5 10
KRRRRK 23,817 922 4 5
KNNNPK 8 646 25 28
KQQBPK 30 5 5 9
KQQNPK 472 477 5 9
KQQPPK 2,386 111,292 5 9
KQQQPK 8,065 899,301 4 4
KQQRPK 410,425 63 4 7
KQRBPK 3,844 66 6 15
KQRRPK 6,957 15 5 14
KRRBPK 70 66 11 15
KRRRPK 2 15 8 14

The positions are 'distinc positions', i.e. not convertable into each other by rotations/reflections of the board, or by switching two like men.
g ... http://www.tinyurl/com/law6k

Martin Kreuzer · Post by **Martin Kreuzer** » Mon Dec 11, 2006 4:40 pm

Hi all,

thanks to everybody for the wonderful contributions.

Guido: This is an excellent intro to your numbering scheme. I will try to download the code you indicated and study it! Do you have a link to the
description of Nalimov of his numbering scheme in ccc? (I could not find it right away.)

Derek: The files I am suggesting are still the same (in this order):
kqrrnk, kqbnn, kqnnp, krbnn, krnnp.
When you are done computing, I would be nice to have the result on egtb51 (address above) if Kirill agrees. There is a directory /stats for the stats files.

Guy: Thanks a lot for these statistics. I will try to use feg -lll and see
whether I succeed in finding some of the maxDTM positions you are looking for.

I am already in the process of uploading further 5-1 EGTB. I will report when I am finished.

Greetings,
Martin Kreuzer

clocks · Post by **clocks** » Mon Dec 11, 2006 6:39 pm

I will commit to doing those, generating them right now, having to do some preliminary ones, but will get those done quickly

Will let you know when I am done, and upload them to wherever you want!

Derek

vb4 · Post by **vb4** » Tue Dec 12, 2006 3:41 am

guido wrote:
Martin Kreuzer wrote:Dear Guido,

at present I have generated 38 of the 70 EGTB with 5-1 men,
including kqqppk with two pawns. Since I had trouble uploading to
Kirills ftp area, my uploads are somewhat behind.

I had so far two endgames which caused trouble: kbbbpk and
knnnpk. After taking two weeks for generating the positions,
knnnpk is now finally finishing. I will launch a similar "brute force"
attack on kbbbpk immediately afterwards.

It is very good that you are generating these EGTB, too. This will enable
us to compare the results. We can make statistics a la the *.tbs files
of the Nalimov EGTB. Moreover, we can check randomly chosen positions
and compare the "mate in ..." announcements.

I looked on your web page and did not see an explicit description of the
way you number the endgames. Are you also generating some illegal positions and then mark them as "broken" (or similarly)?
How much space per position do you allocate? Is you algorithm
easily describable?

I think it would be a nice project to create c (or c++) code for the
various numbering schemes. Also Nalimovs code could use a
major "clean-up". As far as I have seen, little research has been done
into the question what the most efficient numering scheme is when
one compares results _after_ compression.

Cordial greetings and best wishes,
Martin
Hi Martin,

My present situation is: 43 5-1men Tbs done and checked, and 27 still to do.
Unfortunately among the remaining TBs there are the longest to generate for gafs, i.e. kqrbpk, kqrnpk, kqbnpk and krbnpk. Each of these TBs will take about 4 days, 3 for generation and 1 for checking.

Your problem with kbbbpk and knnnpk is very strange but, if FEG doesn't consider the identical pieces, these endgames should have the max dimension and the max number of cycles. But it seems to me an insufficient explanation because gafs, generally much slower, spends about 60000 s of elapsed time in both cases.
Multiplying this time by 6 = 3! the time becomes 360000 s equivalent to 4d + 4h much less than two weeks.

In order to compare the stats I did a small program which convert my stats output in Nalimov's format, but my (pseudo)legal positions will be always less than Nalimov's ones. In my stats the number of broken position is not printed but easily obtained by the difference: total positions - calculated positions.
There is no problem to extract positions also in FEN format on the basis of the result or the combined results of White and Black.
The version 1.41 of my program has some bugs (not in the generation!) when the printing is requested in English notation. In the new version 1.42, not yet on my site, these bugs will be eliminated and the FEN format for the ASCII output will be added.

The numbering scheme used by gafs is based on the following sequence, each of which represents a sub-index of the global index:

Kings (462 without pawns or 1806 with pawns) treated in the program by tables
White pawn(s)
Black pawn(s)
White queen(s), rook(s), bishop(s), knight(s)
Black queen(s), rook(s), bishop(s), knight(s)

Pawns start from the 48 possible positions in the chessboard without keeping into account the positions of the kings. If a square was occupied by a King and a pawn, the correspondent byte in the TB is set to illegal (= broken in Nalimov definition). If you are interested to this subject, see on my site the function calcola(void) in the source of the program gafsdim where this point has been taken in consideration to calculate the number of possible positions more accurately (unfortunately the comments are in Italian).

The successive pieces are allocated keeping into account the squares already occupied by kings and pawns.
The presence of n identical pawns or pieces are treated together to form one sub-index with a reduction of the correspondent space in the TBs by the factor n!.

Example: kqqnpk = 1806 * 48 * (61*60)/2 *59 = 9,359,703,360.

As a consequence of this organization my TBs contain positions marked as illegal (broken) for the following
reasons:

- Opponent King under check.
- A King and a pawn on the same square.
- Positions equivalent for symmetry in respect to the main diagonal (a1- h8). This happens when the King are both on this diagonal.
- Triple, quadruple, etc. ( > 2) check and some type of impossible double check.
- Unreachable positions. This control is made executing one backmove and is optional, but it costs too much in cpu time and is incompatible with the method 3. I don't use normally this option.

Gafs starts using 8 bit per position but when the number of moves becames > 126, automatically the program adds one bit of carry to each position. The carry bits are written at the end of the files. If one bit is not sufficient 2 or more bits are added. I checked this algorithm successfully only for one bit during the generation of kppkp. I don't know if the program will run correctly when 2 or more bits must be added. I'm practically sure that some bugs will exist .

The algorithms used are conceptually easy and have been described by Nalimov in a thread on CCC and by Aarontay for the method 1, and by Wu and Beal for the method 3 applied to the Chinese chess endgames. In method 2 I use the direct move in looking for losses and the backmove for finding wins.
The problems arise in the practical implementation where captures and promotions must be kept into account and when the dimensions of the TBs are greater than the RAM and therefore it is necessary to use the disk memory for storing the TBs during the generation, swapping data between RAM and disk.

Endgames with pawns are not more difficult than the others for my program, but they are in general bigger and request the previous generation of all the TBs obtained by promotion or capture.

Creating a C program for the different numbering schemes is not easy. The FEG format is not known while for the Nalimov format I remember to have been unable to decipher.

About space occupation of compressed files unfortunately I have no experience; I use compression only to transfer files. In fact the library gafslib for using my TBs uses only uncompressed files, with two possibilities:

- loading the whole file in RAM.
- reading a single result on the disk (randomly obviously!)

I think that if the number of the accesses to the TBs is relatively limited, the second mode could be sufficiently fast, adding that the result obtained by the TBs could be kept in the hash table.

My best wishes
Guido

Hello Guido,

I sent you an email but figured you may get my message faster here. I was wondering if you could snd me just the stat files for all 43 (5-1) egtbs that you have done up till now.

Appreciate it,

Les

guido · Post by **guido** » Wed Dec 13, 2006 1:37 pm

guyhaw wrote:For P-ful endgames, 'LOG/LOF' (column 5) = 2 precisely, corresponding to the a-h symmetry being the only symmetry.
For P-less endgames, 'LOG/LOF' varies widely in the range [4, 8] which rather surprises me: I expected it to be consistently around ~7.6.
I think one has to divide both the LOF and LOG column-5 figures for the 'like men effect', i.e. dividing by 2, 4, 6 or 24 as appropriate.
I would be interested in exemplar wtm and btm maxDTM (1-0 of course) positions: the stats look like the following I believe ...

(after 'Like Men')
# distinct pos. maxDTM
1-0 1-0
Endgame wtm btm wtm btm

KBBBBK 80 8 13 16
KBBBNK 2 48 22 33
KBBNNK 2 10 25 34
KBNNNK 2 6 22 34
KNNNNK 1,139 507 18 21
KQBBBK 14,709 8 6 16
KQNNNK 6 507 8 21
KQQBBK 91,048 396,494 4 6
KQQBNK 405,459 11,414 4 7
KQQNNK 375,519 592 4 8
KQQQBK 9 38,647 4 4
KQQQNK 58 79,205 4 4
KQQQQK 8,442 245,486 3 3
KQQQRK 358,536 382 3 4
KQQRBK 7,611 39,801 4 5
KQQRNK 15,997 671,632 4 5
KQQRRK 17 71,713 4 4
KQRBBK 29,134 2,190 5 12
KQRBNK 234,359 56 5 29
KQRRBK 16 6 5 10
KQRRNK 82 10 5 10
KQRRRK 1,609 922 4 5
KRBBBK 1,130 8 11 16
KRNNNK 12 507 16 21
KRRBBK 27 2,190 10 12
KRRBNK 1 56 10 29
KRRNNK 2 5 10 15
KRRRBK 37 6 6 10
KRRRNK 43,711 10 5 10
KRRRRK 23,817 922 4 5
KNNNPK 8 646 25 28
KQQBPK 30 5 5 9
KQQNPK 472 477 5 9
KQQPPK 2,386 111,292 5 9
KQQQPK 8,065 899,301 4 4
KQQRPK 410,425 63 4 7
KQRBPK 3,844 66 6 15
KQRRPK 6,957 15 5 14
KRRBPK 70 66 11 15
KRRRPK 2 15 8 14

The positions are 'distinc positions', i.e. not convertable into each other by rotations/reflections of the board, or by switching two like men.
g ... http://www.tinyurl/com/law6k

Hi Guy and all,

I have some different results from my TBs generator in respect to those reported by you, that seem not explainable with the different tests of legality that I do, even if not impossible.

I have attached 44 stats in ASCII in two different formats, the first (contained in stv.zip) obtained directly by my program and the second (contained in stn.zip) equal to Nalimov's stats obtained by a transformation of the first.
The symbols of the men used in the name of the files are Italian/Spanish (R = King, D = Queen, T = Rook, A = Bishop, C = Knight)

For instance for KBBBBK (i.e. RAAAAR) my program obtains 80 29 13 14 while FEG gives 80 8 13 16.

This means that for FEG there are 8 btm losses of the Black (and victories of the White) in 16 moves while for me the max DTM is obtained by 29 positions in 14 moves.

The possible conclusions are:

- Alle the losses in 16 and 15 moves (if any exists) are illegal
- One of the two programs has an error (probably mine).

So I ask you if it is possible to extract the 8 btm losses, to see if they are illegal, or to check the correct number of moves. There are many other cases to explain with different number of positions and moves between the two generators.

For Martin Kreuzer

Program gafsdim has been written to compute the number of pseudolegal positions in chess for different number of men, until 32. The number obtained for the total chess positions is generally greater than that reported by others.
In the computation I used my indexing scheme, but to better the computation I put an option for eliminating the double occupation of a square by a Pawn and a King. This reduces the number by a significant percentage (about 20-30 %).
For the Nalimov's indexing scheme I think that Guy can give you more information than me.

Ciao a tutti
Guido

guido · Post by **guido** » Wed Dec 13, 2006 11:04 pm

I saw that in my preceding message the attachments containing stats of 44 5-1 men TBs are not downloadable.
I try to send again them.
Guido

Martin Kreuzer · Post by **Martin Kreuzer** » Thu Dec 14, 2006 7:11 am

Hi all,

now I have computed and uploaded 5 new EGTB to egtb51:

kqbbnk, kqbbpk, krbbnk, krrnpk, and krrppk.

They are complete with stats files and md5sums, as usual. The total
count is 45 now.

Guide: Thank you for your stats files. I have experimend with
FEG -LL and it extracts these positions nicely. I will do this for kbbbbk soon
and more systematically over the Christmas break.

Ciao, cordial greetings,
Martin

guyhaw · Post by **guyhaw** » Thu Dec 14, 2006 8:39 am

Two thoughts:

1) it has to be remembered that it may be the loser who makes the 'conversion' to a losing endgame, being forced to capture or promote.

2) If Guido has made an tests (extra to those made by FEG) about illegality of positions, that is likely to affect the move-counts and maxDTMs. A position excluded as 'illegal' in KBBBK will affect the statistics for KBBBBK.
g

guyhaw · Post by **guyhaw** » Thu Dec 14, 2006 10:24 am

Guido,
I guess you mean you have a version of your stats in 'Nalimov style format': Nalimov never produced any 5-1 DTM EGT stats.
Note that KBBBK has (Nalimov provided) maxDTM =16 (wtm, 2 positions) and maxDTM = 19 (btm, 45 positions). So it is quite likely that Black can immediately capture into one of the two KBBBK maxDTM positions from KBBBBK. So I suspect that the FEG data is correct.
It is precisely because there is no agreement on eliminating 'unreachable but not really obviously illegal' positions that no attempt has been made to do it in the past. Statistics should be reproducable in alternative technologies.
g

guido · Post by **guido** » Thu Dec 14, 2006 4:47 pm

guyhaw wrote:Guido,
I guess you mean you have a version of your stats in 'Nalimov style format': Nalimov never produced any 5-1 DTM EGT stats.
Note that KBBBK has (Nalimov provided) maxDTM =16 (wtm, 2 positions) and maxDTM = 19 (btm, 45 positions). So it is quite likely that Black can immediately capture into one of the two KBBBK maxDTM positions from KBBBBK. So I suspect that the FEG data is correct.
It is precisely because there is no agreement on eliminating 'unreachable but not really obviously illegal' positions that no attempt has been made to do it in the past. Statistics should be reproducable in alternative technologies.
g

Hi Guy,

the files in "Nalimov's format" are exactly what you say. It wasn't my intention to create ambiguity. They are my_statistics converted in the format used by Nalimov. I did this for persons, who want to compare the results of Nalimov's tbgen with those of gafs. But as EN didn't generate the 5-1 TBs, this would be possible only with FEG statistics transposed in 'Nalimov style format'. Translating my stats in FEG format would have been a little more difficult

.
In order to avoid any confusion I gave completely different names and suffixes to my stats in Nalimov style format as the names of my files don't contain 'k'. In any case if this can create problems I ask Kirill to cancel stn.zip from the CCRL. Please, give me your opinion and advice.

For KBBBK gafs gives the following results:

maxDTM =16 (wtm, 2 positions) and maxDTM = 19 (btm, 44 positions)

but this doesn't guarantee that KBBBBK has the same maxDTMs. In fact if the 44 positions in KBBBK have in general (I checked only some cases) a capture as first move of the bK, than in KBBBBK the bK should have
done two successive captures! Possible but not probable.
For this reason it would be interesting to extract the positions in KBBBBK with the maxDTM to see if they are actually legal. I don't know if this is possible for FEG files by means of a program. If I have these positions and
they are clearly legal I can see what my program gives as result for them and also play these endgames choosing the best moves at each step. My stats come from an error free verification phase, which should be the same also for tbgen and FEG; it consists in checking the congruency of all the results.

About the differences in the number of legal positions, the endgame KBBBBK can be considered tipical, because I eliminate all the double (and obviously triple and quadruple) check as illegal. The same happens for KNNNNK, while for KRRRRK and KQQQQK this must be done carefully because a promotion can produce a double check in certain positions.

I chose to eliminate some type of illegal positions in order to reduce the cpu time.
It could be interesting to have a program that solve the positions with TBs of different authors, but I think it is difficult to do.

Ciao
Guido

guyhaw · Post by **guyhaw** » Thu Dec 14, 2006 5:31 pm

For KBBBK maxDTM 1-0 positions, I am seeing these stats:
(gafs) wtm 2 @ 16, and btm 44 @ 19
(EN) wtm 2 @ 16, and btm 45 @ 19
The difference could be the '2 Ks on a1-h8' issue, or your pursuit of less obviously unreachable positions. There are 'single checks' that are unreachable too, e.g. wBs on a1, b2 and c3 - and bK on d4 ... and I think there is no obvious place to stop in this pursuit of unreachables - a good reason not to start in the first place.

For KBBBBK maxDTM 1-0 postiion, I am seeing these stats:
(gafs) wtm 80 @ 13, and btm 29 @ 14
(FEG) wtm 80 @ 13, and btm 13 @ 16
It is possible, but we don't know, that the '13' btm positions at DTM=16 convert into the 2 wtm KBBBK DTM=16 positions.

An optimal wtm KBBBK position is 8/8/8/3B4/3B4/3k4/3B4/3K4 w
, and Black would have to have been in an unreachable double-check if it had captured a B where it is on the previous move. Maybe the other wtm maxDTM position has the same criticism, so maybe the btm KBBBBK positions at DTM=16/15 have been eliminated. If you instrument gafs to tell you about eliminated positions, you will be able to find out.
g

guyhaw · Post by **guyhaw** » Fri Dec 15, 2006 10:55 am

Equal of less positions if they agree on maxDTM, and GAF's maxDTM is less than or equal to FEG's.
Some endgames where the GAF count of positions is exactly half of FEG's - which rather causes me to wonder why that is.
Season's greetings to all - g

guido · Post by **guido** » Fri Dec 15, 2006 9:51 pm

guyhaw wrote:For KBBBK maxDTM 1-0 positions, I am seeing these stats:
(gafs) wtm 2 @ 16, and btm 44 @ 19
(EN) wtm 2 @ 16, and btm 45 @ 19
The difference could be the '2 Ks on a1-h8' issue, or your pursuit of less obviously unreachable positions. There are 'single checks' that are unreachable too, e.g. wBs on a1, b2 and c3 - and bK on d4 ... and I think there is no obvious place to stop in this pursuit of unreachables - a good reason not to start in the first place.

For KBBBBK maxDTM 1-0 postiion, I am seeing these stats:
(gafs) wtm 80 @ 13, and btm 29 @ 14
(FEG) wtm 80 @ 13, and btm 13 @ 16
It is possible, but we don't know, that the '13' btm positions at DTM=16 convert into the 2 wtm KBBBK DTM=16 positions.

An optimal wtm KBBBK position is 8/8/8/3B4/3B4/3k4/3B4/3K4 w
, and Black would have to have been in an unreachable double-check if it had captured a B where it is on the previous move. Maybe the other wtm maxDTM position has the same criticism, so maybe the btm KBBBBK positions at DTM=16/15 have been eliminated. If you instrument gafs to tell you about eliminated positions, you will be able to find out.
g

Hi Guy,

when I started some years ago there are only few generators and few information about them, so I decided to set as illegal the max number of positions compatibly with the cost in cpu time to decide about legality, without taking into account the possibility of a comparison with results of other generators.

Your example (e.g. wBs on a1, b2 and c3 - and bK on d4 ) is IMHO wrong, because this positions is legal if the preceding white move is for instance Bd2xc3.

About the second point I said that it was possible, but not probable, if we supposedly admit that the first two moves of bK in KBBBBK btm must be captures of two wBs. For this the bK must be initially close at least to 3
bishops. The black king captures one wB with the first move and then he must be so close to the other two wBs in a way that one of them can still be captured after any white move. But this is impossible as 2 of the 3 wBs are necessarily on squares of the same colour and they can easily defend themselves reciprocally.

About the third point you are right because only two backmoves can show that such position is illegal.
It would be easy to insert a new option in gafs for listing only the illegal positions, but I don't know if it is useful because the list would be very long for 5-6 men TBs.
In gafs at present there is the possibility of printing all the legal positions with their results and analogously all the positions (legal+illegal) with their results. So comparing the two files it is possible to extract the illegal ones. This could be done practically for 3 and 4 men endgames. I try something with KPK to see what positions are illegal with the backmove (for instance I found illegal all the positions where bK is under check by a pawn in row 2) and in this particular endgame I think, but I'm not sure, that this test together with the usual criteria should exclude all the illegal positions.

Ciao
Guido

P.S. Can you indicate exactly which endgame give the most difficult to explain differences in maxDTM between FEG and gafs? We can investigate in detail these endgames if the number of positions is not too high.

guyhaw · Post by **guyhaw** » Sun Dec 17, 2006 9:29 am

I've attached a zipped .xls spreadsheet with the FEG and GAF 5-1 stats. Some other data there too.
The LOG/LOF ration of FEG's stats intrigues, as does the comparison of FEG and GAF results after less-obvious unreachable positions are removed from the GAF results.
Some examples of maxDTM positions would be useful - to add to another spreadsheet.
Would also be interesting to see if GAF produces the same stats as Nalimov and/or FEG when running without advanced unreachability-checking. Nalimov and FEG stats line up easily enough on P-ful endgames.
My error re the 'Bishops in a line' unreachabilty: it wasn't. Never trust my chessic intuition - that's why I leave this kind of thing to MB etc.
g

guido · Post by **guido** » Sun Dec 17, 2006 11:47 pm

guyhaw wrote:I've attached a zipped .xls spreadsheet with the FEG and GAF 5-1 stats. Some other data there too.
The LOG/LOF ration of FEG's stats intrigues, as does the comparison of FEG and GAF results after less-obvious unreachable positions are removed from the GAF results.
Some examples of maxDTM positions would be useful - to add to another spreadsheet.
Would also be interesting to see if GAF produces the same stats as Nalimov and/or FEG when running without advanced unreachability-checking. Nalimov and FEG stats line up easily enough on P-ful endgames.
My error re the 'Bishops in a line' unreachabilty: it wasn't. Never trust my chessic intuition - that's why I leave this kind of thing to MB etc.
g

Hi Guy,

thank you very much for the interesting attachment.
In comparison of FEG with gafs stats I noticed a small error in the KQRRBK endgame, because btm maxDTM for gafs is 7 and not 14. In fact in any case gafs could find only values of maxDTM less than FEG correspondent
values. Moreover if the maxDTM is the same of FEG, the numbers of positions must be always less or equal to FEG numbers.
It is interesting to notice that wtm maxDTM are always_the_same for FEG and gafs, as it was to be expected because obviously wK is never under check. So this is also a good point for the correctness of my TBs.

I executed a test removing the control on checks (but not the symmetry on the main diagonal) and repeating the calculation of KBBBBK.
The result are coherent with FEG because btm maxDTM is now 16 and there are 8 positions as in FEG.
For losses in 15 moves there are 4 positions. All these positions ( 8 + 4 ) are illegal for a double check of two bishops. Therefore gafs seems in agreement with FEG.

I attached theASCII file kbbbbk.txt containing the new statistics of the endgame and the list of the positions in FEN format.

Ciao
Guido

P.S. I see that you didn't have any problem with my old attachments, which had the Unix and not the Windows EOL. The new attachments should have the Windows EOL.

guyhaw · Post by **guyhaw** » Mon Dec 18, 2006 12:32 am

... guido, for pointing out my finger-slip. Must have copied a cell down and forgotten to change the content - dangerous practice

All looks good for the correctness of your GAF EGTs. But do you have an independent way of verifying your EGTs?
Other maxDTM positions (at 'FEG' maxDTMs) will be interesting: no hurry.
g

CCRL Discussion Board

5-1 EGTB generation revisited

Re: Further computations

HALFTIME REPORT

FEG statistics ...

2 new EGTB

Martin ...

stats files for FEG 5-1 EGTB

A digest of MK's 5-1 DTM stats to dat

Looking for update for stat files

Re: A digest of MK's 5-1 DTM stats to dat

Re: A digest of MK's 5-1 DTM stats to dat

new EGTB

Differences in statistics

Re the KBBBBK DTM stats

Re: Re the KBBBBK DTM stats

KBBBBK maxDTM results

GAF stats compatible with the FEG ones ...

Re: KBBBBK maxDTM results

FEG and GAF 5-1 stats

Re: FEG and GAF 5-1 stats

Thanks ...