7-men EGTB Bounty

Endgame analysis using tablebases, EGTB generation, exchange, sharing, discussions, etc..
kronsteen
Posts: 88
Joined: Fri Aug 01, 2008 11:20 am
Sign-up code: 0
Location: France

Re: 7-men EGTB Bounty

Post by kronsteen »

According to what has been already discussed, I would like to submit some first-hand text in order to help starting building a project.
Maybe it would also be the opportunity (if not too ambitious) to try to raise "standard" terminology about won positions forced into draw by 50-move rule, provided no one already exists (I've no idea of this). My opinion is terminology may make it relatively clear that those positions are nothing else and nothing more than theoretical draws, strictly chessically speaking.

Feel free to use it and correct anything you want. Apologies if you find any flaws.
Hope this helps. k


*******************

1. Format

Four new different formats for tablebases are introduced :

WDL50 format
DTZ50 format
DOS format
MDTZ format

These are explained as follows.

All Tablebases are built on the assumption that evaluated positions come from a zeroing move, i.e 50-move rule clock is set to zero.
All Tablebases make the assumption that neither side on evaluated positions has any castling rights, even if initial rook+king position is compatible of castling
All Tablebases make the assumption that evaluated positions contain no opportunity for en-passant pawn capture for any side, even if pawn position is compatible (but en-passant rule naturally must be considered in subsequent play)

1.1 WDL50 Tablebases (Win-Draw-Loss, 50 move rule on)

Tablebase on WDL50 format will provide information for every legal position about whether it is :
- a win, i.e moving side is able to force a win within a line of play which is not drawn under 50-move rule
- a loss, i.e moving side is unable to escape defeat within a line of play which is not drawn under 50-move rule
- a draw, which is every position being neither a win nor a loss.


1.2 DTZ50 Tablebases (Distance To next Zeroing move, 50 move rule on)

A DTZ50 Tablebase is a WDL50 tablebase containing metric information about won/lost positions.
DTZ50 Tablebase provides, for every winning/losing position, the smallest number of HALF-moves within which winning side can force either mate or a zeroing move (pawn move, promotion, capture) leading to a still won position.
DTZ50 Tablebases contain no quantitative information about drawn positions, whatever their type (genuine draw or marginal draw, see below)


1.3 DOS Tablebases (Denial-Ordinary-Save)

Tablebase on DOS format will provide information for every legal position about whether it is :
- a denial (50-move rule denies opportunity to win for moving side), which is a winnable position for moving side off 50-move rule, but unwinnable under 50-move rule
- a save (50-move rule gives opportunity for moving side to save himself from loss), which is a position from which moving side cannot avoid defeat off 50-move rule, but can do it under 50-move rule
- an ordinary position, which is neither of these.

Drawn ordinary positions are genuine draws : on these positions, both sides can avoid defeat but neither can force mate, regardless of the 50-move rule.
Denials and saves are called "marginals" (or marginal positions).

DOS Tablebases are to be added to WDL Tablebases (but separately).


1.4 MDTZ Tablebases (Marginal position Distance To next Zeroing move)

MDTZ Tablebases provide metric information about marginal positions.
For every marginal position, MDTZ gives the smallest number of half-moves within which winning side can force either mate or a zeroing move leading to a still winning position off 50-move rule (i.e a true win or a denial).
MDTZ Tablebases contain no quantitative information about ordinary positions.

2. Project

Project is to build following tables :

Task 1 : 3 to 6 men WDL50
Task 2 : 3 to 6 men DTZ50
Task 3 : 5-6 men DOS (DOS tablebases for 3-4 men are empty)
Task 4 : 5-6 men MDTZ
Task 5 : 7-men WDL50
Task 6 : 7-men DTZ50 on some endings of interest
Task 7 : 7-men DOS
Task 8 : 7-men MDTZ on some endings of interest

T1 needs no previous data. It is immediate if T2 is already achieved.
T2 needs no previous data. x-men DTZ50 generation needs at least complete WDL50 tables for 3 to x-1 men.
T3 is immediate if T1 is already achieved, via comparison with existing DTM nalimov tablebases.
T4 shall be performed after T3 in order to run algorithm only on marginal positions
T5 needs T1 and only T1
T6 needs T1, and T5 on every daughter 7-men ending (daughter = reachable ending after one or several pawn promotions)
T7 needs T5
T8 needs T3, and T7 on every daughter 7-men ending.

All types of 3-6 men endings are to be considered (21 22 31 32 41 33 42 51). For 7-men, only 43 and 52 may be considered.
User avatar
Kirill Kryukov
Site Admin
Posts: 7399
Joined: Sun Dec 18, 2005 9:58 am
Sign-up code: 0
Location: Mishima, Japan
Contact:

Re: 7-men EGTB Bounty

Post by Kirill Kryukov »

kronsteen wrote:Four new different formats for tablebases are introduced :

WDL50 format
DTZ50 format
DOS format
MDTZ format

These are explained as follows.

All Tablebases are built on the assumption that evaluated positions come from a zeroing move, i.e 50-move rule clock is set to zero.
All Tablebases make the assumption that neither side on evaluated positions has any castling rights, even if initial rook+king position is compatible of castling
Please note that separate tables for positions with castling rights can be constructed and added later, if the "composers" will need it. I agree that it's better to not include castling in the formal requirement for the first step of 7-men tables generation, but it may be a welcome future addition.
kronsteen wrote:All Tablebases make the assumption that evaluated positions contain no opportunity for en-passant pawn capture for any side, even if pawn position is compatible (but en-passant rule naturally must be considered in subsequent play)
Here I have to disagree. En-passant is a very important part of many endgames. Without it the tablebases simply can not be trusted in many endgames with pawns. I think that en-passant is a must and should be in the requirement for a 7-men generator.
kronsteen wrote:1.1 WDL50 Tablebases (Win-Draw-Loss, 50 move rule on)

Tablebase on WDL50 format will provide information for every legal position about whether it is :
- a win, i.e moving side is able to force a win within a line of play which is not drawn under 50-move rule
- a loss, i.e moving side is unable to escape defeat within a line of play which is not drawn under 50-move rule
- a draw, which is every position being neither a win nor a loss.


1.2 DTZ50 Tablebases (Distance To next Zeroing move, 50 move rule on)

A DTZ50 Tablebase is a WDL50 tablebase containing metric information about won/lost positions.
DTZ50 Tablebase provides, for every winning/losing position, the smallest number of HALF-moves within which winning side can force either mate or a zeroing move (pawn move, promotion, capture) leading to a still won position.
DTZ50 Tablebases contain no quantitative information about drawn positions, whatever their type (genuine draw or marginal draw, see below)

1.3 DOS Tablebases (Denial-Ordinary-Save)

Tablebase on DOS format will provide information for every legal position about whether it is :
- a denial (50-move rule denies opportunity to win for moving side), which is a winnable position for moving side off 50-move rule, but unwinnable under 50-move rule
- a save (50-move rule gives opportunity for moving side to save himself from loss), which is a position from which moving side cannot avoid defeat off 50-move rule, but can do it under 50-move rule
- an ordinary position, which is neither of these.

Drawn ordinary positions are genuine draws : on these positions, both sides can avoid defeat but neither can force mate, regardless of the 50-move rule.
Denials and saves are called "marginals" (or marginal positions).

DOS Tablebases are to be added to WDL Tablebases (but separately).
Is "DOS" the same with what I called WDL(>50)? I don't like the name "DOS" because DOS is used for many other things already.
kronsteen wrote:1.4 MDTZ Tablebases (Marginal position Distance To next Zeroing move)

MDTZ Tablebases provide metric information about marginal positions.
For every marginal position, MDTZ gives the smallest number of half-moves within which winning side can force either mate or a zeroing move leading to a still winning position off 50-move rule (i.e a true win or a denial).
MDTZ Tablebases contain no quantitative information about ordinary positions.
Is this the same with what I called DTZ(>50)?
kronsteen wrote:2. Project

Project is to build following tables :

Task 1 : 3 to 6 men WDL50
Task 2 : 3 to 6 men DTZ50
Task 3 : 5-6 men DOS (DOS tablebases for 3-4 men are empty)
Task 4 : 5-6 men MDTZ
Task 5 : 7-men WDL50
Task 6 : 7-men DTZ50 on some endings of interest
Task 7 : 7-men DOS
Task 8 : 7-men MDTZ on some endings of interest
I think that 3-to-7 men WDL50 computation can progress regardless of any other tables. Building any DTZ50, WDL(>50) and DTZ(>50) will be up to those who need them. I don't think we can effectively plan the computation order. Every table will be generated by some volunteers running the generator on their own computers, so it will be their choice what to generate next.
kronsteen wrote:T1 needs no previous data. It is immediate if T2 is already achieved.
T2 needs no previous data. x-men DTZ50 generation needs at least complete WDL50 tables for 3 to x-1 men.
T3 is immediate if T1 is already achieved, via comparison with existing DTM nalimov tablebases.
T4 shall be performed after T3 in order to run algorithm only on marginal positions
T5 needs T1 and only T1
T6 needs T1, and T5 on every daughter 7-men ending (daughter = reachable ending after one or several pawn promotions)
T7 needs T5
T8 needs T3, and T7 on every daughter 7-men ending.

All types of 3-6 men endings are to be considered (21 22 31 32 41 33 42 51). For 7-men, only 43 and 52 may be considered.
I think that the generator certainly has to be able to handle 61 and 61p tables, even if no one decides to compute them.
kronsteen
Posts: 88
Joined: Fri Aug 01, 2008 11:20 am
Sign-up code: 0
Location: France

Re: 7-men EGTB Bounty

Post by kronsteen »

Kirill Kryukov wrote:kronsteen wrote:
All Tablebases make the assumption that evaluated positions contain no opportunity for en-passant pawn capture for any side, even if pawn position is compatible (but en-passant rule naturally must be considered in subsequent play)

Here I have to disagree. En-passant is a very important part of many endgames. Without it the tablebases simply can not be trusted in many endgames with pawns. I think that en-passant is a must and should be in the requirement for a 7-men generator.
I never considered cancelling en-passant rule. I thought only that on initial positions containing one or more en-passant potential rights it might be difficult for programmers to take every possibility into account. But I'm not a programmer myself.

Kirill Kryukov wrote:kronsteen wrote:
2. Project

Project is to build following tables :

Task 1 : 3 to 6 men WDL50
Task 2 : 3 to 6 men DTZ50
Task 3 : 5-6 men DOS (DOS tablebases for 3-4 men are empty)
Task 4 : 5-6 men MDTZ
Task 5 : 7-men WDL50
Task 6 : 7-men DTZ50 on some endings of interest
Task 7 : 7-men DOS
Task 8 : 7-men MDTZ on some endings of interest

I think that 3-to-7 men WDL50 computation can progress regardless of any other tables. Building any DTZ50, WDL(>50) and DTZ(>50) will be up to those who need them. I don't think we can effectively plan the computation order. Every table will be generated by some volunteers running the generator on their own computers, so it will be their choice what to generate next.
Yes I didn't want to suggest any special order. Wasn't so clear. Sorry for the ambiguity.
Kirill Kryukov wrote:Is "DOS" the same with what I called WDL(>50)? I don't like the name "DOS" because DOS is used for many other things already.
Kirill Kryukov wrote:Is this the same with what I called DTZ(>50)?
It is. Idea of separating [WDL(>50)+DTZ(>50)] from [WDL50+DTZ50] feels good.
User avatar
Kirill Kryukov
Site Admin
Posts: 7399
Joined: Sun Dec 18, 2005 9:58 am
Sign-up code: 0
Location: Mishima, Japan
Contact:

Re: 7-men EGTB Bounty

Post by Kirill Kryukov »

kronsteen, some detailed write up like this will be very useful when we will get closer to deciding the requirements. Thanks for your work! I will also write a summary when we are closer to final version of the requirements. My writing is usually brief and technical, so some more friendly text will be a big help when talking to people. :-)
kronsteen wrote:I never considered cancelling en-passant rule. I thought only that on initial positions containing one or more en-passant potential rights it might be difficult for programmers to take every possibility into account. But I'm not a programmer myself.
Can a correct en-passant-aware tablebase be generated without considering en-passant rights in root? (Or without storing the positions with the en-passant right). I don't think so. Please correct me if I am wrong.
kronsteen
Posts: 88
Joined: Fri Aug 01, 2008 11:20 am
Sign-up code: 0
Location: France

Re: 7-men EGTB Bounty

Post by kronsteen »

Kirill Kryukov wrote:Can a correct en-passant-aware tablebase be generated without considering en-passant rights in root? (Or without storing the positions with the en-passant right). I don't think so. Please correct me if I am wrong.
Ah yes. You're correct on both. I missed second argument, positions with en-passant rights must absolutely be stored somewhere, of course. Asks for more complexity, but has already been overcome by existing TBs :) .
jkominek
Posts: 150
Joined: Mon Dec 04, 2006 9:02 am
Sign-up code: 0
Location: Pittsburgh, PA

Re: 7-men EGTB Bounty

Post by jkominek »

kronsteen wrote:
Kirill Kryukov wrote:Can a correct en-passant-aware tablebase be generated without considering en-passant rights in root? (Or without storing the positions with the en-passant right). I don't think so. Please correct me if I am wrong.
Ah yes. You're correct on both. I missed second argument, positions with en-passant rights must absolutely be stored somewhere, of course. Asks for more complexity, but has already been overcome by existing TBs :) .
EP adds complexity, but I wouldn't go so far as to say it is "overcome". Maybe it is in the closed work of Diepeveen and de Koning, but no not totally in Nalimov. There are no routines for handling 3 vs 2 pawns or 4 pawns vs 1. Which denies the summit tables kpppkpp and kppppkp.

A tough call for the evaluation group. Is the award contingent on demonstrating proper handling of all various opposing pawn forces, when this cannot be realized with a complete build of the 7-man tables? They might have to resort to code review and the notion of staggered payout.

john
syzygy
Posts: 166
Joined: Sun Aug 03, 2008 4:02 pm

Re: 7-men EGTB Bounty

Post by syzygy »

jkominek wrote:
syzygy wrote:For the file format itself it shouldn't matter a lot, but it may complicate parts of the generator and increase the size and/or number of intermediate files. So I'd put it under "difficulty of writing code" rather than the (final) file format.
It certainly is that. I also argue that, as a consequence, validation becomes more tricky. There are no existing tables of exactly the same kind to cross-check against. I've been expecting guyhaw to weigh in on this matter with a wavy-pointy finger, but he so far has not.
I would not expect getting a wavy-pointy finger for suggesting to compute something which has not been computed before.

There are standard ways of verifying internal consistency of tablebases. Those can be trivially adapted to DTZ50+. DTZ50+ tables can also be verified at either WDL- or WDL50-level with other tables. Comparing full stats between tables is tricky anyway unless both tables were generated with the same indexing function. I don't believe an a priori choice for Nalimov's way of indexing is necessarily a good choice.
jkominek wrote:
syzygy wrote:Btw, my notation DTR(k) (for those that are still with me ;)) should be changed into DTZ(k). Actually, it also possible to talk about DTC(k) and DTM(k).
DTZ(k): pairs (min{i : p is won under k_i-move rule}, distance to zeroing move)
...
Kronsteen's table = DTZ(50<inf).
It's not typical for me to discourse negatively ... but, I'm sorry, your notation sucks. For all positions p in the chess space X, DTZ(k) is a function of the independent variable k, where k is any positive integer, but for the purpose of finiteness we can limit it to be lower than say 2^15.
Please read what I write carefully. I'll repeat my definition once more:
k does not stand for an integer; (k) stands for a (possibly infinite) sequence 0 < k_1 < k_2 < k_3 < ... < k_n. The DTZ(k)-metric for a position p is given by the pair: (min{i : p is won under k_i-move rule}, distance to zeroing move)
The ordering on this metric is lexicographic.

With this definition, it suddenly becomes clear how various metrics are related to each other:
DTZ = DTZ(inf)
DTZ50 = DTZ(50)
DTZ50+ = DTZ(50 < inf)
DTR = DTZ(1 < 2 < 3 < 4 < 5 < ...)

I'd like to hear your opinion of my notation again, but only after you've taken the time to study it.
Last edited by syzygy on Thu Aug 07, 2008 10:35 pm, edited 1 time in total.
syzygy
Posts: 166
Joined: Sun Aug 03, 2008 4:02 pm

Re: 7-men EGTB Bounty

Post by syzygy »

Kirill Kryukov wrote:WDL50 should an order of 100 times less than DTZ50.
WDL50 will be significantly smaller than DTZ50, but I wonder where you get the factor 100 from.

I think for my 5-piece tables the factor is about 6. On 7-piece the factor would probably be higher. Hard to say, maybe 15, but not 100. (Actually, on 61 tables the factor will be far exceeding 100, as a 61 WDL table will resemble an "all win"-table, especially if you set stalemate positions to don't care. These tables can be compressed into "nothing", which is pretty good. Most 52 WDL-tables will be really small as well, but the 43 tables are a different story.)

About DTZ/DTZ50/DTZ50+: I realize that you are looking at this from a different perspective. What you want is a set of minimum requirements that will still make you happy. From that point of view I certainly understand a preference for DTZ50.

However, just in case someone with the required skills and a huge amount of spare time stands up and starts coding a 7-piece generator, it would at least be nice if that person is aware of the possibility of using the DTZ50+-metric.

Previously I had not been aware that DTZ50 and DTZ were reconcilable in a single table without having to significantly sacrifice on space efficiency. I'm surely not the first to realize that it can be done (e.g. in the mean time I have found a paper by guyhaw that briefly mentions the possibility of DTZ50 "extended with" DTR), but its practical feasibilty is not entirely trivial and can be easily overlooked.

Having a DTZ50-generator would be great. Having a DTZ50+-generator would just be greater. How much greater depends on personal preference. (I personally prefer DTZ over DTZ50, but I agree that DTZ turns the 50-move rule into a nagging problem.)
The pro-50-move-rule camp will keep complaining about the bloat in the code and tables.
Well, I'm afraid that nobody (as in 99.99%) will understand the code or the tables anyway. Most people believe that a tablebase is a file with a list of all board positions and for each position the best move in text format, and they wonder why 10-piece tables still have not been computed. That's the way it is.
syzygy
Posts: 166
Joined: Sun Aug 03, 2008 4:02 pm

Re: 7-men EGTB Bounty

Post by syzygy »

kronsteen wrote:DTZ50 Tablebase provides, for every winning/losing position, the smallest number of HALF-moves within which winning side can force either mate or a zeroing move (pawn move, promotion, capture) leading to a still won position.
It is sufficient to store the number of FULL-moves. It will matter significantly in compressed size and it is precise enough to correctly deal with the 50-move rule.
Kirill Kryukov wrote:Can a correct en-passant-aware tablebase be generated without considering en-passant rights in root? (Or without storing the positions with the en-passant right). I don't think so. Please correct me if I am wrong.
Yes it can. None of my tables store en-passant positions, yet my probing code will always return the correct result. The trick is to do a one-ply search when probing an en-passant position (and this can easily be hidden in the probing code).

Also remember that tables with pawns will be sliced into pawn-slices during the generation. Within a slice there are no en-passant positions. Storing or not storing en-passant positions only affects the code for glueing the slices together and whatever the choice, en-passant will affect that code anyway. In my experience (e.g. on PPPPvP and PPPvPP in suicide chess), not storing the en-passant positions is easy enough (and it saves some space).
User avatar
Kirill Kryukov
Site Admin
Posts: 7399
Joined: Sun Dec 18, 2005 9:58 am
Sign-up code: 0
Location: Mishima, Japan
Contact:

Re: 7-men EGTB Bounty

Post by Kirill Kryukov »

jkominek wrote:A tough call for the evaluation group. Is the award contingent on demonstrating proper handling of all various opposing pawn forces, when this cannot be realized with a complete build of the 7-man tables? They might have to resort to code review and the notion of staggered payout.

john
One possible idea. If the generator can produce WDLk tables, we may try to generate WDL2 or WDL3 tables for the whole 7-men. This will allow us to test the en-passant in kpppkpp and kppppkp. Some test suite for this can be created, I think. Generating and storing WDL3 should be much easier than WDL50. Not sure if it's easy enough for evaluation board to do, or if it's easy to create a test suite for checking the en-passant correctness using just WDL3. WDL5 should be totally enough though.
User avatar
Kirill Kryukov
Site Admin
Posts: 7399
Joined: Sun Dec 18, 2005 9:58 am
Sign-up code: 0
Location: Mishima, Japan
Contact:

Re: 7-men EGTB Bounty

Post by Kirill Kryukov »

syzygy wrote:
Kirill Kryukov wrote:Can a correct en-passant-aware tablebase be generated without considering en-passant rights in root? (Or without storing the positions with the en-passant right). I don't think so. Please correct me if I am wrong.
Yes it can. None of my tables store en-passant positions, yet my probing code will always return the correct result. The trick is to do a one-ply search when probing an en-passant position (and this can easily be hidden in the probing code).
Do you also use one-ply search during generation, each time you need a value for an EP position?
User avatar
Kirill Kryukov
Site Admin
Posts: 7399
Joined: Sun Dec 18, 2005 9:58 am
Sign-up code: 0
Location: Mishima, Japan
Contact:

Re: 7-men EGTB Bounty

Post by Kirill Kryukov »

syzygy wrote:
Kirill Kryukov wrote:WDL50 should an order of 100 times less than DTZ50.
WDL50 will be significantly smaller than DTZ50, but I wonder where you get the factor 100 from.
Not sure, probably from Vincent.
syzygy wrote:I think for my 5-piece tables the factor is about 6. On 7-piece the factor would probably be higher. Hard to say, maybe 15, but not 100. (Actually, on 61 tables the factor will be far exceeding 100, as a 61 WDL table will resemble an "all win"-table, especially if you set stalemate positions to don't care. These tables can be compressed into "nothing", which is pretty good. Most 52 WDL-tables will be really small as well, but the 43 tables are a different story.)

About DTZ/DTZ50/DTZ50+: I realize that you are looking at this from a different perspective. What you want is a set of minimum requirements that will still make you happy. From that point of view I certainly understand a preference for DTZ50.
What I want is a minimum requirement that will make everyone happy. :-)
syzygy wrote:However, just in case someone with the required skills and a huge amount of spare time stands up and starts coding a 7-piece generator, it would at least be nice if that person is aware of the possibility of using the DTZ50+-metric.
My head starts aching when I even begin to think about such a table. Note that I too wrote some generators for chess variants.

If the metric is simpler, more people may compete, which possibly may produce a technically better generator. Verifying the tablebases is much easier with the simpler metric. Note that there are numerous other issues (indexing, compression, en-passant, etc) which already need non-trivial solutions.

One most important point for splitting the DTZ50 and DTZ(>50) tables into different files is: There are two distinct camps. One camp (players) are those that need only DTZ50 and WDL50 tables, fastest possible access and smallest possible files. Another camp (composers and theorists) wants DTZ and WDL. Note that there are much more players than composers and theorists together (although theorists seem to prevail on this forum). So it makes sense to create a minimum tables that both camp can use, and additional tables that only the second camp will use.

Combining both metrics into a single format makes things unreasonably more complicated, and is not optimal for the players camp (majority).
syzygy wrote:Previously I had not been aware that DTZ50 and DTZ were reconcilable in a single table without having to significantly sacrifice on space efficiency. I'm surely not the first to realize that it can be done (e.g. in the mean time I have found a paper by guyhaw that briefly mentions the possibility of DTZ50 "extended with" DTR), but its practical feasibilty is not entirely trivial and can be easily overlooked.

Having a DTZ50-generator would be great. Having a DTZ50+-generator would just be greater. How much greater depends on personal preference. (I personally prefer DTZ over DTZ50, but I agree that DTZ turns the 50-move rule into a nagging problem.)
The pro-50-move-rule camp will keep complaining about the bloat in the code and tables.
Well, I'm afraid that nobody (as in 99.99%) will understand the code or the tables anyway. Most people believe that a tablebase is a file with a list of all board positions and for each position the best move in text format, and they wonder why 10-piece tables still have not been computed. That's the way it is.
Yes, but we understand the code and the tables. So it's our responsibility to come up with the solution that is in everyone's best interest. Also those 0.01% that do understand the code and tables, are those that will post articles and discuss it in forums, forming public opinion about it.
User avatar
Kirill Kryukov
Site Admin
Posts: 7399
Joined: Sun Dec 18, 2005 9:58 am
Sign-up code: 0
Location: Mishima, Japan
Contact:

Re: 7-men EGTB Bounty

Post by Kirill Kryukov »

By the way.. If I was designing this generator for my interest, I would certainly request it to handle DTM on arbitrary MxN sized boards. But I won't even try to suggest this as a requirement, because I realize that my personal interest may not be the same with everyone's. (Although it would be still terrific to have such a general generator). Please think about what's practical and what's important for majority of people when discussing the hopefully next big EGTB standard, especially for 7-men tables. :-)
kronsteen
Posts: 88
Joined: Fri Aug 01, 2008 11:20 am
Sign-up code: 0
Location: France

Re: 7-men EGTB Bounty

Post by kronsteen »

Kirill Kryukov wrote:Is "DOS" the same with what I called WDL(>50)? I don't like the name "DOS" because DOS is used for many other things already.
May be DUS, then ? U = Unaffected position by 50-move rule. Or anything else. IMHO, it’s up to the community to find suitable terminology, if no one already exists. Suitable means understandable by most players/composers, and therefore eligible for widespread use.
Main point is that “DUS” tables are not WDL tables, they are something else. This is to be explained to “people”, and notion is not completely trivial, even if on this forum almost everyone can catch it. According to this, I don’t recommend keeping WDL trigram (in a notation such as WDL>50) because it might be potentially misleading. DTZ>50 appellation is not misleading, but one might think about a name that makes even clearer that these tablebases are something really special.
jkominek
Posts: 150
Joined: Mon Dec 04, 2006 9:02 am
Sign-up code: 0
Location: Pittsburgh, PA

Re: 7-men EGTB Bounty

Post by jkominek »

syzygy wrote: Please read what I write carefully. I'll repeat my definition once more:
k does not stand for an integer; (k) stands for a (possibly infinite) sequence 0 < k_1 < k_2 < k_3 < ... < k_n. The DTZ(k)-metric for a position p is given by the pair: (min{i : p is won under k_i-move rule}, distance to zeroing move)
The ordering on this metric is lexicographic.

With this definition, it suddenly becomes clear how various metrics are related to each other:
DTZ = DTZ(inf)
DTZ50 = DTZ(50)
DTZ50+ = DTZ(50 < inf)
DTR = DTZ(1 < 2 < 3 < 4 < 5 < ...)

I'd like to hear your opinion of my notation again, but only after you've taken the time to study it.
The ideas you are expressing are nice - helpful even - but the notation you've invented is non-standard, and therefore deserving of complaint. Some concerns:

1. The range of your functions are two-tuples, from the space Z^2. You can define that, that's fine. Yet this is a departure from all previous uses of the acronyms DTZ, DTZ50 and so on, which are scalar functions. In aliasing these terms you're introducing ambiguity in naming, and asking us to keep in mind that your DTZ is not the DTZ tables of Ken Thompson or Marc B. To get his from yours, one has to project onto the second dimension. In the context of a single posting that's not terrible, because you're defining your terms. You are though tempting the proliferation of confusion. "DTZsyzygy" grants ample fame and alerts your readers to something new under the sun.

2. Using round parentheses declares to the world that you are defining a function. You can't stuff the mouth of a function willy-nilly; it has to be consistent with the domain. F(50 < inf) and F(1 < 2 < 3 < 4 < 5 < ...) are functions of Boolean expressions, which is something completely different. Expressing constraints on the argument is handled outside. As in
  • DTZsyzygy(p;k) where
    p is the independent variable and is from the domain of all positions
    k is the constraining parameter, and "stands for a (possibly infinite) sequence 0 < k_1 < k_2 < k_3 < ... < k_n ..."
Placing k_1 < k_2 < k_3 ... inside F() is maybe tolerable as a quick shorthand for what you want to express, but is mathematically incorrect.

3. There is a subtlety combining
  • 0 < k_1 < k_2 < k_3 < ... < k_n ...
    with
    min{i : p is won under k_i-move rule}
To illustrate with your third example, DTZ(50 < inf), the constraints are
  • i is an element of {1,2} s.t. k_1=50 and k_2=inf, which satisfies 0 < k_1 < k_2.
The subtle part: the min{i} operation maps onto the subscript values 1 and 2, as opposed to 50 and inf. Nothing wrong with that per se, but notice that your DTZ(inf) always maps onto (1, no-rule distance) and DTZ(50) always maps onto (1, 50-move rule distance). To keep the information complete the inverse i->k_i has to hang around somewhere. In real tables interpreting the bit patterns is typically hard-coded in, in one and only one interpretation. One set of tables could be swapped out for another and the code would be caught unawares.

That said, I do believe you deserve thanks for offering a way of unifying various metrics. And, your proposed innovation of a dual-metric file format is worthy of consideration, even if it does not (yet anyway) persuade all.

john
jkominek
Posts: 150
Joined: Mon Dec 04, 2006 9:02 am
Sign-up code: 0
Location: Pittsburgh, PA

Re: 7-men EGTB Bounty

Post by jkominek »

Kirill Kryukov wrote:
syzygy wrote:
Kirill Kryukov wrote:WDL50 should an order of 100 times less than DTZ50.
WDL50 will be significantly smaller than DTZ50, but I wonder where you get the factor 100 from.
Not sure, probably from Vincent.
It would be great if you could get a list of actual sizes. I'm with syzygy on this one: I expect a factor of less than 10 (unfortunately). Just projecting from 5-man tables though.
Kirill Kryukov wrote:Combining both metrics into a single format makes things unreasonably more complicated, and is not optimal for the players camp (majority).
But what if -- I hope this is not an unrealistic supposition -- syzygy kicks in 250,000 euros and becomes the patron sponsor of your bounty? Would he get his request if he foots the bill? Have you considered auctioning the format to vastly increase the prize fund? Jusk asking, just asking.

john
syzygy
Posts: 166
Joined: Sun Aug 03, 2008 4:02 pm

Re: 7-men EGTB Bounty

Post by syzygy »

Kirill Kryukov wrote:
syzygy wrote:
Kirill Kryukov wrote:Can a correct en-passant-aware tablebase be generated without considering en-passant rights in root? (Or without storing the positions with the en-passant right). I don't think so. Please correct me if I am wrong.
Yes it can. None of my tables store en-passant positions, yet my probing code will always return the correct result. The trick is to do a one-ply search when probing an en-passant position (and this can easily be hidden in the probing code).
Do you also use one-ply search during generation, each time you need a value for an EP position?
Yes. The first step in the generation of a pawn slice is a round of captures and pawn moves, including moves that move into an ep position. When probing the value of those positions you have to do a little bit extra. But as a bonus you can forget about specifically generating the en passant positions and storing those somewhere in the file format.

(Actually a full ply is not needed as long as you can determine that the ep position without ep rights is not a stalemate position. If you know that, it's sufficient to probe the ep captures and the position itself instead of a real ply.)
What I want is a minimum requirement that will make everyone happy. :-)
Making everyone happy is definitely too high a requirement :-)
Combining both metrics into a single format makes things unreasonably more complicated, and is not optimal for the players camp (majority).
It's not really a combination of both metrics in a single format, but a single metric in a single format that delivers all the advantages of both metrics (obey 50-move rule, but also know how to play cursed positions). Apart from an additional step during generation it has the complexity of DTZ. However, I do agree that DTZ50 is simpler than DTZ, mainly (only) because it puts a limit on the possible values of positions in the intermediate files used for generation.

You mention the players camp: are players not interested in knowing how to win a "drawn" position against a non-optimal opponent?

What use do typical players make of endgame tablebases? Computer analysis of endgames?
syzygy
Posts: 166
Joined: Sun Aug 03, 2008 4:02 pm

Re: 7-men EGTB Bounty

Post by syzygy »

jkominek wrote:1. The range of your functions are two-tuples, from the space Z^2. You can define that, that's fine. Yet this is a departure from all previous uses of the acronyms DTZ, DTZ50 and so on, which are scalar functions. In aliasing these terms you're introducing ambiguity in naming, and asking us to keep in mind that your DTZ is not the DTZ tables of Ken Thompson or Marc B. To get his from yours, one has to project onto the second dimension. In the context of a single posting that's not terrible, because you're defining your terms. You are though tempting the proliferation of confusion.
I first called them DTR (which already has 2-tuples, at least if you want a table usable for playing). Then I realized it "is" really a generalized DTZ, indeed in the 2nd dimension. In fact, by changing the definition of the second coordinate you get variations of DTM, DTC and even WDL (the latter by leaving out the 2nd dimension).

A table based on DTZ(inf) is identical to Thompson's DTZ. I agree it would not be helpful to refer to that table DTZ(inf) outside the context of the generalized approach.
Placing k_1 < k_2 < k_3 ... inside F() is maybe tolerable as a quick shorthand for what you want to express, but is mathematically incorrect.
I'd say at most "conventionally" incorrect ;). A subscript would have been more in line with standard convention: DTZ_(k), or even DTZ_(k_i)_i.

You're right about the subtlety (in fact I subtly shifted the indices in my own definition, I noticed later). Basically what I'm trying to do with the metric's definition is to pin down an ordering on all positions. A generator takes this ordering and generates a tablebase out of it. In this sense the actual value pairs are not so important. (Of course in practice you'll want to be able to get some kind of "number of moves" out of a table.)

The value pairs can always be mapped to a single value, since the 2nd coordinate in (i, n) is bounded by k_i. That's why a new file format or a new compression algorithm are not needed. Of course, the resulting values may not fit in a byte which can be complicating, but this is not different from DTZ (yes, it is different from DTZ50).

For DTZ50+ (= DTZ_(50 < inf)) the mapping would be:
(1, n) - > n
(2, n) -> 50+n
The required range should be comparable to that of DTZ (probably somewhat larger).

The resulting values allow for an easy interpretation. What you get is the table that Kronsteen was thinking of way earlier in this thread.
Arpad Rusz
Posts: 93
Joined: Mon Mar 27, 2006 5:33 pm
Sign-up code: 0
Location: Romania/Hungary
Contact:

Re: 7-men EGTB Bounty

Post by Arpad Rusz »

As a composer, I only need the WDL information. The depth to win is not important for me.
So I basically want only the WDL tables, not the DTZ.
User avatar
Kirill Kryukov
Site Admin
Posts: 7399
Joined: Sun Dec 18, 2005 9:58 am
Sign-up code: 0
Location: Mishima, Japan
Contact:

Re: 7-men EGTB Bounty

Post by Kirill Kryukov »

kronsteen wrote:May be DUS, then ? U = Unaffected position by 50-move rule. Or anything else. IMHO, it’s up to the community to find suitable terminology, if no one already exists. Suitable means understandable by most players/composers, and therefore eligible for widespread use.
Main point is that “DUS” tables are not WDL tables, they are something else. This is to be explained to “people”, and notion is not completely trivial, even if on this forum almost everyone can catch it. According to this, I don’t recommend keeping WDL trigram (in a notation such as WDL>50) because it might be potentially misleading. DTZ>50 appellation is not misleading, but one might think about a name that makes even clearer that these tablebases are something really special.
A new terminology should be introduced when it improves clarity or convenience. I feel the opposite about "DUS", "DOS" etc.. Yet another acronym to remember, while being essentially a WDL. File format is the same, indexing is the same. Just the interpretation of the values differs slightly. Something like WDL(<=50) and WDL(>50) is a consistent and self-explanatory way to call them. I don't think something without "WDL" in the name comes close in convenience, but this may be just me. Though we may also need a version without '<=>' characters for using in file names.
User avatar
Kirill Kryukov
Site Admin
Posts: 7399
Joined: Sun Dec 18, 2005 9:58 am
Sign-up code: 0
Location: Mishima, Japan
Contact:

Re: 7-men EGTB Bounty

Post by Kirill Kryukov »

jkominek wrote:It would be great if you could get a list of actual sizes. I'm with syzygy on this one: I expect a factor of less than 10 (unfortunately). Just projecting from 5-man tables though.
7-men WDL should contain longer runs of identical values than 5-men, that's why I think it should be much more than 10 times difference from DTZ. Let's hope we'll live to see it. :-)
jkominek wrote:
Kirill Kryukov wrote:Combining both metrics into a single format makes things unreasonably more complicated, and is not optimal for the players camp (majority).
But what if -- I hope this is not an unrealistic supposition -- syzygy kicks in 250,000 euros and becomes the patron sponsor of your bounty? Would he get his request if he foots the bill? Have you considered auctioning the format to vastly increase the prize fund? Jusk asking, just asking.
What if I had 250'000 euros and wanted to see a DTM MxN generator with fairy pieces? Or anything else... I don't see what this all has to do with community bounty. If someone has the funds, then he is of course free to set his rules and requirements as he likes. I'll be the first to thank him, if his project is any useful (like syzygy's metric would be).

Seeing the absence (in year 2008) of any trace of useful 7-men chess generator, I want to open a way for community to make it appear sooner. It may happen than someone (H.G.M.) will release a useful generator, making this bounty initiative obsolete. We have to consider this possibility. For example, if syzygy will release a generator for his metric, then most of people will be satisfied and there will be very little interest in different (slightly imporved) ETGB format. However, so far it did not happen, so at this point we still have chance to review many possibilities and choose the optimal one.

Optimal (IMHO) means useful for everyone, but optimized for a larger part of the potential user base.

My doubtz about syzygy's format are:
1. How much his tables will be larger than WDL50?
2. How much the access will be slower than WDL50?
If the answer is "tiny difference" for both questions, then I don't see any problem. If the answer is something like "10% different" or more (for any of the questions), then I would prefer a simple WDL50 for players + additional WDL(>50) for those who need it. Note that the difference should be probably measured not on the whole 7-men set, but on the most useful tables, which will be accessed more often.
User avatar
Kirill Kryukov
Site Admin
Posts: 7399
Joined: Sun Dec 18, 2005 9:58 am
Sign-up code: 0
Location: Mishima, Japan
Contact:

Re: 7-men EGTB Bounty

Post by Kirill Kryukov »

syzygy wrote:You mention the players camp: are players not interested in knowing how to win a "drawn" position against a non-optimal opponent?
The main application of EGTB for players is not playing an tablebase ending, it's knowing to which endgame they should and should not exchange. After you are in the tablebase endgame, the access speed does not matter because you only need a few of probes per move. So at this point they can utilize WDL(>50) tables if they want to try to win a drawn ending and if they choose to have those tables on their disk. This does not matter. (Though, please note that this option is available to them in my plan, by additional WDL(>50) tables). Playing an EGTB position is of secondary importance to the players. The primary importance is to be able to probe tablebases in search. Already in 8-men, 9-men, or 15-men positions. If they can probe fast, they can probe more deep in search tree, so they can play late middlegame better. Here the access speed is of great importance. Any other use of tablebases does not need any particular access speed compared to this one.[/quote]
syzygy wrote:What use do typical players make of endgame tablebases? Computer analysis of endgames?
Analysis of late middlegame. Freestyle, advanced chess, and correspondance games.

I forgot one more camp - engine testers, operators, etc.. (like myself). Computer chess hobbyists in general. They play engine-engine matches, run engines on chess servers, etc. Not a particularly small camp too. They need fastest possible access just like the players.
User avatar
Kirill Kryukov
Site Admin
Posts: 7399
Joined: Sun Dec 18, 2005 9:58 am
Sign-up code: 0
Location: Mishima, Japan
Contact:

Re: 7-men EGTB Bounty

Post by Kirill Kryukov »

Arpad Rusz wrote:As a composer, I only need the WDL information. The depth to win is not important for me.
So I basically want only the WDL tables, not the DTZ.
WDL will be computed much earlier than DTZ. Both WDL and DTZ information is going to be available to the users in my proposed "quad-metric" plan.
User avatar
Kirill Kryukov
Site Admin
Posts: 7399
Joined: Sun Dec 18, 2005 9:58 am
Sign-up code: 0
Location: Mishima, Japan
Contact:

Re: 7-men EGTB Bounty

Post by Kirill Kryukov »

syzygy wrote:
Kirill Kryukov wrote:Do you also use one-ply search during generation, each time you need a value for an EP position?
Yes. The first step in the generation of a pawn slice is a round of captures and pawn moves, including moves that move into an ep position. When probing the value of those positions you have to do a little bit extra. But as a bonus you can forget about specifically generating the en passant positions and storing those somewhere in the file format.
Cool. Yes, not needing to store EP postions is a big bonus!
syzygy
Posts: 166
Joined: Sun Aug 03, 2008 4:02 pm

Re: 7-men EGTB Bounty

Post by syzygy »

My doubtz about syzygy's format are:
1. How much his tables will be larger than WDL50?
2. How much the access will be slower than WDL50?
If the answer is "tiny difference" for both questions, then I don't see any problem. If the answer is something like "10% different" or more (for any of the questions), then I would prefer a simple WDL50 for players + additional WDL(>50) for those who need it. Note that the difference should be probably measured not on the whole 7-men set, but on the most useful tables, which will be accessed more often.
In my estimation DTZ50+ tables should not be more than 1% larger. In any case I don't see how it could be close to 10%. The most useful tables will be tables with several pawns. Those will normally have quite low distance to zero values and should hardly be affected, if at all. (Note that DTZ50+ compared to DTZ50 does not affect positions that are won under the 50-move rule, but that would convert to "cursed positions" when using the DTZ-metric. DTZ50+ should be more efficient than taking a DTZ-table and storing in addition the delta with DTZ50, or the other way around.)

Btw, I wouldn't call it a "format". The precise file format is to a large extent independent of the metric used. Choices made in picking a file format will have a huge impact on table size and access times (much more than the choice between DTZ50, DTZ50+ or DTZ).
Post Reply