7-man EGTB Bounty Reborn - Metric Discussion

Endgame analysis using tablebases, EGTB generation, exchange, sharing, discussions, etc..
Post Reply
User avatar
Kirill Kryukov
Site Admin
Posts: 7399
Joined: Sun Dec 18, 2005 9:58 am
Sign-up code: 0
Location: Mishima, Japan
Contact:

7-man EGTB Bounty Reborn - Metric Discussion

Post by Kirill Kryukov »

A set of supported metrics is the key characteristic of the endgame tables and generator. What we want to achieve by requiring a good set of metric is:

1. Make sure every use case is covered reasonably well with the metrics we chose.
2. Avoid making the requirements too hard.
3. Avoid requiring too many metrics, to not spead the generation efforts too much.

The old discussion (from 2008) was mainly focusing on metric choice, so it's a good idea to read it if you want to contribute.

As far as I understand, at least three metrics are totally essential:

DTM - Composers need this and nothing less.

DTZ50 - Practical metric for supporting real chess playing and analysis.

WDL50 - Fast and compact metric, compatible with DTZ50. Interesting property is that it can be used as a shortcut to DTZ50, as DTZ50 table can be built using WDL50 sub-tables.

IMHO, the other metrics that were proposed are either not essential (DTC), or too complex for normal people to understand why they should be in the requirements of the community-funded project (DTR).

If you have comments or suggestions, please share!
Arpad Rusz
Posts: 93
Joined: Mon Mar 27, 2006 5:33 pm
Sign-up code: 0
Location: Romania/Hungary
Contact:

Re: 7-men EGTB Bounty Reborn - Metric Discussion

Post by Arpad Rusz »

Kirill Kryukov wrote:DTM - Composers need this and nothing less.
Study composers need less than DTM! We actually want the following informations about a position:
1.If it is a win/draw/lose (WDL).
2.In a drawn position: is the drawing move unique?
3.In a win position: is the winning move unique (except time losing moves)?
User avatar
Kirill Kryukov
Site Admin
Posts: 7399
Joined: Sun Dec 18, 2005 9:58 am
Sign-up code: 0
Location: Mishima, Japan
Contact:

Re: 7-man EGTB Bounty Reborn - Metric Discussion

Post by Kirill Kryukov »

Arpad Rusz wrote:
Kirill Kryukov wrote:DTM - Composers need this and nothing less.
Study composers need less than DTM! We actually want the following informations about a position:
1.If it is a win/draw/lose (WDL).
2.In a drawn position: is the drawing move unique?
3.In a win position: is the winning move unique (except time losing moves)?
Great. WDL is much easier to build than DTM. Also, WDL nicely answers the first two questions, but not the last one.

Somehow I still have the impression that someone was advocating for DTM, perhaps not for composing needs.

EDIT: I added WDL to the specs draft.
koistinen
Posts: 92
Joined: Fri May 02, 2008 7:59 pm
Sign-up code: 0
Location: Stockholm

Re: 7-men EGTB Bounty Reborn - Metric Discussion

Post by koistinen »

With an the clean code and OSI approved license requirements it seems likely that those who want a different metric would be able to adapt the code to their needs. I'd like to see any of the metrics WDL50-DTM be allowed. (That is: any one of the metrics mentioned in this thread if computed should count as a solution.)
But with people putting up bounties, they might want to specify what metric the bounty requires. That would be ok too. Then if someone would solve for only some metric, other bounties would remain for those who want to adapth the code to compute that metric.
DTM is needed for composing problems but not for studies.
User avatar
Kirill Kryukov
Site Admin
Posts: 7399
Joined: Sun Dec 18, 2005 9:58 am
Sign-up code: 0
Location: Mishima, Japan
Contact:

Re: 7-man EGTB Bounty Reborn - Metric Discussion

Post by Kirill Kryukov »

koistinen wrote:With an the clean code and OSI approved license requirements it seems likely that those who want a different metric would be able to adapt the code to their needs. I'd like to see any of the metrics WDL50-DTM be allowed. (That is: any one of the metrics mentioned in this thread if computed should count as a solution.)
But with people putting up bounties, they might want to specify what metric the bounty requires. That would be ok too. Then if someone would solve for only some metric, other bounties would remain for those who want to adapth the code to compute that metric.
DTM is needed for composing problems but not for studies.
The current choice of metrics results from several simple assumptions (right or wrong):

1. While adopting a generator to a new metric is not hard in concept, some challenges may arise from the limitations of current day hardware. It's desirable that the more significant of those challenges are solved (or mostly solved) by the original team of developers. (The original developers are supposed to be more motivated, more skilled and more familiar with the code than the community).

2. DTM is the most demanding in terms of RAM, IO, storage space and computation time. Therefore, as a significant challenge, it should be included in the requirements.

3. DTZ is less computationally heavy than DTM. The challenge with DTZ comes from a) Properly handling the pawn slices and conversions (not required with DTM). b) Handling some corner cases not present in DTM. For example properly handing the situation when a losing side is forced to make a conversion or a pawn move. This could be trivial, but in that case this won't be an obstacle for the original developers. If not trivial, then the more reason to leave it for original developers to do. Additionally, enormous practical utility of DTZ tables makes it a high priority target, at least as far as I understand.

4. DTC is half-way between DTM and DTZ, so it should be trivial to implement when DTM and DTZ are already there. So I don't think it should be in the requirements (the original developers will most likely implement DTC just for completeness).

5. WDL50 and DTZ50, should be relatively trivial to implement if WDL and DTZ are already available. From this viewpoint these metrics don't have to be in the requirements. The reason I'd still have them in the specs is that these two metrics will be very popular in the "players" camp. Perhaps these could be a sub-project with its own sub-bounty.

6. WDL is essential because it's implementation can be very different from distance-based metrics. Many optimization, particularly in the access code, slicing, compression, etc, should count as significant challenge. Also, this is probably the first metric (this or WDL50) that will be used for mass computation, so it's desirable to have solid implementation from the start.

7. I don't have a very good understanding of DTM50. I'm not sure anyone needs such a metric, also not sure how hard is to implement it from a working DTM. Same for DTC50.

8. Experimental theoretical concepts, such as DTR (which was never implemented before as far as I'm aware), are best left for research projects and out of the community bounty project. Although it's easy to add them as sub-projects with sub-bounties, if there is any interest.

Summing up. While I agree that the specs should be as simple as possible, the specs should (IMO) include at least the basic set of three metrics: WDL, DTM, DTZ. Some other metrics (DTC, DTZ50, WDL50) can be sub-projects with their own sub-bounties, or in the base specs, I've no strong feeling pro or against. And for DTM50, DTC50 - I don't have clear idea, and so far I don't see them as essential for the specs.

Does any of this make sense?

Alternative plan would be to have three separate independent sub-projects (WDL, DTM and DTZ). Some other reasons against this separation: 1. It's not good if half of the community will receive their desired solution (DTM, or DTZ) and the other half will be left behind. 2. DTZ and WDL are interchangeable as sub-endgame tables, so it's very practical to hope that both DTZ and WDL generation is designed and tested by the same people.

All IMHO, opinions and discussion welcome.
kronsteen
Posts: 88
Joined: Fri Aug 01, 2008 11:20 am
Sign-up code: 0
Location: France

Re: 7-men EGTB Bounty Reborn - Metric Discussion

Post by kronsteen »

Kirill Kryukov wrote:The challenge with DTZ comes from a) Properly handling the pawn slices and conversions (not required with DTM). b) Handling some corner cases not present in DTM
DTM is a very nice metric to be sure, but beware of not setting the bar too high by selecting a difficult metric on an inherently ambitious project - more of this to come
Kirill Kryukov wrote:DTC is half-way between DTM and DTZ, so it should be trivial to implement when DTM and DTZ are already there. So I don't think it should be in the requirements (the original developers will most likely implement DTC just for completeness).
I would include DTC in the requirements for 2 reasons : 1) a modular code capable of DTM+DTZ shall trivially handle DTC and 2) upon WDL global sets DTC and DTZ can be easily computed, and DTC is a little better than DTZ for practical use : remember that every time a winning pawn move is available, DTZ only says “push the pawn, that wins” like WDL does, when DTC says “winning conversion forced in x moves“, a stronger message for sure.
Kirill Kryukov wrote:WDL50 and DTZ50, should be relatively trivial to implement if WDL and DTZ are already available. From this viewpoint these metrics don't have to be in the requirements.
I’m not so sure of this.

DTZ50+WDL50 generating code can be seen as a useful add-on, but having it quickly available may be desirable to compute 3-6 men WDL50/DTZ50 tables that don’t exist today. These tables are very valuable for themselves and are required for any subsequent 7-men attempts.
Kirill Kryukov wrote:WDL is essential because it's implementation can be very different from distance-based metrics. Many optimization, particularly in the access code, slicing, compression, etc, should count as significant challenge. Also, this is probably the first metric (this or WDL50) that will be used for mass computation, so it's desirable to have solid implementation from the start.
Fully agreed
Kirill Kryukov wrote: I don't have a very good understanding of DTM50. I'm not sure anyone needs such a metric, also not sure how hard is to implement it from a working DTM. Same for DTC50.
DTM50 gives distance to mate with respect to 50 move rule. It is the final and ultimate answer for the game played with 50 move rule, especially interesting for some particular endings such as kqpkq or knnkp. DTM50 has a demonstrated utility : in a post somewhere in this forum somebody asked a question about a knnkp position requiring DTM50 analysis - and has not be answered yet. But getting it needs to overcome a major difficulty : introducing the move counter (0-99 plies) as a supplementary variable. For example, a position with DTM=10 for mc=0 can still have DTM=10 for mc<90 but DTM=20 for 90<=mc<97 (the winning side having to delay the win by hurrying a zeroing move before the 50 move limit), and DTM=inf (i.e draw) for mc>=97 (the losing side being able to force 3 non-zeroing plies to reach a 50 move rule draw). So DTM50 will have to treat 100x more positions, but will compress a lot harder than DTM, so final DTM50 tables may be only 5-10 times larger than DTM ones.

For practical purposes, DTM+DTZ50 is almost as good as DTM50. Albeit non-DTM optimal, DTZ50 gives winning paths robust against 50-move rule if the user is worried by that, and DTZ50 also works when the move counter is not zero.

With such features, DTM50 definitely deserves a try on 3-4-5 men first, then on a few well chosen 6-men. I believe that a DTM50 generator can’t be derived from a DTM one, but is not especially harder to design, and I would be glad to see a working one. But don’t even think of it for 7-men. DTM50 is a nice subject, but off-topic there.

DTC50 is as almost hard as DTM50, and far less powerful, so I wouldn't consider it more than a DTM50 derivated product, for the sake of completeness.
kronsteen
Posts: 88
Joined: Fri Aug 01, 2008 11:20 am
Sign-up code: 0
Location: France

Metric discussion

Post by kronsteen »

Metric choice is a fundamental question and has to be undertaken with careful reasoning. One may handle the problem by addressing the following questions :

1- What are the different metrics and their characteristics (chessical characteristics, soft & hard constraints for computing & storing them)

2- What are the different categories of end users, their needs & wishes

3- What key elements to consider for metric selection & TB building strategy

4- What may a “reference design” of the project look like, taking into account the elements


Since these discussions can be lengthy, I’ll write first ideas in separate posts (see below)
kronsteen
Posts: 88
Joined: Fri Aug 01, 2008 11:20 am
Sign-up code: 0
Location: France

1- What are the different metrics and their characteristics

Post by kronsteen »

All metrics potentially suitable for 7-men are listed below. Some metrics such as DTM50 or DTR are not mentioned : these are “sophisticated metrics”, definitely worth a try on 3-4-5-(6) men but not on 7-men where metric simplicity and lightweight are absolutely essential.

DTM
Gives distance to mate

DTM is the only metric who never gives doubtful advice, as it targets the only true goal of the game. With any other DTX metric (X = any event other than mate), there are always cases where going for X can be seen as unwise as it makes the win longer & harder thereafter. DTM ignores 50 move rule, which is perfect for composers (50 move rule is ignored in chess problems), but may be sometimes a liability in real games. Adding DTZ50 solves this problem (see below).

DTM is a little harder to compute than simpler metrics (DTC/DTZ/WDL) and is larger in size. With a compression ratio similar to Nalimov, here are the final sizes one might expect (give or take 25% for all estimates) :

43 : 15 TB (up to 200 GB for 1 single ending)
43p : 100 TB (up to 600 GB for 1 single ending)
52 : 7 TB
52p : 45 TB
Total : ~170 TB

Better compression techniques can improve somewhat (for example the more recent Gaviota compressing scheme gives 6.5 GB for 3-4-5 men instead of 7.5 GB in Nalimov format) but an unbreakable lower limit exists somewhere and one can hope to cut down the above figures by 20-30%, but not much less.

Also importantly, DTM for a given ending needs having DTM for any subending resulting from captures and promotions. Due to this, kpppkpp and kppppkp are at the top of the tree and computing them needs to have computed the whole set before.


DTC
Gives distance to conversion (next promotion or capture with the win still possible after that)

DTC can sometimes lead to silly-looking results such as pieces unnecessarily unprotected or sacrificed, but most of the time DTC will show the path to force a capture or a pawn promotion after what the win is generally easy. It should therefore cover most of the needs of composers and players. Like DTM, DTC ignores 50-move rule.

DTC is a little easier to compute than DTM but not by much. It is smaller in size than DTM (about 50% less, 90 TB for all 7-men). More importantly, computing DTC for a given ending needs only WDL for all its subendings. This opens the possibility to have DTC only for most interesting endings, less interesting ones being kept in WDL format, allowing big savings in space storage.

DTZ
Gives distance to next zeroing move (a zeroing move is a capture, a promotion, or a pawn move)

DTZ has similar defects as DTC and in addition can advise ill-suited (albeit still winning) pawn pushes or symmetrically ignore threatening adverse pawn pushes. Nevertheless, DTZ like any other DTx metric will give sure paths to win and will therefore also cover most of the needs of composers and players. DTZ ignores 50 move rule, but it is possible to build special DTZ tables to take this rule into account (DTZ50 tables, see below)

DTZ is a little easier to compute than DTM and DTC, and also more compact (about 25% less than DTC). Like with DTC, DTZ can be computed for most interesting endings only, other ones being left in WDL format.


DTZ50
Gives distance to next zeroing move, but playing with 50 move rule, like the real game does. If a position is winnable but only through playing sequences breaking 50-move rule, DTZ50 considers it as drawn, which is the true mathematical result. DTZ50 also solves all positions with the move counter > 0. Such features are unique to DTZ50. DTZ50 is especially strong when used in conjunction with DTM, as the user has the choice between the shortest (DTM) and the safest (DTZ50) route to win, according to what he seeks.

DTZ50 ability to give robust winning lines against 50-move rule is an important feature for chess players. But it is not suitable for composers as 50-move rule is ignored in chess problems. Also, DTZ50 gives no clue about very long wins off the 50-move rule like the “winning” nature of krbknn, the 262-move mate in krnknn, the 545-move mate in kqnkrbn….

DTZ50 computing and storing requirements are almost the same as DTZ. Two important elements must be considered :
- Building DTZ50 tables require WDL50 (and not WDL) tables for all subendings. This will require building first WDL50 (or DTZ50) tables for 3-6 men as they don’t exist today
- If a DTZ table exists for a given ending, in many cases the DTZ50 will have very little difference with it and sometimes no difference at all. This gives special interest of the DTZ+DTZ50 duo as one of the tables can be considered as a small add-on to the other and it takes almost no supplementary space to store both tables instead of one. This is another advantage of DTZ over DTC or DTM.

WDL
Gives won/drawn/lost status of any position. Doesn’t contain any distance information, to mate or any other forced event. WDL TBs are frequently named as “bitbases”.

WDL can’t show winning lines in general : it can tell if a move is winning but not whether the move makes progress or delays the win. Winning lines must be engineered by human or conventional engine analysis, WDL providing support for this by faultlessly detecting difficult and easy to overlook winning or defending lines. WDL is therefore of general use for players and composers, even if it provides weaker information than DTx tables.

Computing WDL tables is quite similar to computing DTZ tables. The main advantage of WDL tables are their unequalled compactness : 4 times less than DTZ, 10 times less than DTM. The final expected sizes of complete 7-men sets in WDL are as follows :

43 : 1,4 TB
43p : 7,6 TB
52 : 700 GB
52p : 3,8 TB
Total : 13,5 TB

Biggest single-ending tables will be about 50 GB.

As said above, WDL sets can later be very easily and progressively upgraded to DTZ/C for endings of more interest.


WDL50
Gives won/drawn/lost status of any position, but with 50 move rule taken into account. Considers as draws winning positions requiring > 50 non-zeroing consecutive moves to win, like DTZ50 does.

WDL50 used alone is of less interest than WDL. It is useless for chess composers. Its main interest lies in its use as preliminary data to build DTZ50 tables. Like with DTZ, a complete set of WDL50 tables will be an ideal foundation upon which DTZ50 tables can be gradually built with possibility to take the most important ones first. Also noteworthy is the fact that WDL50 tables have very little difference with WDL tables and can be treated as add-ons, inexpensive to store.
kronsteen
Posts: 88
Joined: Fri Aug 01, 2008 11:20 am
Sign-up code: 0
Location: France

2- What are the different end users, their needs/wishes

Post by kronsteen »

This question has to be collectively answered, of course, but one can try to make a first general picture. Other opinions are welcome !

Users of TBs are :

- “Real play community” : chess players, chess engine developers and users, endgame theorists,
- problemists,
- EGTB enthusiasts.

Real play community wants hard solutions for endgames most commonly found in real games. Statistics already exist up to 6-men. For 7-men, maybe statistics exist too (if not they can be easily built), but anyway the results will certainly be close to the following :
- one ending far more common than any other : krppkrp
- 10 endings very common too, 4 vs 3 with 3+ pawns on the board and material difference of 1. These are : kmppkmp, (m = minor piece), kmppkrp, kqppkqp, kpppkmp, kpppkpp.
- other endings less common but also interesting. I would take in there 4 vs 3 endings with 2+ pawns and material difference of 1 or 3, krrpkrr, and 5 vs 2 top of list (krpppkr, kmpppkr, krpppkq).
- other endings, either very exceptional or of no practical interest. Pawnless endings fill in this category, as well as heavily unbalanced endings (material difference of 5 or more), and almost all 5 vs 2 endings.

About metrics, DTM and DTZ50 are best and have their own advantages. Having both of them (or even DTM50 !) is a must, but should only one be selected (apart from computing / storage issues), it shall be DTM as there are many 50-move rule detractors who want DTM and don’t want to hear of DTZ50. DTC or DTZ are less good but are very usable too. WDL is weaker but still very helpful (for instance it will allow unbreakable defence in otherwise painful situations such as krkrb)

Problemists are virtually interested by every ending, but for them TB prioritization lies elsewhere : what problems can’t be checked/solved by conventional engines ? One can give two categories :
- long problems, with a main winning line of (say) 15+moves.
- problems whose end position status (won/drawn) is not so clear and requires check-up from a WDL TB.

These features indicate that problemists are mainly interested in TBs for balanced endings (material difference = 1 or 3, maybe 5). About the metric, WDL can’t show winning lines but is probably very helpful for check-ups. DTZ or DTC can help better for problems with complex variations where it is not always obvious if a winning move makes progress or delays the win. The ultimate choice is of course DTM who gives every answer to problemists as they ignore 50-move rule, and especially gives instant validation for “mate in x” type problems.


EGTB enthusiasts have their own wishes of course, but subjects of interest may be :
- long winning lines and length records such as mate in 545 in kqnkrbn
- definite answers about endings whose status is unclear today, with a lot of surprises to come such as the general win status of kqnkrbn,
- owning, using and sharing whole sets.

About metrics, DTM is probably first choice, followed by DTC/DTZ, followed by WDL. DTZ50/WDL50 have their own particular interest lying in the identification of 50-move rule hits (i.e winning positions where the defender can force a 50-move rule draw) and influence of 50-move rule on the play of very special endings such as kbbknn or knnkp.
Last edited by kronsteen on Wed Apr 20, 2011 5:19 pm, edited 1 time in total.
kronsteen
Posts: 88
Joined: Fri Aug 01, 2008 11:20 am
Sign-up code: 0
Location: France

3- What key elements to consider

Post by kronsteen »

This can be subject to discussion, of course, so I give just a first vision. Undoubtedly it will evolve with other ones.

First key element : The necessity to take a low road, using very simple metrics

Let’s admit it : 7-men is a gigantic task. To have a clear view of this, we shall compare it with what was the 6-men project 10 years ago. Thanks to remarkable efforts by Eugene Nalimov and other pioneers, 6-men has been solved and distributed in ~ 5 year time, and one can say that the project was at the upper limit of what the small community of EGTB hobbyists was able to undertake. 7-men full project is about 150x bigger than 6-men, tables are 50x larger. Is current computing hardware (RAM capacity, CPU and I/O speeds) 50x better and HDD capacity 150x larger than they were 10 years ago ? No. So building a full set of 7-men in DTM is probably beyond our current capabilities, one main issue being the HDD capacity needs (remember, a full set of 7-men DTM, even 43+43p only, will be > 100 TB). We could still aim at it, but one can guess that 5 or even 10 years will have to pass before really serious attempts for mass generation can be done. If we start straight away, we will be able to build the first tables, but what we’ll do when our tiny 2 TB disks will be full ? And how will we manage the fact that computing an ending such as krnpkqp will need reading access to 24 subtables of 600 GB each ?

For these reasons, it is probably wiser to target WDL for a first mass generation attempt, in order to remove as many feasibility problems as possible. The final size (~ 13 TB for the 43/43p/52/52p whole set) sounds close to what the 1.2 TB nalimov 6-men was for HDDs 10 years ago. Big, ambitious to be sure, but realistic, unlike the 170 TB DTM whole set is. WDL will be the ideal foundation upon which DTZ or DTC tables for most interesting endings will be easily built : that’s why it is surely essential to include DTZ (or DTC) in the specifications of the generator. It will also be useful for eventual later WDL50/DTZ50 tables that will be stored as add-ons in an inexpensive way. And with a versatile code it won’t impede first DTM generation attempts for those who really want to.

Second key element : targeting fast and valuable intermediate results, and having a global incremental approach

It is not realistic to see the project as a multi-year effort with all the reward coming only in the end. Especially as we want external support for the project, it will be essential for our holy oracle to have quickly and regularly something to say about what people are really interested in.

Let’s see the past 7-men first tries (krrnkrr, kqbnkqb, krbnkqn, krrpkrr and the like). We’ve built some immediately accessible TBs, and unearthed some astonishing results. Then… we’re still at this point, where we’re in fact standing in front of a big desert. Next endings of interest contain several pawns, and the dependency tree forces us to build and store dozens of uninteresting endings first, which is not realistically achievable today in DTM.

One of the main issues of the global project is how to cross this desert. I see a very good target beyond the desert towards which efforts could be concentrated : krppkrp.

Real play community (players, engine developers and users, theorists) is vastly larger than problemists and EGTB enthusiasts ones. This community is waiting anxiously for krppkrp, and global interest will probably rise a lot when the oracle will begin to give krppkrp revelations.

krppkrp has 74 subendings, this is big but only 1/7 of the global 43+43p project (525 endings). One could even try as a first intermediate result the krppkrp* tablebase, built upon assumption that pawns promote only to queens before a capture occurs. This close-to-perfect table needs only 5 subendings (kqqrkqr, kqqrkrp*, kqrpkqr*, kqrpkrp*, krppkqr*). It could be released first to raise early interest with a note that its accuracy is not perfect and will be upgraded later by including subpromotions (maybe only n promotions first), after what the true definitive result will be there.

After krppkrp, other endings of interest shall be targeted by selecting first endings with a high ratio frequency rate in real games / number of subendings still to be built. It could also be conceivable that players submit their wishes about what TBs they want built first. This way, intermediate results will be constantly released and there will be no more long periods with no results posing the risk of interest to vanish.
kronsteen
Posts: 88
Joined: Fri Aug 01, 2008 11:20 am
Sign-up code: 0
Location: France

What may a “reference design” of the project look like

Post by kronsteen »

Generator compatible with all of the following metrics : DTM, DTC, DTZ, WDL, DTZ50, WDL50. This modularity should be not too hard a requirement. DTZ50/WDL50 might be seen as a possible later add-on but it would be handy to get fast availability of such a code in order to build 3-6men DTZ50 tables as they don’t exist yet.

Metric choice for mass 7-men generation : WDL+DTZ (or maybe DTC). Generating code is the same for both, most endings will be stored in WDL, only a few very interesting endings may be stored in DTZ/C. This will be harmless as the WDL stored tables will allow immediate DTZ/C recomputability for any ending.

Development :

Focused on 43/43p only (52/52p shall be delayed as a later effort)
Goal 1 : a small set of well chosen “demonstrative” pawnless endings stored in DTZ or DTC. krbnkqn is an excellent choice as it probably holds the DTZ/C record (517 moves) and has maximal complexity (no piece duplication), a wanted feature to demonstrate code capability
Goal 2 : The krppkrp* “demonstrator” tablebase
Goal 3 : The krppkrp true tablebase
Goal 4 : Successive building of endings of interest chosen among most interesting types for real games, kpppkpp coming last to complete the whole set

End result : Full WDL 43+43p set, with a small selection of most interesting endings in DTZ (or DTC) format, including at least krbnkqn, krppkrp, kmppkmp, (m = minor piece), kmppkrp, kqppkqp, kpppkmp, kpppkpp. Global size estimated at 10 TB first, to be inflated by progressive addition of DTZ/C tables for endings of interest. Some members of the community shall own the whole WDL set in order to be able to build immediately any DTZ/C table on request.

Reaching this point will be a big achievement, after that the time will have come to discuss another goal : DTM 7-men, WDL+DTZ/C 52/52p, 6-7men WDL50+DTZ50…
User avatar
Kirill Kryukov
Site Admin
Posts: 7399
Joined: Sun Dec 18, 2005 9:58 am
Sign-up code: 0
Location: Mishima, Japan
Contact:

Re: 7-man EGTB Bounty Reborn - Metric Discussion

Post by Kirill Kryukov »

Wow, that's a lot of text. Thanks for enthusiasm, kronsteen, particularly for the explanation of the metrics! If I reply to every point, this thread will probably explode, so I'll try to comment very briefly.
kronsteen wrote:Generator compatible with all of the following metrics : DTM, DTC, DTZ, WDL, DTZ50, WDL50. This modularity should be not too hard a requirement. DTZ50/WDL50 might be seen as a possible later add-on but it would be handy to get fast availability of such a code in order to build 3-6men DTZ50 tables as they don’t exist yet.
So, your proposal is to add DTC and DTZ, compared to the current specs. I've no strong feeling against, because these metrics are trivial to have when already having DTM and DTZ50. My only concerns are 1) WIth 6 required metrics the specs are no longer minimalistic. 2) This increases the work for the verification team to verify that the submitted generator complies to the specs. Also the verification time is increased. But perhaps the added sense of "completeness" outweights the cons, and may be with these metrics the project is attractive to larger audience.

We seem to agree that there are camps of users with different needs. One thing that became clear from the previous discussions is that pro-50-move-rule and against-50-move-rule camps (which we called "players" and "composers" respectively) are both significant, and both unlikely to change their minds. If we are to succeed, the project should be attractive to both camps. So I think the mass generation will go in parallel in both WDL and WDL50. So having both of these metrics in the basic specs is the only way out that I can see. The good thing is that techically WDL and WDL50 are very similar to implement (unlike DTM vs DTM50).

In general, it will be up to the community to choose which tables are generated next. This is the whole point - the community plays the key role in all phases of this project: Now it drafts the specs and motivates the developers. When the generator is complete, the community directly participates in building the tables, and in distributing them. When the tables are generated, the community hosts them, and provides remote probing access. The infrastructure for all those tasks will have to be designed, but it's doable.

EDIT: The infrastructure thread started.
kronsteen
Posts: 88
Joined: Fri Aug 01, 2008 11:20 am
Sign-up code: 0
Location: France

Re: 7-man EGTB Bounty Reborn - Metric Discussion

Post by kronsteen »

Kirill Kryukov wrote:So I think the mass generation will go in parallel in both WDL and WDL50. So having both of these metrics in the basic specs is the only way out that I can see. The good thing is that techically WDL and WDL50 are very similar to implement (unlike DTM vs DTM50).
Beware, reading this I would like to be sure that there’s no misunderstanding. It is trivial to give WDL50/DTZ50 capability to a WDL/DTZ generating code (just have an option to stop at 50th iteration instead of running until no new winning position is found), but generally both tables cannot conveniently be generated in a single run. WDL50 is not an intermediate result stored at 50th iteration of a WDL run, because it uses WDL50 and not WDL subtables as entries.

I feel that trying to generate WDL + WDL50 in parallel may be not so wise as it will make running times rising in an already well crowded project. Adding WDL50 to WDL is marginal in space storage, but not in computation time. Some optimizations are conceivable, but basically most of the time a second run will be required to get WDL50.

Probably better to go on with WDL+DTZ only and see WDL50+DTZ50 as a follow-on project, living its own life and beginning with 3-6 men table generation (adding 3-6men DTZ50 tables to the existing DTM for free release would be nice, wouldn’t it ?). This doesn’t decrease the desirability of a multi-metric capable generating code, of course, for all the good reasons already given.
User avatar
Kirill Kryukov
Site Admin
Posts: 7399
Joined: Sun Dec 18, 2005 9:58 am
Sign-up code: 0
Location: Mishima, Japan
Contact:

Re: 7-man EGTB Bounty Reborn - Metric Discussion

Post by Kirill Kryukov »

No misunderstanding. When I mentioned simultaneous WDL + WDL50 generation, I am assuming two totally independent processes. However, now that I think about it, perhaps generation can be speeded up if the same table is already available in a counterpart metric. But this is not too important optimization, and possibly error-prone.

Regarding the idea of everyone focusing on just WDL and not WDL50 (or vice versa), I'm not sure. I think there is no necessity to impose any order on the community, and just let everyone enjoy building the tables in his favorite metric. Eventually the whole 7-piece solution should be complete in both WDL and WDL50. It may take a year or two, but the important point is that we'll get there.

DTZ, DTZ50 and DTM - these will be the follow-up projects, although, again, everyone will have the choice to do them earlier if they can't wait. More choices - more fun, this is how I see it. Assuming the infrastructure is good enough to handle all this mess.
kronsteen
Posts: 88
Joined: Fri Aug 01, 2008 11:20 am
Sign-up code: 0
Location: France

Re: 7-man EGTB Bounty Reborn - Metric Discussion

Post by kronsteen »

Kirill Kryukov wrote:No misunderstanding. When I mentioned simultaneous WDL + WDL50 generation, I am assuming two totally independent processes. However, now that I think about it, perhaps generation can be speeded up if the same table is already available in a counterpart metric. But this is not too important optimization, and possibly error-prone.
Such is also my thought. Endings unaffected by 50-move rule can be easily identified and WDL+WDL50 generation can be handled on a single run for endings having all their subendings unaffected. Having a DTZ (or DTZ50) requirement will force pawn slicing in pawnful endings, allowing the above scheme to work. But it’s probably not worth the effort as that there are probably not so many 7-men endings satisfying the condition.
Kirill Kryukov wrote:Regarding the idea of everyone focusing on just WDL and not WDL50 (or vice versa), I'm not sure. I think there is no necessity to impose any order on the community, and just let everyone enjoy building the tables in his favorite metric. Eventually the whole 7-piece solution should be complete in both WDL and WDL50. It may take a year or two, but the important point is that we'll get there.
May be then we should think of an early effort to build and release 3-6men WDL50/DTZ50 tables (this is a necessary first step, anyway – is a code readily available ?). Then we’ll be able to gauge more accurately the overall interest for those. DTZ50 is the optimal metric for engine Elo improvement in 50-move rule played games and there will be a time when DTZ50 armed engines will earn draws in positions assumed to be won by engines running on DTM…
Kirill Kryukov wrote:DTZ, DTZ50 and DTM - these will be the follow-up projects, although, again, everyone will have the choice to do them earlier if they can't wait. More choices - more fun, this is how I see it. Assuming the infrastructure is good enough to handle all this mess.
Basically, EGTBs is a two-dimensional space, with positions and metrics. To raise new knowledge, one can concentrate the effort in one direction, exploring more complex endings on simplest metrics (7-men WDL(50)) or more complex metrics on simplest endings (3-5 men DTM50/DTR). One can also make a “diagonal” push (all 6-men DTZ50, 7-men DTM samples), in every case ending complexity x metric complexity being the limiting factor. Infrastructure should ideally be capable of greeting and capitalizing every effort made anywhere in the technically feasible territory.
tralala
Posts: 3
Joined: Mon Apr 25, 2011 10:55 pm
Sign-up code: 10159

Re: 7-man EGTB Bounty Reborn - Metric Discussion

Post by tralala »

Hi,

I think kronsteen explained it very well and it is obvious that WDL (or bitbases) would be the first goal towards a full set of 7men EGTBs. Therefore I would concentrate the effort only to this metric and once that is achieved (or being achieved) one can target other metrics.

Am I right that a complete set of 7men bitbases won't help the creation of DTM-Bases?

regards Joachim Rang (Fruitchess)
User avatar
Kirill Kryukov
Site Admin
Posts: 7399
Joined: Sun Dec 18, 2005 9:58 am
Sign-up code: 0
Location: Mishima, Japan
Contact:

Re: 7-man EGTB Bounty Reborn - Metric Discussion

Post by Kirill Kryukov »

kronsteen wrote:
Kirill Kryukov wrote:Regarding the idea of everyone focusing on just WDL and not WDL50 (or vice versa), I'm not sure. I think there is no necessity to impose any order on the community, and just let everyone enjoy building the tables in his favorite metric. Eventually the whole 7-piece solution should be complete in both WDL and WDL50. It may take a year or two, but the important point is that we'll get there.
May be then we should think of an early effort to build and release 3-6men WDL50/DTZ50 tables (this is a necessary first step, anyway – is a code readily available ?). Then we’ll be able to gauge more accurately the overall interest for those. DTZ50 is the optimal metric for engine Elo improvement in 50-move rule played games and there will be a time when DTZ50 armed engines will earn draws in positions assumed to be won by engines running on DTM…
I'm not aware of any such code. Discussing an early effort is nice, but I'm not sure what this even means, practically. Creating a sub-bounty for 6-piece WDL generator? Note that the current bounty is not particularly huge even undivided. I think if we want to make this project happen, we should concentrate on one ambitious goal - a 3-to-7 piece generator. Such goal will have a better chance of raising any funding.

Another point is: I'm not actually sure that this will allow us to estimate of proportional interest to WDL/WDL50 in the community. For most people 6-piece chess is solved, as far as they care. It's 7-pieces when the wider community will become interested and join the party.
User avatar
Kirill Kryukov
Site Admin
Posts: 7399
Joined: Sun Dec 18, 2005 9:58 am
Sign-up code: 0
Location: Mishima, Japan
Contact:

Re: 7-man EGTB Bounty Reborn - Metric Discussion

Post by Kirill Kryukov »

kronsteen wrote:Basically, EGTBs is a two-dimensional space, with positions and metrics. To raise new knowledge, one can concentrate the effort in one direction, exploring more complex endings on simplest metrics (7-men WDL(50)) or more complex metrics on simplest endings (3-5 men DTM50/DTR). One can also make a “diagonal” push (all 6-men DTZ50, 7-men DTM samples), in every case ending complexity x metric complexity being the limiting factor. Infrastructure should ideally be capable of greeting and capitalizing every effort made anywhere in the technically feasible territory.
tralala wrote:I think kronsteen explained it very well and it is obvious that WDL (or bitbases) would be the first goal towards a full set of 7men EGTBs. Therefore I would concentrate the effort only to this metric and once that is achieved (or being achieved) one can target other metrics.
I think there is a slight confusion here. I agree that having a consistent and rational set of goals is good. I agree that focusing efforts in one well-chosen direction is good. That having a plan and target is good. This all only works for a single person. It won't work for the community (at least not for this community), because the community is very diverse. One person can plan the efforts, and then do the efforts. Community (as a whole) can't. You can try to direct the community efforts gently, but any attempts to just force your strategy on the community will probably fail.

What we want to achieve is to give the community tools and infrastructure, to make progress towards 7-piece tables fast, easy and enjoyable. Some will want to do WDL, some WDL50 and there is no way around this. Trying to force everyone do WDL will result in those who prefer WDL50 leaving the project, and vice versa. Then, some may want to do DTM from the start and nothing else.

We need to develop a strategy that will allow everyone to participate, and to have fun, regardless of their metric choice. At least the way that I see it.
User avatar
Kirill Kryukov
Site Admin
Posts: 7399
Joined: Sun Dec 18, 2005 9:58 am
Sign-up code: 0
Location: Mishima, Japan
Contact:

Re: 7-man EGTB Bounty Reborn - Metric Discussion

Post by Kirill Kryukov »

I added DTZ and DTC into the requirement specs draft.

DTZ - for symmetry with DTZ50. It allows to skip uninteresting endgames by solving them in WDL, and only solve interesting ones in DTZ. DTZ50/WDL50 allows to do this with 50-move rule, so DTZ/WDL combination will make the same strategy possible for those who don't want to consider the 50-move rule.

DTC - as it will allow comparison with the results of Yakov Konoval and Marc Bourzutschky. Also DTC is trivial to implement when both DTM and DTZ are done, so it should not be a big burden on the developers.
kronsteen
Posts: 88
Joined: Fri Aug 01, 2008 11:20 am
Sign-up code: 0
Location: France

Re: 7-man EGTB Bounty Reborn - Metric Discussion

Post by kronsteen »

tralala wrote:Am I right that a complete set of 7men bitbases won't help the creation of DTM-Bases?
You’re right. Building a DTM tablebase needs having DTM for all subendings, bitbases are not enough.
Kirill Kryukov wrote:I'm not aware of any such code. Discussing an early effort is nice, but I'm not sure what this even means, practically. Creating a sub-bounty for 6-piece WDL generator?
No, I know that some DTZ50 6-men tables already exist today (www.chess.jaet.org) and wondered if the generating code could be made available to generate more tables, or if a fast effort to build a DTZ50 code from an existing one such as Gaviota could be made. This is not required, of course, if a multimetric 3-7 men code is achieved quickly, but who knows ?
Kirill Kryukov wrote:I think there is a slight confusion here. I agree that having a consistent and rational set of goals is good. I agree that focusing efforts in one well-chosen direction is good. That having a plan and target is good. This all only works for a single person. It won't work for the community (at least not for this community), because the community is very diverse. One person can plan the efforts, and then do the efforts. Community (as a whole) can't. You can try to direct the community efforts gently, but any attempts to just force your strategy on the community will probably fail.

What we want to achieve is to give the community tools and infrastructure, to make progress towards 7-piece tables fast, easy and enjoyable. Some will want to do WDL, some WDL50 and there is no way around this. Trying to force everyone do WDL will result in those who prefer WDL50 leaving the project, and vice versa. Then, some may want to do DTM from the start and nothing else.

We need to develop a strategy that will allow everyone to participate, and to have fun, regardless of their metric choice. At least the way that I see it.
Of course there is no point in trying to push people in directions they don’t want. But releasing a code and letting uncoordinated people do what they want for themselves isn’t probably best either. This is simply because most of the imaginable goals are simply out of reach for a single person. I mentioned krppkrp before, this is an already identified goal (see this thread) and I don’t see someone alone harvesting its 74 subendings. One can hope to find enough people interested in multi-metric mass generation and nothing else, but this will probably be a very lengthy effort and earlier intermediate results will certainly be desirable.

The solution to this problem is clearly to provide information and support in order to help people coordinate their efforts towards goals they can define collectively. I see at least two possible tools about this :

- Having somewhere a “pool of wishes” where everybody can submit what he wants built first, find out if his interests are shared by others, and what goals make the most sense for collective efforts

- Having an information page giving some clues about feasibility and efforts required to build some specific 7-men endings, subsets or complete sets.
User avatar
Kirill Kryukov
Site Admin
Posts: 7399
Joined: Sun Dec 18, 2005 9:58 am
Sign-up code: 0
Location: Mishima, Japan
Contact:

Re: 7-man EGTB Bounty Reborn - Metric Discussion

Post by Kirill Kryukov »

kronsteen wrote:Of course there is no point in trying to push people in directions they don’t want. But releasing a code and letting uncoordinated people do what they want for themselves isn’t probably best either.
Agreed. I've been advocating for coordinating the efforts for a few years now, in both sharing and computing the tablebases. (This is one of the reasons why this EGTB forum exists here).
kronsteen wrote:This is simply because most of the imaginable goals are simply out of reach for a single person. I mentioned krppkrp before, this is an already identified goal (see this thread) and I don’t see someone alone harvesting its 74 subendings. One can hope to find enough people interested in multi-metric mass generation and nothing else, but this will probably be a very lengthy effort and earlier intermediate results will certainly be desirable.

The solution to this problem is clearly to provide information and support in order to help people coordinate their efforts towards goals they can define collectively. I see at least two possible tools about this :

- Having somewhere a “pool of wishes” where everybody can submit what he wants built first, find out if his interests are shared by others, and what goals make the most sense for collective efforts

- Having an information page giving some clues about feasibility and efforts required to build some specific 7-men endings, subsets or complete sets.
Yes, these are some of the problems to be addressed by the infrastructure sub-project. On some primitive level these tasks can be helped by just exchanging messages in a forum like this one, but eventually some specialized database of tables should probably be made.
Post Reply