New Bitbase format

Codeman · Post by **Codeman** » Sun Mar 16, 2008 7:54 am

Hallo!

In the past months I have been writing a bitbase generator. Currently it only supports pawn-less endgames and I haven't implemented the optimization of n-like men, which I want to do in the upcoming days. I tried generating some 5men endgames and it produces the right number of indexed positions (compared to Nalimov). The code is written in a very general way, so that it would even generate n-men bitbases. The main feature of this format is the space advantage. The final file only takes 1 bit per position and during generation it takes 2 bits. Furthermore it seems to be very well compressible. The downside of the bitbase generator is the time it takes to generate the files. I would estimate it to take maybe a factor 5-10 longer than the usual DTM egtb (depending on the type of endgame). This is because DTM takes 3 times more space while generation and so can use the stored information more efficiently.

The only real problem I encountered yesterday when trying to generate my first 6-men tablebase. It would take the program 1474 mb ram to generate it (462*62*61*60*59* 2/8) My computer wasn't anymore able to hold it in ram, so I wanted to ask here, whether you know of any efficient algorithms to combine harddrive and ram to still ensure a relatively fast generation.

Secondly I wanted to know if any of the experts here would be so kind and have a look at the code and check whether any major improvements can be made. (I myself am only a hobby programmer interested in the principles of chess-programming and therefore not aware of many little speed-up tricks)

And finally I want to donate the code to the community and here I have some questions as well. One often reads in this board that there are licensing problems when using e.g Nalimov files. Could you give me some advice under which license to publish my code (as I am not a lawyer either)

So if you are interested, please contact me. I am not going to make the source files public to anyone and just put it on my webspace, as I want to minimize the risk of someone corrupting it.

Edmund

Post by **Kirill Kryukov** » Sun Mar 16, 2008 12:37 pm

Hi Edmund! Great to hear about your progress! Looking forward to use your generator!

About the license, I think any OSI-approved license should be fine. Personally I like zlib/libpng license, because it is short enough so someone might actually read it.

jkominek · Post by **jkominek** » Mon Mar 17, 2008 3:58 am

A timely subject, for as it happens I have a new bitbase format to propose. I call it a "wldr bitbase", short for win/loss/draw/rule. The idea is that both regular draws and drawn-by-the-50-move-rule draws are represented. It sprang from two sources, a) all the recent discussion about DTZ and DTR tablebases, and b) Kirill's wish for a way to search into 6-man tables efficiently from positions having more than 6 men.

Storing two distinct types of draws offers the search a fundamental freedom. It can be a scofflaw chess purist, or a rule-abiding citizen of FIDE.

In addition, it is satisfying to finally make good use of the 4th bit pattern. In 2's complement interpretation:

Code: Select all

 1  = 01 -> win
 0  = 00 -> normal draw (plus broken positions)
-1 = 11 -> loss
-2 = 10 -> draw by rule

If a table has no draw-by-rule positions, it reverts identically to a normal wld bitbase. Also, if you want to use it in a normal wld way, a draw value is determined by checking the least significant bit. Any bit patterning will do, but for simplicity and backward compatibility I favor this arrangement.

It never hurts to ask "why bother at all?". In the lingo of patent law, is it new and useful? There are free Scorpio bitbases up to 5 men, and Shredderfolk are nearing the completion of 6-man btibases. In Edmund's case, win/non-win bitbases are the smallest useful thing, and are not currently available. That's reason enough to pursue.

In my case, it the prospect of efficiently searching from 7 and 8-man positions motivates the development of 4-valued bitbases. And if the result can't both be True and Pure, at least it can be True or Pure.

john

jkominek · Post by **jkominek** » Mon Mar 17, 2008 4:16 am

Codeman wrote:In the past months I have been writing a bitbase generator. ... The code is written in a very general way, so that it would even generate n-men bitbases. The main feature of this format is the space advantage. The final file only takes 1 bit per position and during generation it takes 2 bits.

Congratulations on your progress! Space-efficiency is very nice, the generality you speak of is interesting.

The only real problem I encountered yesterday when trying to generate my first 6-men tablebase. It would take the program 1474 mb ram to generate it (462*62*61*60*59* 2/8) My computer wasn't anymore able to hold it in ram, so I wanted to ask here, whether you know of any efficient algorithms to combine harddrive and ram to still ensure a relatively fast generation.

In increasing order of programming difficulty, you have 4 choices.

1. buy more memory and install it in your computer
2. buy a shared memory supercomputer from Cray or SGI
3. implement memory-mapped files
4. implement a custom disk-to-memory controller

mmap'd files allow you to program as if your hard drive where just a big chunk of regular RAM (albeit very slow RAM). Use of it is supposed to yield identical results, as if it were all real memory, but I have my doubts. I prefer to use it only in read-only mode, and to handle writing of file blocks explicitly. But that's more effort, which brings along its own bucket of bugs.

Secondly I wanted to know if any of the experts here would be so kind and have a look at the code and check whether any major improvements can be made. (I myself am only a hobby programmer interested in the principles of chess-programming and therefore not aware of many little speed-up tricks)

I'm willing to take a look, from a perspective that suites my specific interest. I want a way of verifying my tbgen2 work - by writing or using a program that derives from an independent code base. By the way, we are all hobbyists here. Maybe Eiko Bleicher has reason to think otherwise, but there isn't much money in tablebase work.

And finally I want to donate the code to the community and here I have some questions as well.

Please take Kirill's reply to mean "Welcome. You've come to the right place."

john

jkominek · Post by **jkominek** » Mon Mar 17, 2008 4:22 am

In addition, it is satisfying to finally make good use of the 4th bit pattern.

A forgot to describe a practical detail. To create these things my plan is not to configure a generator to make wldr bitbases, but to down-convert existing DTM tablebases. And then apply a second pass with a DTZ50 tablebase, flipping the bits of draws by rule. I've made arrangements with John Tomplin to receive a copy of his non-Nalimov files (in exchange for filling him out with what he is missing).

john

Codeman · Post by **Codeman** » Mon Mar 17, 2008 8:10 pm

Kirill Kryukov wrote:Hi Edmund! Great to hear about your progress! Looking forward to use your generator!

About the license, I think any OSI-approved license should be fine. Personally I like zlib/libpng license, because it is short enough so someone might actually read it.

Thank you Kirill, I have taken your proposed license now.

Codeman · Post by **Codeman** » Mon Mar 17, 2008 8:20 pm

jkominek wrote:
The only real problem I encountered yesterday when trying to generate my first 6-men tablebase. It would take the program 1474 mb ram to generate it (462*62*61*60*59* 2/8) My computer wasn't anymore able to hold it in ram, so I wanted to ask here, whether you know of any efficient algorithms to combine harddrive and ram to still ensure a relatively fast generation.
In increasing order of programming difficulty, you have 4 choices.

1. buy more memory and install it in your computer
2. buy a shared memory supercomputer from Cray or SGI
3. implement memory-mapped files
4. implement a custom disk-to-memory controller

mmap'd files allow you to program as if your hard drive where just a big chunk of regular RAM (albeit very slow RAM). Use of it is supposed to yield identical results, as if it were all real memory, but I have my doubts. I prefer to use it only in read-only mode, and to handle writing of file blocks explicitly. But that's more effort, which brings along its own bucket of bugs.

sounds all interesting, but I think I will go for the memory-mapped files. Only then will it probably be possible to attempt generating 7-men at one point (currently I could not afford 85gb ram, but that amount of hard-disk is more likely)

jkominek wrote:
Secondly I wanted to know if any of the experts here would be so kind and have a look at the code and check whether any major improvements can be made. (I myself am only a hobby programmer interested in the principles of chess-programming and therefore not aware of many little speed-up tricks)
I'm willing to take a look, from a perspective that suites my specific interest. I want a way of verifying my tbgen2 work - by writing or using a program that derives from an independent code base. By the way, we are all hobbyists here. Maybe Eiko Bleicher has reason to think otherwise, but there isn't much money in tablebase work.

I have just sent you a pm with the source enclosed. I hope it is of some use to you. Maybe you even have some ideas to improve it.

Edmund

Arpad Rusz · Post by **Arpad Rusz** » Tue Mar 25, 2008 1:48 pm

The idea to use the 4th bit pattern is very nice. As I am a study composer, I don't really care of the 50 move draw rule. So I'm one of those chess purist jkominek has talked about. But I don't need the exact depth of a win either. A simple win/draw/lose information and the fact that in a position there is an unique winning/drawing move or not is enough.
So I was wondering if the win/draw/lose and unique/not unique move information can be stored in 2 bits. It seems that - with a little trick - it can be stored:

00 - unique move draws
01 - unique move wins
10 - lost
11 - draw or win (and there is no unique move)

When a position has 11, then an one ply search is needed to find the draw/win information:
- if all moves from that position lead to 10 positions then it is a win
- if one of the moves leads to a 00 position then it is a draw
- if at least one of the moves leads to a 11 position then it is a draw

A further development would be to use instead of the unique moves the "effective unique moves"- see the following old thread:
http://kirr.homeunix.org/chess/discussi ... ?f=6&t=824

Then the Effective Unique Move Sequences (EUMS) could be found easily. Those could be starting points for new studies. A software that uses this new kind of Unique Move Bitbases and has the features of Wilhelm and CQL, would be any composers dream...

jkominek · Post by **jkominek** » Wed Mar 26, 2008 5:59 pm

Arpad Rusz wrote:So I was wondering if the win/draw/lose and unique/not unique move information can be stored in 2 bits. It seems that - with a little trick - it can be stored:

00 - unique move draws
01 - unique move wins
10 - lost
11 - draw or win (and there is no unique move)

This is an interesting alternative that I hadn't thought of. The question in my mind is whether there is simplicity in search that outweighs the hassle of creating a derivative tablebase set. I don't know.

A further development would be to use instead of the unique moves the "effective unique moves"- see the following old thread: http://kirr.homeunix.org/chess/discussi ... ?f=6&t=824

I recall this thread now. In skimming it, the basic idea is that you want to ignore cycles from consideration of unique moves, is that right? The discussion seemed fairly esoteric at the time.

Guyhaw said to be "writing-up the algorithm now". Don't know what became of that initiative. A year and a half has passed.

jk

Arpad Rusz · Post by **Arpad Rusz** » Thu Mar 27, 2008 1:50 pm

jkominek wrote:The question in my mind is whether there is simplicity in search that outweighs the hassle of creating a derivative tablebase set. I don't know.

As far as I know, Wilhelm is the only software which has the feature to search for unique moves in the Nalimov databases. For example, you can search for positions where the only winning move is an underpromotion or a knight move to the corner of the board. The problem is that the more complex searches in the 6-man databases could take hours to finish because every move has to be checked if it is unique or not. Usually only 1 position from a 1000 has an unique winning or drawing move! So you can imagine the gain from the new type of bitbases.

jkominek wrote:the basic idea is that you want to ignore cycles from consideration of unique moves, is that right?

Yes, that's the basic idea. The white moves from a chess study are usually unique and the "main line" and the "variations" are UMSs (Unique Move Sequences). The same is true for black in the "tries". But sometimes there is an alternative move which also wins but the other side can force the return to the previous position. It is universally accepted that in this case there is no "dual" in the study. So that's why the Effective Unique Move Sequences (EUMS) are important.

jkominek · Post by **jkominek** » Sat Mar 29, 2008 3:05 am

Arpad Rusz wrote: It is universally accepted that in this case there is no "dual" in the study. So that's why the Effective Unique Move Sequences (EUMS) are important.

I'm unfamiliar with that term. What is meant by a "dual"?

Also, do you have an example of an EUMS with a cycle to be avoided? Perhaps there is a study on your web site that you can point to.

john

Arpad Rusz · Post by **Arpad Rusz** » Sat Mar 29, 2008 11:45 am

"Dual" - double solution in a chess study or a problem. A chess composition with a dual is worthless (it is "cooked"

).

In the following position the Nalimov database gives two winning moves(1.Kb4 #64 and 1.Qb3 #66):
8/8/8/8/Q1P3k1/K3q3/8/8

But the second move leads to a cycle: after 1.Qb3 black can force the return to the previous position with 1...Qa7+! 2.Qa4(the only move which is still winning) 2...Qe3+. This means that 1.Kb4 is the only effective move which wins. So in this case there is no dual because the only way to win is by playing Kb4.

The second position is a study by Kling and Horwitz (1851):

7q/4K1k1/7p/6p1/8/8/8/7Q

White to move and win

The solution: 1.Qa1+ Kh7 2.Qb1+ Kg7 3.Qb2+ Kh7 4.Qc2+ Kg7 5.Qc3+ Kh7 6.Qd3+ Kg7 7.Qd4+ Kh7 8.Qe4+ Kg7 9.Qe5+ Kh7 10.Qf5+ Kg7 11.Qf7#
But is this solution without duals? As white can win only by checks, the first two white moves are unique. But in the third move there are two ways to give check to the black king: 3.Qb2+ and 3.Qa1+. Of course the second move leads only to a cycle and it is not a dual! For the same reason, there are no duals later. So the study is not "cooked", it is "sound"!

jkominek · Post by **jkominek** » Sun Mar 30, 2008 5:09 pm

Arpad - Thank you for the examples! That helps greatly to fix the idea.

john

jkominek · Post by **jkominek** » Mon Mar 31, 2008 3:50 pm

Arpad -

By the way, what program did you use to generate your board diagrams? And what are those mysterious numbers at the bottom? In your examples it looks like white and black piece counts, but I've seen other diagrams with three numbers (e.g. "3 + 4 + 2") so it must be something else.

john

Arpad Rusz · Post by **Arpad Rusz** » Mon Mar 31, 2008 9:13 pm

I didn't use any program to generate those diagrams! This is a general feature of the CCRL Discussion Board. You can post a diagram using HTML-like tags with "diagram":

[diagra m]FEN[/diagra m],

where FEN must be a position in Forsyth notation (for the initial position FEN is rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR).
The simplest way to generate the FEN of a position is using Position Setup in ChessBase (or Fritz, Shredder...) and then "Copy FEN".

The numbers are indeed white and black piece counts. The third number can be the piece count for neutral pieces in some fairy problems...

If you need a tool to create chess diagrams I would recomand DiagTransfer, which is an excellent freeware:

http://alain.blaisot.free.fr/DiagTransf ... h/home.htm

All diagrams from my blog are created with DiagTransfer.
(Recomandation: always use "Save as Picture..."!)

guyhaw · Post by **guyhaw** » Sat Apr 12, 2008 12:29 am

[/quote]
Guyhaw said to be "writing-up the algorithm now". Don't know what became of that initiative. A year and a half has passed.

jk[/quote]

Thanks for the reminder. My ideas about writing up the algorithm rather spiralled into a study of principles for writing up algorithms and 'what is a good pseudo-code'. Looking back on it, in the past I made rather a poor attempt to explain how to generate DTR and DTZR EGTs, including getting the recursion formula wrong

.

[ Incidentally, it is not true - as I have said from time to time - that DTZR >= DTZ. Black only captures to lose when forced to with the DTZ metric, but chooses to do so when it makes for a longer, later phase in the play. ]

I decided to write up the algorithm in a layered way, showing that a 'first feasible' algorithm exists [effectiveness] and then showing how to make it run more quickly [efficiency]. Then I switched priorities and wrote up "Gentlemen, Stop Your Engines!" first instead - a proposal for rating players via the moves they make rather than the game results they achieve.

g

CCRL Discussion Board

New Bitbase format

New Bitbase format

Re: New Bitbase format

Re: New Bitbase format

Re: New Bitbase format

Re: New Bitbase format

Re: New Bitbase format

Re: New Bitbase format

Re: New Bitbase format

Re: New Bitbase format

Re: New Bitbase format

Re: New Bitbase format

Re: New Bitbase format

Re: New Bitbase format

Re: New Bitbase format

Re: New Bitbase format

Re: New Bitbase format