Further compression

Endgame analysis using tablebases, EGTB generation, exchange, sharing, discussions, etc..

Further compression

Postby EricLang » Fri Nov 07, 2008 10:20 am

Ok, thanks to your great answers in my other posts I got my tablebases working!

A short description:
-Positions with wtm and btm are stored in one file.
-duplicate squares are stored (in case of non-same pieces), but i think this is not very much
-use of unique triangle
-use of precomputed legal kingcombinations
-use of "same-piece" algorithm (p1 > p2 > p3 > p4)
-bitpacking
-datacompression in blocks of about 8192 bytes (depending on the need bits per positions)
-at the end of the file is the index for the positions of all compressed blocks, just an array of integers (which is datacompressed too)

Now for example my KQRK file is 760 KB (The two files of Nalimov are about 500 KB)
It has
- 4.620.288 positions
- 284.256 duplicates
- 720.435 invalid positions

I do not want to make the files as small as possible (being a Size Champion), but I if more compression is possible in an easy way I would like to do that.

Is there a smart way to eliminate a large part of invalid positions?
More ideas?

Thanks, Eric
EricLang
 
Posts: 33
Joined: Wed Sep 03, 2008 5:31 am

Re: Further compression

Postby koistinen » Fri Nov 07, 2008 1:36 pm

One way to avoid storing illegal positions is by storing positions in blocks and then filtering out illegal positions before compression.
When decompressing a block, use the same filtering function to put them back.

Would you publish your code for others to use? (under some free software licence)
koistinen
 
Posts: 80
Joined: Fri May 02, 2008 7:59 pm
Location: Stockholm

Re: Further compression

Postby EricLang » Fri Nov 07, 2008 1:49 pm

That seems like a good idea. I will have a try at that.

The code needs some cleanup, but I'm certainly planning to put everything online somewhere as freeware, when it's finished (I guess somewhere within a half year)
EricLang
 
Posts: 33
Joined: Wed Sep 03, 2008 5:31 am

Re: Further compression

Postby EricLang » Fri Nov 07, 2008 4:48 pm

Hmmm... I can't think of anything else than a sort of RunLength pre-compression for that idea of yours. Is that the way you should handle it, Koistinen?
EricLang
 
Posts: 33
Joined: Wed Sep 03, 2008 5:31 am

Re: Further compression

Postby EricLang » Fri Nov 07, 2008 9:27 pm

After all the bitpacking was not succesful. It "confuses" the bytes so normal datacompression should be enough.
My KRQK file is now 663 KB. Coming closer to Nalimov (500)
KQQK is 379 KB, Nalimovs about 320 KB
I think I need one more last space-optimization and then I will be satisfied.
EricLang
 
Posts: 33
Joined: Wed Sep 03, 2008 5:31 am

Re: Further compression

Postby Codeman » Fri Nov 07, 2008 10:50 pm

One more index optimization Nalimov uses is that it doesn't index unblockable checks from the stm

eg: KRK, white to move and black King on e4.
instead of having 63 left squares for the white rook, what he says is that no matter where the other pieces are standing, the rook may never be on d4,f4,e3,e5
this saves some space as well. Especially for the queen in your example.
Codeman
 
Posts: 85
Joined: Fri Oct 19, 2007 7:50 pm

Re: Further compression

Postby koistinen » Sat Nov 08, 2008 1:04 am

EricLang wrote:Hmmm... I can't think of anything else than a sort of RunLength pre-compression for that idea of yours. Is that the way you should handle it, Koistinen?

No, not run-length, it is removal of illegal positions:
Code: Select all
int
encode(inbuf, outbuf, firstindex, count)
{
  index = firstindex
  outcount = 0

  for i=1..count
    if legal_position(index)
      *inbuf++ = *outbuf++
      outcount++
    else
      inbuf++
    index++

  return outcount
}

decode(inbuf, outbuf, firstindex, count)
{
  index = firstindex
  for i=1..count
    if legal_position(index)
      *inbuf++ = *outbuf++
    else
      *outbuf++ = ILLEGAL_POSITION
    index++
}
koistinen
 
Posts: 80
Joined: Fri May 02, 2008 7:59 pm
Location: Stockholm

Re: Further compression

Postby EricLang » Sat Nov 08, 2008 10:37 am

Simple and smart, thanks!
I'll have to check if the Decode procedure isn't too timeconsuming, probably not.
EricLang
 
Posts: 33
Joined: Wed Sep 03, 2008 5:31 am

Re: Further compression

Postby EricLang » Sun Nov 09, 2008 11:59 am

Okidoki, I got it working! My files are much smaller now than Nalimov's! I'll make a list here soon for comparison :)
EricLang
 
Posts: 33
Joined: Wed Sep 03, 2008 5:31 am

Re: Further compression

Postby notnale » Sun Nov 09, 2008 1:57 pm

If I were trying to write my own generator, what would the easiest/most efficient language be?
notnale
 
Posts: 43
Joined: Sun Aug 17, 2008 6:36 am

Re: Further compression

Postby Codeman » Sun Nov 09, 2008 2:48 pm

notnale wrote:If I were trying to write my own generator, what would the easiest/most efficient language be?


For EGTB generators you need to write code that compiles as fast as possible and you need good access to memory management.
I would suggest C++
Codeman
 
Posts: 85
Joined: Fri Oct 19, 2007 7:50 pm

Re: Further compression

Postby EricLang » Sun Nov 09, 2008 4:08 pm

My generator is written with Delphi (7). No problems there :)
EricLang
 
Posts: 33
Joined: Wed Sep 03, 2008 5:31 am

Re: Further compression

Postby koistinen » Mon Aug 03, 2009 12:38 pm

Avoiding storing illegal positions looks rather promising, the more pieces, the more useful is my guess.
I'm Experimenting with Erlang (still learning) and for KRRvK I get:
[8325184,16777216,0.4962196350097656] legal for [white_rook,white_rook]
and for KRRvKR:
[508975152,1073741824,0.4740200489759445] legal for [white_rook,white_rook,
black_rook]

real 359m12.467s
user 355m20.060s
sys 0m12.925s
and for KRRRvK:
[387623040,1073741824,0.3610020875930786] legal for [white_rook,white_rook,
white_rook]

real 471m29.302s
user 469m55.926s
sys 0m0.252s

The last ought to be about the same proportion as for KRRRvKRR.
That means the bitmaps might be stored in about 64GB*0.36 is approx 23GB for endgames like KQRBvKQR.
The multiplier for just filtering adjacent kings and pieces on same square is 0.64 giving 41GB sized bitmaps.

To compute it I iterate over every square for every piece and count the legal positions with no two pieces on the same square.
I imagine computing illegal positions can be done much quicker for blocks of positions.

Code: Select all
legal_count(P)->
    Pieces = [white_king, black_king] ++ P,
    N = 1 bsl (6*lists:flatlength(Pieces)),
    Count = legal_count(Pieces, N-1, 0),
    [Count, N, Count/N].

legal_count(_Pieces, -1, Count)->Count;
legal_count(Pieces, Cursor, Count)->
    Board = retro_board(Pieces, Cursor),
    All_pieces = length(Pieces) == length(Board),
    Illegal = is_illegal(white, Board),
    if
   Illegal or not All_pieces ->
       legal_count(Pieces, Cursor-1, Count);
   true ->
       legal_count(Pieces, Cursor-1, 1+Count)
    end.
koistinen
 
Posts: 80
Joined: Fri May 02, 2008 7:59 pm
Location: Stockholm

Re: Further compression

Postby koistinen » Thu Aug 20, 2009 3:49 pm

I have written some C++ code for computing illegality/invalidity status of positions.
Using one core it does this in approximately 15min for a 7-man endgame. (simulated by 640*KRRRvK)
There are many opportunities for optimization left so it should be possible to do it quickly enough that it can be useful for compression during computation of the endgame for the purpose of reducing the amount read and written to disk.
Code is at http://teo.se/F.zip, it is a work in progress, lots of room for improvement.
koistinen
 
Posts: 80
Joined: Fri May 02, 2008 7:59 pm
Location: Stockholm


Return to Endgame Tablebases

Who is online

Users browsing this forum: No registered users and 1 guest