Statistics files re Nalimov's EGTs
Posted: Thu Jan 25, 2007 2:52 pm
At the risk of being accused of going to the well of 'data correctness' too often, I'll add some further comments.
Lest anyone gets confused on visiting this thread later on, all references to "Guy's tbs [statistics files]" should be to 'Guy's copy of Nalimov's tbs files'. Others may be recomputing the stats of Nalimov's EGTs but I am not.
I agree that the Nalimov format of tbs files is not ideal, and for two reasons. Counts are not right-aligned, and items with a zero-count (e.g. mates at some depth and even broken positions) are omitted when a line with a '0' would have made subsequent analysis easier to do.
The rigour which Eugene applied to EGT generation and verification may not have applied to his production of stats files: his apparent lack of interest in expunging the erronous KBPKN statistics (a 'one off') rather indicates that.
I realised somewhat tardily that I could check the tbs figures against the index-ranges in the TBGEN code. It was then that the 'missing 2^32' phenomenon came to light - only in 6-man P-ful endgames I think.
I have never changed even my copy of Nalimov's original tbs files, but only my transfer of them to xls's. It is clear that Nalimov's code eventually produced numbers > 2^32 correctly, see, e.g., KBBPKB btm draws. So, there may be some tbs files around with missing multiples of 2^32 - a few in 4-2p and more in 3-3p: it would be worth getting a definitive list of where the errors are, and any contributions to this list are welcome.
Recall suggests that my %s (of wins, draws and losses) in the xls on the ICGA website are after corrections to Nalimov's stats.
g
Lest anyone gets confused on visiting this thread later on, all references to "Guy's tbs [statistics files]" should be to 'Guy's copy of Nalimov's tbs files'. Others may be recomputing the stats of Nalimov's EGTs but I am not.
I agree that the Nalimov format of tbs files is not ideal, and for two reasons. Counts are not right-aligned, and items with a zero-count (e.g. mates at some depth and even broken positions) are omitted when a line with a '0' would have made subsequent analysis easier to do.
The rigour which Eugene applied to EGT generation and verification may not have applied to his production of stats files: his apparent lack of interest in expunging the erronous KBPKN statistics (a 'one off') rather indicates that.
I realised somewhat tardily that I could check the tbs figures against the index-ranges in the TBGEN code. It was then that the 'missing 2^32' phenomenon came to light - only in 6-man P-ful endgames I think.
I have never changed even my copy of Nalimov's original tbs files, but only my transfer of them to xls's. It is clear that Nalimov's code eventually produced numbers > 2^32 correctly, see, e.g., KBBPKB btm draws. So, there may be some tbs files around with missing multiples of 2^32 - a few in 4-2p and more in 3-3p: it would be worth getting a definitive list of where the errors are, and any contributions to this list are welcome.
Recall suggests that my %s (of wins, draws and losses) in the xls on the ICGA website are after corrections to Nalimov's stats.
g