Methodology or strategy to build EGTB test suite

Nguyen Pham · Post by **Nguyen Pham** » Tue Sep 04, 2012 8:12 am

Hi all,

I am working on my own EGTB (3-4-5 men only), thus I am looking/collecting all usefull information for my work. One of good stuff I take from this forum is EGTB test suite (thanks a lot for that). Unfortunately, the number of tests for 3-4-5 men is too small and can help me to test very few cases only.

I have been thinking about improving the test set. At my first glance, the current test suite looks like lack of strategy on development. The way the new test added seems be randomly and not very clear on purpose (why people adds this but not that, add for test what?). People seems to like to add complicated tests but not simpler ones.

For me, current test set focused on only the correctness of endgames, which is turn out, not very important for creators, because after having a new EGTB, creators may make sure its correctness by searching (compare value of each postion with surrounding ones) or by comparing with other EGTBs. Because of having very large number of positions, few tests are definitely not enough (but still useful for fast checking or for users).

IMO, the test set should be divided into at least two groups

1) Test the existentences of endgames:
An EGTB may have so many files. I usually have to ask myself if they are enought, if I miss any of them or create some redundantly. Sometimes the situation is more complecated when some can keep only endgames for white but not black side. In other cases, some endgames may be not really useful such as KQKQ for black side (we can use white side one instead).

The test set should divided into some sub groups to find out:
- test set for existence of all white side endgames
- test set for existence of all black side endgames
- test set for existence of all endgames which are not really nessesary such as KQKQ for black side

2) Test the correctness
As I said above, the test set may not be usefull for creators (because they can check by searching) but can help to check quickly and becomes more usefull for users. Few of followings are not realy test set but methods of fast checking.

- test data corrupted at glance by checking files' sizes. Different EGTBs may have different files' sizes, but they should be similar in general, say, 5 men should 20-64 times bigger than 4 men which in turn, 20-64 times bigger than 3 men. Pawn endgames are usually bigger than non-Pawn endgames.
- test data corrupted: every endgame should have few tests regardless they are so simple or so small. Few should be at begining, few in middle, few in the end of index system. Ideal: every data block should have a test.
- test range of data: test for positions which are longest mates, longest losses, draws
- test based on statistics: some EGTB (like mine

) can produce some statistics. Even different EGTBs may have different statistics, but they should be similar. For example, ratio of win/loss/draw should be similar

Any ideas?

Thanks

/Pham