CCRL 40/40 Testing Conditions

Questions and comments related to CCRL testing study

CCRL 40/40 Testing Conditions

Postby Ray » Fri Dec 15, 2006 5:46 am

For information, our testing conditions are as follows:

CCRL 40/40 Testing Conditions

Time control
Our time control is equivalent to 40 moves in 40 minutes on AMD X2 4600+ at 2.4GHz.
We use Crafty 19.17 BH as a benchmark to determine the equivalent time control for particular machine.


CCRL Testing Conditions

Endgame tablebases: 4 or 5 piece tablebases.

Pondering: OFF

Tournament type: Any. Match, Tournament, Gauntlet etc.
Matches or Gauntlets are usually limited to 60 games maximum per engine pair. Our standard match length in 40/40 is 30 games.

Hash size: is set to the same value of either 128 or 256 MB for all engines in a match or tourney. There are three exeptions:
1) "deep" engines running on 2 CPUs should have double hash size, compared to single engines in the same tourney. 2) "deep" engines running on 4 CPUs should have four time hash size
3) If an engine has problems with particular hash size, it may play with smaller hash.

EGTB hash: 32 MB

Tournament Interface: Any. Examples: Winboard, Arena, Shredder, Chessbase, Chess Partner.

Opening book: Any generic. Examples: remis.ctg, draw.ctg, 5moves.ctg, perfect.ctg etc. Book line length is limited to 12 moves per side maximum and book learning is off or the book set as read only. The same books are used for all engines in the match, tournament or gauntlet.

Engines with their own books have them disabled (deleted or switched off in parameters). Engines which can't disable their own book can't participate in CCRL testing.

Book learning: Off for all engines.

Position learning: Off for all engines. Alternatively learning files are set to read-only or deleted after each match.


Choosing engines to test

Engine choice is up to ndividual testers, with just a few limitations:

1. We test only stable versions of chess engines. We generally test only publicly available engines, although private versions and betas are tested solely at our discretion on a case-by-case basis.

Note: If we test a beta version, and that beta version is then released as stable version, and if the engine author says there is no playing strength difference between release and beta, then those games of the beta version are re-named.

2. We try to avoid over-testing any particular pair of engines. The reason is to avoid rating distortions and have more reliable statistics. Normally we test with 30 games for each pair, and with 60 games for interesting pairs, but anything up to 100 is fine.

Tournament format is entirely up to individual testers. We run matches, roundrobin tournaments, gauntlet, swiss or knockout events.

New or old engines can be selected and studying the current rating list will show which engines are currently being tested and which engines may need additional games. Rating differences of estimated +-400 points should be avoided.
Last edited by Ray on Fri Jul 13, 2007 2:24 pm, edited 1 time in total.
Ray
 
Posts: 8106
Joined: Mon Dec 19, 2005 3:33 am
Location: London, England

Re: CCRL 40/40 Testing Conditions

Postby Marc Lacrosse » Wed May 30, 2007 12:54 am

Ray Banks wrote:For information, our testing conditions are as follows:

CCRL 40/40 Testing Conditions

Time control
Our time control is equivalent to 40 moves in 40 minutes on Athlon 64 3800+ at 2.4GHz, or an AMD X2 4600+ also at 2.4GHz.
We use Crafty 19.17 BH as a benchmark to determine the equivalent time control for particular machine.


Hi Ray

How do I adjust the timings for a given machine according to the results of crafty 19.17 benchmark?
Is there a table or a formula for this ?

In which precise conditions do I do the benchmarking ?

Thanks

Marc
Marc Lacrosse
 
Posts: 2
Joined: Tue May 29, 2007 3:46 pm

Postby Kirill Kryukov » Wed May 30, 2007 1:35 am

Hi Marc! Here is extract from our conditions (updated version at our internal wiki page).

Time control

We are doing our testing in three time controls: 40/40, 40/12 and 40/4, as measured on Athlon X2 4600+ machine. If you have different machine (actually even if you have the same machine) you need to benchmark it and adjust the time control according to the results. We use Crafty 19.17 BH as a benchmark to determine the equivalent time control for particular machine.

Crafty 19.17 BH benchmark can be downloaded here. (Version 19.17, Brian Hoffman compile, 32-bit, single-CPU). Please note that we should use the same version and compile because different versions may be slightly faster or slower and will give different benchmark time.

How to test: Reboot a machine (not required but preferrable), extract Crafty executable into a separate folder, make sure there are no other files in that folder, run crafty, type 'bench' <enter>, wait a while (don't use computer in that time), when the benchmark ends type 'quit' <enter> (to quit the Crafty). You will then see a new file "log.001" in the Crafty folder. Open that file and find a line "Total elapsed time: 96" near the end. (Your time may be different of course).

Time control for your machine is then computed based on Crafty 19.17 BH benchmark result as following:

CCRL 40/40: T minutes / 40 moves repeated, where T = 40 * <your elapsed seconds> / 48 = <your elapsed seconds> / 1.2
CCRL 40/12: T minutes / 40 moves repeated, where T = 40 * <your elapsed seconds> / 160 = <your elapsed seconds> / 4
CCRL 40/4: T minutes / 40 moves repeated, where T = 40 * <your elapsed seconds> / 480 = <your elapsed seconds> / 12

Example: Your machine runs Crafty 19.17 BH benchmark in 55 seconds and you want to run long time control games (CCRL 40/40). Compute T as 40 * 55 / 48 = 45.833333, rounding to the nearest integer we get T = 46. So, your time control for CCRL 40/40 is 40 moves in 46 minutes.

Note 1. It is totally your choice which of the three time controls to use. Either one is fine and will be good addition to the database and rating list. Just be sure to check the right coordination page, and submit the games to the appropriate update thread. (Best is to add '4040', '4012' and '404' to the file name to help avoiding the mistake).

Note 2. We use repeated time control. It means the, say, in 40/4 the engines have 4 minutes for the first 40 moves. Then they get another 4 minutes for the next 40 moves, and so on.

Note 3. (About benchmark hardware). Initially we used Athlon 64 3800+ as our standard hardware. In January 2007 we changed to Athlon X2 4600+ to better reflect the fact that we do a lot of multi-CPU testing. Those two platforms have the same Crafty benchmark result. Note that the benchmarking it always 32-bit single-CPU.
User avatar
Kirill Kryukov
Site Admin
 
Posts: 6234
Joined: Sun Dec 18, 2005 6:58 pm
Location: Mishima, Japan

Postby Marc Lacrosse » Wed May 30, 2007 3:41 am

Kirill Kryukov wrote:Hi Marc! Here is extract from our conditions (updated version at our internal wiki page).

Time control

We are doing our testing in three time controls: 40/40, 40/12 and 40/4, as measured on Athlon X2 4600+ machine. If you have different machine (actually even if you have the same machine) you need to benchmark it and adjust the time control according to the results. We use Crafty 19.17 BH as a benchmark to determine the equivalent time control for particular machine.

Crafty 19.17 BH benchmark can be downloaded here. (Version 19.17, Brian Hoffman compile, 32-bit, single-CPU). Please note that we should use the same version and compile because different versions may be slightly faster or slower and will give different benchmark time.

How to test: Reboot a machine (not required but preferrable), extract Crafty executable into a separate folder, make sure there are no other files in that folder, run crafty, type 'bench' <enter>, wait a while (don't use computer in that time), when the benchmark ends type 'quit' <enter> (to quit the Crafty). You will then see a new file "log.001" in the Crafty folder. Open that file and find a line "Total elapsed time: 96" near the end. (Your time may be different of course).

Time control for your machine is then computed based on Crafty 19.17 BH benchmark result as following:

CCRL 40/40: T minutes / 40 moves repeated, where T = 40 * <your elapsed seconds> / 48 = <your elapsed seconds> / 1.2
CCRL 40/12: T minutes / 40 moves repeated, where T = 40 * <your elapsed seconds> / 160 = <your elapsed seconds> / 4
CCRL 40/4: T minutes / 40 moves repeated, where T = 40 * <your elapsed seconds> / 480 = <your elapsed seconds> / 12

Example: Your machine runs Crafty 19.17 BH benchmark in 55 seconds and you want to run long time control games (CCRL 40/40). Compute T as 40 * 55 / 48 = 45.833333, rounding to the nearest integer we get T = 46. So, your time control for CCRL 40/40 is 40 moves in 46 minutes.

Note 1. It is totally your choice which of the three time controls to use. Either one is fine and will be good addition to the database and rating list. Just be sure to check the right coordination page, and submit the games to the appropriate update thread. (Best is to add '4040', '4012' and '404' to the file name to help avoiding the mistake).

Note 2. We use repeated time control. It means the, say, in 40/4 the engines have 4 minutes for the first 40 moves. Then they get another 4 minutes for the next 40 moves, and so on.

Note 3. (About benchmark hardware). Initially we used Athlon 64 3800+ as our standard hardware. In January 2007 we changed to Athlon X2 4600+ to better reflect the fact that we do a lot of multi-CPU testing. Those two platforms have the same Crafty benchmark result. Note that the benchmarking it always 32-bit single-CPU.


Hi Kiril !

Let me just say that I highly praise the extremely high standard of your tests and the excellent quality of your site (I myself am involved in some statistical analysis in my medical research professional activities).

Just a little question : how does your benchmark adjust on multiprocessors PCs ?
When I just type "bench" for crafty without any crafty.rc file : does it use multiple processors if they are present? How does it influence adjustment for engines with/without SMP ability ?

I am not yet quite sure that I will perfectly adjust to your protocol (I do not like "x moves in y time repeating", I do prefer pure Fischer timing; and also I do not like the variability of opening books in your tests) but I do intend to adjust the relative timings on my different PCs according to your kind of benchmarking.

Thanks a lot for all your good work !

You are my favorite ranking list !

Marc
Marc Lacrosse
 
Posts: 2
Joined: Tue May 29, 2007 3:46 pm

Postby Kirill Kryukov » Wed May 30, 2007 4:18 am

Thanks for kind words, Marc! :D

Marc Lacrosse wrote:Just a little question : how does your benchmark adjust on multiprocessors PCs ?
When I just type "bench" for crafty without any crafty.rc file : does it use multiple processors if they are present? How does it influence adjustment for engines with/without SMP ability ?

The exact Crafty compile I referred to in previous post uses 1 CPU (or core) by default, so it runs the same way on multi-CPU machine as it would on a single-CPU machine with the same CPU. This is why we can make meaningful benchmarking.

This benchmarking we use ensures that single-CPU engines search comparable number of nodes on different machines, no matter 1, 2, or 4 CPUs. When an engine can use multi-CPUs, it becomes its own advantage which is reflected in the rating.

Marc Lacrosse wrote:... and also I do not like the variability of opening books in your tests

It is easy to use single book when you are testing alone, but in a large team this will quickly become an issue. Everyone seems to have some preferences. So we now use only two requirements for a book: 1. Opening book must be general, which means not tuned to any particular engine. 2. Opening book must be limited to 12 moves maximum (24 plies). Personally I use 8 moves or shorter books.
User avatar
Kirill Kryukov
Site Admin
 
Posts: 6234
Joined: Sun Dec 18, 2005 6:58 pm
Location: Mishima, Japan

Postby Ray » Wed May 30, 2007 4:25 am

Kirill Kryukov wrote:It is easy to use single book when you are testing alone, but in a large team this will quickly become an issue. Everyone seems to have some preferences. So we now use only two requirements for a book: 1. Opening book must be general, which means not tuned to any particular engine. 2. Opening book must be limited to 12 moves maximum (24 plies). Personally I use 8 moves or shorter books.


Indeed - and no doubt Marc you've seen the book history page which shows what books we've used over time and which ones are most popular

http://www.computerchess.org.uk/ccrl/40 ... _book.html

.
Ray
 
Posts: 8106
Joined: Mon Dec 19, 2005 3:33 am
Location: London, England

Re:

Postby M Lacrosse » Sat Sep 29, 2007 8:22 pm

Kirill Kryukov wrote:(...)
We use Crafty 19.17 BH as a benchmark to determine the equivalent time control for particular machine.
(...)
Open that file and find a line "Total elapsed time: 96" near the end.


Hi Kyril

Is there a list with known "total elapsed time" for different PC architectures ?

Regards

Marc

PS I would like to know which is the fastest presently available monoprocessor architecture for 32 bits engines
Here I have :
HP core duo : TET = 38
Pentium M 2.0 : TET = 50
PIV 3.0 : TET = 68
M Lacrosse
 
Posts: 1
Joined: Sat Sep 29, 2007 8:08 pm

Re: Re:

Postby Kirill Kryukov » Sun Sep 30, 2007 12:48 am

M Lacrosse wrote:Is there a list with known "total elapsed time" for different PC architectures ?

Regards

Marc

PS I would like to know which is the fastest presently available monoprocessor architecture for 32 bits engines
Here I have :
HP core duo : TET = 38
Pentium M 2.0 : TET = 50
PIV 3.0 : TET = 68

We maintain an internal list of benchmark results on our machines. The list is unlikely to become public, but we may extract and publish some essense from it (Theoretically at least).

The fastest we have is 28 seconds on overclocked Core 2 Duo. Though sometimes we forget to add the machines to the list so someone may already have a faster one.
User avatar
Kirill Kryukov
Site Admin
 
Posts: 6234
Joined: Sun Dec 18, 2005 6:58 pm
Location: Mishima, Japan

Re: Re:

Postby Graham Banks » Sun Sep 30, 2007 4:48 am

Kirill Kryukov wrote:
M Lacrosse wrote:Is there a list with known "total elapsed time" for different PC architectures ?

Regards

Marc

PS I would like to know which is the fastest presently available monoprocessor architecture for 32 bits engines
Here I have :
HP core duo : TET = 38
Pentium M 2.0 : TET = 50
PIV 3.0 : TET = 68

We maintain an internal list of benchmark results on our machines. The list is unlikely to become public, but we may extract and publish some essense from it (Theoretically at least).

The fastest we have is 28 seconds on overclocked Core 2 Duo. Though sometimes we forget to add the machines to the list so someone may already have a faster one.


From what I can gather, Marc would be interested to know the hardware that benchmarks at 28 seconds.
User avatar
Graham Banks
 
Posts: 8438
Joined: Mon Dec 19, 2005 2:47 am
Location: Auckland, NZ

Re: CCRL 40/40 Testing Conditions

Postby Ray » Sun Sep 30, 2007 5:02 am

Nothing secretive about it - the Crafty benchmark is available for download and anyone can run it

28 seconds was - Core 2 Duo E6600 @ 3330 MHz
Ray
 
Posts: 8106
Joined: Mon Dec 19, 2005 3:33 am
Location: London, England

Re: CCRL 40/40 Testing Conditions

Postby Carl Mascott » Tue Jul 15, 2008 2:50 am

The download link in this thread for Crafty 19.17 BH no longer works.
I haven't been able to find the program elsewhere.
Could someone post the binary or a link?
Thanks!
Carl Mascott
 
Posts: 5
Joined: Sun Jul 13, 2008 12:25 am

Re: CCRL 40/40 Testing Conditions

Postby Graham Banks » Tue Jul 15, 2008 4:20 am

Carl Mascott wrote:The download link in this thread for Crafty 19.17 BH no longer works.
I haven't been able to find the program elsewhere.
Could someone post the binary or a link?
Thanks!


viewtopic.php?f=7&t=3608
User avatar
Graham Banks
 
Posts: 8438
Joined: Mon Dec 19, 2005 2:47 am
Location: Auckland, NZ

Re: CCRL 40/40 Testing Conditions

Postby Shaun Brewer » Tue Jul 15, 2008 7:15 pm

Link fixed...

Shaun
User avatar
Shaun Brewer
 
Posts: 5913
Joined: Sun May 14, 2006 12:24 am
Location: Brighton. UK


Return to CCRL Public

Who is online

Users browsing this forum: Google [Bot], MSN [Bot] and 1 guest