KCEC
(Kirr's Chess Engine Comparison)
A tournament of original free chess engines
June 16, 2013
Testing summary:
Total: 135,679 games
played by 202 programs
1398 CPU days (X2 4600+)

White wins: 55,227 (40.7%)
Black wins: 47,434 (35.0%)
Draws: 33,018 (24.3%)
White score: 52.9%

Custom engine selection

Comparing 9 engines!
8 best versions of selected engines played 152 games with each other

KCEC Rating List — Custom engine selection (Quote)

Ponder off, neutral book (up to 8 moves), 3-4-5 piece EGTB
Time control: Equivalent to 40 moves in 4 minutes on Athlon 64 X2 4600+ (2.4 GHz)
Computed on June 16, 2013 with Bayeselo based on 135,679 games
Note: Please see how to read the list
 RankEngine   RatingAv.
Op.
Perf.
Slope
Av.
Df.
Draw-
ness
GamesLOS
Ala 1 Alaric 707
Peter Fendrich (2007)
UCI
Swe
Europe 2768 +22
−22
V
−0.083
±0.317
o
o
86.8%
±7.8%
864      
100
   
    Alaric 704
Peter Fendrich (2007)
      2713 +26
−26
  −0.425
±0.619
o
103.9%
±11.7%
556
55
100
 
95.8
81
100
Ruf 2 Ruffian 1.0.5
Per-Ola Valfridsson (2003)
UCI
Swe
Europe 2687 +17
−17
  −0.095
±0.229
o
o
108.5%
±8.2%
1428
26
100
269
100
214
100
Kni 3 KnightDreamer 3.3
Johan Melin (2004)
WB
Swe
Europe 2499 +16
−16
  +0.003
±0.277
o
o
107.9%
±9.9%
1390
188
100
487
100
461
100
Ala 4 Alarm 0.93.1
Benny Antonsson, Erik Robertsson (2004)
WB
Swe
Europe 2226 +16
−16
  −0.089
±0.134
o
o
o
o
80.0%
±9.6%
1696
273
100
633
100
445
100
Mar 5 Marvin 1.3.0
Martin Danielsson (2005)
UCI
Swe
Europe 2054 +16
−16
Λ
−0.163
±0.128
o
o
o
o
o
98.7%
±11.2%
1877
172
100
477
99.9
204
100
Sha 6 Sharper 0.17
Albert Bertilsson (2003)
! WB !
Swe
Europe 2022 +16
−16
Λ
−0.046
±0.118
o
o
o
o
o
83.1%
±9.1%
1867
32
100
539
100
367
100
Rai 7 Rainman 0.7.5
Johnny Bigert (2003)
! WB !
Swe
Europe 1687 +24
−24
Λ
Λ
Λ
+0.012
±0.202
o
o
o
o
o
o
132.6%
±19.0%
800
335
100
551
100
519
 
Min 8 MiniMardi 1.3
Juan Pablo Fernandez Alvarez (2003)
WB
Swe
Europe 1503 +33
−33
Λ
Λ
Λ
Λ
Λ
Λ
Λ
Λ
Λ
−0.268
±0.461
o
o
o
o
o
o
o
o
o
106.4%
±26.4%
480
184
   
     

Score matrix

Custom engine selection (best versions only)
#NameElo12345678
1Alaric 7072768 66%
18.5/28
      
2Ruffian 1.0.5268734%
9.5/28
       
3KnightDreamer 3.32499        
4Alarm 0.93.12226    61%
19.5/32
89%
28.5/32
  
5Marvin 1.3.02054   39%
12.5/32
 75%
21/28
  
6Sharper 0.172022   11%
3.5/32
25%
7/28
   
7Rainman 0.7.51687       80%
25.5/32
8MiniMardi 1.31503      20%
6.5/32
 
Score color legend:
(Only pairs with at least 20 games)
0% 5% 10% 15% 20% 25% 30% 35% 40% 45% 50% 55% 60% 65% 70% 75% 80% 85% 90% 95% 100%

LOS matrix

Custom engine selection (best versions only)
Each cell shows likelihood of superiority of one engine over the other one, in percents. These numbers are computed using Bayeselo for the complete game database.
#NameElo12345678
1Alaric 7072768 100.0100.0100.0100.0100.0100.0100.0
2Ruffian 1.0.526870.0 100.0100.0100.0100.0100.0100.0
3KnightDreamer 3.324990.00.0 100.0100.0100.0100.0100.0
4Alarm 0.93.122260.00.00.0 100.0100.0100.0100.0
5Marvin 1.3.020540.00.00.00.0 99.9100.0100.0
6Sharper 0.1720220.00.00.00.00.1 100.0100.0
7Rainman 0.7.516870.00.00.00.00.00.0 100.0
8MiniMardi 1.315030.00.00.00.00.00.00.0 
LOS color legend:
0 10 20 30 40 50 60 70 80 90 100

Alter engine selection



Alter output selection

Rating list
      Protocols
      Logos
      Flags
      Continents
     LOS columns:

Crosstables:
Results
Performances
Score
LOS
Ponder hit
Eval difference
Proportion of draws
Number of games
Number of connecting games
Percentage of connecting games
Expected score
Score with common opponents
Score with all opponents
Performance with common opponents
Performance with all opponents
LOS with common opponents
LOS with all opponents
Ponder hit with common opponents
Ponder hit with all opponents
Eval difference with common opponents
Eval difference with all opponents

Ponder hit: most similar pairs
Ponder hit: most similar pairs (different families only)
Ponder hit: most different pairs
Ponder hit: most different pairs (same families only)
Eval diff: most similar pairs
Eval diff: most similar pairs (different families only)
Eval diff: most different pairs
Eval diff: most different pairs (same families only)

Maximum size of cross-tables (from 2 to 200):
Limit crosstables to engines in Elo range: to

Cross-tables show only best version of each engine
Highlight diagonal of cells wide. (0 to highlight everything)

Reference rating list:
Recalibrate:
  No recalibration (reference and current list are compared as they are)
  Recalibrate reference list to current one using selected engines only
  Recalibrate reference list to current one using all common engines
  Recalibrate current list to reference using selected engines only
  Recalibrate current list to reference using all common engines


Created in 2005-2012 by Kirill Kryukov
Updated on June 16, 2013