Newb Qustions

Questions and comments related to CCRL testing study
Post Reply
Erick Van Dam
Posts: 1
Joined: Fri Mar 04, 2011 8:44 pm
Sign-up code: 10159

Newb Qustions

Post by Erick Van Dam »

Fisrt off, GREAT SITE!

I'm currently writing a chess engine from scratch and have been pitting it against some of the cellar dwellers in the 40/4 list. I'm trying to adhere to your testing restraints as much as possible. Have a few questions.

- Are the EGTB engine specific, or are they being implemented by the front end kind of like the Openings?
- Bayeselo seems to give a delta elo rating. Did you guys start all engines at some baseline, or am I not using it properly. I'm really just trying to get a idea of where my engine would stack up. I would see about submitting it, but it is under heavy development.
- Is there a way for Bayeselo to be aware of the ratings of engines and therefore give a more accurate representation of an engine I run a gauntlet with?

I'm sure I will think of more. Again great site.
User avatar
Kirill Kryukov
Site Admin
Posts: 7399
Joined: Sun Dec 18, 2005 9:58 am
Sign-up code: 0
Location: Mishima, Japan
Contact:

Re: Newb Qustions

Post by Kirill Kryukov »

Erick Van Dam wrote:Fisrt off, GREAT SITE!
Hi Erick, welcome to this forum and thanks for comments!
Erick Van Dam wrote:I'm currently writing a chess engine from scratch and have been pitting it against some of the cellar dwellers in the 40/4 list. I'm trying to adhere to your testing restraints as much as possible. Have a few questions.

- Are the EGTB engine specific, or are they being implemented by the front end kind of like the Openings?
EGTB support is engine specific for the best results. Actually it's a bit confusing topic, because I've seen some testers using EGTB adjudication, where EGTB is queried by GUI to decide the game outcome. I never use such adjuducation in my matches. In any case, such adjudication can only happen when an engine is already in a tablebase ending, whereas the most benefit from the tablebases is obtained earlier, by using them in search. If you are not sure which EGTB to implement, I recommend you to look at Gaviota tablebases.
Erick Van Dam wrote:- Bayeselo seems to give a delta elo rating. Did you guys start all engines at some baseline, or am I not using it properly. I'm really just trying to get a idea of where my engine would stack up. I would see about submitting it, but it is under heavy development.
Only Elo differences have meaning, a single Elo value tells you nothing. So you can add or subtract any number from your Elo ratings, to shift your rating scale up or down. Our ratings were originally calibrated to SSDF using a weighted average ratings of all shared engines.
Erick Van Dam wrote:- Is there a way for Bayeselo to be aware of the ratings of engines and therefore give a more accurate representation of an engine I run a gauntlet with?
Yes, there is. What you call "ratings of engines" usually comes from some huge database of games. Just merge that database and your games together, and run Bayeselo on this new combined database to compute the ratings.
Erick Van Dam wrote:I'm sure I will think of more. Again great site.
Good luck with your engine! :-)
Janne Kokkala
Posts: 3
Joined: Sun Nov 23, 2014 8:07 pm
Sign-up code: 10159

Re: Newb Qustions

Post by Janne Kokkala »

Kirill Kryukov wrote:
Erick Van Dam wrote:- Bayeselo seems to give a delta elo rating. Did you guys start all engines at some baseline, or am I not using it properly. I'm really just trying to get a idea of where my engine would stack up. I would see about submitting it, but it is under heavy development.
Only Elo differences have meaning, a single Elo value tells you nothing. So you can add or subtract any number from your Elo ratings, to shift your rating scale up or down. Our ratings were originally calibrated to SSDF using a weighted average ratings of all shared engines.
Erick Van Dam wrote:- Is there a way for Bayeselo to be aware of the ratings of engines and therefore give a more accurate representation of an engine I run a gauntlet with?
Yes, there is. What you call "ratings of engines" usually comes from some huge database of games. Just merge that database and your games together, and run Bayeselo on this new combined database to compute the ratings.
I understand that you run BayesElo with the whole database before each list update to get the ratings, am I right? If so, how do you precisely then determine the number to add to the ratings to get the numbers given in the list?

Reason for asking: I'd like to know roughly the "CCRL rating" of my (soon to be published) engine. So far, I have been running BayesElo with the downloaded CCRL database added with my test results. As the number of games added by me is relatively small, the rating differences of other engines remain approximately constant, so I can determine the offset by comparing the CCRL ratings of a few engines to the ones produced by BayesElo. However, this is a bit tedious to do by hand, so I'd like to know if there is a way I can do also this step automatically. Of course, I can always resort to writing an ad hoc script that compares the BayesElo output to downloaded CCRL rating list, but I feel that that's not very elegant.
User avatar
Kirill Kryukov
Site Admin
Posts: 7399
Joined: Sun Dec 18, 2005 9:58 am
Sign-up code: 0
Location: Mishima, Japan
Contact:

Re: Newb Qustions

Post by Kirill Kryukov »

Janne Kokkala wrote:I understand that you run BayesElo with the whole database before each list update to get the ratings, am I right?
Yes.
Janne Kokkala wrote:If so, how do you precisely then determine the number to add to the ratings to get the numbers given in the list?

Reason for asking: I'd like to know roughly the "CCRL rating" of my (soon to be published) engine. So far, I have been running BayesElo with the downloaded CCRL database added with my test results. As the number of games added by me is relatively small, the rating differences of other engines remain approximately constant, so I can determine the offset by comparing the CCRL ratings of a few engines to the ones produced by BayesElo. However, this is a bit tedious to do by hand, so I'd like to know if there is a way I can do also this step automatically. Of course, I can always resort to writing an ad hoc script that compares the BayesElo output to downloaded CCRL rating list, but I feel that that's not very elegant.
The number to add is chosen each time to bring the average rating of several selected engines close to a few years old snapshot of the list. Of course, only engines present in both current list and the snapshot can be used for this. This is done automatically, however we don't have a separate script for just this step (it's done together with other processing). So I think writing a simple script to adjust your list to reference list would be a good idea for you.
Post Reply