Project

JWILD · Post by **JWILD** » Tue Mar 03, 2015 4:11 pm

Dear CCRL community,
I am a graduate student at American University. I need to do a term project for a statistics course, Regression. I would love to use the data on the "Complete List," here: http://www.computerchess.org.uk/ccrl/40 ... t_all.html
Unfortunately doing so would require a large amount of data entry, unless someone can point me towards or send me a useable format ( .txt would be best, but Excel, or SPSS formats are fine too). Basically I am interested in predictors of Elo (ponder hits, draw%, score% (this is a no brainer), average opponent (another no brainer), etc.). I'm sure that score and average opponent will be hugely predictive of Elo, and that's fine for my purposes. I'm just interested in how predictive score and average opponent are, as it cannot be a perfect relation because Elo weights recent games more heavily than older games (weighted average) whereas Score does not (simple mean of results). ANY help would be greatly appreciated.
Basically,
1. Where can I find the data in a manageable form or who might I speak to to request it?
2. Does anyone have any recommendations for predictors other than the ones I mentioned? LOS?
Thank you for reading!
- Jordan

Post by **Kirill Kryukov** » Wed Mar 04, 2015 1:54 pm

Welcome and good luck with this project!

JWILD wrote:Elo weights recent games more heavily than older games (weighted average).

True for human games, but not in computer chess. In engine games generally all games are weighted equally when computing the ratings.

JWILD wrote:1. Where can I find the data in a manageable form or who might I speak to to request it?

Ratings and all statistics is computed based on the database of games. All CCRL games can be downloaded on the Games section of the web-site. You can then compute the ratings by yourself using, e.g., Bayeselo. Alternatively you can extract the ratings and stats from our web-site pages - easy for ratings, but probably not optimal for large tables such as LOS.

JWILD wrote:2. Does anyone have any recommendations for predictors other than the ones I mentioned? LOS?

LOS is computed only after you know the Elo rating of engine in question, so it probably can't be a predictor. You can try scores and opponent ratings as you mentioned. For me more interesting would be to try and find improvement in the rating model itself.

If you'll have any difficulties extracting data from PGN database or CCRL web-site, please post and someone will try to help!

CCRL Discussion Board

Project

Project

Re: Project