FRC Openings Statistics
FRC Openings Statistics
It is slightly hard to find, but under the games page there is a link called Games By Opening
It opens up this page
http://www.computerchess.org.uk/ccrl/40 ... y_eco.html
The columns are clickable, and it sorts them. So click on "white score" and it gets interesting. Screen shot of first few lines
It opens up this page
http://www.computerchess.org.uk/ccrl/40 ... y_eco.html
The columns are clickable, and it sorts them. So click on "white score" and it gets interesting. Screen shot of first few lines
Re: FRC Openings Statistics
An example, here is the first opening in that list - 20 games, all decided (no draws), and you can see of course that each opening is played with reverse sides.
Re: FRC Openings Statistics
Of course there are not enough games in the database for meaningful statistics. As time permits I'm going to do some further analysis of the first few
Re: FRC Openings Statistics
nbrkbrnq opening experiment with Naum 3.1 64-bit playing itself in finished (350 games)
Total 350 games
- 35 drawn
- 289 wins for white
- 26 wins for black
So, White Score 82% - very high indeed
I'll repeat this with Rybka 3 when it comes out. At this early stage, I have to say if a human or engine gets this opening as white, they might have got lucky
Total 350 games
- 35 drawn
- 289 wins for white
- 26 wins for black
So, White Score 82% - very high indeed
I'll repeat this with Rybka 3 when it comes out. At this early stage, I have to say if a human or engine gets this opening as white, they might have got lucky
Re: FRC Openings Statistics
An experiment with NBQNBRKR with Naum 3.1 64-bit playing itself also completed (300 games)
Total 300 games
- 156 drawn
- 19 wins for white
- 125 wins for black
So, here clearly playing as black is an advantage, although white does have more than a 50% chance of salvaging a draw.
The result here bears no relationship to what was seen in our FRC games database, emphasising that statistically in the FRC database itself we don't have enough games to draw any valid conclusions about opening positions. I will also play this opening with Rybka 3 when it is out
Total 300 games
- 156 drawn
- 19 wins for white
- 125 wins for black
So, here clearly playing as black is an advantage, although white does have more than a 50% chance of salvaging a draw.
The result here bears no relationship to what was seen in our FRC games database, emphasising that statistically in the FRC database itself we don't have enough games to draw any valid conclusions about opening positions. I will also play this opening with Rybka 3 when it is out
Re: FRC Openings Statistics
Of course, what I'm doing here on a small scale is what the "chess960 at home" project is or was all about I think.
That project used BOINC distributed computing to play thousands of Glaurung vs Glaurung matches with the aim of gaining good statistical information. That project though appears to have stopped ?
That project used BOINC distributed computing to play thousands of Glaurung vs Glaurung matches with the aim of gaining good statistical information. That project though appears to have stopped ?
- Kirill Kryukov
- Site Admin
- Posts: 7399
- Joined: Sun Dec 18, 2005 9:58 am
- Sign-up code: 0
- Location: Mishima, Japan
- Contact:
Re: FRC Openings Statistics
I don't think you can get meaningful statistics by repeating the same starting positions with the same engines 100 times. Because there is a large chance of repeating the whole game or the beginning of the game. The engine you are using has particular understanding of particular positions. Even if there is sometimes variation in the chosen moves, the results are not statistically reliable to say something about the position itself. All you can say is that in Naum's understanding the position is usually won by white.
I suggest to apply broader approach. Take one starting position, for example the same "nbrkbrnq". Then take all FRC-capable engines and let them play a round-robin from that position. Two circles at least, to have the position used in both ways in each pair. Four circles are probably OK too, but I doubt the usefulness of any more than that. You should get a decent number of games and a sound statistics for this position from "blitz computer chess" point of view.
If you want to further extend your knowledge about this particular position, repeat the whole round-robin in slightly longer time control. Then a bit more longer, etc... This again will provide some interesting information about this position. For example, by seeing whether the average white score is increasing or decreasing with longer time control, you will see the tendency and predict how this position turns out in much more long time control. This is very interesting stuff, to do this kind of thing even for one starting position would be great!
I suggest to apply broader approach. Take one starting position, for example the same "nbrkbrnq". Then take all FRC-capable engines and let them play a round-robin from that position. Two circles at least, to have the position used in both ways in each pair. Four circles are probably OK too, but I doubt the usefulness of any more than that. You should get a decent number of games and a sound statistics for this position from "blitz computer chess" point of view.
If you want to further extend your knowledge about this particular position, repeat the whole round-robin in slightly longer time control. Then a bit more longer, etc... This again will provide some interesting information about this position. For example, by seeing whether the average white score is increasing or decreasing with longer time control, you will see the tendency and predict how this position turns out in much more long time control. This is very interesting stuff, to do this kind of thing even for one starting position would be great!
KCEC | EGTB Online | 3x3 Chess | 3x4 Chess | 4x4 Chess | Longest Checkmates | EGTB Test Suite | Opening Sampler | EGTB Bounty | NULP
Re: FRC Openings Statistics
Yes, but this is why I intend to repeat the experiment withKirill Kryukov wrote:I don't think you can get meaningful statistics by repeating the same starting positions with the same engines 100 times. Because there is a large chance of repeating the whole game or the beginning of the game. The engine you are using has particular understanding of particular positions. Even if there is sometimes variation in the chosen moves, the results are not statistically reliable to say something about the position itself. All you can say is that in Naum's understanding the position is usually won by white.
Hiarcs vs Hiarcs
Shredder vs Shredder
Rybka vs Rybka
Sjeng vs Sjeng
If the results across all engines are consistent, then it surely does say something about the position (for blitz games anyway)
You're right i need to keep this small, and focus on one position for now, and analyse it well in various different ways, and at longer time controls too.
Re: FRC Openings Statistics
Results so far for Hiarcs 12 vs Hiarcs 12 for nbrkbrnq
Total 94 games
- 49 drawn
- 28 wins for white
- 17 wins for black
Contrast this with the same position using Naum vs Naum
Total 350 games
- 35 drawn
- 289 wins for white
- 26 wins for black
Totally different results, and Kirill is right, this method of analysing opening positions is totally useless
Total 94 games
- 49 drawn
- 28 wins for white
- 17 wins for black
Contrast this with the same position using Naum vs Naum
Total 350 games
- 35 drawn
- 289 wins for white
- 26 wins for black
Totally different results, and Kirill is right, this method of analysing opening positions is totally useless
-
- Posts: 223
- Joined: Mon Feb 19, 2007 8:24 am
- Sign-up code: 0
- Location: Amsterdam
- Contact:
Re: FRC Openings Statistics
Well, I would not say it is completely useless, but you would need an engine that randomizes its play very well. And for most engines we cannot control that.
Kirill's way also has its pitfalls. When doing the test with different engines, you will be playing engines of uneqal strength against each other. And even if you let them play both sides, this tends to push the score towards 50%. This can be corrected for if the strength difference of the engines is known. In fact the white adantage of each position could be translated to Elo points, and put as independent fit parameters in the rating determination.
Games between players of very different strength will hardly contribute to determination of these fit parameters, though. So it would be essential to play prgrams that are as equal in strength as possible. If I were to conduct such a study, I would give the engines time-odds to neutralize their Elo difference as much as possible, so that you can employ virtually all existing engines in a useful way. The CCRL rating lsist should give you a pretty good idea how to handicap them in order to bring their Elo within, say, 100 points from each other. And then put them in a large round-robin tournament, of only 2 games per pairing. (This is what I did to make the Knightmate tournament interesting, although I had only 5 engines that could play the game at all.)
Kirill's way also has its pitfalls. When doing the test with different engines, you will be playing engines of uneqal strength against each other. And even if you let them play both sides, this tends to push the score towards 50%. This can be corrected for if the strength difference of the engines is known. In fact the white adantage of each position could be translated to Elo points, and put as independent fit parameters in the rating determination.
Games between players of very different strength will hardly contribute to determination of these fit parameters, though. So it would be essential to play prgrams that are as equal in strength as possible. If I were to conduct such a study, I would give the engines time-odds to neutralize their Elo difference as much as possible, so that you can employ virtually all existing engines in a useful way. The CCRL rating lsist should give you a pretty good idea how to handicap them in order to bring their Elo within, say, 100 points from each other. And then put them in a large round-robin tournament, of only 2 games per pairing. (This is what I did to make the Knightmate tournament interesting, although I had only 5 engines that could play the game at all.)
Re: FRC Openings Statistics
Yes, I had identified the ELO differences as a potential problem with Kirill's approach. My choice would just be to choose a bunch of closely rated engines.
Of course, we could just choose a starting position, and let a few different engines run infinite analysis on it for say 24 hrs. And see what the evaluations were. Someone on the Rybka forum did that for all 960 positons, but clearly not for 24 hrs, just a minute or two I think it was, and found no unequal positions. But, the evauation at move 1 doesn't give the whole story, games and opportunities develop as the game progreses - hence I was wanting to look at game results on some different positions. Maybe there is just no good way to do what I'm wanting to do...
Of course, we could just choose a starting position, and let a few different engines run infinite analysis on it for say 24 hrs. And see what the evaluations were. Someone on the Rybka forum did that for all 960 positons, but clearly not for 24 hrs, just a minute or two I think it was, and found no unequal positions. But, the evauation at move 1 doesn't give the whole story, games and opportunities develop as the game progreses - hence I was wanting to look at game results on some different positions. Maybe there is just no good way to do what I'm wanting to do...
- Kirill Kryukov
- Site Admin
- Posts: 7399
- Joined: Sun Dec 18, 2005 9:58 am
- Sign-up code: 0
- Location: Mishima, Japan
- Contact:
Re: FRC Openings Statistics
Unfortunately I am not aware of a FRC-capable GUI that can automatically pair only close engines based on their ELO difference (achieving a "diagonal" tourney, like my blitz comparison), or automatically give time handicap based on ELO difference. Having such GUI available would simplify such experiments a lot.
KCEC | EGTB Online | 3x3 Chess | 3x4 Chess | 4x4 Chess | Longest Checkmates | EGTB Test Suite | Opening Sampler | EGTB Bounty | NULP
Re: FRC Openings Statistics
We have the FRC ratings list, so I manually choose the engines with close ratings.... GUI doesn't need to know
But anyway I'm beginning to think that there just is no good way to do what I'm trying to do
But anyway I'm beginning to think that there just is no good way to do what I'm trying to do
Re: FRC Openings Statistics
Interestingly Rybka has a "randomizer" feature which I wasn't aware of, mentioned hereh.g.muller wrote:Well, I would not say it is completely useless, but you would need an engine that randomizes its play very well. And for most engines we cannot control that.
http://www.rybkachess.com/index.php?aus ... +for+v+2.x
Might be just what I need
-
- Posts: 223
- Joined: Mon Feb 19, 2007 8:24 am
- Sign-up code: 0
- Location: Amsterdam
- Contact:
Re: FRC Openings Statistics
Why would the time handicap have to be automatic? The ratings are known in advance, as they should, as you would want to give the engines the same handicap during the entire tournament. So it seems to me that a fixed, manually precalculatd handicap factor (e.g. exp(DELTA_Elo/100)) would be more than sufficient. And you could run that under PSWBTM + WinBoard (4.3.13 or higher).Kirill Kryukov wrote:Unfortunately I am not aware of a FRC-capable GUI that can automatically pair only close engines based on their ELO difference (achieving a "diagonal" tourney, like my blitz comparison), or automatically give time handicap based on ELO difference. Having such GUI available would simplify such experiments a lot.
- Kirill Kryukov
- Site Admin
- Posts: 7399
- Joined: Sun Dec 18, 2005 9:58 am
- Sign-up code: 0
- Location: Mishima, Japan
- Contact:
Re: FRC Openings Statistics
It's a pain to set up manual handicaps for each pair in 16 engine tournament (CCRL 40/4 FRC has 16 engines right now if we count only best versions). What I mean by automatic time handicap is the handicap given by GUI automatically based on the engine ratings, known in advance.h.g.muller wrote:Why would the time handicap have to be automatic? The ratings are known in advance, as they should, as you would want to give the engines the same handicap during the entire tournament. So it seems to me that a fixed, manually precalculatd handicap factor (e.g. exp(DELTA_Elo/100)) would be more than sufficient. And you could run that under PSWBTM + WinBoard (4.3.13 or higher).Kirill Kryukov wrote:Unfortunately I am not aware of a FRC-capable GUI that can automatically pair only close engines based on their ELO difference (achieving a "diagonal" tourney, like my blitz comparison), or automatically give time handicap based on ELO difference. Having such GUI available would simplify such experiments a lot.
KCEC | EGTB Online | 3x3 Chess | 3x4 Chess | 4x4 Chess | Longest Checkmates | EGTB Test Suite | Opening Sampler | EGTB Bounty | NULP
-
- Posts: 223
- Joined: Mon Feb 19, 2007 8:24 am
- Sign-up code: 0
- Location: Amsterdam
- Contact:
Re: FRC Openings Statistics
What difference does it make if you have to enter 16 ratings, or 16 handicap factors? In both cases you will have to enter 16 numbers, as no GUI will be able to guess the rating of an engine.
The factors are also very easy to caculate, as exp(rating/100). You can almost calculate them by hand (a factor 2 for each 70 Elo you want to dumb it down). As it is good enough to equalize the Elo upto 20-30 points, E.g. if you want to dumb down 250 Elo, that is between 210 and 280, so between 3 and 4 factors 2, is between 8 and 16. Closer to 280 than to 210, so take 13. Close enough... An alternative is, you make a table for it (rewuired factor vs Elo drop).
The factors are also very easy to caculate, as exp(rating/100). You can almost calculate them by hand (a factor 2 for each 70 Elo you want to dumb it down). As it is good enough to equalize the Elo upto 20-30 points, E.g. if you want to dumb down 250 Elo, that is between 210 and 280, so between 3 and 4 factors 2, is between 8 and 16. Closer to 280 than to 210, so take 13. Close enough... An alternative is, you make a table for it (rewuired factor vs Elo drop).