FRC Openings Statistics

Questions and comments related to CCRL testing study
Post Reply
Ray
Posts: 22583
Joined: Sun Dec 18, 2005 6:33 pm
Sign-up code: 10159
Location: NZ

FRC Openings Statistics

Post by Ray »

It is slightly hard to find, but under the games page there is a link called Games By Opening

It opens up this page
http://www.computerchess.org.uk/ccrl/40 ... y_eco.html

The columns are clickable, and it sorts them. So click on "white score" and it gets interesting. Screen shot of first few lines
Ray
Posts: 22583
Joined: Sun Dec 18, 2005 6:33 pm
Sign-up code: 10159
Location: NZ

Re: FRC Openings Statistics

Post by Ray »

An example, here is the first opening in that list - 20 games, all decided (no draws), and you can see of course that each opening is played with reverse sides.
Ray
Posts: 22583
Joined: Sun Dec 18, 2005 6:33 pm
Sign-up code: 10159
Location: NZ

Re: FRC Openings Statistics

Post by Ray »

Of course there are not enough games in the database for meaningful statistics. As time permits I'm going to do some further analysis of the first few
Ray
Posts: 22583
Joined: Sun Dec 18, 2005 6:33 pm
Sign-up code: 10159
Location: NZ

Re: FRC Openings Statistics

Post by Ray »

nbrkbrnq opening experiment with Naum 3.1 64-bit playing itself in finished (350 games)

Total 350 games
- 35 drawn
- 289 wins for white
- 26 wins for black

So, White Score 82% - very high indeed

I'll repeat this with Rybka 3 when it comes out. At this early stage, I have to say if a human or engine gets this opening as white, they might have got lucky :!:
Ray
Posts: 22583
Joined: Sun Dec 18, 2005 6:33 pm
Sign-up code: 10159
Location: NZ

Re: FRC Openings Statistics

Post by Ray »

An experiment with NBQNBRKR with Naum 3.1 64-bit playing itself also completed (300 games)

Total 300 games
- 156 drawn
- 19 wins for white
- 125 wins for black

So, here clearly playing as black is an advantage, although white does have more than a 50% chance of salvaging a draw.

The result here bears no relationship to what was seen in our FRC games database, emphasising that statistically in the FRC database itself we don't have enough games to draw any valid conclusions about opening positions. I will also play this opening with Rybka 3 when it is out
Ray
Posts: 22583
Joined: Sun Dec 18, 2005 6:33 pm
Sign-up code: 10159
Location: NZ

Re: FRC Openings Statistics

Post by Ray »

Of course, what I'm doing here on a small scale is what the "chess960 at home" project is or was all about I think.
That project used BOINC distributed computing to play thousands of Glaurung vs Glaurung matches with the aim of gaining good statistical information. That project though appears to have stopped ?
User avatar
Kirill Kryukov
Site Admin
Posts: 7399
Joined: Sun Dec 18, 2005 9:58 am
Sign-up code: 0
Location: Mishima, Japan
Contact:

Re: FRC Openings Statistics

Post by Kirill Kryukov »

I don't think you can get meaningful statistics by repeating the same starting positions with the same engines 100 times. Because there is a large chance of repeating the whole game or the beginning of the game. The engine you are using has particular understanding of particular positions. Even if there is sometimes variation in the chosen moves, the results are not statistically reliable to say something about the position itself. All you can say is that in Naum's understanding the position is usually won by white.

I suggest to apply broader approach. Take one starting position, for example the same "nbrkbrnq". Then take all FRC-capable engines and let them play a round-robin from that position. Two circles at least, to have the position used in both ways in each pair. Four circles are probably OK too, but I doubt the usefulness of any more than that. You should get a decent number of games and a sound statistics for this position from "blitz computer chess" point of view.

If you want to further extend your knowledge about this particular position, repeat the whole round-robin in slightly longer time control. Then a bit more longer, etc... This again will provide some interesting information about this position. For example, by seeing whether the average white score is increasing or decreasing with longer time control, you will see the tendency and predict how this position turns out in much more long time control. This is very interesting stuff, to do this kind of thing even for one starting position would be great!
Ray
Posts: 22583
Joined: Sun Dec 18, 2005 6:33 pm
Sign-up code: 10159
Location: NZ

Re: FRC Openings Statistics

Post by Ray »

Kirill Kryukov wrote:I don't think you can get meaningful statistics by repeating the same starting positions with the same engines 100 times. Because there is a large chance of repeating the whole game or the beginning of the game. The engine you are using has particular understanding of particular positions. Even if there is sometimes variation in the chosen moves, the results are not statistically reliable to say something about the position itself. All you can say is that in Naum's understanding the position is usually won by white.
Yes, but this is why I intend to repeat the experiment with
Hiarcs vs Hiarcs
Shredder vs Shredder
Rybka vs Rybka
Sjeng vs Sjeng

If the results across all engines are consistent, then it surely does say something about the position (for blitz games anyway)

You're right i need to keep this small, and focus on one position for now, and analyse it well in various different ways, and at longer time controls too.
Ray
Posts: 22583
Joined: Sun Dec 18, 2005 6:33 pm
Sign-up code: 10159
Location: NZ

Re: FRC Openings Statistics

Post by Ray »

Results so far for Hiarcs 12 vs Hiarcs 12 for nbrkbrnq

Total 94 games
- 49 drawn
- 28 wins for white
- 17 wins for black

Contrast this with the same position using Naum vs Naum

Total 350 games
- 35 drawn
- 289 wins for white
- 26 wins for black

Totally different results, and Kirill is right, this method of analysing opening positions is totally useless :shock:
h.g.muller
Posts: 223
Joined: Mon Feb 19, 2007 8:24 am
Sign-up code: 0
Location: Amsterdam
Contact:

Re: FRC Openings Statistics

Post by h.g.muller »

Well, I would not say it is completely useless, but you would need an engine that randomizes its play very well. And for most engines we cannot control that.

Kirill's way also has its pitfalls. When doing the test with different engines, you will be playing engines of uneqal strength against each other. And even if you let them play both sides, this tends to push the score towards 50%. This can be corrected for if the strength difference of the engines is known. In fact the white adantage of each position could be translated to Elo points, and put as independent fit parameters in the rating determination.

Games between players of very different strength will hardly contribute to determination of these fit parameters, though. So it would be essential to play prgrams that are as equal in strength as possible. If I were to conduct such a study, I would give the engines time-odds to neutralize their Elo difference as much as possible, so that you can employ virtually all existing engines in a useful way. The CCRL rating lsist should give you a pretty good idea how to handicap them in order to bring their Elo within, say, 100 points from each other. And then put them in a large round-robin tournament, of only 2 games per pairing. (This is what I did to make the Knightmate tournament interesting, although I had only 5 engines that could play the game at all.)
Ray
Posts: 22583
Joined: Sun Dec 18, 2005 6:33 pm
Sign-up code: 10159
Location: NZ

Re: FRC Openings Statistics

Post by Ray »

Yes, I had identified the ELO differences as a potential problem with Kirill's approach. My choice would just be to choose a bunch of closely rated engines.

Of course, we could just choose a starting position, and let a few different engines run infinite analysis on it for say 24 hrs. And see what the evaluations were. Someone on the Rybka forum did that for all 960 positons, but clearly not for 24 hrs, just a minute or two I think it was, and found no unequal positions. But, the evauation at move 1 doesn't give the whole story, games and opportunities develop as the game progreses - hence I was wanting to look at game results on some different positions. Maybe there is just no good way to do what I'm wanting to do...
User avatar
Kirill Kryukov
Site Admin
Posts: 7399
Joined: Sun Dec 18, 2005 9:58 am
Sign-up code: 0
Location: Mishima, Japan
Contact:

Re: FRC Openings Statistics

Post by Kirill Kryukov »

Unfortunately I am not aware of a FRC-capable GUI that can automatically pair only close engines based on their ELO difference (achieving a "diagonal" tourney, like my blitz comparison), or automatically give time handicap based on ELO difference. Having such GUI available would simplify such experiments a lot.
Ray
Posts: 22583
Joined: Sun Dec 18, 2005 6:33 pm
Sign-up code: 10159
Location: NZ

Re: FRC Openings Statistics

Post by Ray »

We have the FRC ratings list, so I manually choose the engines with close ratings.... GUI doesn't need to know
But anyway I'm beginning to think that there just is no good way to do what I'm trying to do
Ray
Posts: 22583
Joined: Sun Dec 18, 2005 6:33 pm
Sign-up code: 10159
Location: NZ

Re: FRC Openings Statistics

Post by Ray »

h.g.muller wrote:Well, I would not say it is completely useless, but you would need an engine that randomizes its play very well. And for most engines we cannot control that.
Interestingly Rybka has a "randomizer" feature which I wasn't aware of, mentioned here

http://www.rybkachess.com/index.php?aus ... +for+v+2.x

Might be just what I need :wink:
h.g.muller
Posts: 223
Joined: Mon Feb 19, 2007 8:24 am
Sign-up code: 0
Location: Amsterdam
Contact:

Re: FRC Openings Statistics

Post by h.g.muller »

Kirill Kryukov wrote:Unfortunately I am not aware of a FRC-capable GUI that can automatically pair only close engines based on their ELO difference (achieving a "diagonal" tourney, like my blitz comparison), or automatically give time handicap based on ELO difference. Having such GUI available would simplify such experiments a lot.
Why would the time handicap have to be automatic? The ratings are known in advance, as they should, as you would want to give the engines the same handicap during the entire tournament. So it seems to me that a fixed, manually precalculatd handicap factor (e.g. exp(DELTA_Elo/100)) would be more than sufficient. And you could run that under PSWBTM + WinBoard (4.3.13 or higher).
User avatar
Kirill Kryukov
Site Admin
Posts: 7399
Joined: Sun Dec 18, 2005 9:58 am
Sign-up code: 0
Location: Mishima, Japan
Contact:

Re: FRC Openings Statistics

Post by Kirill Kryukov »

h.g.muller wrote:
Kirill Kryukov wrote:Unfortunately I am not aware of a FRC-capable GUI that can automatically pair only close engines based on their ELO difference (achieving a "diagonal" tourney, like my blitz comparison), or automatically give time handicap based on ELO difference. Having such GUI available would simplify such experiments a lot.
Why would the time handicap have to be automatic? The ratings are known in advance, as they should, as you would want to give the engines the same handicap during the entire tournament. So it seems to me that a fixed, manually precalculatd handicap factor (e.g. exp(DELTA_Elo/100)) would be more than sufficient. And you could run that under PSWBTM + WinBoard (4.3.13 or higher).
It's a pain to set up manual handicaps for each pair in 16 engine tournament (CCRL 40/4 FRC has 16 engines right now if we count only best versions). What I mean by automatic time handicap is the handicap given by GUI automatically based on the engine ratings, known in advance.
h.g.muller
Posts: 223
Joined: Mon Feb 19, 2007 8:24 am
Sign-up code: 0
Location: Amsterdam
Contact:

Re: FRC Openings Statistics

Post by h.g.muller »

What difference does it make if you have to enter 16 ratings, or 16 handicap factors? In both cases you will have to enter 16 numbers, as no GUI will be able to guess the rating of an engine.

The factors are also very easy to caculate, as exp(rating/100). You can almost calculate them by hand (a factor 2 for each 70 Elo you want to dumb it down). As it is good enough to equalize the Elo upto 20-30 points, E.g. if you want to dumb down 250 Elo, that is between 210 and 280, so between 3 and 4 factors 2, is between 8 and 16. Closer to 280 than to 210, so take 13. Close enough... An alternative is, you make a table for it (rewuired factor vs Elo drop).
Post Reply