Hi everyone,
I was wondering what you thought about the use of statistics to determine effective openings. Are they really predictive? I've seen authors who will suggest an opening based on a small statistical edge. But others claim that the stats are useless because too much happens between the opening and the endgame. Any thoughts on whether (or how) to use opening statistics?
Scott
Like all statistics, the sampling pool seems to be the most important criterion for validity. In chess, this would appear to manifest itself as follows: large numbers of recent games between top players will, I imagine, produce results that can be statistically coherent. If, in the last thousand games between 2600+ players in a 10 year span where such-and-such opening was played, black won 59% of the time, it's probably a good sign that the opening sucks for white. On the other hand, if the pool includes games between patzers going back to the 1800s, I wouldn't trust it for reliable information...
In general terms it works, but paradoxically when your opponent plays the best moves it continues to whittle down your winning chances until there isn't enough to win. Say for example your database says after five moves in the Sicilian you have a 65 % chance of winning and your database has umpteen opponents making lousy moves, you can take it with a grain of salt. If your opponent is also using a database and continues to make the best percentage moves, your original 65% ends up around 48.2% winning (or less). That's why novelties are so prized by pros. The new moves are an unknown factor to the opponent. No percentages to go on, just knowledge and experience.
Database stats probably help you to make sound moves, rather than winning moves. Any move that has been played frequently at GM/IM level will be sound.
Once the sample size whittles down, it is quite interesting to see some high scores for iffy moves - the reason for this is probably because a strong GM has played it fairly often. It is not the move that's the winner, it's the GM.
You can access databases of 2.5+ million games with a few clicks of the mouse, especially if you have ChessBase. But, I found that if you choose only decisive games between players with an average rating of 2200, you're down to a little over 600,000 games.
Quality matters, not just quantity.
But if you're playing 3 days per move, and you're ignoring chess history (opening books, databases of some sort), you're cheating yourself of an opportunity for productive study and chess improvement.
Say the 14th move gives a 65% chance of winning. The 65% not only represents the value of the move it could also represent the value of subsequent play which may or may not have anything to do with that particular move. And that is directly related to the quality of the player involved. So the stats may be misleading. Just playing percentages is not enough.