I was giving some thought to this system of analysis recently when contemplating the future of the game mods (which in my opinion isn't coming back, sadly, but that's a different topic), and I've come to decide that this is a good system, but some extra mathy stuff might be useful to decide how useful the game is to look at. For instance, a game that's all theory, then a bunch of forced moves, then drawn is near useless, but will make both players look like cheats. Of course, you could look at it and tell, but it would be interesting to try to automate this process as much as possible. An interesting idea is to devise some kind of dimensionless quantity (forgive me, I've been too wrapped up in Fluid Mechanics, so I think in terms of dimensionless numbers all the time), possibly called the Rotella # (after the person who invented it of course), that's a measure of how many moves in a game required deep thought (out of the total N moves that aren't theoretical), or posed a problem to a player. The higher the Rotella # (Ro from now on) the more likely a high match up rate means something. This wouldn't be hard to implement. For instance,
It wouldn't be too hard to take the top n move choices of an engine on a turn and see how much of a difference between the moves there are. You could set some threshold, let's just say .2 pawns, by which you count the m number of moves in this group that are within .2 pawns of the first choice. So each turn you'd have some fraction m/n that gives you a measure of how difficult it is to select the computers first choice. Then:
Ro = [sum (m/n) ]/ N
Thus a Ro of 1 would mean every turn was very difficult to select the computers first choice of moves, and a very small Ro would mean that the game means almost nothing. That said, Ro would depend heavily on n and the pawn fraction, but it would be interesting nonetheless.
P.S. - Now that I think about it, we might have had this when I was a game mod, and I just didn't know what it was. Maybe not. I'll wait for someone to tell me. 😀