Originally posted by XanthosNZ Yes but old hash table data is written over once the table is full. What if one side starts overwriting data from the other engine?
They have separate RAM, hopefully enough for overwriting of their own hash tables not to be a problem.
Originally posted by XanthosNZ Yes but old hash table data is written over once the table is full. What if one side starts overwriting data from the other engine?
I don't see how that is possible?! Each engine will run in its own process, and each process will have its own memory space. So, each has its own "protected" hash table.
Regarding clearing of hash tables, I think some engines have a parameter setting to do this or not after each move. Most engines will maintain their hash tables between moves (in the context of a "permanent brain off" engine match, running in a Fritz style GUI).
Originally posted by TippedKing After reading countless postings about how Fritz is yesterday's news and HIARCS can easily cream it, HIARCS gets better depth more quickly, blah blah blah, I decided to run my own little engine tournament to see for myself.
In a 40/40 format, 319MB each for hash table, permanent brain off, both having access to my EGTB directory, and each using their o ...[text shortened]... next time I see one of the postings dissing Fritz. It obviously still knows what it is doing.
You have to play hundreds, perhaps thousands pf games to get an accurate rating when matching engines.
Originally posted by LanndonKane You have to play hundreds, perhaps thousands pf games to get an accurate rating when matching engines.
I used to waste my time explaining margins of error to people who made wild assertions like this, but that is in the past. You can look it up yourself. Use terms like 'standard error', 'statistics' and 'sample size'. You will find that as your sample size increases, your accuracy does improve, but that sample sizes do not have to be all that high to still get accurate information.
Yes, ten is a small sample size, but with the results being as swayed as they were you will find that Fritz is still a clear winner (under the parameters used in the test) even after you account for error margins. You will also find that you reach a point where additional games being added to the sample do not significantly improve the statistical accuracy of the sample, and you will find you reach that long before you get anywhere near 'thousands of games'.
What does engine vs. engine strength matter to anyone other than the engine manufacturers (who use it as a sales tool) and people playing engine vs. engine matches?
Engines playing against engines will never have to evaluate positions that they don't reach in these games. But what happens when you stick one such position into the engine for analysis (say it occured during a game you played)? Now your engine is doing something different to what you measured. Yes, there will be a relationship (good engines will tend to be good at both) but they aren't the same.
Originally posted by XanthosNZ What does engine vs. engine strength matter to anyone other than the engine manufacturers (who use it as a sales tool) and people playing engine vs. engine matches?
Engines playing against engines will never have to evaluate positions that they don't reach in these games. But what happens when you stick one such position into the engine for analysis (sa ...[text shortened]... e will be a relationship (good engines will tend to be good at both) but they aren't the same.
I could not agree more with these statements. That isn't what my little impromptu test was meant to measure though. I simply had read all kinds of things that made it sound like the new HIARCS would just hands down completely slaughter Fritz because it was so positionally superior and calculated so much faster. I was genuinely curious, so I ran my own little engine match.
I will be the first to admit engine vs engine means pretty much nothing other than how well it plays against that particular engine with those particular settings.
Running a test like what you have described would be much more relevant, but is also a lot more difficult to do. Feeding the positions isn't so hard, but any results that don't match the 'expected' outcome need to be analyzed to make sure the engine didn't find a better solution than the expected one 😉
Originally posted by TippedKing I could not agree more with these statements. That isn't what my little impromptu test was meant to measure though. I simply had read all kinds of things that made it sound like the new HIARCS would just hands down completely slaughter Fritz because it was so positionally superior and calculated so much faster. I was genuinely curious, so I ran my own ...[text shortened]... be analyzed to make sure the engine didn't find a better solution than the expected one 😉
I thought Hiracs was a slower calculating engine (so less depth) but with a more complex evaluation algorithm. Shredder being the opposite (deeper but a simpler evaluation) and Fritz being in between.
Originally posted by TippedKing I could not agree more with these statements. That isn't what my little impromptu test was meant to measure though.
you could set up a computer account to FICS, and let the engines loose for some fixed number of games one engine at a time, and compare the ratings they achieve. you'd get masses of human opponents that way. - there's a slight problem though: the latter engine would continue from the final rating of the first one... but, given big enough sample size that could be ignored.