Moderator: Pachnes
astokes wrote:Kirill, your points on independence and accuracy are good ones, though I would have guessed DTC were similarly independent. Perhaps I don't fully grasp the metrics yet. The chess glossary defines DTC as "any conversion of material" which may be "either a promotion or capture". Doesn't this give you a constant-value boundary (zero) before leaping into a subordinate table?
astokes wrote:I guess DTC is an odd duck, because it doesn't have an advantageous size, nor does it express jugular efficiency (though it will steer you there eventually). DTC would be the algorithmic view of things, DTM the chess purist perspective, and DTZ the "blessed by committee" perspective. The WDL tables we are contemplating are actually WDL-unbounded and are not the same as WLD derived from DTZ-50, so these don't please everyone, either, despite being seemingly semantic free.
astokes wrote:I think I've detected the unstated assumption: the KPPP-KPP style piece combinations can't be generated with full chess accuracy without first generating myriads of promotion tables, most of which won't permit this kind of dataflow simplification, and if you've done the hard part already, why waste time coding up a quick conclusion for just one piece set? (My dirty secret, which I mentioned to Kirill in personal mail, is that I'm willing to tread somewhat lightly on full chess accuracy, so I never assume I'll bother will all those promotion tables before computing a piece set of interest.)
astokes wrote:Forum nit: "Log me in automatically" is not having much effect on my Firefox of many paranoid extensions. NoScript is the usual culprit, but k-k.com shows as completely green listed. It's disorienting, since after I log in, if I use the back button to return to the original resource (as is my habit), my login does not take effect (I tried a page refresh as well with no success).
Kirill Kryukov wrote: With WDL sub-endgame tables, however, any errors in DTx will remain there and not affect the other tables. This should allow to relax the verification requirements somewhat (e.g., to postpone the verification, because any error can be corrected locally without having to recompute a large number of dependant tables). This idea, as far as I understand, was not a part of Guy's proposal. Although the idea is trivial, I never saw it mentioned by others until I formulated it for the 7-piece EGTB Bounty. (I'm sure there should be earlier references that I'm missing).
guyhaw wrote:While we are talking about boundaries ...
Most people assume that the winner does the conversion into the next endgame. This is not always true: the loser can convert a Pawn, or be forced to capture. N Wirth, back in the 1990s, inherited a piece of code that assumed 'winner does all the conversions' and as a result his otherwise more accurate 'depths in plies' were unpredictably less accurate at times.
guyhaw wrote:J Schaeffer wrote a piece about managing bugs in the process of creating Checkers (WDL) EGTs ... ICGA conference I think. One of them was 'just verifying the internal consistency of an endgame EGT, as opposed to checking its consistency with successor EGTs as well'.
guyhaw wrote:The encoding of depth need not be aligned with the sign-bit of a notional integer of say 8 bits.
It needs to cover 'draw', 'index unused' (aka 'broken' in Nalimov terminology), lost in 0-n1 and won in 1-n2. If (n1 + n2 + 3) > 64, Nalimov used a 16-bit integer. The choice of 8 bits or 16 was made by Nalimov before the EGT was created.
This occasionally caused some rework when 8 had been chosen when 16 was needed, and even when 16 was chosen and 8 bits was sufficient.
guyhaw wrote:Of course, in the DTC and DTZ metrics, the n1 and n2 were smaller which helps with the compression of the EGT file.
guyhaw wrote:Jonathan Schaeffer's article on bugs and testing in the generation of EGTs was:
- - - - - Schaeffer, J., Björnsson, Y., Burch, N., Lake, R., Lu, P., Sutphen, S.: Building the Checkers 10-piece Endgame Databases.
- - - - - In: Advances in Computer Games 10, pp. 193-210 (2003)
I have no idea why they went through a round of internal testing without boundary-testing on EGTs. Maybe JS is making the simple point that internal testing does not pick up the endgame-boundary problems ... and there were problems to be picked up. Certainly, JdK with FEG initially ignored the possibility of capturing directly into a KNNK mate.
My article on 'Data Assurance in Opaque Computations' with Joe Hurd is at http://centaur.reading.ac.uk/4515/ .
g
astokes wrote:I found an hour over the weekend to compute the pawn slices of KPPP-KPP.
astokes wrote:Coding dominant move instead of DTx (as I am leaning toward with my compression method), probably 2.5 bits per table entry (on a fairly simple move order heuristic), times two sides to move.
astokes wrote:HDF5
Finally, I took a very quick look at HDF5, which I evaluated a couple of years ago for a project with large storage demands. It was very fast in the tests conducted, even from a rather minimal R binding. Fast, portable, infinitely scalable, accommodates metadata, has a rich tool ecosystem, and bindings for everything. Aimed at petabyte data sets, and well financed for future maintenance. Downside is an extremely rich API, so the tiny part of value to this project is tiny needles in a giant documentation haystack--if one takes the view that too much documentation is a bad thing.
HDF-EOS Project
It even seems to support rudimentary packed bit fields. I'll do some performance tests on that soon.
Anyone else have experience with this? I prefer to work with standard file types. Functionally I don't think it offers much over roll-your-own for EGTB tables. However, I would like to bias further collaboration to embed comprehensive combinatorial metadata (position counts) in the giant files (for validation and comparison), and a standard format might facilitate this.
guyhaw wrote:The DTM EGTs seems to compress by a factor of ~4, and the DTZ EGTs by a factor of ~8.
To reduce the cost, the program only iterates over all positions until a "small" number of changes occurs in an iteration. The positions that change value are saved in a queue.
A more general point is that the boundary of any set of values for a parameter in a computation is a 'dangerous place'. It's not naturally in the centre of the mindset of the algorithm designer and can be a place for 'one out' errors.
astokes wrote:Programmers get sucked into visualizing the interior case by planning to debug their code; I never plan to do so (for a special value of studiously near-sighted). We also suck ourselves into this by writing if/else constructions without distinguishing between necessity (solely dictated by pre/post conditions) and preference (in time, space, or policy) where other statement orderings were equally valid under program correctness.
astokes wrote:Edit: Probably more than half the comments in any code I write concerns differentiating shimness from brickness in whatever follows. I make heavy use of partial bricks, where the program asserts out in any subcase I haven't formally analyzed. Analysis is often quick when your program immediately bombs when fed some tasty new input, since you're probably thinking clearly about the new input at precisely that moment. I enjoy big computation because people look at you a little less funny for being quite so pedantic.
astokes wrote:Kirill, a quick comment on distributed file systems. HDF5 lives in the HPC space and has some application level support for this (mounds of documentation to wade through). I have no idea if what it provides overlaps with your ambitions.
On another front, distributed file systems is the raison d'être of DragonFly BSD. You could maybe post a question in one of their technical forums at some point to see whether any of those guys thinks this problem space is a good fit. They might be enthusiastic about supporting a computation that really pushes the edge in what their (admittedly young) OS is able to handle.
Users browsing this forum: Bing [Bot] and 1 guest