Stockfish Testing Queue

Finished - 29420 tests

14-11-13 Fis asp_tune_result diff
ELO: 0.75 +-3.0 (95%) LOS: 68.4%
Total: 20000 W: 4034 L: 3991 D: 11975
20000 @ 15+0.05 th 1 Quick test some preliminary SPSA found aspiration values to see if they are really improving. (Starting delta still seems to be increasing but already significantly larger than before) Pri -1
14-11-12 gli master diff
ELO: 36.21 +-1.9 (95%) LOS: 100.0%
Total: 40000 W: 8511 L: 4357 D: 27132
40000 @ 60+0.05 th 1 Regression test against sf5, previous was 31 +- 1.9 (this time using 8moves_v3 as previous regression test did)
14-11-12 gli master diff
ELO: 45.62 +-2.2 (95%) LOS: 100.0%
Total: 40000 W: 10801 L: 5579 D: 23620
40000 @ 60+0.05 th 1 Regression test against sf5, previous was 31 +- 1.9
14-11-13 Fis asp_tune diff
24457/25000 iterations
50000/50000 games played
50000 @ 15+0.05 th 1 Tune aspiration parameters. Pri -1
14-11-13 fwi timemanagement_depthbas diff
19676/20000 iterations
40000/40000 games played
40000 @ 15+0.05 th 1 my suspicion is, that the optimisation got stuck in a local minimum. To test this I am using a different set of initial values. Final round of time management using previous depth. If this does not work I'll give up on this theme. combination of previously used best values.
14-11-13 sni threats5 diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 23462 W: 4735 L: 4777 D: 13950
sprt @ 15+0.05 th 1 Threat tempo in endgame
14-11-13 lbr tuned_tm diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 41649 W: 8329 L: 8321 D: 24999
sprt @ 15+0.05 th 1 Tuned values. Seems rather insensitive, not sure it got anywhere. I don't know what happened, was browsing it, but fishtest says I stopped. Relaunched.
14-11-12 pec tune_tm diff
24238/25000 iterations
49870/50000 games played
50000 @ 15+0.05 th 1 Tune faster hard stop on unchanging first root move
14-11-12 fwi timemanagement_depthbas diff
19544/20000 iterations
40000/40000 games played
40000 @ 15+0.05 th 1 final round of timemanagement using previous depth. If this does not work I'll give up on this theme. combination of previously used best values.
14-11-12 sni threats4 diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 18857 W: 3723 L: 3778 D: 11356
sprt @ 15+0.05 th 1 Threat tempo (experimental run)
14-11-12 sg move_order diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 16525 W: 3300 L: 3361 D: 9864
sprt @ 15+0.05 th 1 give bonus to underpromotions
14-11-11 Roc Outpost diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 5298 W: 1054 L: 1147 D: 3097
sprt @ 15+0.05 th 1 Artificial outpost bonus, first try, using spsa values
14-11-11 Roc Outpost diff
24140/25000 iterations
50000/50000 games played
50000 @ 15+0.05 th 1 SPSA tuning for new "Artifical outpost" feature, computed only on squares where pure outposts were valued more than one.
14-11-08 fwi bestMove_changes_optimi diff
LLR: -2.95 (-2.94,2.94) [0.00,4.00]
Total: 129943 W: 22155 L: 21888 D: 85900
sprt @ 60+0.05 th 1 rerun @ sprt[0,4], cause it's parameter tuning (had not realised there is a special rule for that, sorry) bestMove Changes SPSA optimised values. Values did not fully converge. So I am using the average values they oscillated about.
14-11-11 tvi hist_tune diff
LLR: -2.94 (-2.94,2.94) [-1.50,4.50]
Total: 11446 W: 2291 L: 2366 D: 6789
sprt @ 15+0.05 th 1 Tune history
14-11-11 lbr qstt diff
LLR: 2.96 (-2.94,2.94) [-4.00,0.00]
Total: 25085 W: 5059 L: 4991 D: 15035
sprt @ 15+0.05 th 1 allow quiet ttMove in qsearch(). check for non regression with hash pressure also.
14-11-10 lbr qstt diff
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 41181 W: 7008 L: 6920 D: 27253
sprt @ 60+0.05 th 1 allow quiet ttMove in qsearch()
14-11-10 jos null_threat diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 12973 W: 2565 L: 2636 D: 7772
sprt @ 15+0.05 th 1 My version of null-move threat extension. Only on PvNodes.
14-11-10 lbr qstt diff
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 33067 W: 6670 L: 6571 D: 19826
sprt @ 15+0.05 th 1 allow quiet ttMove in qsearch()
14-11-10 mco null_extension diff
LLR: -2.97 (-2.94,2.94) [-1.50,4.50]
Total: 22473 W: 4522 L: 4567 D: 13384
sprt @ 15+0.05 th 1 Another attempt at null extension, this time without verification search
14-11-09 jos tune_shelter diff
LLR: -2.95 (-2.94,2.94) [0.00,4.00]
Total: 35207 W: 5937 L: 5977 D: 23293
sprt @ 60+0.05 th 1 Final test for new SPSA values for StormDanger array, directly at LTC.
14-11-09 lbr history diff
LLR: -2.95 (-2.94,2.94) [-3.00,1.00]
Total: 11200 W: 1843 L: 2016 D: 7341
sprt @ 60+0.05 th 1 left over from ONE_PLY=2. a micro functional change bcos granularity isn't the same.
14-11-09 sni probcut5 diff
LLR: -2.97 (-2.94,2.94) [-1.50,4.50]
Total: 7810 W: 1555 L: 1641 D: 4614
sprt @ 15+0.05 th 1 Larger reduction
14-11-09 mco null_extension diff
LLR: 0.00 (-2.94,2.94) [0.00,6.00]
Total: 21274 W: 3726 L: 3621 D: 13927
sprt @ 60+0.05 th 1 LTC: Null search extension. See http://www.talkchess.com/forum/viewtopic.php?t=54281
14-11-09 gli accurate_pv diff
ELO: -2.02 +-2.0 (95%) LOS: 2.5%
Total: 40000 W: 6990 L: 7223 D: 25787
40000 @ 15+0.05 th 3 Verify no crashes after removing PV truncation code.
14-11-08 sni probcut3 diff
LLR: -2.97 (-2.94,2.94) [-1.50,4.50]
Total: 45027 W: 9165 L: 9147 D: 26715
sprt @ 15+0.05 th 1 More accurate probcut
14-11-09 lbr history diff
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 8031 W: 1672 L: 1527 D: 4832
sprt @ 15+0.05 th 1 left over from ON_PLY=2. a micro functional change bcos granularity isn't the same.
14-11-08 mco null_extension diff
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 41072 W: 8428 L: 8206 D: 24438
sprt @ 15+0.05 th 1 Null search extension. See http://www.talkchess.com/forum/viewtopic.php?t=54281
14-11-09 mco null_extension diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 9763 W: 1913 L: 1993 D: 5857
sprt @ 15+0.05 th 1 Use beta in verification and alpha in null
14-11-09 mco null_extension^ diff
LLR: -2.97 (-2.94,2.94) [-1.50,4.50]
Total: 27723 W: 5521 L: 5552 D: 16650
sprt @ 15+0.05 th 1 Fix correct verification limit to alpha, not alpha-100
14-11-09 lbr aspiration diff
LLR: -2.96 (-2.94,2.94) [-1.00,4.00]
Total: 37020 W: 7404 L: 7430 D: 22186
sprt @ 15+0.05 th 1 move unviolated bound by 3/4 of the window
14-11-08 jos tune_shelter diff
ELO: 2.99 +-2.8 (95%) LOS: 98.1%
Total: 20000 W: 3539 L: 3367 D: 13094
20000 @ 60+0.05 th 1 Check also at LTC as proposed by Gary.
14-11-09 mco null_extension diff
LLR: -2.94 (-2.94,2.94) [-1.50,4.50]
Total: 4649 W: 898 L: 992 D: 2759
sprt @ 15+0.05 th 1 Use always beta as verification limit
14-11-09 lbr aspiration diff
LLR: -2.95 (-2.94,2.94) [-1.00,4.00]
Total: 14223 W: 2843 L: 2932 D: 8448
sprt @ 15+0.05 th 1 assume no instability
14-11-08 mco null_extension diff
LLR: 0.69 (-2.94,2.94) [-1.50,4.50]
Total: 41192 W: 8332 L: 8193 D: 24667
sprt @ 15+0.05 th 1 Set threshold at -100 below alpha
14-11-09 sni probcut4 diff
LLR: -2.94 (-2.94,2.94) [-1.50,4.50]
Total: 21174 W: 4227 L: 4275 D: 12672
sprt @ 15+0.05 th 1 Simpler threshold eval >= beta
14-11-08 mco no_protected diff
LLR: -2.96 (-2.94,2.94) [-3.00,1.00]
Total: 30138 W: 6047 L: 6265 D: 17826
sprt @ 15+0.05 th 1 Simplify evaluate_threats()
14-11-06 gli accurate_pv diff
ELO: 2.94 +-1.9 (95%) LOS: 99.9%
Total: 40000 W: 6308 L: 5970 D: 27722
40000 @ 60+0.05 th 3 Measure ELO of accurate PV at 60s, 3 threads, 32mb hash
14-11-07 jos tune_shelter diff
ELO: 1.20 +-3.0 (95%) LOS: 78.0%
Total: 20000 W: 4032 L: 3963 D: 12005
20000 @ 15+0.05 th 1 Since SPSA values are close to default, first a quick check at STC to see, how much gain there is. If any.
14-11-07 pec master1 diff
ELO: 0.72 +-1.9 (95%) LOS: 77.4%
Total: 60000 W: 13687 L: 13563 D: 32750
60000 @ 6+0.03 th 1 Measure impact of rearranging timer thread at very fast tc. Seems like speed and elo drop
14-11-08 Fis laterKS_3minor diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 28131 W: 5704 L: 5733 D: 16694
sprt @ 15+0.05 th 1 Another version of later king safety to include 3 minors but not RB or RN. (If both patches pass I will play them head to head and ONLY submit the stronger to LTC)
14-11-08 lbr aspiration diff
LLR: -0.31 (-2.94,2.94) [0.00,5.00]
Total: 691 W: 120 L: 130 D: 441
sprt @ 60+0.05 th 1 aspiration: same but keep original growth 3/8
14-11-08 lbr aspiration diff
LLR: 2.94 (-2.94,2.94) [0.00,5.00]
Total: 21666 W: 3780 L: 3569 D: 14317
sprt @ 60+0.05 th 1 aspiration: be more optimistic wrt search instability
14-11-08 Fis laterKS diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 7300 W: 1406 L: 1493 D: 4401
sprt @ 15+0.05 th 1 Apply king safety even later in the game. Suggested by Ajith in the forum.
14-11-08 lbr aspiration diff
LLR: 2.94 (-2.94,2.94) [-1.00,4.00]
Total: 17326 W: 3618 L: 3441 D: 10267
sprt @ 15+0.05 th 1 aspiration: same but keep original growth 3/8
14-11-08 lbr aspiration diff
LLR: 2.96 (-2.94,2.94) [-1.00,4.00]
Total: 16362 W: 3371 L: 3197 D: 9794
sprt @ 15+0.05 th 1 aspiration: be more optimistic wrt search instability
14-11-07 fwi bestMove_changes_optimi diff
LLR: -2.97 (-2.94,2.94) [0.00,6.00]
Total: 30683 W: 5296 L: 5249 D: 20138
sprt @ 60+0.05 th 1 bestMove Changes SPSA optimised values. Values did not fully converge. So I am using the average values they oscillated about.
14-10-28 jos tune_shelter diff
24734/25000 iterations
50000/50000 games played
50000 @ 60+0.05 th 1 Check for tc dependency. Very low priority for times when queue is empty. Maybe this is of general interest. (If not, feel free to not approve/delete.)
14-11-02 gli accurate_pv diff
ELO: -3.60 +-2.9 (95%) LOS: 0.8%
Total: 20000 W: 3575 L: 3782 D: 12643
20000 @ 15+0.05 th 3 (3 threads) Measure ELO of accurate PV version 4 (allow TT-refined values to be used in PV q-search)
14-11-06 luc phatm diff
LLR: -2.97 (-2.94,2.94) [-1.50,4.50]
Total: 4545 W: 866 L: 961 D: 2718
sprt @ 15+0.05 th 1 Take into account ponder hitting when determining current move's importance: final take 3 (locally tuned values / in case queue gets empty)