Stockfish Testing Queue

Finished - 29420 tests

14-11-19 tja ttzero diff
LLR: 4.56 (-2.94,2.94) [-3.00,1.00]
Total: 220478 W: 44125 L: 44284 D: 132069
sprt @ 15+0.05 th 1 No-regression test for bugfix.
14-11-20 mco pull_doubled_fix diff
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 39839 W: 6662 L: 6572 D: 26605
sprt @ 60+0.05 th 1 LTC: Regression test for this subtle evaluation asymmetry bug
14-11-20 gli keep_old_pv_at_root diff
ELO: -0.95 +-2.1 (95%) LOS: 19.3%
Total: 40000 W: 7851 L: 7960 D: 24189
40000 @ 15+0.05 th 1 Measure ELO again, this time, only write back PV if score is within alphabeta bounds
14-11-20 mco pull_accurate_pv_3 diff
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 54668 W: 9873 L: 9809 D: 34986
sprt @ 15+0.05 th 3 Retest with correct sprt parameters. Sorry for the mistake
14-11-19 sg spsa_knight_fork_threat diff
19381/20000 iterations
39888/40000 games played
40000 @ 15+0.05 th 1 The first SPSA tuning attempt seems to give worser result than the initial values. Perhaps with higher resolution (c = 5) we get better results. Prio -1
14-11-20 mco pull_doubled_fix diff
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 21035 W: 4244 L: 4122 D: 12669
sprt @ 15+0.05 th 1 Regression test for this subtle evaluation asymmetry bug
14-11-19 lbr history diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 46285 W: 7766 L: 7704 D: 30815
sprt @ 60+0.05 th 1 d^2.1
14-11-18 nab tune_shelter diff
LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 20956 W: 3468 L: 3555 D: 13933
sprt @ 60+0.05 th 1 LTC: Got something like 2-3 ELO in local testing. The values never been tested in the fishtest, so might need one. LTC because StormDanger is TC sensitive.
14-11-19 mco pull_accurate_pv_3 diff
LLR: -2.97 (-2.94,2.94) [-1.50,4.50]
Total: 24662 W: 4425 L: 4467 D: 15770
sprt @ 15+0.05 th 3 Regression test with 3 threads for this further simplification of accurate pv (changed from the one that failed)
14-11-18 gli keep_old_pv_at_root diff
ELO: -4.46 +-2.2 (95%) LOS: 0.0%
Total: 40000 W: 7770 L: 8284 D: 23946
40000 @ 15+0.05 th 1 Check no regression after change to keep old PV at root on fail-high/low, if the new PV is a prefix of the old PV.
14-11-19 mco pull_accurate_pv_3 diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 9081 W: 1764 L: 1846 D: 5471
sprt @ 15+0.05 th 1 Regression test with 1 thread for this further simplification of accurate pv (changed from the one that failed)
14-11-16 fwi optimise_bestMove_chang diff
19810/20000 iterations
40000/40000 games played
40000 @ 60+0.05 th 1 LTC cos there is quite likely a TC sensitivity spsa optimise bestmove
14-11-19 sg knight_fork_threats diff
ELO: -2.21 +-3.0 (95%) LOS: 7.8%
Total: 20000 W: 3947 L: 4074 D: 11979
20000 @ 15+0.05 th 1 Use the SPSA tuned values. some parameters show a straight trend and so seems not be converged. So first a elo measure is done.
14-11-18 mco pull_accurate_pv_3 diff
LLR: -2.96 (-2.94,2.94) [-3.00,1.00]
Total: 48915 W: 8787 L: 9028 D: 31100
sprt @ 15+0.05 th 3 Regression test with 3 threads for this further simplification of accurate pv
14-11-17 lbr history diff
LLR: 2.96 (-2.94,2.94) [-0.50,3.50]
Total: 124817 W: 25186 L: 24681 D: 74950
sprt @ 15+0.05 th 1 d^2.1
14-11-18 lbr history diff
LLR: -2.96 (-2.94,2.94) [-1.00,4.00]
Total: 51886 W: 10386 L: 10371 D: 31129
sprt @ 15+0.05 th 1 bonus=d^2.1 hmax=250^(2.1/2)
14-11-18 jos Outposts diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 14902 W: 3008 L: 3074 D: 8820
sprt @ 15+0.05 th 1 Try Ed's idea separately with 4 pawns as limit.
14-11-18 sg spsa_knight_fork_threat diff
19493/20000 iterations
40000/40000 games played
40000 @ 15+0.05 th 1 tune now with spsa starting at my values get from PBIL. This is my first SPSA run, so check if everything is ok please.
14-11-18 jos Outposts diff
ELO: -0.94 +-2.6 (95%) LOS: 23.6%
Total: 30000 W: 6310 L: 6391 D: 17299
30000 @ 10+0.05 th 1 Quick check of a modified outpost code.
14-11-15 Fis asp_tune_result diff
LLR: -3.03 (-2.94,2.94) [0.00,4.00]
Total: 77126 W: 12853 L: 12765 D: 51508
sprt @ 60+0.05 th 1 Results of aspiration tuning. LTC
14-11-17 sni no_shuffling diff
ELO: -0.56 +-3.9 (95%) LOS: 38.9%
Total: 10000 W: 1617 L: 1633 D: 6750
10000 @ 60+0.05 th 1 Estimate the Elo cost of avoiding piece shuffling
14-11-16 hxi piecevalues diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 14856 W: 2933 L: 2999 D: 8924
sprt @ 15+0.05 th 1 -50 mg bishop and knight
14-11-18 nab knight_outpost diff
LLR: -2.97 (-2.94,2.94) [-1.50,4.50]
Total: 5522 W: 1055 L: 1147 D: 3320
sprt @ 15+0.05 th 1 STC: Bonus for outpost knight supported by knight
14-11-18 nab king_opposition diff
LLR: -2.95 (-2.94,2.94) [0.00,6.00]
Total: 16120 W: 2693 L: 2718 D: 10709
sprt @ 60+0.05 th 1 LTC: King opposition
14-11-17 sg knight_fork_threats diff
ELO: -0.21 +-3.1 (95%) LOS: 44.7%
Total: 20000 W: 4029 L: 4041 D: 11930
20000 @ 15+0.05 th 1 Add bonus for safe knight fork threats. Parameters tuned by population base incremental learning (see also my post at the forum).
14-11-17 lbr history^ diff
LLR: -2.95 (-2.94,2.94) [-0.50,3.50]
Total: 14616 W: 2876 L: 2996 D: 8744
sprt @ 15+0.05 th 1 d^1.9
14-11-17 nab king_opposition diff
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 19010 W: 3929 L: 3769 D: 11312
sprt @ 15+0.05 th 1 STC: King opposition
14-11-17 mco accurate_pv_2 diff
LLR: -2.95 (-2.94,2.94) [-3.00,1.00]
Total: 26644 W: 5270 L: 5480 D: 15894
sprt @ 15+0.05 th 1 Test some functional changes to accurate_pv. It is a sensible simplification. Refer to the log messages for details.
14-11-17 lbr history^ diff
LLR: -2.96 (-2.94,2.94) [-0.50,3.50]
Total: 15102 W: 2966 L: 3085 D: 9051
sprt @ 15+0.05 th 1 d^1.8
14-11-17 lbr history diff
LLR: -2.95 (-2.94,2.94) [-0.50,3.50]
Total: 17782 W: 3495 L: 3606 D: 10681
sprt @ 15+0.05 th 1 d^2.2
14-11-16 gli kid diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 27196 W: 4462 L: 4500 D: 18234
sprt @ 60+0.05 th 1 One more try on KID king safety pattern detection.
14-11-17 mco accurate_pv diff
ELO: 1.22 +-6.6 (95%) LOS: 64.1%
Total: 10000 W: 4724 L: 4689 D: 587
10000 @ 10+0.05 th 3 Crash test for 'accurate pv' series, as required by Joona.
14-11-15 lbr history diff
LLR: -2.74 (-2.94,2.94) [-0.50,3.50]
Total: 127288 W: 25658 L: 25454 D: 76176
sprt @ 15+0.05 th 1 reduce history max by 1/4. last try
14-11-14 gli kid diff
19435/20000 iterations
39259/40000 games played
40000 @ 60+0.05 th 1 SPSA tuning of KID attack patterns, LTC because very TC sensitive
14-11-16 lbr asym diff
LLR: -2.96 (-2.94,2.94) [-0.50,3.50]
Total: 19372 W: 3827 L: 3934 D: 11611
sprt @ 15+0.05 th 1 asymmetric history: double malus
14-11-16 lbr asym^ diff
LLR: -2.95 (-2.94,2.94) [-0.50,3.50]
Total: 4580 W: 882 L: 1031 D: 2667
sprt @ 15+0.05 th 1 asymmetric history: double bonus
14-11-16 gli kid diff
LLR: -3.69 (-2.94,2.94) [-1.50,4.50]
Total: 20054 W: 3351 L: 3431 D: 13272
sprt @ 60+0.05 th 1 King-threat patterns, using SPSA tuned values. LTC because king safety very TC sensitive.
14-11-15 fwi timemanagement_depthbas diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 6769 W: 1304 L: 1392 D: 4073
sprt @ 15+0.05 th 1 spsa values, tuned @ slower time control
14-11-15 hxi statsupd diff
LLR: -2.97 (-2.94,2.94) [-1.50,4.50]
Total: 6338 W: 1232 L: 1322 D: 3784
sprt @ 15+0.05 th 1 try different history update
14-11-15 Fis asp_tune_result diff
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 16151 W: 3285 L: 3134 D: 9732
sprt @ 15+0.05 th 1 Results of aspiration tuning.
14-11-13 fwi timemanagement_depthbas diff
19776/20000 iterations
40000/40000 games played
40000 @ 60+0.05 th 1 Looking whether this converges to different values at slower time control. Also increased ck for Depth Factor to 50, to allow search range to be bigger.
14-11-15 lbr history diff
LLR: 3.25 (-2.94,2.94) [0.00,4.00]
Total: 21346 W: 3691 L: 3453 D: 14202
sprt @ 60+0.05 th 1 half history max
14-11-15 lbr history diff
LLR: 3.35 (-2.94,2.94) [-0.50,3.50]
Total: 17993 W: 3740 L: 3508 D: 10745
sprt @ 15+0.05 th 1 half history max
14-11-15 lbr noEvalImproving diff
LLR: -2.96 (-2.94,2.94) [-3.00,1.00]
Total: 5483 W: 1045 L: 1217 D: 3221
sprt @ 15+0.05 th 1 retire improving
14-11-14 jos null_threat diff
LLR: -3.13 (-2.94,2.94) [-1.50,4.50]
Total: 22679 W: 4621 L: 4671 D: 13387
sprt @ 15+0.05 th 1 Take 2. (Based on Marco's code)
14-11-15 lbr history diff
LLR: -2.95 (-2.94,2.94) [-0.50,3.50]
Total: 8385 W: 1667 L: 1805 D: 4913
sprt @ 15+0.05 th 1 double history max
14-11-14 Fis asp_tune diff
24380/25000 iterations
50000/50000 games played
50000 @ 15+0.05 th 1 Looks like a small improvement so far and we are still converging so tune a bit more. Pri -1
14-11-15 sni rule50 diff
LLR: -2.95 (-2.94,2.94) [0.00,6.00]
Total: 1153 W: 172 L: 271 D: 710
sprt @ 60+0.05 th 1 Soft 50 moves rule : instead of returning VALUE_DRAW abruptly after 50 moves, decrease eval slowly after 10 moves or more of piece shuffling
14-11-14 sni stochastic4 diff
LLR: -2.97 (-2.94,2.94) [-1.50,4.50]
Total: 18856 W: 3843 L: 3898 D: 11115
sprt @ 15+0.05 th 1 Double stochastic mobility in midgame
14-11-14 sg move_order diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 9432 W: 1910 L: 1991 D: 5531
sprt @ 15+0.05 th 1 add bonus to double checks