Stockfish Testing Queue

Finished - 56600 tests

15-01-20 sg prune_pv diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 19173 W: 3774 L: 3828 D: 11571
sprt @ 15+0.05 th 1 After allowing pruning at PV nodes try to excluded specific moves. move count pruning: don't allow pruning counter moves at PV nodes (Take 1)
15-01-20 sg prune_pv diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 10017 W: 1888 L: 1967 D: 6162
sprt @ 15+0.05 th 1 move count pruning: don't allow pruning killer moves at PV nodes (Take 2)
15-01-20 sg prune_pv diff
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 14011 W: 2890 L: 2744 D: 8377
sprt @ 15+0.05 th 1 move count pruning: don't allow pruning followup moves at PV nodes (Take 3)
15-01-20 pec tm diff
ELO: 1.33 +-2.4 (95%) LOS: 85.7%
Total: 30000 W: 5884 L: 5769 D: 18347
30000 @ 15+0.05 th 1 Remove hard stop on unchanging root moves. Take 3
15-01-20 sg prune_pv diff
LLR: -2.95 (-2.94,2.94) [0.00,6.00]
Total: 9905 W: 1638 L: 1693 D: 6574
sprt @ 60+0.05 th 1 LTC: move count pruning: don't allow pruning followup moves at PV nodes (Take 3)
15-01-20 n_p KingSafetySPSA diff
ELO: 1.38 +-2.1 (95%) LOS: 90.1%
Total: 41981 W: 8529 L: 8362 D: 25090
40000 @ 15+0.05 th 1 Testing the values obtained from the SPSA-session against the branch HighKingSafety.
15-01-20 pec tm diff
LLR: 3.19 (-2.94,2.94) [-3.00,1.00]
Total: 71981 W: 14341 L: 14300 D: 43340
sprt @ 15+0.05 th 1 Remove hard stop on unchanging root moves. Take 3. Test as simplification
15-01-21 pec tm diff
LLR: -2.96 (-2.94,2.94) [-3.00,1.00]
Total: 22326 W: 3561 L: 3750 D: 15015
sprt @ 60+0.05 th 1 LTC. Remove hard stop on unchanging root moves. Take 3. Test as simplification
15-01-21 n_p KingSafetySPSA50 diff
ELO: -0.31 +-2.5 (95%) LOS: 40.1%
Total: 31006 W: 6258 L: 6286 D: 18462
40000 @ 15+0.05 th 1 Try SPSA-values with 50% more change.
15-01-21 n_p KingSafetySPSA diff
ELO: 2.07 +-2.1 (95%) LOS: 97.3%
Total: 34859 W: 5967 L: 5759 D: 23133
30000 @ 60+0.05 th 1 The new king safety values does not look like a significant regression in STC. See if there is a clear ELO-gain in LTC.
15-01-22 n_p KingSafetySPSAClean diff
ELO: -16.40 +-41.2 (95%) LOS: 21.7%
Total: 106 W: 18 L: 23 D: 65
40000 @ 15+0.05 th 1 Using the SPSA-values but 'cleaned the noise', i.e. only using values with at least 5% change and by at least 2.
15-01-22 mco KingSafetySPSAClean diff
LLR: -1.55 (-2.94,2.94) [-1.50,4.50]
Total: 2867 W: 548 L: 596 D: 1723
sprt @ 15+0.05 th 1 Using the SPSA-values but 'cleaned the noise', i.e. only using values with at least 5% change and by at least 2.
15-01-23 sg scale_endgame diff
ELO: 1.99 +-2.5 (95%) LOS: 94.2%
Total: 30000 W: 6103 L: 5931 D: 17966
30000 @ 15+0.05 th 1 Measure effect of scaling down endgame score. Perhaps this avoids a little bit straight exchanges into endgames.
15-01-23 n_p KingSafetySPSAClean diff
ELO: -2.04 +-2.1 (95%) LOS: 2.7%
Total: 43000 W: 8491 L: 8744 D: 25765
40000 @ 15+0.05 th 1 Using the SPSA-values but 'cleaned the noise', i.e. only using values with at least 5% change and by at least 2. Got stopped yesterday due to wrong bench (after a few thousand games but shouldn't that be done automatically and immediately if bench differ?) but cannot find why the bench from the test branch should differ from local. So retesting this hoping it was a quirk in fishtest.
15-01-23 sg spsa_scale_endgame diff
19833/20000 iterations
40000/40000 games played
40000 @ 15+0.05 th 1 The concept seems promising so try first optimize parameters (include idea of mindbreaker)
15-01-24 roh QuickMove diff
LLR: -1.97 (-2.94,2.94) [-1.50,4.50]
Total: 21000 W: 4223 L: 4236 D: 12541
sprt @ 15+0.05 th 1 Changed the alternate move search depth.
15-01-24 sg spsa_scale_endgame diff
19353/20000 iterations
39670/40000 games played
40000 @ 15+0.05 th 1 My first tuning attempt breaks eval symmetry. So i stick now to my original approach. Mea culpa.
15-01-24 n_p KingSafetySPSANoise diff
ELO: -18.57 +-18.6 (95%) LOS: 2.5%
Total: 543 W: 96 L: 125 D: 322
40000 @ 15+0.05 th 1 When the big movers only gave a ELO-drop. Perhaps only the "noise" will do better then the original values.
15-01-24 jos check_extension diff
LLR: -3.26 (-2.94,2.94) [-1.50,4.50]
Total: 6308 W: 1242 L: 1343 D: 3723
sprt @ 15+0.05 th 1 Also extend checks with negative SEE if remaining depth is small.
15-01-24 sg scale_endgame diff
LLR: -2.97 (-2.94,2.94) [-1.50,4.50]
Total: 48483 W: 9772 L: 9745 D: 28966
sprt @ 15+0.05 th 1 Tuning indicate my start value is already good. So test this now with sprt.
15-01-24 n_p KingSafetySPSANoise diff
ELO: -0.80 +-3.0 (95%) LOS: 30.4%
Total: 20086 W: 4007 L: 4053 D: 12026
40000 @ 15+0.05 th 1 When the big movers gave a ELO-drop. Perhaps only the "noise" will do better then the original values.
15-01-24 sg fix_skill_level diff
ELO: 534.29 +-11.7 (95%) LOS: 100.0%
Total: 20000 W: 19098 L: 863 D: 39
20000 @ 15+0.05 th 1 Disable move pruning at the root node to fix the reported problem if using skill levels (test with skill level 1).
15-01-24 sg fix_skill_level diff
LLR: 2.94 (-2.94,2.94) [-3.00,1.00]
Total: 117279 W: 23585 L: 23642 D: 70052
sprt @ 15+0.05 th 1 Verify the skill level fix is no regression in standard ply
15-01-25 lbr test diff
LLR: -3.72 (-2.94,2.94) [-3.00,1.00]
Total: 103036 W: 20322 L: 20713 D: 62001
sprt @ 15+0.05 th 1 seems ok in local testing
15-01-25 mco hashfull diff
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 7352 W: 1548 L: 1401 D: 4403
sprt @ 15+0.05 th 1 Regression test for hashfull patch.
15-01-25 mco hashfull diff
LLR: 0.16 (-2.94,2.94) [0.00,6.00]
Total: 4400 W: 757 L: 730 D: 2913
sprt @ 60+0.05 th 1 LTC: Regression test for hashfull patch.
15-01-25 mco hashfull diff
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 61432 W: 10307 L: 10251 D: 40874
sprt @ 60+0.05 th 1 LTC: Regression test for hashfull patch.
15-01-25 mco skill diff
ELO: -130.82 +-7.2 (95%) LOS: 0.0%
Total: 10000 W: 3144 L: 6741 D: 115
10000 @ 10+0.05 th 1 Compare simplified skill to current one: level 1
15-01-25 mco skill diff
ELO: -116.85 +-7.1 (95%) LOS: 0.0%
Total: 10000 W: 3274 L: 6516 D: 210
10000 @ 10+0.05 th 1 Compare simplified skill to current one: level 2
15-01-25 mco skill diff
ELO: -96.98 +-7.0 (95%) LOS: 0.0%
Total: 10000 W: 3495 L: 6216 D: 289
10000 @ 15+0.05 th 1 Compare simplified skill to current one: level 3
15-01-25 mco skill diff
ELO: -81.26 +-6.8 (95%) LOS: 0.0%
Total: 10000 W: 3613 L: 5910 D: 477
10000 @ 15+0.05 th 1 Compare simplified skill to current one: level 4
15-01-26 lbr skill diff
ELO: -171.16 +-7.6 (95%) LOS: 0.0%
Total: 10000 W: 2671 L: 7234 D: 95
10000 @ 9+0.03 th 1 Compare simplified skill to current one: level 0
15-01-26 lbr skill diff
ELO: -48.18 +-6.5 (95%) LOS: 0.0%
Total: 10000 W: 3862 L: 5240 D: 898
10000 @ 15+0.05 th 1 Compare simplified skill to current one: level 6
15-01-26 lbr skill diff
ELO: -23.38 +-6.3 (95%) LOS: 0.0%
Total: 10000 W: 4001 L: 4673 D: 1326
10000 @ 15+0.05 th 1 Compare simplified skill to current one: level 8
15-01-26 Roc PinnedPawn diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 6161 W: 1169 L: 1259 D: 3733
sprt @ 15+0.05 th 1 Removing Pinned Pawns from attacks. See Git Notes.
15-01-26 lbr skill diff
ELO: -14.88 +-6.2 (95%) LOS: 0.0%
Total: 10000 W: 3882 L: 4310 D: 1808
10000 @ 20+0.05 th 1 Compare simplified skill to current one: level 10
15-01-26 lbr skill diff
ELO: -2.62 +-4.3 (95%) LOS: 11.4%
Total: 20000 W: 7756 L: 7907 D: 4337
20000 @ 20+0.05 th 1 Compare simplified skill to current one: level 12
15-01-26 Mys chk_red diff
LLR: -3.00 (-2.94,2.94) [-1.50,4.50]
Total: 12154 W: 2448 L: 2523 D: 7183
sprt @ 15+0.05 th 1 Try reducing reduction for checks, low pri.
15-01-26 mco skill diff
ELO: 28.94 +-6.8 (95%) LOS: 100.0%
Total: 10000 W: 5384 L: 4553 D: 63
10000 @ 10+0.05 th 1 Double skill level resolution. Measure ELO gap: Level 1 vs Level 0
15-01-26 mco skill diff
ELO: 69.24 +-6.9 (95%) LOS: 100.0%
Total: 10000 W: 5940 L: 3973 D: 87
10000 @ 10+0.05 th 1 Double skill level resolution. Measure ELO gap: Level 2 vs Level 1
15-01-26 sg scale_endgame diff
LLR: -2.97 (-2.94,2.94) [-1.50,4.50]
Total: 17589 W: 3465 L: 3524 D: 10600
sprt @ 15+0.05 th 1 Scale down endgame by 13/16 (Take 2)
15-01-26 mco skill diff
ELO: 26.95 +-6.8 (95%) LOS: 100.0%
Total: 10000 W: 5351 L: 4577 D: 72
10000 @ 10+0.05 th 1 Double skill level resolution. Measure ELO gap: Level 1 vs Level 0 (take 2)
15-01-26 mco skill diff
ELO: 71.60 +-6.9 (95%) LOS: 100.0%
Total: 10000 W: 5972 L: 3940 D: 88
10000 @ 10+0.05 th 1 Double skill level resolution. Measure ELO gap: Level 2 vs Level 1 (take 2)
15-01-26 mco skill diff
ELO: 70.76 +-6.9 (95%) LOS: 100.0%
Total: 10000 W: 5946 L: 3937 D: 117
10000 @ 10+0.05 th 1 Double skill level resolution. Measure ELO gap: Level 3 vs Level 2 (take 2)
15-01-26 mco skill diff
ELO: 85.89 +-6.9 (95%) LOS: 100.0%
Total: 10000 W: 6130 L: 3707 D: 163
10000 @ 10+0.05 th 1 Double skill level resolution. Measure ELO gap: Level 4 vs Level 3 (take 2)
15-01-26 mco skill diff
ELO: 26.46 +-6.7 (95%) LOS: 100.0%
Total: 10000 W: 5270 L: 4510 D: 220
10000 @ 10+0.05 th 1 Double skill level resolution. Measure ELO gap: Level 5 vs Level 4 (take 2)
15-01-26 mco skill diff
ELO: 91.38 +-6.9 (95%) LOS: 100.0%
Total: 10000 W: 6139 L: 3568 D: 293
10000 @ 10+0.05 th 1 Double skill level resolution. Measure ELO gap: Level 6 vs Level 5 (take 2)
15-01-26 jos less_imbalance_eg^ diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 19060 W: 3827 L: 3881 D: 11352
sprt @ 15+0.05 th 1 80% endgame score, take 2.
15-01-26 jos less_imbalance_eg diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 15983 W: 3130 L: 3193 D: 9660
sprt @ 15+0.05 th 1 120% endgame score, take 3.
15-01-26 Roc PinnedPawn diff
LLR: -2.97 (-2.94,2.94) [-1.50,4.50]
Total: 7339 W: 1391 L: 1478 D: 4470
sprt @ 15+0.05 th 1 Take 2: exclude only the attacks from horz/vert pinned pawns.