Stockfish Testing Queue

Finished - 43823 tests

13-02-13 simplify_eval diff
ELO: 2.29 +-4.8 (95%) LOS: 93.0%
Total: 20000 W: 4075 L: 3943 D: 11982
20000 @ 40/5+0.1 th 1
13-02-13 bishop_pin_clop diff
ELO: 4.73 +-4.8 (95%) LOS: 99.9%
Total: 20000 W: 4166 L: 3894 D: 11940
20000 @ 40/3+0.1 th 1
13-02-15 eval_scale diff
ELO: -1.00 +-7.6 (95%) LOS: 34.3%
Total: 8000 W: 1602 L: 1625 D: 4773
8000 @ 40/3+0.1 th 1
13-02-15 skip_null diff
ELO: -4.05 +-4.8 (95%) LOS: 0.4%
Total: 20000 W: 3630 L: 3863 D: 12507
20000 @ 40/8+0.1 th 1
13-02-15 all_cut_squash diff
ELO: -4.00 +-7.6 (95%) LOS: 3.3%
Total: 8000 W: 1203 L: 1295 D: 5502
8000 @ 40/30+0.5 th 1
13-02-16 test294 diff
ELO: -0.74 +-7.6 (95%) LOS: 38.3%
Total: 8000 W: 1632 L: 1649 D: 4719
8000 @ 15+0.05 th 1
13-02-17 singular_tweak diff
ELO: -1.46 +-4.8 (95%) LOS: 16.5%
Total: 20000 W: 3667 L: 3751 D: 12582
20000 @ 15+0.05 th 1 Simpler version of all_cut_squash
13-02-17 singular_tweak4 diff
ELO: 2.99 +-4.8 (95%) LOS: 97.7%
Total: 20000 W: 3776 L: 3604 D: 12620
20000 @ 15+0.05 th 1
13-02-17 bishop_pin_clop diff
ELO: -1.18 +-5.6 (95%) LOS: 24.4%
Total: 15011 W: 2680 L: 2731 D: 9600
16000 @ 15+0.05 th 1
13-02-18 lazy_eval diff
ELO: 2.19 +-5.6 (95%) LOS: 88.3%
Total: 14943 W: 3177 L: 3083 D: 8683
16000 @ 5+0.05 th 1
13-02-18 remove_space_eval diff
ELO: 9.06 +-5.4 (95%) LOS: 100.0%
Total: 16000 W: 3306 L: 2889 D: 9805
16000 @ 15+0.05 th 1
13-02-18 scale_with_gameplay diff
ELO: -22.82 +-6.8 (95%) LOS: 0.0%
Total: 10000 W: 1407 L: 2063 D: 6530
10000 @ 20+0.05 th 1 Scale down score with game ply
13-02-18 singular_tweak5 diff
ELO: 1.53 +-4.8 (95%) LOS: 84.4%
Total: 19998 W: 3838 L: 3750 D: 12410
20000 @ 15+0.05 th 1
13-02-19 bishop_pin_clop diff
ELO: 12.30 +-4.4 (95%) LOS: 100.0%
Total: 24000 W: 4931 L: 4082 D: 14987
24000 @ 15+0.05 th 1 Remove previous pin code, add bishop pin
13-02-19 scale_with_gameplay diff
ELO: 3.65 +-6.8 (95%) LOS: 96.2%
Total: 10000 W: 1813 L: 1708 D: 6479
26000 @ 20+0.05 th 1
13-02-20 move_ordering diff
ELO: -1.34 +-3.9 (95%) LOS: 13.3%
Total: 31013 W: 5753 L: 5873 D: 19387
32000 @ 15+0.05 th 1
13-02-20 outpost diff
ELO: -2.37 +-5.4 (95%) LOS: 7.7%
Total: 16000 W: 2870 L: 2979 D: 10151
16000 @ 15+0.05 th 1
13-02-20 pinned_null diff
ELO: -9.14 +-5.4 (95%) LOS: 0.0%
Total: 16000 W: 2689 L: 3110 D: 10201
16000 @ 15+0.05 th 1
13-02-20 master diff
ELO: 4.76 +-6.8 (95%) LOS: 99.2%
Total: 10000 W: 1715 L: 1578 D: 6707
10000 @ 60+0.05 th 1 Regression test vs sf_2.3.1 (Take 2)
13-02-20 master diff
ELO: 4.50 +-4.8 (95%) LOS: 99.9%
Total: 20000 W: 3507 L: 3248 D: 13245
20000 @ 60+0.05 th 1 Another regression test at long TC but with "bishop pin" patch applied
13-02-20 scale_with_gameplay diff
ELO: 2.43 +-6.8 (95%) LOS: 89.3%
Total: 10000 W: 1618 L: 1548 D: 6834
10000 @ 60+0.05 th 1 Retest game ply scaling at longer TC
13-02-21 remove_space_eval diff
ELO: 0.54 +-5.4 (95%) LOS: 63.2%
Total: 16000 W: 2753 L: 2728 D: 10519
16000 @ 45+0.05 th 1 Test at longer TC
13-02-21 rook_pin diff
ELO: -0.93 +-5.4 (95%) LOS: 28.8%
Total: 16000 W: 2936 L: 2979 D: 10085
16000 @ 15+0.05 th 1 Re-run due to failure
13-02-22 scale_with_gameplay diff
ELO: -1.91 +-5.4 (95%) LOS: 13.1%
Total: 16000 W: 3045 L: 3133 D: 9822
16000 @ 20+0.05 th 1 Increase game ply scaling to 2% every 10 plies
13-02-22 lucas_evasion_prunable diff
ELO: 1.57 +-3.8 (95%) LOS: 90.4%
Total: 32000 W: 6259 L: 6114 D: 19627
32000 @ 15+0.05 th 1
13-02-22 lucas_see_pv diff
ELO: 7.69 +-5.4 (95%) LOS: 100.0%
Total: 16000 W: 3219 L: 2865 D: 9916
16000 @ 15+0.05 th 1
13-02-22 rook_pin diff
ELO: 0.22 +-5.4 (95%) LOS: 55.1%
Total: 16000 W: 2988 L: 2978 D: 10034
16000 @ 15+0.05 th 1 Exclude pawn pins
13-02-22 pinned_null diff
ELO: 3.11 +-3.8 (95%) LOS: 99.5%
Total: 32000 W: 6294 L: 6008 D: 19698
32000 @ 15+0.05 th 1 Only check for pins on full null moves
13-02-23 lucas_see_pv diff
ELO: 4.31 +-6.8 (95%) LOS: 98.3%
Total: 10000 W: 1765 L: 1641 D: 6594
10000 @ 60+0.05 th 1 Re-test at longer TC
13-02-24 scale_with_gameplay diff
ELO: -6.08 +-6.8 (95%) LOS: 0.2%
Total: 10000 W: 1752 L: 1927 D: 6321
10000 @ 20+0.05 th 1 Another try at 1%, this time scaling just endgame score
13-02-24 alhpa_pruning diff
ELO: 1.63 +-4.5 (95%) LOS: 88.1%
Total: 23000 W: 4232 L: 4124 D: 14644
30001 @ 20+0.05 th 1 From lucas_see_pv: test only the pruning condition on alpha
13-02-24 lucaas_see_pv_1 diff
ELO: 5.18 +-6.8 (95%) LOS: 99.2%
Total: 10000 W: 1952 L: 1803 D: 6245
10000 @ 20+0.05 th 1 From lucas_see_pv: test only the pruning condition on PV nodes
13-02-25 space diff
ELO: -2.17 +-5.4 (95%) LOS: 10.5%
Total: 16000 W: 3128 L: 3228 D: 9644
16000 @ 15+0.05 th 1 Reduce space weight
13-02-25 scale_with_gameplay diff
ELO: 1.46 +-6.8 (95%) LOS: 75.5%
Total: 10000 W: 1865 L: 1823 D: 6312
10000 @ 20+0.05 th 1 Last try with scaling midgame instead of end game, always 1%
13-02-25 singular_tweak4 diff
ELO: 1.94 +-4.4 (95%) LOS: 92.0%
Total: 24000 W: 4620 L: 4486 D: 14894
24000 @ 15+0.05 th 1 Re-test against current master
13-02-25 rook_evaluation diff
ELO: -6.62 +-5.4 (95%) LOS: 0.0%
Total: 16000 W: 2921 L: 3226 D: 9853
16000 @ 15+0.05 th 1 From Eelco
13-02-25 alhpa_pruning diff
ELO: 3.20 +-4.8 (95%) LOS: 98.5%
Total: 19999 W: 3675 L: 3491 D: 12833
20000 @ 20+0.05 th 1 Rationale behind condition on alpha is not so clear to me. So remove it.
13-02-25 space diff
ELO: -1.95 +-5.4 (95%) LOS: 12.4%
Total: 15999 W: 2979 L: 3069 D: 9951
16000 @ 15+0.05 th 1 Reduce weight even more
13-02-25 easy_move diff
ELO: -155.54 +-23.1 (95%) LOS: 0.0%
Total: 1000 W: 68 L: 488 D: 444
16000 @ 45+0.05 th 1
13-02-26 xray diff
ELO: -26.46 +-21.5 (95%) LOS: 0.0%
Total: 1000 W: 177 L: 253 D: 570
16000 @ 15+0.05 th 1 X-ray through more types of pieces
13-02-26 scale_with_gameplay diff
ELO: -3.13 +-7.2 (95%) LOS: 7.5%
Total: 9000 W: 1541 L: 1622 D: 5837
10000 @ 20+0.05 th 1 Becaue 1% midgame is not bad, increase to 2%
13-02-26 6c7e7a2 diff
ELO: -0.87 +-4.4 (95%) LOS: 26.6%
Total: 24000 W: 4596 L: 4656 D: 14748
24000 @ 15+0.05 th 1 Re-test successful scale version
13-02-26 trapped_bishop diff
ELO: -2.50 +-5.4 (95%) LOS: 6.9%
Total: 15999 W: 2958 L: 3073 D: 9968
16000 @ 15+0.05 th 1 Generalize trapped bishop logic
13-02-27 master diff
ELO: 2.08 +-5.6 (95%) LOS: 90.0%
Total: 15000 W: 2504 L: 2414 D: 10082
16000 @ 60+0.05 th 1 Regression test after Lucas SEE prune merge
13-02-27 easy_move diff
ELO: 8.12 +-5.8 (95%) LOS: 100.0%
Total: 14000 W: 2475 L: 2148 D: 9377
16000 @ 45+0.05 th 1 Tweak easy move take #2
13-02-27 space diff
ELO: -2.48 +-5.4 (95%) LOS: 7.4%
Total: 16000 W: 3033 L: 3147 D: 9820
16000 @ 15+0.05 th 1 More subtle space tweak
13-02-27 xray diff
ELO: -33.92 +-12.5 (95%) LOS: 0.0%
Total: 3000 W: 491 L: 783 D: 1726
16000 @ 15+0.05 th 1 More x-ray take 2
13-02-27 king_safety_tune diff
ELO: -0.87 +-5.4 (95%) LOS: 30.2%
Total: 16000 W: 2964 L: 3004 D: 10032
16000 @ 15+0.05 th 1 Slightly reduce king safety scores
13-02-28 see_prune_depth diff
ELO: 9.56 +-6.8 (95%) LOS: 100.0%
Total: 10000 W: 1949 L: 1674 D: 6377
10000 @ 20+0.05 th 1 After Lucas patch on SEE pruning in PV nodes, it seems this is a sensible parameter
13-02-28 pinned_null diff
ELO: 4.24 +-4.8 (95%) LOS: 99.9%
Total: 20000 W: 3380 L: 3136 D: 13484
20000 @ 60+0.05 th 1 Retest at longer TC