Stockfish Testing Queue

Finished - 4333 tests

14-12-11 sg stormdanger diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 40589 W: 8370 L: 8363 D: 23856
sprt @ 15+0.05 th 1 25% more Stormdanger bonus for blocked f6/c6 pawn (After Garys compile fix)
14-12-11 sg stormdanger diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 5899 W: 1168 L: 1259 D: 3472
sprt @ 15+0.05 th 1 25% less Stormdanger bonus for blocked pawn on b or g file (After Garys complie fix)
14-12-11 sg stormdanger diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 45064 W: 9115 L: 9097 D: 26852
sprt @ 15+0.05 th 1 Double Stormdanger bonus for blocked pawn on f5/c5
14-12-11 sg history_bonus diff
ELO: -3.23 +-3.1 (95%) LOS: 1.9%
Total: 20000 W: 3937 L: 4123 D: 11940
20000 @ 15+0.05 th 1 use half history bonus, so updates can occur up to depth 31 (instead depth 22). I am expect no significant difference on STC, but do a quick measurement as baseline for later attempts (Take 1)
14-12-12 sg stormdanger diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 10472 W: 2075 L: 2153 D: 6244
sprt @ 15+0.05 th 1 25% more Stormdanger bonus for blocked pawn on g6/b6. Try some opposite because this test fails badly: http://tests.stockfishchess.org/tests/view/5489fa5b0ebc591511eb6ef2
14-12-13 sg history diff
LLR: -4.39 (-2.94,2.94) [-1.50,4.50]
Total: 19474 W: 3886 L: 3991 D: 11597
sprt @ 15+0.05 th 1 add half bonus if Max exceeded (so higher depth influence history)
14-12-17 sg big_king_safety diff
ELO: 3.46 +-2.2 (95%) LOS: 99.9%
Total: 40000 W: 8275 L: 7877 D: 23848
40000 @ 15+0.05 th 1 Measure big king safety tuning with corrected values (see forum). Prio -2
14-12-17 sg big_king_safety diff
LLR: 2.96 (-2.94,2.94) [0.00,6.00]
Total: 10311 W: 1876 L: 1721 D: 6714
sprt @ 60+0.05 th 1 LTC: Measure big king safety tuning with corrected values
14-12-19 sg king_block_pawn diff
ELO: -2.18 +-2.4 (95%) LOS: 3.7%
Total: 32659 W: 6502 L: 6707 D: 19450
40000 @ 15+0.05 th 1 big_king_safety: its unlikely but perhaps the change from 200 to 300 for king blocks pawn is responsible for the elo gain, so go for safety and test this version against the current master (see pull request comment)
14-12-14 sg spsa_big_king_safety diff
47490/50000 iterations
90852/100000 games played
100000 @ 60+0.05 th 1 Big king safety tuning. Stormdanger and Shelterweakness indexed by file pairs (a/h,b/g,c/f,d/e). Special case where king blocks pawn is incorporated in Stormdanger. LTC because TC-dependant. There are 93 parameters so i use following SPSA configuration: Games=100000 Gamma=0.159 Alpha=0.558 C=5 (except in maxSafety C=10 is used) Prio -1
14-12-26 sg passed_pawns diff
LLR: -2.95 (-2.94,2.94) [-0.50,3.50]
Total: 9498 W: 1829 L: 1963 D: 5706
sprt @ 15+0.05 th 1 Because stockfish underestimates passed pawns in middle game raise base bonus factor slightly
14-12-28 sg spsa_big_king_safety diff
53749/50000 iterations
85254/100000 games played
100000 @ 60+0.05 th 1 retune king safety further because some parameters not converged in first run and the -300 handset values are tuned not at all. Additionally attack units, king danger array and king safety weight are included too, so the complete king safety related stuff is tuned. Low prio. Remark: the other tuning attempt http://tests.stockfishchess.org/tests/view/54a0620d0ebc597884f6935e from n_persson is wrong because of the reordering of stormdanger indices by Marco (see my comments on his repo).
14-12-29 sg pruning diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 40438 W: 8071 L: 8066 D: 24301
sprt @ 15+0.05 th 1 allow move count pruning only if distance to next Pv node is greater than one
14-12-29 sg pruning diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 4693 W: 959 L: 1054 D: 2680
sprt @ 15+0.05 th 1 allow move count pruning only if distance to next Pv node is greater than two (Take 2)
14-12-29 sg pruning diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 7185 W: 1395 L: 1482 D: 4308
sprt @ 15+0.05 th 1 allow move count pruning only if distance to next Pv node is greater than one or move is no killer-, counter- and followupmove (Take 3)
14-12-31 sg pruning diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 2671 W: 494 L: 594 D: 1583
sprt @ 15+0.05 th 1 allow razoring only if distance to next Pv node is greater than one
15-01-01 sg big_king_safety diff
ELO: 9.22 +-3.1 (95%) LOS: 100.0%
Total: 19514 W: 4340 L: 3822 D: 11352
20000 @ 15+0.05 th 1 Quick measure of tuned values. For attack unit stuff i use finer granularity ( factor 4) because of very low values.
15-01-02 sg big_king_safety diff
LLR: 2.96 (-2.94,2.94) [0.00,6.00]
Total: 6399 W: 1192 L: 1056 D: 4151
sprt @ 60+0.05 th 1 LTC:Because STC seems so far very good (and i want to go to bed),i set up the LTC now. For attack unit stuff i use finer granularity ( factor 4) because of very low values.
15-01-07 sg spsa_pawns diff
51922/50000 iterations
99911/100000 games played
100000 @ 15+0.05 th 1 Tune pawn structure (except passed pawns and king shelter)
15-01-07 sg pawns diff
LLR: 2.95 (-2.94,2.94) [-0.50,3.50]
Total: 164666 W: 33319 L: 32703 D: 98644
sprt @ 15+0.05 th 1 Test pawn structure tuned values
15-01-09 sg pawns diff
LLR: -3.21 (-2.94,2.94) [0.00,4.00]
Total: 29850 W: 4948 L: 5020 D: 19882
sprt @ 60+0.05 th 1 LTC: Test pawn structure tuned values
15-01-10 sg pruning diff
ELO: 1.17 +-2.5 (95%) LOS: 82.2%
Total: 30000 W: 6029 L: 5928 D: 18043
30000 @ 15+0.05 th 1 Measure the effect of allowing move pruning at PV nodes. Inspired by following talkchess discussion: http://www.talkchess.com/forum/viewtopic.php?t=54761&start=80
15-01-10 sg pruning diff
ELO: 1.71 +-2.6 (95%) LOS: 90.2%
Total: 23404 W: 4008 L: 3893 D: 15503
30000 @ 60+0.05 th 1 LTC: Measure the effect of allowing move pruning at PV nodes. Little gain for STC. Because i'am interrested how this scales i prefer a fixed games test instead of a no-regression-sprt.
15-01-11 sg pruning diff
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 21553 W: 4342 L: 4221 D: 12990
sprt @ 15+0.05 th 1 No regression test: allow move pruning on pv nodes. Futility pruning in step 7 is added (pointed out by Lucas)
15-01-11 sg pruning diff
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 7675 W: 1351 L: 1209 D: 5115
sprt @ 60+0.05 th 1 LTC: No regression test: allow move pruning on pv nodes. Futility pruning in step 7 is added (pointed out by Lucas)
15-01-12 sg pruning diff
ELO: 1.51 +-2.6 (95%) LOS: 86.9%
Total: 20000 W: 3061 L: 2974 D: 13965
20000 @ 60+0.05 th 3 LTC: SMP-Measure (as proposed by Joona) allow move pruning on pv nodes. Futility pruning in step 7 is added (pointed out by Lucas). smp test proposed by Joerg
15-01-14 sg razor_pv diff
ELO: -1.01 +-2.5 (95%) LOS: 21.2%
Total: 30000 W: 5902 L: 5989 D: 18109
30000 @ 15+0.05 th 1 No we allow futility pruning at pv nodes i want measure if other pruning or reductions methods are useful at pv nodes too. Test allow razoring at pv nodes.
15-01-14 sg probcut_pv diff
ELO: 0.98 +-2.5 (95%) LOS: 78.1%
Total: 30000 W: 6032 L: 5947 D: 18021
30000 @ 15+0.05 th 1 Measure effect of allowing probcut on pv nodes
15-01-15 sg probcut_pv2 diff
ELO: 0.91 +-2.5 (95%) LOS: 76.5%
Total: 30000 W: 6015 L: 5936 D: 18049
30000 @ 15+0.05 th 1 allow probcut at pv nodes (except root node)
15-01-15 sg probcut_pv diff
ELO: -1.51 +-2.3 (95%) LOS: 9.8%
Total: 29311 W: 4775 L: 4902 D: 19634
30000 @ 60+0.05 th 1 LTC: Measure effect of allowing probcut on pv nodes
15-01-16 sg null_pv diff
ELO: -0.87 +-2.3 (95%) LOS: 22.9%
Total: 34620 W: 6825 L: 6912 D: 20883
30000 @ 15+0.05 th 1 Allow null move pruning on PV nodes, but do there always verification search
15-01-17 sg spsa_backward_rank diff
36473/40000 iterations
71778/80000 games played
80000 @ 15+0.05 th 1 Tune new rank based penalty for backward pawns.
15-01-18 sg backward_rank diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 8527 W: 1651 L: 1734 D: 5142
sprt @ 15+0.05 th 1 Test tuned values for new rank based backward pawn penalty.
15-01-19 sg backward_rank2 diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 6505 W: 1230 L: 1319 D: 3956
sprt @ 15+0.05 th 1 After failed spsa tuning try i simple linear rank based penalty for backward pawns
15-01-20 sg prune_pv diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 19173 W: 3774 L: 3828 D: 11571
sprt @ 15+0.05 th 1 After allowing pruning at PV nodes try to excluded specific moves. move count pruning: don't allow pruning counter moves at PV nodes (Take 1)
15-01-20 sg prune_pv diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 10017 W: 1888 L: 1967 D: 6162
sprt @ 15+0.05 th 1 move count pruning: don't allow pruning killer moves at PV nodes (Take 2)
15-01-20 sg prune_pv diff
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 14011 W: 2890 L: 2744 D: 8377
sprt @ 15+0.05 th 1 move count pruning: don't allow pruning followup moves at PV nodes (Take 3)
15-01-20 sg prune_pv diff
LLR: -2.95 (-2.94,2.94) [0.00,6.00]
Total: 9905 W: 1638 L: 1693 D: 6574
sprt @ 60+0.05 th 1 LTC: move count pruning: don't allow pruning followup moves at PV nodes (Take 3)
15-01-23 sg scale_endgame diff
ELO: 1.99 +-2.5 (95%) LOS: 94.2%
Total: 30000 W: 6103 L: 5931 D: 17966
30000 @ 15+0.05 th 1 Measure effect of scaling down endgame score. Perhaps this avoids a little bit straight exchanges into endgames.
15-01-23 sg spsa_scale_endgame diff
19833/20000 iterations
40000/40000 games played
40000 @ 15+0.05 th 1 The concept seems promising so try first optimize parameters (include idea of mindbreaker)
15-01-24 sg spsa_scale_endgame diff
19353/20000 iterations
39670/40000 games played
40000 @ 15+0.05 th 1 My first tuning attempt breaks eval symmetry. So i stick now to my original approach. Mea culpa.
15-01-24 sg scale_endgame diff
LLR: -2.97 (-2.94,2.94) [-1.50,4.50]
Total: 48483 W: 9772 L: 9745 D: 28966
sprt @ 15+0.05 th 1 Tuning indicate my start value is already good. So test this now with sprt.
15-01-24 sg fix_skill_level diff
ELO: 534.29 +-11.7 (95%) LOS: 100.0%
Total: 20000 W: 19098 L: 863 D: 39
20000 @ 15+0.05 th 1 Disable move pruning at the root node to fix the reported problem if using skill levels (test with skill level 1).
15-01-24 sg fix_skill_level diff
LLR: 2.94 (-2.94,2.94) [-3.00,1.00]
Total: 117279 W: 23585 L: 23642 D: 70052
sprt @ 15+0.05 th 1 Verify the skill level fix is no regression in standard ply
15-01-26 sg scale_endgame diff
LLR: -2.97 (-2.94,2.94) [-1.50,4.50]
Total: 17589 W: 3465 L: 3524 D: 10600
sprt @ 15+0.05 th 1 Scale down endgame by 13/16 (Take 2)
15-01-30 sg pawn_attack_threat diff
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 7925 W: 1666 L: 1537 D: 4722
sprt @ 15+0.05 th 1 Add bonus for possible safe pawn pushes which attack an enemy piece. Inspired by http://talkchess.com/forum/viewtopic.php?t=55142
15-01-30 sg pawn_attack_threat diff
LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 40109 W: 6841 L: 6546 D: 26722
sprt @ 60+0.05 th 1 LTC: Add bonus for possible safe pawn pushes which attack an enemy piece. Inspired by http://talkchess.com/forum/viewtopic.php?t=55142
15-01-31 sg pawn_attack_threat2 diff
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 14079 W: 2910 L: 2764 D: 8405
sprt @ 15+0.05 th 1 Add bonus for possible safe pawn pushes which attack an enemy piece. Cover more cases by using a doubleAttackedBy array. (Take 2)
15-01-31 sg pawn_attack_threat2 diff
LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 6092 W: 1064 L: 933 D: 4095
sprt @ 60+0.05 th 1 LTC: Add bonus for possible safe pawn pushes which attack an enemy piece. Cover more cases by using a doubleAttackedBy array. (Take 2)
15-01-31 sg pawn_attack_threat2 diff
ELO: -1.07 +-2.0 (95%) LOS: 15.1%
Total: 37089 W: 6061 L: 6175 D: 24853
40000 @ 60+0.05 th 1 Both version of pawn attack threat passed (the second seems better at LTC counting the test run length, but this can misleading). So measure in a direct match which is the better one.