Stockfish Testing Queue

Finished - 4560 tests

15-01-11 sg pruning diff
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 21553 W: 4342 L: 4221 D: 12990
sprt @ 15+0.05 th 1 No regression test: allow move pruning on pv nodes. Futility pruning in step 7 is added (pointed out by Lucas)
15-01-11 sg pruning diff
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 7675 W: 1351 L: 1209 D: 5115
sprt @ 60+0.05 th 1 LTC: No regression test: allow move pruning on pv nodes. Futility pruning in step 7 is added (pointed out by Lucas)
15-01-12 sg pruning diff
ELO: 1.51 +-2.6 (95%) LOS: 86.9%
Total: 20000 W: 3061 L: 2974 D: 13965
20000 @ 60+0.05 th 3 LTC: SMP-Measure (as proposed by Joona) allow move pruning on pv nodes. Futility pruning in step 7 is added (pointed out by Lucas). smp test proposed by Joerg
15-01-14 sg razor_pv diff
ELO: -1.01 +-2.5 (95%) LOS: 21.2%
Total: 30000 W: 5902 L: 5989 D: 18109
30000 @ 15+0.05 th 1 No we allow futility pruning at pv nodes i want measure if other pruning or reductions methods are useful at pv nodes too. Test allow razoring at pv nodes.
15-01-14 sg probcut_pv diff
ELO: 0.98 +-2.5 (95%) LOS: 78.1%
Total: 30000 W: 6032 L: 5947 D: 18021
30000 @ 15+0.05 th 1 Measure effect of allowing probcut on pv nodes
15-01-15 sg probcut_pv2 diff
ELO: 0.91 +-2.5 (95%) LOS: 76.5%
Total: 30000 W: 6015 L: 5936 D: 18049
30000 @ 15+0.05 th 1 allow probcut at pv nodes (except root node)
15-01-15 sg probcut_pv diff
ELO: -1.51 +-2.3 (95%) LOS: 9.8%
Total: 29311 W: 4775 L: 4902 D: 19634
30000 @ 60+0.05 th 1 LTC: Measure effect of allowing probcut on pv nodes
15-01-16 sg null_pv diff
ELO: -0.87 +-2.3 (95%) LOS: 22.9%
Total: 34620 W: 6825 L: 6912 D: 20883
30000 @ 15+0.05 th 1 Allow null move pruning on PV nodes, but do there always verification search
15-01-17 sg spsa_backward_rank diff
36473/40000 iterations
71778/80000 games played
80000 @ 15+0.05 th 1 Tune new rank based penalty for backward pawns.
15-01-18 sg backward_rank diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 8527 W: 1651 L: 1734 D: 5142
sprt @ 15+0.05 th 1 Test tuned values for new rank based backward pawn penalty.
15-01-19 sg backward_rank2 diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 6505 W: 1230 L: 1319 D: 3956
sprt @ 15+0.05 th 1 After failed spsa tuning try i simple linear rank based penalty for backward pawns
15-01-20 sg prune_pv diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 19173 W: 3774 L: 3828 D: 11571
sprt @ 15+0.05 th 1 After allowing pruning at PV nodes try to excluded specific moves. move count pruning: don't allow pruning counter moves at PV nodes (Take 1)
15-01-20 sg prune_pv diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 10017 W: 1888 L: 1967 D: 6162
sprt @ 15+0.05 th 1 move count pruning: don't allow pruning killer moves at PV nodes (Take 2)
15-01-20 sg prune_pv diff
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 14011 W: 2890 L: 2744 D: 8377
sprt @ 15+0.05 th 1 move count pruning: don't allow pruning followup moves at PV nodes (Take 3)
15-01-20 sg prune_pv diff
LLR: -2.95 (-2.94,2.94) [0.00,6.00]
Total: 9905 W: 1638 L: 1693 D: 6574
sprt @ 60+0.05 th 1 LTC: move count pruning: don't allow pruning followup moves at PV nodes (Take 3)
15-01-23 sg scale_endgame diff
ELO: 1.99 +-2.5 (95%) LOS: 94.2%
Total: 30000 W: 6103 L: 5931 D: 17966
30000 @ 15+0.05 th 1 Measure effect of scaling down endgame score. Perhaps this avoids a little bit straight exchanges into endgames.
15-01-23 sg spsa_scale_endgame diff
19833/20000 iterations
40000/40000 games played
40000 @ 15+0.05 th 1 The concept seems promising so try first optimize parameters (include idea of mindbreaker)
15-01-24 sg spsa_scale_endgame diff
19353/20000 iterations
39670/40000 games played
40000 @ 15+0.05 th 1 My first tuning attempt breaks eval symmetry. So i stick now to my original approach. Mea culpa.
15-01-24 sg scale_endgame diff
LLR: -2.97 (-2.94,2.94) [-1.50,4.50]
Total: 48483 W: 9772 L: 9745 D: 28966
sprt @ 15+0.05 th 1 Tuning indicate my start value is already good. So test this now with sprt.
15-01-24 sg fix_skill_level diff
ELO: 534.29 +-11.7 (95%) LOS: 100.0%
Total: 20000 W: 19098 L: 863 D: 39
20000 @ 15+0.05 th 1 Disable move pruning at the root node to fix the reported problem if using skill levels (test with skill level 1).
15-01-24 sg fix_skill_level diff
LLR: 2.94 (-2.94,2.94) [-3.00,1.00]
Total: 117279 W: 23585 L: 23642 D: 70052
sprt @ 15+0.05 th 1 Verify the skill level fix is no regression in standard ply
15-01-26 sg scale_endgame diff
LLR: -2.97 (-2.94,2.94) [-1.50,4.50]
Total: 17589 W: 3465 L: 3524 D: 10600
sprt @ 15+0.05 th 1 Scale down endgame by 13/16 (Take 2)
15-01-30 sg pawn_attack_threat diff
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 7925 W: 1666 L: 1537 D: 4722
sprt @ 15+0.05 th 1 Add bonus for possible safe pawn pushes which attack an enemy piece. Inspired by http://talkchess.com/forum/viewtopic.php?t=55142
15-01-30 sg pawn_attack_threat diff
LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 40109 W: 6841 L: 6546 D: 26722
sprt @ 60+0.05 th 1 LTC: Add bonus for possible safe pawn pushes which attack an enemy piece. Inspired by http://talkchess.com/forum/viewtopic.php?t=55142
15-01-31 sg pawn_attack_threat2 diff
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 14079 W: 2910 L: 2764 D: 8405
sprt @ 15+0.05 th 1 Add bonus for possible safe pawn pushes which attack an enemy piece. Cover more cases by using a doubleAttackedBy array. (Take 2)
15-01-31 sg pawn_attack_threat2 diff
LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 6092 W: 1064 L: 933 D: 4095
sprt @ 60+0.05 th 1 LTC: Add bonus for possible safe pawn pushes which attack an enemy piece. Cover more cases by using a doubleAttackedBy array. (Take 2)
15-01-31 sg pawn_attack_threat2 diff
ELO: -1.07 +-2.0 (95%) LOS: 15.1%
Total: 37089 W: 6061 L: 6175 D: 24853
40000 @ 60+0.05 th 1 Both version of pawn attack threat passed (the second seems better at LTC counting the test run length, but this can misleading). So measure in a direct match which is the better one.
15-02-01 sg spsa_pawn_attack_threat diff
48483/50000 iterations
100000/100000 games played
100000 @ 15+0.05 th 1 Tune parameters of my passed pawn attack threat patch. Use 100000 games because the parameters are completly untuned.
15-02-02 sg tuned_pawn_attack_threa diff
ELO: 3.64 +-2.2 (95%) LOS: 100.0%
Total: 40000 W: 8213 L: 7794 D: 23993
40000 @ 15+0.05 th 1 Measure elo of tuned vs untuned pawn attack threat
15-02-02 sg spsa_pawn_attack_threat diff
38612/40000 iterations
80000/80000 games played
80000 @ 15+0.05 th 1 The last tuning was promising and at least one parameter seems not converged, so try further tuning on top.
15-02-03 sg tuned2_pawn_attack_thre diff
ELO: -0.97 +-2.2 (95%) LOS: 19.3%
Total: 38311 W: 7575 L: 7682 D: 23054
40000 @ 15+0.05 th 1 Measure elo of second tuned vs first tuned pawn attack threat
15-02-03 sg tuned_pawn_attack_threa diff
LLR: -2.95 (-2.94,2.94) [0.00,4.00]
Total: 37488 W: 6213 L: 6247 D: 25028
sprt @ 60+0.05 th 1 My first tuning seems to give the best parameters, so test them now at LTC against current master.
15-02-04 sg pawn_attack_threat3 diff
LLR: -2.97 (-2.94,2.94) [-1.50,4.50]
Total: 14594 W: 2876 L: 2943 D: 8775
sprt @ 15+0.05 th 1 Recognize only attacks on minor pieces (That was the original idea of Ludmil, i extended that in my succesful patch to all pieces).
15-02-04 sg pawn_attack_threat3 diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 15280 W: 3015 L: 3080 D: 9185
sprt @ 15+0.05 th 1 exclude queens as target
15-02-06 sg pawn_attack_threat3 diff
ELO: 1.34 +-3.0 (95%) LOS: 80.7%
Total: 20000 W: 3989 L: 3912 D: 12099
20000 @ 15+0.05 th 1 Allow queen as defender. The test of sn which allows all pieces as defenders passed STC, but struggles with LTC. So lets measure the effect for each piece type separatly.
15-02-06 sg pawn_attack_threat3 diff
ELO: 0.28 +-3.0 (95%) LOS: 57.1%
Total: 20000 W: 4006 L: 3990 D: 12004
20000 @ 15+0.05 th 1 Allow rook as defender
15-02-06 sg pawn_attack_threat3 diff
ELO: 0.12 +-3.0 (95%) LOS: 53.1%
Total: 20000 W: 3933 L: 3926 D: 12141
20000 @ 15+0.05 th 1 Allow bishop as defender
15-02-06 sg pawn_attack_threat3 diff
ELO: 2.69 +-3.0 (95%) LOS: 95.9%
Total: 20000 W: 4072 L: 3917 D: 12011
20000 @ 15+0.05 th 1 Allow knight as defender
15-02-06 sg pawn_attack_threat3 diff
ELO: 1.73 +-3.1 (95%) LOS: 86.2%
Total: 19246 W: 3921 L: 3825 D: 11500
20000 @ 15+0.05 th 1 Allow king as defender
15-02-07 sg pawn_attack_threat3 diff
ELO: -0.81 +-2.6 (95%) LOS: 27.2%
Total: 27059 W: 5374 L: 5437 D: 16248
30000 @ 15+0.05 th 1 Allow knight, king and queen as defender. Combine the pieces which show some elo gain and measure if this adds up.
15-02-07 sg pawn_attack_threat3 diff
LLR: -2.94 (-2.94,2.94) [-1.50,4.50]
Total: 28215 W: 5739 L: 5767 D: 16709
sprt @ 15+0.05 th 1 Allow knight as defender. Retest with SPRT to check for luck in first run
15-02-07 sg pawn_attack_threat_see diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 23696 W: 4778 L: 4819 D: 14099
sprt @ 15+0.05 th 1 My current implemetation detects as cheap as possible safe pawn pushes, so that many cases not covered. Try now for the remaining pushes safety calculation with SEE
15-02-08 sg backward_pawn diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 11104 W: 2202 L: 2278 D: 6624
sprt @ 15+0.05 th 1 double up penalty if backward pawn is stopped by a pawn double attack
15-02-08 sg outposts_double diff
LLR: -2.97 (-2.94,2.94) [-1.50,4.50]
Total: 12099 W: 2426 L: 2500 D: 7173
sprt @ 15+0.05 th 1 Add 50% more bonus if outpost is defended by two pawns.
15-02-08 sg backward_pawn diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 23324 W: 4585 L: 4628 D: 14111
sprt @ 15+0.05 th 1 Add 50% penalty if backward pawn is stopped by a pawn double attack (Take 2)
15-02-08 sg isolated_pawn diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 13095 W: 2575 L: 2646 D: 7874
sprt @ 15+0.05 th 1 Fix error: add 50% penalty for isolated pawn which is stopped by a pawn double attack
15-02-11 sg spsa_pawn_attack_threat diff
46633/50000 iterations
99018/100000 games played
100000 @ 15+0.05 th 1 Use different pawn attack threat bonus by piece type. Now tune this parameters, starting at value (20,20) from the current version.
15-02-12 sg pawn_attack_threat4 diff
ELO: 1.21 +-2.8 (95%) LOS: 79.8%
Total: 23278 W: 4746 L: 4665 D: 13867
30000 @ 15+0.05 th 1 Quick measure of the First tuned parameters. Successful any safe pawn push patch is now merged in.
15-02-12 sg spsa_pawn_attack_threat diff
46974/50000 iterations
97604/100000 games played
100000 @ 15+0.05 th 1 The first tuning is done without the any_safe_pawn2 patch. The measurement (now including any_safe_pawn2 patch) gives no significant gain and this two ideas seems strongly interacting as expected. Tuning now is done based on this passed patch. Only my new parameters tuned, not the 2 from the other patch, because we add code so it have to prove first by itself.
15-02-13 sg pawn_attack_threat4 diff
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 9307 W: 1916 L: 1784 D: 5607
sprt @ 15+0.05 th 1 Now hopefully the correct test of the tuned parameters. It's just not my day.