Stockfish Testing Queue

Finished - 22407 tests

26-12-14 sg passed_pawns diff
LLR: -2.95 (-2.94,2.94) [-0.50,3.50]
Total: 9498 W: 1829 L: 1963 D: 5706
sprt @ 15+0.05 th 1 Because stockfish underestimates passed pawns in middle game raise base bonus factor slightly
26-12-14 st timeman_0 diff
ELO: -0.33 +-3.1 (95%) LOS: 41.7%
Total: 20000 W: 4063 L: 4082 D: 11855
20000 @ 15+0.05 th 1 Let's see how elo sensitive is that part of the code. I expect around 5 elo to 10 elo loss but who knows.
25-12-14 Ro MobilityNoDef diff
ELO: 0.83 +-3.1 (95%) LOS: 70.0%
Total: 20000 W: 4191 L: 4143 D: 11666
20000 @ 15+0.05 th 1 OK, let's keep the defenses...But actually, Pawn defenses are excluded from mobility calculation. See what happen if we add mobile pawns on Rank 4 and above to our mobility area.
26-12-14 jo passed_filebonus diff
LLR: -2.97 (-2.94,2.94) [-1.50,4.50]
Total: 12118 W: 2447 L: 2521 D: 7150
sprt @ 15+0.05 th 1 Take 2 with SPSA values after 24k iterations.
26-12-14 np easy-move diff
ELO: -18.47 +-8.2 (95%) LOS: 0.0%
Total: 2862 W: 518 L: 670 D: 1674
20000 @ 15+0.05 th 1 measuring Elo difference
25-12-14 jo passed_filebonus diff
24212/25000 iterations
47890/50000 games played
50000 @ 30+0.05 th 1 SPSA tuning at intermediate tc, 25k iterations. Slightly reduced rank-based base values, to allow more weight to file bonus.
25-12-14 Ro MobilityNoDef diff
ELO: -6.05 +-3.1 (95%) LOS: 0.0%
Total: 20000 W: 3855 L: 4203 D: 11942
20000 @ 15+0.05 th 1 Same as previous test, but smaller reduction (1/3)
25-12-14 SC king_distance_KPPKPP diff
LLR: 0.11 (-2.94,2.94) [-1.50,4.50]
Total: 977 W: 208 L: 201 D: 568
sprt @ 15+0.05 th 1 King distance KPP KPP. Some minor ambiguities in code removed (double init of a variable). Signature error was seemingly caused by wrong assert spotted by Joerg Oster.
24-12-14 Ro MobilityNoDef diff
ELO: -18.41 +-2.6 (95%) LOS: 0.0%
Total: 30000 W: 5618 L: 7206 D: 17176
30000 @ 15+0.05 th 1 Measure what happens when excluding defenses when evaluating mobility
25-12-14 jo passed_filebonus diff
LLR: -3.42 (-2.94,2.94) [-1.50,4.50]
Total: 34459 W: 6972 L: 7000 D: 20487
sprt @ 15+0.05 th 1 Add bonus/penalty to passed pawns base values, depending on game phase and file.
25-12-14 Ro MobilityNoDef diff
ELO: -1.20 +-3.0 (95%) LOS: 22.1%
Total: 20000 W: 3984 L: 4053 D: 11963
20000 @ 15+0.05 th 1 See what happens if we change only the Queen mobility calculation (do not consider squares where we have some pieces)
24-12-14 jo opposition1 diff
LLR: -3.25 (-2.94,2.94) [-1.50,4.50]
Total: 6466 W: 1254 L: 1354 D: 3858
sprt @ 15+0.05 th 1 Try a different approach.
24-12-14 SC king_distance_KPPKPP diff
LLR: -0.49 (-2.94,2.94) [-1.50,4.50]
Total: 117 W: 13 L: 30 D: 74
sprt @ 15+0.05 th 1 King distance in KPP KPP, further try to resolve signature error problems.
23-12-14 lb reduction diff
ELO: -9.66 +-2.5 (95%) LOS: 0.0%
Total: 30000 W: 5479 L: 6313 D: 18208
30000 @ 15+0.05 th 1 measure before tuning
23-12-14 jo opposition diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 23727 W: 4757 L: 4798 D: 14172
sprt @ 15+0.05 th 1 Take 3.
23-12-14 jo opposition diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 25224 W: 5055 L: 5092 D: 15077
sprt @ 15+0.05 th 1 Apply bonus even later. Take 2.
23-12-14 jo opposition^ diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 21636 W: 4335 L: 4382 D: 12919
sprt @ 15+0.05 th 1 Small bonus for having the opposition in endgames. Take 1.
23-12-14 gl imbalance_tuning diff
LLR: -2.97 (-2.94,2.94) [-1.50,4.50]
Total: 9932 W: 1953 L: 2033 D: 5946
sprt @ 15+0.05 th 1 Test SPSA imbalance values
21-12-14 gl imbalance_tuning diff
48807/50000 iterations
91792/100000 games played
100000 @ 60+0.05 th 1 SPSA tuning of material imbalance at 60s
23-12-14 SC king_distance_KPPKPP diff
LLR: -0.06 (-2.94,2.94) [-1.50,4.50]
Total: 26 W: 5 L: 7 D: 14
sprt @ 15+0.05 th 1 KPP KPP with king distance. Fix of non initialized variables to avoid signature errors. No real tuning, only some bench positions tested.
23-12-14 lb test diff
ELO: -18.75 +-2.6 (95%) LOS: 0.0%
Total: 26741 W: 4662 L: 6104 D: 15975
30000 @ 15+0.05 th 1 last try on this formula.
21-12-14 lb reduction diff
ELO: -20.08 +-2.5 (95%) LOS: 0.0%
Total: 29238 W: 4968 L: 6656 D: 17614
30000 @ 15+0.05 th 1 new reduction formula. take 2.
22-12-14 SC king_distance_KPPKPP diff
LLR: -1.11 (-2.94,2.94) [-1.50,4.50]
Total: 3259 W: 615 L: 646 D: 1998
sprt @ 15+0.05 th 1 Now taking into account also minimal king distance in evaluating the endgame. Parameter tuning taken over from previous SPSA run, let us see how good it is. I rechecked my test signature and I've go exactly the same bench as before.
22-12-14 SC king_distance_KPPKPP diff
LLR: -0.10 (-2.94,2.94) [-1.50,4.50]
Total: 183 W: 36 L: 39 D: 108
sprt @ 15+0.05 th 1 Now taking into account also minimal king distance in evaluating the endgame. Parameter tuning taken over from previous SPSA run, let us see how good it is. I rechecked my test signature and I've go exactly the same bench as before.
22-12-14 in movehorizon diff
LLR: -2.98 (-2.94,2.94) [-1.50,4.50]
Total: 36211 W: 7227 L: 7235 D: 21749
sprt @ 15+0.05 th 1 Another crude try. Plan time management 60 moves ahead.
22-12-14 in movehorizon diff
LLR: -2.97 (-2.94,2.94) [-1.50,4.50]
Total: 45320 W: 9237 L: 9218 D: 26865
sprt @ 15+0.05 th 1 Increase emergencyMoveHorizon. This is based on the crude observation that engine games tend to last slightly longer than the 40 moves estimate which is derived from statistical analysis over a human database.
21-12-14 lb reduction diff
ELO: -23.43 +-2.7 (95%) LOS: 0.0%
Total: 25726 W: 4338 L: 6070 D: 15318
30000 @ 15+0.05 th 1 new reduction formula. quick local tuning. see how far we get.
21-12-14 jo endgame_scaling diff
LLR: -3.22 (-2.94,2.94) [-1.50,4.50]
Total: 2710 W: 491 L: 600 D: 1619
sprt @ 15+0.05 th 1 Endgame scaling, take 2. Scale down more, if the defending king is close to the pawns.
21-12-14 SC streamline_KPPKPP diff
LLR: -2.94 (-2.94,2.94) [-1.50,4.50]
Total: 8923 W: 1762 L: 1844 D: 5317
sprt @ 15+0.05 th 1 Retry SPRT after moving from a formula to a table and after a long SPSA session. I also manually verified that for the (not really critical, but exemplary) position fen 8/2k1n3/1p1p4/6K1/2B1PP2/8/8/8 w - - the master also chooses f4f5, only at much higher depth than the patch.
19-12-14 SC streamline_KPPKPP diff
112202/50000 iterations
127275/300000 games played
300000 @ 5+0.25 th 1 Rescheduling of SPSA tuning of KPP vs KPP, this time with correct base branch. - 100000 games at priority -2 - tc 5+0.25 (endgame oriented) - starting values from previous SPSA run
20-12-14 jo KQKRPs_endgame diff
LLR: 3.93 (-2.94,2.94) [-3.00,1.00]
Total: 36758 W: 6170 L: 6024 D: 24564
sprt @ 60+0.05 th 1 LTC: Slightly changed downscaling to having more effect. I decided to only test take 2. See also commit notes, please.
17-12-14 Fi allowZeroKey diff
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 118640 W: 19909 L: 19946 D: 78785
sprt @ 60+0.05 th 1 Simplification. Don't treat 0 key as special.
19-12-14 sg king_block_pawn diff
ELO: -2.18 +-2.4 (95%) LOS: 3.7%
Total: 32659 W: 6502 L: 6707 D: 19450
40000 @ 15+0.05 th 1 big_king_safety: its unlikely but perhaps the change from 200 to 300 for king blocks pawn is responsible for the elo gain, so go for safety and test this version against the current master (see pull request comment)
19-12-14 jo KQKRPs_endgame diff
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 49388 W: 10093 L: 10024 D: 29271
sprt @ 15+0.05 th 1 Slightly changed downscaling to having more effect. I decided to only test take 2. See also commit notes, please.
19-12-14 jh krp_endgame diff
LLR: 4.44 (-2.94,2.94) [-3.00,1.00]
Total: 32847 W: 5554 L: 5375 D: 21918
sprt @ 60+0.05 th 1 KQKRPs endgame adjustment.
19-12-14 mb midgame_pp1 diff
LLR: -2.97 (-2.94,2.94) [-1.50,4.50]
Total: 14651 W: 2926 L: 2993 D: 8732
sprt @ 15+0.05 th 1 Raise midgame passed pawn values a bit.
18-12-14 pe tm diff
ELO: -0.66 +-2.5 (95%) LOS: 30.1%
Total: 30000 W: 5942 L: 5999 D: 18059
30000 @ 15+0.05 th 1 Decrease time available if for current iteration it looks like the best move would not change. Fix
19-12-14 jh krp_endgame diff
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 62936 W: 12648 L: 12604 D: 37684
sprt @ 15+0.05 th 1 KQKRPs endgame adjustment.
18-12-14 pe tm diff
ELO: -22.82 +-22.3 (95%) LOS: 2.2%
Total: 366 W: 60 L: 84 D: 222
30000 @ 15+0.05 th 1 Decrease time available if for current iteration it looks like the best move would not change
18-12-14 SC streamline_KPPKPP diff
23081/30000 iterations
41650/60000 games played
60000 @ 5+0.1 th 1 KPPKPP eval patch passed STC and failed LTC. Retry to tune a streamlined patch with a endgame-oriented tc.
17-12-14 lb tune diff
LLR: -4.38 (-2.94,2.94) [-3.00,1.00]
Total: 94218 W: 15734 L: 16115 D: 62369
sprt @ 60+0.05 th 1 LTC: simplified storm with locally tuned values
18-12-14 SC specKPPKPPeval diff
LLR: -3.73 (-2.94,2.94) [0.00,6.00]
Total: 11768 W: 1949 L: 2022 D: 7797
sprt @ 60+0.05 th 1 SPRT for a corrected version of KPP vs KPP eval patch. Previous version did not detect symmetric positions correctly.
17-12-14 sg big_king_safety diff
LLR: 2.96 (-2.94,2.94) [0.00,6.00]
Total: 10311 W: 1876 L: 1721 D: 6714
sprt @ 60+0.05 th 1 LTC: Measure big king safety tuning with corrected values
17-12-14 jh time_1 diff
LLR: -3.71 (-2.94,2.94) [-1.50,4.50]
Total: 47696 W: 9608 L: 9610 D: 28478
sprt @ 15+0.05 th 1 Reduce bonus time after consecutive fail highs.
17-12-14 sg big_king_safety diff
ELO: 3.46 +-2.2 (95%) LOS: 99.9%
Total: 40000 W: 8275 L: 7877 D: 23848
40000 @ 15+0.05 th 1 Measure big king safety tuning with corrected values (see forum). Prio -2
17-12-14 SC specKPPKPPeval diff
LLR: 3.46 (-2.94,2.94) [-1.50,4.50]
Total: 35549 W: 7328 L: 7103 D: 21118
sprt @ 15+0.05 th 1 SPRT for a corrected version of KPP vs KPP eval patch. Previous version did not detect symmetric positions correctly.
14-12-14 sg spsa_big_king_safety diff
47490/50000 iterations
90852/100000 games played
100000 @ 60+0.05 th 1 Big king safety tuning. Stormdanger and Shelterweakness indexed by file pairs (a/h,b/g,c/f,d/e). Special case where king blocks pawn is incorporated in Stormdanger. LTC because TC-dependant. There are 93 parameters so i use following SPSA configuration: Games=100000 Gamma=0.159 Alpha=0.558 C=5 (except in maxSafety C=10 is used) Prio -1
17-12-14 np DB-PVStability-2 diff
ELO: -0.33 +-5.7 (95%) LOS: 45.5%
Total: 5263 W: 958 L: 963 D: 3342
20000 @ 30+0.05 th 1 hypotheis: resolving pvstability at much higher depths won't help anymore. faster machines should be more sensitive to this change
16-12-14 np DB-PVStability diff
ELO: 0.55 +-3.0 (95%) LOS: 64.1%
Total: 18838 W: 3484 L: 3454 D: 11900
20000 @ 30+0.05 th 1 give more time for pv stability at higher depth.
16-12-14 lb tune diff
LLR: 3.26 (-2.94,2.94) [-3.50,0.50]
Total: 28113 W: 5754 L: 5655 D: 16704
sprt @ 15+0.05 th 1 simplified storm with locally tuned values