Stockfish Testing Queue

Finished - 56600 tests

15-01-26 Roc WeakDefenders201501 diff
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 18452 W: 3792 L: 3634 D: 11026
sprt @ 15+0.05 th 1 Take 2 on this WeekDefender idea.
15-01-26 Roc WeakDefenders201501 diff
LLR: -2.97 (-2.94,2.94) [0.00,6.00]
Total: 10068 W: 1631 L: 1686 D: 6751
sprt @ 60+0.05 th 1 Take 2 on this WeekDefender idea. LTC.
15-01-27 lbr skill diff
ELO: 106.78 +-7.0 (95%) LOS: 100.0%
Total: 10000 W: 6293 L: 3313 D: 394
10000 @ 10+0.05 th 1 Double skill level resolution. Measure ELO gap: 7 vs 6
15-01-27 lbr skill diff
ELO: 83.61 +-6.8 (95%) LOS: 100.0%
Total: 10000 W: 5890 L: 3529 D: 581
10000 @ 10+0.05 th 1 Double skill level resolution. Measure ELO gap: 8 vs 7
15-01-27 lbr skill diff
ELO: 18.68 +-6.6 (95%) LOS: 100.0%
Total: 10000 W: 4962 L: 4425 D: 613
10000 @ 10+0.05 th 1 Double skill level resolution. Measure ELO gap: 9 vs 8
15-01-27 lbr skill diff
ELO: 91.41 +-6.8 (95%) LOS: 100.0%
Total: 10000 W: 5916 L: 3344 D: 740
10000 @ 10+0.05 th 1 Double skill level resolution. Measure ELO gap: 10 vs 9
15-01-27 lbr skill diff
ELO: 93.28 +-6.7 (95%) LOS: 100.0%
Total: 10000 W: 5862 L: 3240 D: 898
10000 @ 15+0.05 th 1 Double skill level resolution. Measure ELO gap: 11 vs 10
15-01-27 lbr skill diff
ELO: 91.41 +-6.6 (95%) LOS: 100.0%
Total: 10000 W: 5762 L: 3190 D: 1048
10000 @ 15+0.05 th 1 Double skill level resolution. Measure ELO gap: 12 vs 11
15-01-27 lbr skill diff
ELO: 26.49 +-6.4 (95%) LOS: 100.0%
Total: 10000 W: 4776 L: 4015 D: 1209
10000 @ 15+0.05 th 1 Double skill level resolution. Measure ELO gap: 13 vs 12
15-01-27 Roc EvalPieceSimplification diff
LLR: 1.31 (-2.94,2.94) [-3.00,1.00]
Total: 48563 W: 9682 L: 9700 D: 29181
sprt @ 15+0.05 th 1 Attempt to a small simplification in evaluate_piece.
15-01-27 Roc EvalPieceSimplification diff
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 41185 W: 8154 L: 8069 D: 24962
sprt @ 15+0.05 th 1 Attempt to a small simplification in evaluate_piece. Fixed error in initial test submission.
15-01-27 SC no_null_trick diff
LLR: -2.96 (-2.94,2.94) [-3.00,1.00]
Total: 14324 W: 2769 L: 2956 D: 8599
sprt @ 15+0.05 th 1 Testing removal of post-null move evaluation trick as a simplification.
15-01-27 lbr skill diff
ELO: 75.55 +-6.5 (95%) LOS: 100.0%
Total: 10000 W: 5417 L: 3276 D: 1307
10000 @ 15+0.05 th 1 Double skill level resolution. Measure ELO gap: 14 vs 13
15-01-27 lbr skill diff
ELO: 55.22 +-6.3 (95%) LOS: 100.0%
Total: 10000 W: 5027 L: 3451 D: 1522
10000 @ 15+0.05 th 1 Double skill level resolution. Measure ELO gap: 15 vs 14
15-01-27 lbr skill diff
ELO: 31.77 +-6.2 (95%) LOS: 100.0%
Total: 10000 W: 4596 L: 3684 D: 1720
10000 @ 15+0.05 th 1 Double skill level resolution. Measure ELO gap: 16 vs 15
15-01-27 lbr skill diff
ELO: 25.55 +-6.2 (95%) LOS: 100.0%
Total: 10000 W: 4504 L: 3770 D: 1726
10000 @ 15+0.05 th 1 Double skill level resolution. Measure ELO gap: 17 vs 16
15-01-28 lbr skill diff
ELO: 25.72 +-6.2 (95%) LOS: 100.0%
Total: 10000 W: 4461 L: 3722 D: 1817
10000 @ 20+0.05 th 1 Double skill level resolution. Measure ELO gap: 18 vs 17
15-01-28 lbr skill diff
ELO: 21.29 +-6.1 (95%) LOS: 100.0%
Total: 10000 W: 4352 L: 3740 D: 1908
10000 @ 20+0.05 th 1 Double skill level resolution. Measure ELO gap: 19 vs 18
15-01-28 Roc SPSA_EvalSimplification diff
12016/12500 iterations
25000/25000 games played
25000 @ 15+0.05 th 1 SPSA: "simplification" idea was not harmful, but not exciting. Let's see if tweaking on some ThreatenedByPawn weights we can get something more interesting worth going further.
15-01-28 lbr skill diff
ELO: 24.95 +-6.1 (95%) LOS: 100.0%
Total: 10000 W: 4393 L: 3676 D: 1931
10000 @ 20+0.05 th 1 Double skill level resolution. 20 vs 19
15-01-28 lbr skill diff
ELO: 16.62 +-6.1 (95%) LOS: 100.0%
Total: 10000 W: 4265 L: 3787 D: 1948
10000 @ 20+0.05 th 1 Double skill level resolution. 21 vs 20
15-01-28 lbr skill diff
ELO: 18.78 +-6.1 (95%) LOS: 100.0%
Total: 10000 W: 4229 L: 3689 D: 2082
10000 @ 40+0.05 th 1 Double skill level resolution. 21 vs 20. redo this one with more tc to see if we hit the tc ceiling for that skill level
15-01-28 roh QuickMove diff
LLR: -2.96 (-2.94,2.94) [0.00,6.00]
Total: 21144 W: 3614 L: 3614 D: 13916
sprt @ 60+0.05 th 1 Testing with 1000 iterations at 60+0.05 passes sanity check. Check with 20,000 matches while FishTest is idle.
15-01-28 n_p SPSAKingSafety2 diff
58111/50000 iterations
94197/100000 games played
100000 @ 60+0.05 th 1 Another SPSA-session on king safety. The start values from KingSafetySPSA.
15-01-28 Roc FCC_20150127 diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 3513 W: 685 L: 783 D: 2045
sprt @ 15+0.05 th 1 A fix on take #3 of previous Fast contact checks attempt. A bit slower than RC3.
15-01-28 Roc FCC_20150127 diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 8367 W: 1691 L: 1775 D: 4901
sprt @ 15+0.05 th 1 A small variation.
15-01-29 Roc FCC_20150127 diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 8408 W: 1637 L: 1721 D: 5050
sprt @ 15+0.05 th 1 One more tweak.
15-01-29 Mys no_trapRK2 diff
ELO: -33.59 +-3.1 (95%) LOS: 0.0%
Total: 21000 W: 3667 L: 5691 D: 11642
20000 @ 15+0.05 th 1 Indicator of removing trappedrook #2
15-01-29 Mys no_trapRK1 diff
ELO: -10.17 +-3.1 (95%) LOS: 0.0%
Total: 20000 W: 3857 L: 4442 D: 11701
20000 @ 15+0.05 th 1 Indicator of removing trappedrook #1
15-01-29 Roc FCC_20150127 diff
LLR: -2.97 (-2.94,2.94) [-1.50,4.50]
Total: 10634 W: 2084 L: 2162 D: 6388
sprt @ 15+0.05 th 1 This time, use exactly the same bitboard for mobility andkingring calculation as master, and hoping that remaining code is still "improvement"
15-01-29 n_p KingSafety diff
ELO: 3.21 +-2.1 (95%) LOS: 99.8%
Total: 40000 W: 8181 L: 7812 D: 24007
40000 @ 15+0.05 th 1 Checking the values from the SPSA-session on king safety.
15-01-29 lbr skill diff
ELO: -0.98 +-16.0 (95%) LOS: 45.2%
Total: 1777 W: 875 L: 880 D: 22
10000 @ 10+0.05 th 1 measure simplified skill levels: 2 vs 0
15-01-29 lbr skill diff
ELO: 101.89 +-25.6 (95%) LOS: 100.0%
Total: 740 W: 462 L: 251 D: 27
10000 @ 10+0.05 th 1 measure simplified skill levels: 4 vs 2
15-01-30 gli less_mc_pruning diff
LLR: -2.97 (-2.94,2.94) [-1.50,4.50]
Total: 1339 W: 222 L: 326 D: 791
sprt @ 15+0.05 th 1 Do less movecount pruning if there are big threats on the board
15-01-30 gli less_mc_pruning diff
ELO: -19.74 +-7.0 (95%) LOS: 0.0%
Total: 3859 W: 674 L: 893 D: 2292
20000 @ 15+0.05 th 1 (Pre-emptive ELO loss measurement :) Do less movecount pruning if there are big threats on the board
15-01-30 gli measure_pv_pruning diff
ELO: -3.89 +-2.0 (95%) LOS: 0.0%
Total: 40000 W: 6535 L: 6983 D: 26482
40000 @ 60+0.05 th 1 Measure ELO of futility pruning in PV nodes
15-01-30 n_p KingSafety50 diff
ELO: 2.81 +-2.2 (95%) LOS: 99.4%
Total: 40000 W: 8283 L: 7960 D: 23757
40000 @ 15+0.05 th 1 The SPSA-values with 50% more change.
15-01-30 gli less_mc_pruning2 diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 1958 W: 314 L: 414 D: 1230
sprt @ 15+0.05 th 1 Don't movecount prune if bestValue is a lot lower than static eval. This implies we don't understand this position very well, so try more moves.
15-01-30 sg pawn_attack_threat diff
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 7925 W: 1666 L: 1537 D: 4722
sprt @ 15+0.05 th 1 Add bonus for possible safe pawn pushes which attack an enemy piece. Inspired by http://talkchess.com/forum/viewtopic.php?t=55142
15-01-30 sg pawn_attack_threat diff
LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 40109 W: 6841 L: 6546 D: 26722
sprt @ 60+0.05 th 1 LTC: Add bonus for possible safe pawn pushes which attack an enemy piece. Inspired by http://talkchess.com/forum/viewtopic.php?t=55142
15-01-31 Roc QueenPin diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 10171 W: 1994 L: 2073 D: 6104
sprt @ 15+0.05 th 1 Reducing mobility of pinned pieces against Queen. Simplified version, take 1
15-01-31 Roc QueenPin diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 10919 W: 2212 L: 2289 D: 6418
sprt @ 15+0.05 th 1 Take 2: If a piece is pinned against Queen, adjust King Safety too.
15-01-31 sg pawn_attack_threat2 diff
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 14079 W: 2910 L: 2764 D: 8405
sprt @ 15+0.05 th 1 Add bonus for possible safe pawn pushes which attack an enemy piece. Cover more cases by using a doubleAttackedBy array. (Take 2)
15-01-31 SC no_tempo diff
ELO: -8.65 +-2.1 (95%) LOS: 0.0%
Total: 42000 W: 8017 L: 9062 D: 24921
40000 @ 15+0.05 th 1 Measure how much elo is tempo evaluation worth.
15-01-31 vin passed_blockers diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 15827 W: 3178 L: 3241 D: 9408
sprt @ 15+0.05 th 1 Try a small bonus for a protected middle game passed pawn, since it is more likely to survive long-term into the ending. Local 10,000 game test was promising.
15-01-31 sg pawn_attack_threat2 diff
LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 6092 W: 1064 L: 933 D: 4095
sprt @ 60+0.05 th 1 LTC: Add bonus for possible safe pawn pushes which attack an enemy piece. Cover more cases by using a doubleAttackedBy array. (Take 2)
15-01-31 n_p KingSafety2 diff
ELO: 1.44 +-2.3 (95%) LOS: 88.8%
Total: 34359 W: 6910 L: 6768 D: 20681
40000 @ 15+0.05 th 1 When we accidentally got another set of king safety values after "the great purge". Test of these.
15-01-31 sg pawn_attack_threat2 diff
ELO: -1.07 +-2.0 (95%) LOS: 15.1%
Total: 37089 W: 6061 L: 6175 D: 24853
40000 @ 60+0.05 th 1 Both version of pawn attack threat passed (the second seems better at LTC counting the test run length, but this can misleading). So measure in a direct match which is the better one.
15-01-31 Roc QueenPin diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 8660 W: 1701 L: 1784 D: 5175
sprt @ 15+0.05 th 1 See git notes.
15-01-31 n_p KingSafety diff
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 47765 W: 8091 L: 7785 D: 31889
sprt @ 60+0.05 th 1 Try the best values of king safety in LTC.