Stockfish Testing Queue

Finished - 22425 tests

26-01-14 do less_pv diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 75105 W: 16954 L: 16838 D: 41313
sprt @ 15+0.05 th 1 Remove most PV distinctions - low priority retest with 2moves_v1 book. Previously passed after 78877 games.
26-01-14 mc bpsqt diff
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 47663 W: 10777 L: 10524 D: 26362
sprt @ 15+0.05 th 1 SPRT for Joona: bpsqt, 23k iter (retest with 2moves book)
26-01-14 jk lognull diff
LLR: -2.97 (-2.94,2.94) [-1.50,4.50]
Total: 5865 W: 1030 L: 1120 D: 3715
sprt @ 15+0.05 th 1 Logistic try for null move value based reduction.
26-01-14 hx piece_values diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 53321 W: 9779 L: 9746 D: 33796
sprt @ 15+0.05 th 1 mg-eg value diff moved to psqt (keeping eg psqt near zero - all zero or about 12-16 squares positive)
26-01-14 mc null_verification diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 13199 W: 2374 L: 2445 D: 8380
sprt @ 15+0.05 th 1 Tweak null verification threshold: take 3
26-01-14 mc null_verification^ diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 6422 W: 1146 L: 1234 D: 4042
sprt @ 15+0.05 th 1 Tweak null verification threshold: take 2
26-01-14 mc null_verification^^ diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 16399 W: 2886 L: 2949 D: 10564
sprt @ 15+0.05 th 1 Tweak null verification threshold: take 1
26-01-14 ok piece_values diff
LLR: -2.96 (-2.94,2.94) [0.00,6.00]
Total: 31320 W: 4716 L: 4679 D: 21925
sprt @ 60+0.05 th 1 LTC for hx: mg-eg value diff moved to psqt (all eg)
26-01-14 al KingAttackW diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 6937 W: 1262 L: 1349 D: 4326
sprt @ 15+0.05 th 1 more conservatively lowering king attack weights - no SPSA values but direct downscaling, w/ half-weighted rounding and truncating
25-01-14 al KingAttackW diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 4969 W: 889 L: 981 D: 3099
sprt @ 15+0.05 th 1 lowering king attack weights - values obtained with help of JK's SPSA script on very low search depth. Check whether it it scales. try 1.
25-01-14 hx piece_values diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 48390 W: 8837 L: 8817 D: 30736
sprt @ 15+0.05 th 1 mg-eg value diff moved to psqt (avg)
25-01-14 sg nullmove diff
LLR: 2.97 (-2.94,2.94) [0.00,6.00]
Total: 33695 W: 5309 L: 5056 D: 23330
sprt @ 60+0.05 th 1 LTC: Fixed version: increase reduction on greater difference eval-beta
25-01-14 al KingAttackW diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 4656 W: 835 L: 928 D: 2893
sprt @ 15+0.05 th 1 lowering king attack weights - values obtained with help of JK's SPSA script. Try 2, rounding up more.
25-01-14 jk bpsqt diff
ELO: 2.63 +-2.0 (95%) LOS: 99.6%
Total: 50000 W: 10733 L: 10354 D: 28913
50000 @ 5+0.05 th 1 bpsqt, 23k iter
24-01-14 ok 3fold_fix diff
ELO: -1.50 +-1.9 (95%) LOS: 5.8%
Total: 40000 W: 5996 L: 6169 D: 27835
40000 @ 60+0.05 th 1 LTC for jo: LTFix 3-fold repetition, take 5. Common implementation like everybody else does.
25-01-14 sg nullmove diff
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 26141 W: 4871 L: 4699 D: 16571
sprt @ 15+0.05 th 1 Fixed version: increase reduction on greater difference eval-beta
19-01-14 jo kingAttackWeights diff
ELO: -2.59 +-3.0 (95%) LOS: 4.3%
Total: 20000 W: 3704 L: 3853 D: 12443
20000 @ 15+0.05 th 1 KS: Measure kingAttackWeights (Rook)
23-01-14 pe time_trouble diff
ELO: 2.25 +-1.9 (95%) LOS: 98.9%
Total: 40000 W: 6468 L: 6209 D: 27323
40000 @ 60 th 1 Handle time trouble. Take 1. LTC test with no increment 60+0
25-01-14 hx piece_values diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 11909 W: 2201 L: 2275 D: 7433
sprt @ 15+0.05 th 1 mg-eg value diff moved to psqt (all mg)
25-01-14 sg nullmove diff
LLR: 1.82 (-2.94,2.94) [0.00,6.00]
Total: 29897 W: 4637 L: 4442 D: 20818
sprt @ 60+0.05 th 1 LTC: increase reduction on greater difference eval-beta
25-01-14 hx piece_values diff
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 31902 W: 5947 L: 5760 D: 20195
sprt @ 15+0.05 th 1 mg-eg value diff moved to psqt (all eg)
25-01-14 ur reduce_value_known_win diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 2090 W: 324 L: 423 D: 1343
sprt @ 15+0.05 th 1 Note that bench is not a monotonic function of VALUE_KNOWN_WIN and the bench is clearly random but it is a clear functional change(I guess it is going to fail but if it fails it will probably fail fast and I do not plan to try more changes of this parameter). changing it earlier from 15000 to 10000 maybe cause some improvement so I want to check if bigger change can cause a significant improvement.
25-01-14 lb pawn diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 8162 W: 1495 L: 1579 D: 5088
sprt @ 15+0.05 th 1 do not cumulate pawn penalties
19-01-14 jo kingAttackWeights diff
ELO: -3.13 +-3.0 (95%) LOS: 2.1%
Total: 20000 W: 3816 L: 3996 D: 12188
20000 @ 15+0.05 th 1 KS: Measure kingAttackWeights (Queen)
25-01-14 ur fix_bug_infinite diff
ELO: -0.56 +-2.9 (95%) LOS: 35.5%
Total: 20000 W: 3668 L: 3700 D: 12632
20000 @ 15+0.05 th 1 this time I only decide not to use null move pruning when the number of legal moves is 1 and bench is the same(if 1 is better than 10 then I learn that the idea of not using null move pruning when the number of legal moves small is bad because the price of counting the legal move is payed in both cases).
25-01-14 sg nullmove diff
LLR: -3.00 (-2.94,2.94) [0.00,6.00]
Total: 18198 W: 2777 L: 2798 D: 12623
sprt @ 60+0.05 th 1 LTC: slower increase reduction on greater difference eval-beta
19-01-14 jo kingAttackWeights diff
ELO: -3.21 +-2.9 (95%) LOS: 1.6%
Total: 20000 W: 3653 L: 3838 D: 12509
20000 @ 15+0.05 th 1 KS: Measure kingAttackWeights (Bishop)
25-01-14 sg nullmove diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 10640 W: 1938 L: 2015 D: 6687
sprt @ 15+0.05 th 1 more increase nullmove reduction. Before a do the LTC of the passed version i try to increase the reduction further.
24-01-14 sg nullmove diff
LLR: 3.92 (-2.94,2.94) [-1.50,4.50]
Total: 62697 W: 11630 L: 11329 D: 39738
sprt @ 15+0.05 th 1 increase reduction on greater difference eval-beta
25-01-14 bi nullmove diff
LLR: -2.97 (-2.94,2.94) [-1.50,4.50]
Total: 4413 W: 778 L: 872 D: 2763
sprt @ 15+0.05 th 1 Dont reduce null move when 1 legal move. Easy fix to test position in Uri patch.
25-01-14 sg nullmove diff
LLR: -2.95 (-2.94,2.94) [0.00,6.00]
Total: 18237 W: 2802 L: 2821 D: 12614
sprt @ 60+0.05 th 1 LTC: increase nullmove reduction
24-01-14 sg nullmove diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 49042 W: 9005 L: 8983 D: 31054
sprt @ 15+0.05 th 1 decrease nullmove reduction
24-01-14 sg nullmove diff
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 5699 W: 1095 L: 976 D: 3628
sprt @ 15+0.05 th 1 slower increase reduction on greater difference eval-beta
24-01-14 sg nullmove diff
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 17432 W: 3278 L: 3128 D: 11026
sprt @ 15+0.05 th 1 increase nullmove reduction
24-01-14 ur fix_bug_infinite diff
LLR: 2.96 (-2.94,2.94) [-4.00,0.00]
Total: 48524 W: 7419 L: 7412 D: 33693
sprt @ 60+0.05 th 1 long time control I do not expect a big elo improvement and I verify no regression with these bounds considering the fact that the queue is almost empty
23-01-14 pe time_trouble diff
ELO: 1.73 +-1.9 (95%) LOS: 96.0%
Total: 40000 W: 6537 L: 6338 D: 27125
40000 @ 16+0.8 th 1 Handle time trouble. Take 1. LTC test for base/inc=20 16 + 0.8
23-01-14 ur fix_bug_infinite diff
ELO: 1.98 +-2.1 (95%) LOS: 97.0%
Total: 40000 W: 7485 L: 7257 D: 25258
40000 @ 15+0.05 th 1 test fixing some bug that cause the program not to solve 8/8/8/2p5/1pp5/brpp4/1pprp2P/qnkbK3 w - - 0 1 or to have wrong evaluation that is too high.
23-01-14 pe time_trouble diff
ELO: 0.96 +-2.0 (95%) LOS: 83.2%
Total: 40000 W: 6722 L: 6611 D: 26667
40000 @ 1+1 th 1 Handle time trouble. Take 1. LTC test for playing only on increment 1+1
24-01-14 pe time_trouble_3 diff
LLR: -1.80 (-2.94,2.94) [-3.00,3.00]
Total: 105542 W: 22012 L: 22078 D: 61452
sprt @ 5+0.05 th 1 Take 3. Simpler version. Check if no spillover of sudden death tc of simpler version is better at non-zero increment at base/inc = 100. One more quick test while other patches getting approved or crashing
24-01-14 jo pp_material_scaling diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 3736 W: 909 L: 1012 D: 1815
sprt @ 5+0.05 th 1 pp: scaling down eval for major pieces present
23-01-14 jo rr diff
LLR: -2.97 (-2.94,2.94) [-1.50,4.50]
Total: 2321 W: 528 L: 635 D: 1158
sprt @ 5+0.05 th 1 changing the calculation of rr to a linear function
22-01-14 do master diff
ELO: 41.01 +-2.2 (95%) LOS: 100.0%
Total: 40000 W: 10847 L: 6147 D: 23006
40000 @ 60+0.05 th 1 Regression test after SEE simplification with shallow book (2moves_v1). With 8moves_v3 book it was +32.49 ELO.
24-01-14 pe time_trouble_3 diff
LLR: -2.95 (-2.94,2.94) [-3.00,3.00]
Total: 9996 W: 2379 L: 2492 D: 5125
sprt @ 1+0.05 th 1 Take 3. Simpler version. Check if no spillover of sudden death tc of simpler version is better at non-zero increment at base/inc = 20
24-01-14 pe time_trouble_3 diff
LLR: -2.95 (-2.94,2.94) [-3.00,3.00]
Total: 11305 W: 2863 L: 2978 D: 5464
sprt @ 0.05+0.05 th 1 Take 3. Simpler version. Check if no spillover of sudden death tc of simpler version is better at non-zero increment
23-01-14 pe time_trouble_3 diff
LLR: 2.96 (-2.94,2.94) [-3.00,3.00]
Total: 28831 W: 5767 L: 5660 D: 17404
sprt @ 15 th 1 Take 3. Simpler version. Check if simple version without fade down is better at no increment
23-01-14 jo 3fold_fix diff
ELO: -0.88 +-2.1 (95%) LOS: 20.1%
Total: 40000 W: 7241 L: 7342 D: 25417
40000 @ 15+0.05 th 1 Fix 3-fold repetition, take 5. Common implementation like everybody else does.
22-01-14 jo master diff
ELO: 17.88 +-14.5 (95%) LOS: 99.2%
Total: 1128 W: 318 L: 260 D: 550
40000 @ 60+0.05 th 1 Regression test after SEE simplification (chess960_book_3moves.pgn) to test for resolution of the chess960book. Low prio.
22-01-14 jo 3fold_fix diff
ELO: -1.09 +-2.0 (95%) LOS: 14.9%
Total: 40000 W: 7136 L: 7261 D: 25603
40000 @ 15+0.05 th 1 Fix 3-fold repetition, take 3. Seems unlogical, but let's check it, too.
21-01-14 jo pp_blockSq diff
LLR: 0.60 (-2.94,2.94) [-1.50,4.50]
Total: 128000 W: 23809 L: 23455 D: 80736
sprt @ 15+0.05 th 1 More bonus for pp eval when our king attacks the blockSq
22-01-14 pe time_trouble diff
ELO: 3.29 +-2.4 (95%) LOS: 99.6%
Total: 30000 W: 5849 L: 5565 D: 18586
30000 @ 5+0.25 th 1 Handle time trouble. Take 1. STC equivalent time control for base/inc = 20