Stockfish Testing Queue

Finished - 30563 tests

15-03-02 zar one_lever diff
LLR: -2.97 (-2.94,2.94) [-1.50,4.50]
Total: 35125 W: 6914 L: 6925 D: 21286
sprt @ 15+0.05 th 1 Don't bonus lever with two opponents.
15-03-02 vin time_predictor diff
25188/25000 iterations
50000/50000 games played
50000 @ 30+0.05 th 1 Restart SPSA run as one parameter was going to hit the lower limit. (Thanks Binky) Also sync with latest master.
15-03-02 vin structural_mobility diff
LLR: -3.31 (-2.94,2.94) [-1.50,4.50]
Total: 21025 W: 4118 L: 4180 D: 12727
sprt @ 15+0.05 th 1 Try bonus for pawns that are more mobile structurally (e.g. candidate passers) as opposed to currently mobile (the current safe pawn push bonus)
15-03-02 sg long_chain diff
LLR: 2.97 (-2.94,2.94) [-1.50,4.50]
Total: 8225 W: 1708 L: 1578 D: 4939
sprt @ 15+0.05 th 1 Add bonus for inner pawns of a long chain. (even lower bonus)
15-03-02 jos matimb diff
ELO: -3.46 +-2.1 (95%) LOS: 0.1%
Total: 44000 W: 9336 L: 9774 D: 24890
40000 @ 9+0.05 th 1 One last try to improve upon existing values.
15-03-02 lbr 54f8a9cb diff
ELO: -8.27 +-58.9 (95%) LOS: 39.1%
Total: 42 W: 6 L: 7 D: 29
20000 @ 15+0.05 th 1 SF 5: 4 vs. 2 threads
15-03-03 Fis TTflipEval diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 11453 W: 2226 L: 2301 D: 6926
sprt @ 15+0.05 th 1 Use the NULL move flip trick on TT evals also. 2MB hash
15-03-03 sg long_chain diff
LLR: -2.96 (-2.94,2.94) [0.00,6.00]
Total: 15665 W: 2575 L: 2603 D: 10487
sprt @ 60+0.05 th 1 LTC: Add bonus for inner pawns of a long chain. (even lower bonus)
15-03-03 Fis TTflipEval diff
LLR: -3.09 (-2.94,2.94) [-1.50,4.50]
Total: 10331 W: 1934 L: 2017 D: 6380
sprt @ 15+0.05 th 1 One more try also refreshing the flipped TT entries. 2MB
15-03-03 vin time_predictor diff
26652/25000 iterations
50000/50000 games played
50000 @ 30+0.05 th 1 One parameter has converged it seems, so one more run to stabilise the other. Priority -1 so as not to hold up all the eval/threading work.
15-03-03 sni king2 diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 8113 W: 1654 L: 1739 D: 4720
sprt @ 15+0.05 th 1 Give less importance to the pawn structure when the kings are separated
15-03-03 sni king2 diff
LLR: -2.94 (-2.94,2.94) [-1.50,4.50]
Total: 11459 W: 2310 L: 2385 D: 6764
sprt @ 15+0.05 th 1 Take 2. Less reduction with king separation.
15-03-03 sni king2 diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 12731 W: 2514 L: 2586 D: 7631
sprt @ 15+0.05 th 1 Take 3. Quadratic reduction.
15-03-04 Fis tt_iteration2 diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 2442 W: 433 L: 533 D: 1476
sprt @ 15+0.05 th 1 A modified version of this http://tests.stockfishchess.org/tests/view/54f1b4af0ebc594fbf9a9735 patch. Full credit to Marco and http://www.talkchess.com/forum/viewtopic.php?t=55501 2MB
15-03-04 sg long_chain diff
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 32674 W: 6534 L: 6338 D: 19802
sprt @ 15+0.05 th 1 Add bonus for inner pawns of a long chain. The last value 1/16 passed STC fast but failed LTC. Try now a higher value between 1/8 and 1/16.
15-03-04 jos outpost3 diff
ELO: 0.80 +-1.8 (95%) LOS: 80.3%
Total: 46000 W: 7775 L: 7669 D: 30556
40000 @ 9+0.05 th 1 Measure tuned outpost values with 8-moves book.
15-03-04 sg long_chain diff
LLR: -2.97 (-2.94,2.94) [0.00,6.00]
Total: 25573 W: 4245 L: 4226 D: 17102
sprt @ 60+0.05 th 1 LTC: Add bonus for inner pawns of a long chain. The last value 1/16 passed STC fast but failed LTC. Try now a higher value between 1/8 and 1/16.
15-03-04 jki cb2111f0b62af diff
ELO: -0.43 +-2.0 (95%) LOS: 33.5%
Total: 40000 W: 6594 L: 6643 D: 26763
40000 @ 60+0.05 th 1 c++11 migration, regression test
15-03-04 jki cb2111f0b62af diff
ELO: -32.20 +-8.1 (95%) LOS: 0.0%
Total: 2229 W: 252 L: 458 D: 1519
20000 @ 60+0.05 th 4 c++11 migration, regression test, 4 threads
15-03-05 vin time_predictor_a diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 1493 W: 219 L: 319 D: 955
sprt @ 30+0.05 th 1 Test of first possible metric (overall node growth) from statistical fit of node growth for time management. Test at same TC as was used for tuning initially.
15-03-05 sni king3 diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 4848 W: 959 L: 1053 D: 2836
sprt @ 15+0.05 th 1 Reward semi-open and open files on opponent king
15-03-05 jos no_is_draw diff
ELO: -4.66 +-3.0 (95%) LOS: 0.1%
Total: 20000 W: 3852 L: 4120 D: 12028
20000 @ 15+0.05 th 1 Can we drop the check for a draw at the beginning of qsearch? (We shouldn't miss a 3-fold rep because we always check in main search.)
15-03-06 jki spin diff
ELO: -5.82 +-10.7 (95%) LOS: 14.4%
Total: 1314 W: 204 L: 226 D: 884
10000 @ 15+0.05 th 16 c++11 regression test, spinlocks activated
15-03-06 jos outpost3 diff
ELO: 0.39 +-1.9 (95%) LOS: 65.5%
Total: 44000 W: 7594 L: 7545 D: 28861
40000 @ 9+0.05 th 1 Now measure some asymmetrical values. Also with 8moves book to get a fair comparison to the symmetrical values.
15-03-06 lbr 27a18772 diff
ELO: 11.94 +-3.2 (95%) LOS: 100.0%
Total: 20000 W: 4883 L: 4196 D: 10921
20000 @ 9+0.03 th 1 quick test to confirm scaling regression interval. 1 thread.
15-03-06 lbr 27a18772 diff
ELO: 7.07 +-3.1 (95%) LOS: 100.0%
Total: 20000 W: 4276 L: 3869 D: 11855
20000 @ 9+0.03 th 3 quick test to confirm scaling regression interval. 3 threads.
15-03-06 lbr 27a18772 diff
ELO: 3.39 +-3.2 (95%) LOS: 98.0%
Total: 17226 W: 3422 L: 3254 D: 10550
20000 @ 9+0.03 th 7 quick test to confirm scaling regression interval. 7 threads.
15-03-07 lbr 2eec7103 diff
ELO: 0.90 +-3.1 (95%) LOS: 71.1%
Total: 19369 W: 4050 L: 4000 D: 11319
20000 @ 15+0.05 th 1 confirm scaling regression. 1 thread.
15-03-07 lbr 2eec7103 diff
ELO: 1.58 +-3.2 (95%) LOS: 83.5%
Total: 17564 W: 3409 L: 3329 D: 10826
20000 @ 15+0.05 th 3 confirm scaling regression. 3 thread.
15-03-07 lbr 27a18772 diff
ELO: 8.50 +-3.3 (95%) LOS: 100.0%
Total: 20000 W: 4961 L: 4472 D: 10567
20000 @ 9+0.03 th 1 bissect 766fb9c6..27a18772 1 thread
15-03-07 lbr 27a18772 diff
ELO: 10.48 +-3.1 (95%) LOS: 100.0%
Total: 20000 W: 4586 L: 3983 D: 11431
20000 @ 9+0.03 th 3 bissect 766fb9c6..27a18772 3 threads
15-03-07 lbr 27a18772 diff
ELO: 1.84 +-3.0 (95%) LOS: 88.2%
Total: 20000 W: 4066 L: 3960 D: 11974
20000 @ 9+0.03 th 7 bissect 766fb9c6..27a18772 7 threads
15-03-07 vin time_predictor diff
330/25000 iterations
670/50000 games played
50000 @ 30+0.05 th 1 Move to the second possible metric, since overall node growth was not good. This time tune against average width.
15-03-07 jki nolocks2 diff
ELO: 2.78 +-5.5 (95%) LOS: 83.8%
Total: 5000 W: 841 L: 801 D: 3358
20000 @ 15+0.05 th 16 Retire global locks, more compact implementation
15-03-07 vin time_predictor diff
567/25000 iterations
397/50000 games played
50000 @ 30+0.05 th 1 Move to the second possible metric, since overall node growth was not good. This time tune against average width. (Stopped previous run owing to UCI option parsing bug)
15-03-07 sni yielding_spinlock diff
ELO: 136.48 +-5.8 (95%) LOS: 100.0%
Total: 6073 W: 2566 L: 296 D: 3211
20000 @ 15+0.05 th 7 Estimate the value of using yielding spinlocks instead of mutexes for 7 threads (test is between C++11 compile with yielding spinlocks and C++11 compile with mutexes)
15-03-07 sni yielding_spinlock diff
ELO: -3.27 +-2.8 (95%) LOS: 1.1%
Total: 20000 W: 3298 L: 3486 D: 13216
20000 @ 15+0.05 th 7 Estimate the value of using yielding spinlocks instead of normal spinlocks for 7 threads (test is between C++11 compile with yielding spinlocks and C++11 compile with normal spinlocks)
15-03-07 jos no_is_draw diff
ELO: -1.70 +-3.0 (95%) LOS: 13.4%
Total: 20000 W: 3877 L: 3975 D: 12148
20000 @ 15+0.05 th 1 Only check for a draw when called from main search. (Idea from Crafty) Gives a nice boost in nps, but we also lose some of the pruning effect here.
15-03-07 vin time_predictor diff
582/25000 iterations
968/50000 games played
50000 @ 30+0.05 th 1 Last SPSA run was converging to nonsense values, so clearly something not right. Constrain the action of the metric so it can work alongside BestMoveChanges.
15-03-07 vin measure_BMC diff
ELO: -21.01 +-3.1 (95%) LOS: 0.0%
Total: 20000 W: 3505 L: 4713 D: 11782
20000 @ 15+0.05 th 1 Measure scalability of BestMoveChanges. Part 1. Measure benefit of BMC for one thread.
15-03-07 vin measure_BMC diff
ELO: -21.60 +-2.9 (95%) LOS: 0.0%
Total: 20000 W: 3038 L: 4280 D: 12682
20000 @ 15+0.05 th 3 Measure scalability of BestMoveChanges. Part 2 - measure benefit with 3 threads. Priority -1.
15-03-07 jki nolocks3 diff
ELO: 2.48 +-3.1 (95%) LOS: 94.3%
Total: 16079 W: 2710 L: 2595 D: 10774
20000 @ 15+0.05 th 16 Retire global lock, version 3. Including Marco's clean up.
15-03-07 lbr d6613b7 diff
ELO: 4.90 +-3.3 (95%) LOS: 99.8%
Total: 20000 W: 4858 L: 4576 D: 10566
20000 @ 9+0.03 th 1 bissecting. 1 thread
15-03-07 lbr d6613b7 diff
ELO: 5.09 +-3.1 (95%) LOS: 99.9%
Total: 20000 W: 4377 L: 4084 D: 11539
20000 @ 9+0.03 th 3 bissecting. 3 threads
15-03-07 lbr d6613b7 diff
ELO: 1.62 +-3.1 (95%) LOS: 84.7%
Total: 20000 W: 4183 L: 4090 D: 11727
20000 @ 9+0.03 th 7 bissecting. 7 threads
15-03-08 vin measure_BMC diff
ELO: -17.60 +-2.8 (95%) LOS: 0.0%
Total: 20000 W: 2915 L: 3927 D: 13158
20000 @ 60+0.05 th 1 Measure scalability of BestMoveChanges. Part 3 (and last). Measure benefit of BMC at LTC. Priority -1.
15-03-08 vin time_predictor_b diff
2480/25000 iterations
4999/50000 games played
50000 @ 60+0.05 th 1 Back to the second time predictor metric, after local testing to get a sensible starting point for the parameters. (On previous runs InstabilityMultiplier started far too high). Priority -1.
15-03-08 jos mobility diff
ELO: 0.62 +-2.2 (95%) LOS: 70.9%
Total: 41000 W: 8788 L: 8715 D: 23497
40000 @ 9+0.05 th 1 Measure new values for Mobility.
15-03-08 sg double_history diff
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 4747 W: 1005 L: 885 D: 2857
sprt @ 15+0.05 th 1 Collect double history data and use them for quiet move ordering. Inspired by http://talkchess.com/forum/viewtopic.php?t=55535
15-03-08 sg double_history diff
LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 5726 W: 1001 L: 872 D: 3853
sprt @ 60+0.05 th 1 LTC: Collect double history data and use them for quiet move ordering. Inspired by http://talkchess.com/forum/viewtopic.php?t=55535