Stockfish Testing Queue

Finished - 38853 tests

15-03-03 sni king2 diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 8113 W: 1654 L: 1739 D: 4720
sprt @ 15+0.05 th 1 Give less importance to the pawn structure when the kings are separated
15-03-03 sni king2 diff
LLR: -2.94 (-2.94,2.94) [-1.50,4.50]
Total: 11459 W: 2310 L: 2385 D: 6764
sprt @ 15+0.05 th 1 Take 2. Less reduction with king separation.
15-03-03 sni king2 diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 12731 W: 2514 L: 2586 D: 7631
sprt @ 15+0.05 th 1 Take 3. Quadratic reduction.
15-03-04 Fis tt_iteration2 diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 2442 W: 433 L: 533 D: 1476
sprt @ 15+0.05 th 1 A modified version of this http://tests.stockfishchess.org/tests/view/54f1b4af0ebc594fbf9a9735 patch. Full credit to Marco and http://www.talkchess.com/forum/viewtopic.php?t=55501 2MB
15-03-04 sg long_chain diff
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 32674 W: 6534 L: 6338 D: 19802
sprt @ 15+0.05 th 1 Add bonus for inner pawns of a long chain. The last value 1/16 passed STC fast but failed LTC. Try now a higher value between 1/8 and 1/16.
15-03-04 jos outpost3 diff
ELO: 0.80 +-1.8 (95%) LOS: 80.3%
Total: 46000 W: 7775 L: 7669 D: 30556
40000 @ 9+0.05 th 1 Measure tuned outpost values with 8-moves book.
15-03-04 sg long_chain diff
LLR: -2.97 (-2.94,2.94) [0.00,6.00]
Total: 25573 W: 4245 L: 4226 D: 17102
sprt @ 60+0.05 th 1 LTC: Add bonus for inner pawns of a long chain. The last value 1/16 passed STC fast but failed LTC. Try now a higher value between 1/8 and 1/16.
15-03-04 jki cb2111f0b62af diff
ELO: -0.43 +-2.0 (95%) LOS: 33.5%
Total: 40000 W: 6594 L: 6643 D: 26763
40000 @ 60+0.05 th 1 c++11 migration, regression test
15-03-04 jki cb2111f0b62af diff
ELO: -32.20 +-8.1 (95%) LOS: 0.0%
Total: 2229 W: 252 L: 458 D: 1519
20000 @ 60+0.05 th 4 c++11 migration, regression test, 4 threads
15-03-05 vin time_predictor_a diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 1493 W: 219 L: 319 D: 955
sprt @ 30+0.05 th 1 Test of first possible metric (overall node growth) from statistical fit of node growth for time management. Test at same TC as was used for tuning initially.
15-03-05 sni king3 diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 4848 W: 959 L: 1053 D: 2836
sprt @ 15+0.05 th 1 Reward semi-open and open files on opponent king
15-03-05 jos no_is_draw diff
ELO: -4.66 +-3.0 (95%) LOS: 0.1%
Total: 20000 W: 3852 L: 4120 D: 12028
20000 @ 15+0.05 th 1 Can we drop the check for a draw at the beginning of qsearch? (We shouldn't miss a 3-fold rep because we always check in main search.)
15-03-06 jki spin diff
ELO: -5.82 +-10.7 (95%) LOS: 14.4%
Total: 1314 W: 204 L: 226 D: 884
10000 @ 15+0.05 th 16 c++11 regression test, spinlocks activated
15-03-06 jos outpost3 diff
ELO: 0.39 +-1.9 (95%) LOS: 65.5%
Total: 44000 W: 7594 L: 7545 D: 28861
40000 @ 9+0.05 th 1 Now measure some asymmetrical values. Also with 8moves book to get a fair comparison to the symmetrical values.
15-03-06 lbr 27a18772 diff
ELO: 11.94 +-3.2 (95%) LOS: 100.0%
Total: 20000 W: 4883 L: 4196 D: 10921
20000 @ 9+0.03 th 1 quick test to confirm scaling regression interval. 1 thread.
15-03-06 lbr 27a18772 diff
ELO: 7.07 +-3.1 (95%) LOS: 100.0%
Total: 20000 W: 4276 L: 3869 D: 11855
20000 @ 9+0.03 th 3 quick test to confirm scaling regression interval. 3 threads.
15-03-06 lbr 27a18772 diff
ELO: 3.39 +-3.2 (95%) LOS: 98.0%
Total: 17226 W: 3422 L: 3254 D: 10550
20000 @ 9+0.03 th 7 quick test to confirm scaling regression interval. 7 threads.
15-03-07 lbr 2eec7103 diff
ELO: 0.90 +-3.1 (95%) LOS: 71.1%
Total: 19369 W: 4050 L: 4000 D: 11319
20000 @ 15+0.05 th 1 confirm scaling regression. 1 thread.
15-03-07 lbr 2eec7103 diff
ELO: 1.58 +-3.2 (95%) LOS: 83.5%
Total: 17564 W: 3409 L: 3329 D: 10826
20000 @ 15+0.05 th 3 confirm scaling regression. 3 thread.
15-03-07 lbr 27a18772 diff
ELO: 8.50 +-3.3 (95%) LOS: 100.0%
Total: 20000 W: 4961 L: 4472 D: 10567
20000 @ 9+0.03 th 1 bissect 766fb9c6..27a18772 1 thread
15-03-07 lbr 27a18772 diff
ELO: 10.48 +-3.1 (95%) LOS: 100.0%
Total: 20000 W: 4586 L: 3983 D: 11431
20000 @ 9+0.03 th 3 bissect 766fb9c6..27a18772 3 threads
15-03-07 lbr 27a18772 diff
ELO: 1.84 +-3.0 (95%) LOS: 88.2%
Total: 20000 W: 4066 L: 3960 D: 11974
20000 @ 9+0.03 th 7 bissect 766fb9c6..27a18772 7 threads
15-03-07 vin time_predictor diff
330/25000 iterations
670/50000 games played
50000 @ 30+0.05 th 1 Move to the second possible metric, since overall node growth was not good. This time tune against average width.
15-03-07 jki nolocks2 diff
ELO: 2.78 +-5.5 (95%) LOS: 83.8%
Total: 5000 W: 841 L: 801 D: 3358
20000 @ 15+0.05 th 16 Retire global locks, more compact implementation
15-03-07 vin time_predictor diff
567/25000 iterations
397/50000 games played
50000 @ 30+0.05 th 1 Move to the second possible metric, since overall node growth was not good. This time tune against average width. (Stopped previous run owing to UCI option parsing bug)
15-03-07 sni yielding_spinlock diff
ELO: 136.48 +-5.8 (95%) LOS: 100.0%
Total: 6073 W: 2566 L: 296 D: 3211
20000 @ 15+0.05 th 7 Estimate the value of using yielding spinlocks instead of mutexes for 7 threads (test is between C++11 compile with yielding spinlocks and C++11 compile with mutexes)
15-03-07 sni yielding_spinlock diff
ELO: -3.27 +-2.8 (95%) LOS: 1.1%
Total: 20000 W: 3298 L: 3486 D: 13216
20000 @ 15+0.05 th 7 Estimate the value of using yielding spinlocks instead of normal spinlocks for 7 threads (test is between C++11 compile with yielding spinlocks and C++11 compile with normal spinlocks)
15-03-07 jos no_is_draw diff
ELO: -1.70 +-3.0 (95%) LOS: 13.4%
Total: 20000 W: 3877 L: 3975 D: 12148
20000 @ 15+0.05 th 1 Only check for a draw when called from main search. (Idea from Crafty) Gives a nice boost in nps, but we also lose some of the pruning effect here.
15-03-07 vin time_predictor diff
582/25000 iterations
968/50000 games played
50000 @ 30+0.05 th 1 Last SPSA run was converging to nonsense values, so clearly something not right. Constrain the action of the metric so it can work alongside BestMoveChanges.
15-03-07 vin measure_BMC diff
ELO: -21.01 +-3.1 (95%) LOS: 0.0%
Total: 20000 W: 3505 L: 4713 D: 11782
20000 @ 15+0.05 th 1 Measure scalability of BestMoveChanges. Part 1. Measure benefit of BMC for one thread.
15-03-07 vin measure_BMC diff
ELO: -21.60 +-2.9 (95%) LOS: 0.0%
Total: 20000 W: 3038 L: 4280 D: 12682
20000 @ 15+0.05 th 3 Measure scalability of BestMoveChanges. Part 2 - measure benefit with 3 threads. Priority -1.
15-03-07 jki nolocks3 diff
ELO: 2.48 +-3.1 (95%) LOS: 94.3%
Total: 16079 W: 2710 L: 2595 D: 10774
20000 @ 15+0.05 th 16 Retire global lock, version 3. Including Marco's clean up.
15-03-07 lbr d6613b7 diff
ELO: 4.90 +-3.3 (95%) LOS: 99.8%
Total: 20000 W: 4858 L: 4576 D: 10566
20000 @ 9+0.03 th 1 bissecting. 1 thread
15-03-07 lbr d6613b7 diff
ELO: 5.09 +-3.1 (95%) LOS: 99.9%
Total: 20000 W: 4377 L: 4084 D: 11539
20000 @ 9+0.03 th 3 bissecting. 3 threads
15-03-07 lbr d6613b7 diff
ELO: 1.62 +-3.1 (95%) LOS: 84.7%
Total: 20000 W: 4183 L: 4090 D: 11727
20000 @ 9+0.03 th 7 bissecting. 7 threads
15-03-08 vin measure_BMC diff
ELO: -17.60 +-2.8 (95%) LOS: 0.0%
Total: 20000 W: 2915 L: 3927 D: 13158
20000 @ 60+0.05 th 1 Measure scalability of BestMoveChanges. Part 3 (and last). Measure benefit of BMC at LTC. Priority -1.
15-03-08 vin time_predictor_b diff
2480/25000 iterations
4999/50000 games played
50000 @ 60+0.05 th 1 Back to the second time predictor metric, after local testing to get a sensible starting point for the parameters. (On previous runs InstabilityMultiplier started far too high). Priority -1.
15-03-08 jos mobility diff
ELO: 0.62 +-2.2 (95%) LOS: 70.9%
Total: 41000 W: 8788 L: 8715 D: 23497
40000 @ 9+0.05 th 1 Measure new values for Mobility.
15-03-08 sg double_history diff
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 4747 W: 1005 L: 885 D: 2857
sprt @ 15+0.05 th 1 Collect double history data and use them for quiet move ordering. Inspired by http://talkchess.com/forum/viewtopic.php?t=55535
15-03-08 sg double_history diff
LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 5726 W: 1001 L: 872 D: 3853
sprt @ 60+0.05 th 1 LTC: Collect double history data and use them for quiet move ordering. Inspired by http://talkchess.com/forum/viewtopic.php?t=55535
15-03-08 Fis fastMove diff
ELO: 8.22 +-3.0 (95%) LOS: 100.0%
Total: 20000 W: 4203 L: 3730 D: 12067
20000 @ 15+0.05 th 1 This might need some tuning first but I would like to know where I'm starting from. See commit comments for patch description. Pri -1
15-03-09 Fis fastMove diff
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 23266 W: 4662 L: 4492 D: 14112
sprt @ 15+0.05 th 1 Fixed games looks good so sprt.
15-03-09 jos mobility+outpost3 diff
ELO: 1.82 +-2.6 (95%) LOS: 91.7%
Total: 30000 W: 6507 L: 6350 D: 17143
30000 @ 9+0.05 th 1 Measure combined new mobility and outpost values. Each one showed a positive result, but not enough to pass a sprt. Also tried to somewhat 'normalize' the mobility values, though obviously there is some kind of local maximum for the queen.
15-03-09 jos mobility+outpost3 diff
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 57644 W: 11805 L: 11431 D: 34408
sprt @ 15+0.05 th 1 Let's see if it passes sprt.
15-03-09 Fis fastMove diff
LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 12470 W: 2091 L: 1931 D: 8448
sprt @ 60+0.05 th 1 Spend much less time on "only" moves detected using a pv stability metric. LTC.
15-03-09 sg double_history diff
ELO: 7.26 +-3.3 (95%) LOS: 100.0%
Total: 14937 W: 2710 L: 2398 D: 9829
20000 @ 15+0.05 th 7 Check for possible regression at multicore. Collect double history data and use them for quiet move ordering. Inspired by http://talkchess.com/forum/viewtopic.php?t=55535
15-03-09 SC oppbis_eval diff
LLR: -2.96 (-2.94,2.94) [-3.00,1.00]
Total: 1710 W: 246 L: 407 D: 1057
sprt @ 15+0.05 th 1 Simplify scaling with opposite-colored bishops.
15-03-09 sni yielding_spinlock diff
ELO: -15.10 +-13.7 (95%) LOS: 1.5%
Total: 829 W: 121 L: 157 D: 551
20000 @ 15+0.05 th 7 Estimate the value of using yielding spinlocks against current master (test is between C++11 compile with yielding spinlocks and current master without C++11 branch)
15-03-09 hxi cmh2 diff
ELO: 6.25 +-3.0 (95%) LOS: 100.0%
Total: 20000 W: 4062 L: 3702 D: 12236
20000 @ 15+0.05 th 1 quick check double history with a) 2 * weight b) used in LMR reduction code
15-03-09 SC oppbis_eval diff
LLR: -2.95 (-2.94,2.94) [-3.00,1.00]
Total: 2332 W: 423 L: 591 D: 1318
sprt @ 15+0.05 th 1 Simplify scaling with opposite-colored bishops, bugfix. I somehow got the values completely wrong in the previous test.