Stockfish Testing Queue

Finished - 33282 tests

15-03-07 lbr 27a18772 diff
ELO: 1.84 +-3.0 (95%) LOS: 88.2%
Total: 20000 W: 4066 L: 3960 D: 11974
20000 @ 9+0.03 th 7 bissect 766fb9c6..27a18772 7 threads
15-03-07 vin time_predictor diff
330/25000 iterations
670/50000 games played
50000 @ 30+0.05 th 1 Move to the second possible metric, since overall node growth was not good. This time tune against average width.
15-03-07 jki nolocks2 diff
ELO: 2.78 +-5.5 (95%) LOS: 83.8%
Total: 5000 W: 841 L: 801 D: 3358
20000 @ 15+0.05 th 16 Retire global locks, more compact implementation
15-03-07 vin time_predictor diff
567/25000 iterations
397/50000 games played
50000 @ 30+0.05 th 1 Move to the second possible metric, since overall node growth was not good. This time tune against average width. (Stopped previous run owing to UCI option parsing bug)
15-03-07 sni yielding_spinlock diff
ELO: 136.48 +-5.8 (95%) LOS: 100.0%
Total: 6073 W: 2566 L: 296 D: 3211
20000 @ 15+0.05 th 7 Estimate the value of using yielding spinlocks instead of mutexes for 7 threads (test is between C++11 compile with yielding spinlocks and C++11 compile with mutexes)
15-03-07 sni yielding_spinlock diff
ELO: -3.27 +-2.8 (95%) LOS: 1.1%
Total: 20000 W: 3298 L: 3486 D: 13216
20000 @ 15+0.05 th 7 Estimate the value of using yielding spinlocks instead of normal spinlocks for 7 threads (test is between C++11 compile with yielding spinlocks and C++11 compile with normal spinlocks)
15-03-07 jos no_is_draw diff
ELO: -1.70 +-3.0 (95%) LOS: 13.4%
Total: 20000 W: 3877 L: 3975 D: 12148
20000 @ 15+0.05 th 1 Only check for a draw when called from main search. (Idea from Crafty) Gives a nice boost in nps, but we also lose some of the pruning effect here.
15-03-07 vin time_predictor diff
582/25000 iterations
968/50000 games played
50000 @ 30+0.05 th 1 Last SPSA run was converging to nonsense values, so clearly something not right. Constrain the action of the metric so it can work alongside BestMoveChanges.
15-03-07 vin measure_BMC diff
ELO: -21.01 +-3.1 (95%) LOS: 0.0%
Total: 20000 W: 3505 L: 4713 D: 11782
20000 @ 15+0.05 th 1 Measure scalability of BestMoveChanges. Part 1. Measure benefit of BMC for one thread.
15-03-07 vin measure_BMC diff
ELO: -21.60 +-2.9 (95%) LOS: 0.0%
Total: 20000 W: 3038 L: 4280 D: 12682
20000 @ 15+0.05 th 3 Measure scalability of BestMoveChanges. Part 2 - measure benefit with 3 threads. Priority -1.
15-03-07 jki nolocks3 diff
ELO: 2.48 +-3.1 (95%) LOS: 94.3%
Total: 16079 W: 2710 L: 2595 D: 10774
20000 @ 15+0.05 th 16 Retire global lock, version 3. Including Marco's clean up.
15-03-07 lbr d6613b7 diff
ELO: 4.90 +-3.3 (95%) LOS: 99.8%
Total: 20000 W: 4858 L: 4576 D: 10566
20000 @ 9+0.03 th 1 bissecting. 1 thread
15-03-07 lbr d6613b7 diff
ELO: 5.09 +-3.1 (95%) LOS: 99.9%
Total: 20000 W: 4377 L: 4084 D: 11539
20000 @ 9+0.03 th 3 bissecting. 3 threads
15-03-07 lbr d6613b7 diff
ELO: 1.62 +-3.1 (95%) LOS: 84.7%
Total: 20000 W: 4183 L: 4090 D: 11727
20000 @ 9+0.03 th 7 bissecting. 7 threads
15-03-08 vin measure_BMC diff
ELO: -17.60 +-2.8 (95%) LOS: 0.0%
Total: 20000 W: 2915 L: 3927 D: 13158
20000 @ 60+0.05 th 1 Measure scalability of BestMoveChanges. Part 3 (and last). Measure benefit of BMC at LTC. Priority -1.
15-03-08 vin time_predictor_b diff
2480/25000 iterations
4999/50000 games played
50000 @ 60+0.05 th 1 Back to the second time predictor metric, after local testing to get a sensible starting point for the parameters. (On previous runs InstabilityMultiplier started far too high). Priority -1.
15-03-08 jos mobility diff
ELO: 0.62 +-2.2 (95%) LOS: 70.9%
Total: 41000 W: 8788 L: 8715 D: 23497
40000 @ 9+0.05 th 1 Measure new values for Mobility.
15-03-08 sg double_history diff
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 4747 W: 1005 L: 885 D: 2857
sprt @ 15+0.05 th 1 Collect double history data and use them for quiet move ordering. Inspired by http://talkchess.com/forum/viewtopic.php?t=55535
15-03-08 sg double_history diff
LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 5726 W: 1001 L: 872 D: 3853
sprt @ 60+0.05 th 1 LTC: Collect double history data and use them for quiet move ordering. Inspired by http://talkchess.com/forum/viewtopic.php?t=55535
15-03-08 Fis fastMove diff
ELO: 8.22 +-3.0 (95%) LOS: 100.0%
Total: 20000 W: 4203 L: 3730 D: 12067
20000 @ 15+0.05 th 1 This might need some tuning first but I would like to know where I'm starting from. See commit comments for patch description. Pri -1
15-03-09 Fis fastMove diff
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 23266 W: 4662 L: 4492 D: 14112
sprt @ 15+0.05 th 1 Fixed games looks good so sprt.
15-03-09 jos mobility+outpost3 diff
ELO: 1.82 +-2.6 (95%) LOS: 91.7%
Total: 30000 W: 6507 L: 6350 D: 17143
30000 @ 9+0.05 th 1 Measure combined new mobility and outpost values. Each one showed a positive result, but not enough to pass a sprt. Also tried to somewhat 'normalize' the mobility values, though obviously there is some kind of local maximum for the queen.
15-03-09 jos mobility+outpost3 diff
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 57644 W: 11805 L: 11431 D: 34408
sprt @ 15+0.05 th 1 Let's see if it passes sprt.
15-03-09 Fis fastMove diff
LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 12470 W: 2091 L: 1931 D: 8448
sprt @ 60+0.05 th 1 Spend much less time on "only" moves detected using a pv stability metric. LTC.
15-03-09 sg double_history diff
ELO: 7.26 +-3.3 (95%) LOS: 100.0%
Total: 14937 W: 2710 L: 2398 D: 9829
20000 @ 15+0.05 th 7 Check for possible regression at multicore. Collect double history data and use them for quiet move ordering. Inspired by http://talkchess.com/forum/viewtopic.php?t=55535
15-03-09 SC oppbis_eval diff
LLR: -2.96 (-2.94,2.94) [-3.00,1.00]
Total: 1710 W: 246 L: 407 D: 1057
sprt @ 15+0.05 th 1 Simplify scaling with opposite-colored bishops.
15-03-09 sni yielding_spinlock diff
ELO: -15.10 +-13.7 (95%) LOS: 1.5%
Total: 829 W: 121 L: 157 D: 551
20000 @ 15+0.05 th 7 Estimate the value of using yielding spinlocks against current master (test is between C++11 compile with yielding spinlocks and current master without C++11 branch)
15-03-09 hxi cmh2 diff
ELO: 6.25 +-3.0 (95%) LOS: 100.0%
Total: 20000 W: 4062 L: 3702 D: 12236
20000 @ 15+0.05 th 1 quick check double history with a) 2 * weight b) used in LMR reduction code
15-03-09 SC oppbis_eval diff
LLR: -2.95 (-2.94,2.94) [-3.00,1.00]
Total: 2332 W: 423 L: 591 D: 1318
sprt @ 15+0.05 th 1 Simplify scaling with opposite-colored bishops, bugfix. I somehow got the values completely wrong in the previous test.
15-03-09 hxi cmh2 diff
ELO: 2.50 +-3.1 (95%) LOS: 94.5%
Total: 20000 W: 4118 L: 3974 D: 11908
20000 @ 15+0.05 th 1 quick check double history with a) 1 * weight b) used in LMR reduction code
15-03-09 hxi cmh2 diff
ELO: 6.08 +-3.0 (95%) LOS: 100.0%
Total: 20000 W: 4052 L: 3702 D: 12246
20000 @ 15+0.05 th 1 quick check double history with a) 2 * weight b) not used in LMR reduction code
15-03-09 hxi cmh2 diff
ELO: 6.52 +-3.0 (95%) LOS: 100.0%
Total: 20000 W: 4190 L: 3815 D: 11995
20000 @ 15+0.05 th 1 quick check double history with a) 1 * weight b) not used in LMR reduction code
15-03-09 jos mobility+outpost3 diff
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 69138 W: 11698 L: 11323 D: 46117
sprt @ 60+0.05 th 1 LTC: new mobility + outpost values.
15-03-10 vin pawn_tune diff
24463/25000 iterations
50000/50000 games played
50000 @ 15+0.05 th 1 Now that all the pawn tweaking has subsided, try re-tuning the base value of the pawn. All the extra bonuses may have affected the pawns v piece imbalance assessment.
15-03-10 mco c++11_official diff
ELO: 2.67 +-5.8 (95%) LOS: 81.5%
Total: 4679 W: 825 L: 789 D: 3065
20000 @ 15+0.05 th 16 C++11 native Win32 vs master: 16 threads
15-03-10 mco c++11_official diff
ELO: 1.67 +-2.8 (95%) LOS: 87.7%
Total: 20000 W: 3465 L: 3369 D: 13166
20000 @ 15+0.05 th 4 C++11 native Win32 vs master: 4 threads
15-03-10 sg king_shelter diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 11233 W: 2181 L: 2257 D: 6795
sprt @ 15+0.05 th 1 Add penalty if king shelter pawns are attackable by enemy pawns
15-03-10 sg king_shelter diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 7048 W: 1340 L: 1427 D: 4281
sprt @ 15+0.05 th 1 Try the other direction. Reduce penalty if king shelter pawns are not attackable by enemy pawns
15-03-11 sg double_history diff
ELO: 0.31 +-2.2 (95%) LOS: 60.8%
Total: 36695 W: 7330 L: 7297 D: 22068
40000 @ 15+0.05 th 1 I noted a NPS slowdown for the simplified version. Perhaps the result of a combination compiler and hardware. But to go save test the simplified against the original version.
15-03-11 vin tuned_pawn diff
ELO: -2.20 +-2.9 (95%) LOS: 6.9%
Total: 22000 W: 4306 L: 4445 D: 13249
20000 @ 15+0.05 th 1 Quick measure of tuned pawn values from SPSA run.
15-03-11 vin blocker2 diff
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 44719 W: 9028 L: 8798 D: 26893
sprt @ 15+0.05 th 1 2nd try at an idea that passed STC but not LTC last time. But instead of complicating it, try making it simpler instead. Local STC test of 3000 games looked promising.
15-03-11 sni yielding_spinlock2 diff
ELO: 6.02 +-3.0 (95%) LOS: 100.0%
Total: 16672 W: 2929 L: 2640 D: 11103
20000 @ 15+0.05 th 7 Test yielding spinlocks against current master
15-03-11 sni adaptive_mutex2 diff
ELO: 3.74 +-2.9 (95%) LOS: 99.4%
Total: 18000 W: 3139 L: 2945 D: 11916
20000 @ 15+0.05 th 7 Test adaptive mutexes against current master
15-03-11 vin blocker2 diff
LLR: -3.09 (-2.94,2.94) [0.00,6.00]
Total: 16836 W: 2756 L: 2783 D: 11297
sprt @ 60+0.05 th 1 Retest at LTC after STC pass.
15-03-11 jki nolocks4 diff
ELO: 0.49 +-5.7 (95%) LOS: 56.6%
Total: 5000 W: 897 L: 890 D: 3213
5000 @ 15+0.05 th 16 Test nolocks branch against c++11 master branch (this is only to catch serious regression or crash)
15-03-12 Fis fastMove diff
ELO: 3.02 +-3.0 (95%) LOS: 97.5%
Total: 20000 W: 4015 L: 3841 D: 12144
20000 @ 15+0.05 th 1 Quick sanity test to prove that merge was fine as suggested by Joona.
15-03-12 vin blocker2 diff
LLR: -2.94 (-2.94,2.94) [-1.50,4.50]
Total: 16226 W: 3102 L: 3164 D: 9960
sprt @ 15+0.05 th 1 Last try - perhaps just the rook counts (remembering it was the only one that moved positively in the SPSA) as a sort of offset to "on a half-open file" bonus
15-03-12 vin doubled_passer_2 diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 10783 W: 2093 L: 2170 D: 6520
sprt @ 15+0.05 th 1 2nd go at this idea. Since (accidentally) increasing the endgame penalty made the patch fail STC instead of pass, try removing it and increasing the midgame penalty.
15-03-12 sg update_stats diff
LLR: -2.97 (-2.94,2.94) [-1.50,4.50]
Total: 5175 W: 983 L: 1076 D: 3116
sprt @ 15+0.05 th 1 Allow MOVE_NONE (should help move order at root) and MOVE_NULL (should help move order in null move pruning) as previous move in counter moves and counter history stats update. For this moves as piece always NO_PIECE is used to separate them from other moves like Ra1 and avoid therefore noise. Local tests seems promising.
15-03-12 sg spsa_cmh diff
24172/25000 iterations
48368/50000 games played
50000 @ 15+0.05 th 1 Tune counter move history weight