Stockfish Testing Queue

Finished - 38773 tests

15-03-07 vin measure_BMC diff
ELO: -21.60 +-2.9 (95%) LOS: 0.0%
Total: 20000 W: 3038 L: 4280 D: 12682
20000 @ 15+0.05 th 3 Measure scalability of BestMoveChanges. Part 2 - measure benefit with 3 threads. Priority -1.
15-03-07 jki nolocks3 diff
ELO: 2.48 +-3.1 (95%) LOS: 94.3%
Total: 16079 W: 2710 L: 2595 D: 10774
20000 @ 15+0.05 th 16 Retire global lock, version 3. Including Marco's clean up.
15-03-07 lbr d6613b7 diff
ELO: 4.90 +-3.3 (95%) LOS: 99.8%
Total: 20000 W: 4858 L: 4576 D: 10566
20000 @ 9+0.03 th 1 bissecting. 1 thread
15-03-07 lbr d6613b7 diff
ELO: 5.09 +-3.1 (95%) LOS: 99.9%
Total: 20000 W: 4377 L: 4084 D: 11539
20000 @ 9+0.03 th 3 bissecting. 3 threads
15-03-07 lbr d6613b7 diff
ELO: 1.62 +-3.1 (95%) LOS: 84.7%
Total: 20000 W: 4183 L: 4090 D: 11727
20000 @ 9+0.03 th 7 bissecting. 7 threads
15-03-08 vin measure_BMC diff
ELO: -17.60 +-2.8 (95%) LOS: 0.0%
Total: 20000 W: 2915 L: 3927 D: 13158
20000 @ 60+0.05 th 1 Measure scalability of BestMoveChanges. Part 3 (and last). Measure benefit of BMC at LTC. Priority -1.
15-03-08 vin time_predictor_b diff
2480/25000 iterations
4999/50000 games played
50000 @ 60+0.05 th 1 Back to the second time predictor metric, after local testing to get a sensible starting point for the parameters. (On previous runs InstabilityMultiplier started far too high). Priority -1.
15-03-08 jos mobility diff
ELO: 0.62 +-2.2 (95%) LOS: 70.9%
Total: 41000 W: 8788 L: 8715 D: 23497
40000 @ 9+0.05 th 1 Measure new values for Mobility.
15-03-08 sg double_history diff
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 4747 W: 1005 L: 885 D: 2857
sprt @ 15+0.05 th 1 Collect double history data and use them for quiet move ordering. Inspired by http://talkchess.com/forum/viewtopic.php?t=55535
15-03-08 sg double_history diff
LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 5726 W: 1001 L: 872 D: 3853
sprt @ 60+0.05 th 1 LTC: Collect double history data and use them for quiet move ordering. Inspired by http://talkchess.com/forum/viewtopic.php?t=55535
15-03-08 Fis fastMove diff
ELO: 8.22 +-3.0 (95%) LOS: 100.0%
Total: 20000 W: 4203 L: 3730 D: 12067
20000 @ 15+0.05 th 1 This might need some tuning first but I would like to know where I'm starting from. See commit comments for patch description. Pri -1
15-03-09 Fis fastMove diff
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 23266 W: 4662 L: 4492 D: 14112
sprt @ 15+0.05 th 1 Fixed games looks good so sprt.
15-03-09 jos mobility+outpost3 diff
ELO: 1.82 +-2.6 (95%) LOS: 91.7%
Total: 30000 W: 6507 L: 6350 D: 17143
30000 @ 9+0.05 th 1 Measure combined new mobility and outpost values. Each one showed a positive result, but not enough to pass a sprt. Also tried to somewhat 'normalize' the mobility values, though obviously there is some kind of local maximum for the queen.
15-03-09 jos mobility+outpost3 diff
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 57644 W: 11805 L: 11431 D: 34408
sprt @ 15+0.05 th 1 Let's see if it passes sprt.
15-03-09 Fis fastMove diff
LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 12470 W: 2091 L: 1931 D: 8448
sprt @ 60+0.05 th 1 Spend much less time on "only" moves detected using a pv stability metric. LTC.
15-03-09 sg double_history diff
ELO: 7.26 +-3.3 (95%) LOS: 100.0%
Total: 14937 W: 2710 L: 2398 D: 9829
20000 @ 15+0.05 th 7 Check for possible regression at multicore. Collect double history data and use them for quiet move ordering. Inspired by http://talkchess.com/forum/viewtopic.php?t=55535
15-03-09 SC oppbis_eval diff
LLR: -2.96 (-2.94,2.94) [-3.00,1.00]
Total: 1710 W: 246 L: 407 D: 1057
sprt @ 15+0.05 th 1 Simplify scaling with opposite-colored bishops.
15-03-09 sni yielding_spinlock diff
ELO: -15.10 +-13.7 (95%) LOS: 1.5%
Total: 829 W: 121 L: 157 D: 551
20000 @ 15+0.05 th 7 Estimate the value of using yielding spinlocks against current master (test is between C++11 compile with yielding spinlocks and current master without C++11 branch)
15-03-09 hxi cmh2 diff
ELO: 6.25 +-3.0 (95%) LOS: 100.0%
Total: 20000 W: 4062 L: 3702 D: 12236
20000 @ 15+0.05 th 1 quick check double history with a) 2 * weight b) used in LMR reduction code
15-03-09 SC oppbis_eval diff
LLR: -2.95 (-2.94,2.94) [-3.00,1.00]
Total: 2332 W: 423 L: 591 D: 1318
sprt @ 15+0.05 th 1 Simplify scaling with opposite-colored bishops, bugfix. I somehow got the values completely wrong in the previous test.
15-03-09 hxi cmh2 diff
ELO: 2.50 +-3.1 (95%) LOS: 94.5%
Total: 20000 W: 4118 L: 3974 D: 11908
20000 @ 15+0.05 th 1 quick check double history with a) 1 * weight b) used in LMR reduction code
15-03-09 hxi cmh2 diff
ELO: 6.08 +-3.0 (95%) LOS: 100.0%
Total: 20000 W: 4052 L: 3702 D: 12246
20000 @ 15+0.05 th 1 quick check double history with a) 2 * weight b) not used in LMR reduction code
15-03-09 hxi cmh2 diff
ELO: 6.52 +-3.0 (95%) LOS: 100.0%
Total: 20000 W: 4190 L: 3815 D: 11995
20000 @ 15+0.05 th 1 quick check double history with a) 1 * weight b) not used in LMR reduction code
15-03-09 jos mobility+outpost3 diff
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 69138 W: 11698 L: 11323 D: 46117
sprt @ 60+0.05 th 1 LTC: new mobility + outpost values.
15-03-10 vin pawn_tune diff
24463/25000 iterations
50000/50000 games played
50000 @ 15+0.05 th 1 Now that all the pawn tweaking has subsided, try re-tuning the base value of the pawn. All the extra bonuses may have affected the pawns v piece imbalance assessment.
15-03-10 mco c++11_official diff
ELO: 2.67 +-5.8 (95%) LOS: 81.5%
Total: 4679 W: 825 L: 789 D: 3065
20000 @ 15+0.05 th 16 C++11 native Win32 vs master: 16 threads
15-03-10 mco c++11_official diff
ELO: 1.67 +-2.8 (95%) LOS: 87.7%
Total: 20000 W: 3465 L: 3369 D: 13166
20000 @ 15+0.05 th 4 C++11 native Win32 vs master: 4 threads
15-03-10 sg king_shelter diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 11233 W: 2181 L: 2257 D: 6795
sprt @ 15+0.05 th 1 Add penalty if king shelter pawns are attackable by enemy pawns
15-03-10 sg king_shelter diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 7048 W: 1340 L: 1427 D: 4281
sprt @ 15+0.05 th 1 Try the other direction. Reduce penalty if king shelter pawns are not attackable by enemy pawns
15-03-11 sg double_history diff
ELO: 0.31 +-2.2 (95%) LOS: 60.8%
Total: 36695 W: 7330 L: 7297 D: 22068
40000 @ 15+0.05 th 1 I noted a NPS slowdown for the simplified version. Perhaps the result of a combination compiler and hardware. But to go save test the simplified against the original version.
15-03-11 vin tuned_pawn diff
ELO: -2.20 +-2.9 (95%) LOS: 6.9%
Total: 22000 W: 4306 L: 4445 D: 13249
20000 @ 15+0.05 th 1 Quick measure of tuned pawn values from SPSA run.
15-03-11 vin blocker2 diff
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 44719 W: 9028 L: 8798 D: 26893
sprt @ 15+0.05 th 1 2nd try at an idea that passed STC but not LTC last time. But instead of complicating it, try making it simpler instead. Local STC test of 3000 games looked promising.
15-03-11 sni yielding_spinlock2 diff
ELO: 6.02 +-3.0 (95%) LOS: 100.0%
Total: 16672 W: 2929 L: 2640 D: 11103
20000 @ 15+0.05 th 7 Test yielding spinlocks against current master
15-03-11 sni adaptive_mutex2 diff
ELO: 3.74 +-2.9 (95%) LOS: 99.4%
Total: 18000 W: 3139 L: 2945 D: 11916
20000 @ 15+0.05 th 7 Test adaptive mutexes against current master
15-03-11 vin blocker2 diff
LLR: -3.09 (-2.94,2.94) [0.00,6.00]
Total: 16836 W: 2756 L: 2783 D: 11297
sprt @ 60+0.05 th 1 Retest at LTC after STC pass.
15-03-11 jki nolocks4 diff
ELO: 0.49 +-5.7 (95%) LOS: 56.6%
Total: 5000 W: 897 L: 890 D: 3213
5000 @ 15+0.05 th 16 Test nolocks branch against c++11 master branch (this is only to catch serious regression or crash)
15-03-12 Fis fastMove diff
ELO: 3.02 +-3.0 (95%) LOS: 97.5%
Total: 20000 W: 4015 L: 3841 D: 12144
20000 @ 15+0.05 th 1 Quick sanity test to prove that merge was fine as suggested by Joona.
15-03-12 vin blocker2 diff
LLR: -2.94 (-2.94,2.94) [-1.50,4.50]
Total: 16226 W: 3102 L: 3164 D: 9960
sprt @ 15+0.05 th 1 Last try - perhaps just the rook counts (remembering it was the only one that moved positively in the SPSA) as a sort of offset to "on a half-open file" bonus
15-03-12 vin doubled_passer_2 diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 10783 W: 2093 L: 2170 D: 6520
sprt @ 15+0.05 th 1 2nd go at this idea. Since (accidentally) increasing the endgame penalty made the patch fail STC instead of pass, try removing it and increasing the midgame penalty.
15-03-12 sg update_stats diff
LLR: -2.97 (-2.94,2.94) [-1.50,4.50]
Total: 5175 W: 983 L: 1076 D: 3116
sprt @ 15+0.05 th 1 Allow MOVE_NONE (should help move order at root) and MOVE_NULL (should help move order in null move pruning) as previous move in counter moves and counter history stats update. For this moves as piece always NO_PIECE is used to separate them from other moves like Ra1 and avoid therefore noise. Local tests seems promising.
15-03-12 sg spsa_cmh diff
24172/25000 iterations
48368/50000 games played
50000 @ 15+0.05 th 1 Tune counter move history weight
15-03-12 jos mobility+outpost3 diff
LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 82485 W: 16388 L: 16247 D: 49850
sprt @ 15+0.05 th 1 Retest with new interpolated mobility values, as suggested by Joona.
15-03-12 mbo no_countermoves_for_sor diff
LLR: -2.96 (-2.94,2.94) [-3.00,1.00]
Total: 40107 W: 7765 L: 7998 D: 24344
sprt @ 15+0.05 th 1 Do not use counter moves for sorting. Test as simplification.
15-03-12 mbo remove_followupmoves diff
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 34891 W: 6904 L: 6808 D: 21179
sprt @ 15+0.05 th 1 Remove followup moves. Test as simplification.
15-03-12 mbo remove_followupmoves_no diff
LLR: -1.50 (-2.94,2.94) [-3.00,1.00]
Total: 107206 W: 21087 L: 21364 D: 64755
sprt @ 15+0.05 th 1 Removes followupmoves, and does not use countermoves for sorting. Test as simplification.
15-03-12 vin blocker2 diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 6007 W: 1133 L: 1223 D: 3651
sprt @ 15+0.05 th 1 Hmm.. so Q+R together have passed STC SPRT twice. But R alone failed STC. So perhaps it was the Q all along?
15-03-12 hxi fmh diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 72978 W: 14476 L: 14383 D: 44119
sprt @ 15+0.05 th 1 try Followup Move History
15-03-12 hxi cmh3 diff
LLR: -2.97 (-2.94,2.94) [-1.50,4.50]
Total: 20275 W: 3892 L: 3944 D: 12439
sprt @ 15+0.05 th 1 try Counter Move History max 4 times higher
15-03-12 jki spinl diff
ELO: 3.54 +-2.9 (95%) LOS: 99.2%
Total: 17971 W: 2976 L: 2793 D: 12202
20000 @ 15+0.05 th 7 Yielding spinlocks. Retest with the latest updates.
15-03-13 sg tuned_cmh diff
LLR: -2.95 (-2.94,2.94) [0.00,4.00]
Total: 17233 W: 3267 L: 3364 D: 10602
sprt @ 15+0.05 th 1 The tuned counter move history weight seems small, but lets see.