Stockfish Testing Queue

Finished - 2606 tests

15-08-17 An researchStatBonus4 diff
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 151824 W: 25179 L: 24459 D: 102186
sprt @ 10+0.1 th 1 STC Take 4. Scale with R value from LMR instead of node depth.
16-08-17 Vo lmrQSv diff
LLR: 3.56 (-2.94,2.94) [0.00,5.00]
Total: 40696 W: 7416 L: 7094 D: 26186
sprt @ 10+0.1 th 1 Different variant... (fixed patch)
16-08-17 pb smp_pickbest3 diff
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 4878 W: 940 L: 795 D: 3143
sprt @ 5+0.05 th 3 Trying to further improve pickbest-logic by comparing nominalDepth (=effective used rootDepth for best move) instead of completedDepth.
15-08-17 sg update_stats_mcp2 diff
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 102267 W: 18660 L: 18101 D: 65506
sprt @ 10+0.1 th 1 Only for depth < 8
15-08-17 lb stats16 diff
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 31489 W: 4532 L: 4295 D: 22662
sprt @ 40+0.4 th 1 16bit stats, 3rd test. Verify that we really have an elo gain with strong hash pressure. This time use longer time control, and larger hash to ensure that the hash size is big w.r.t. CPU cache sizes, and it's not an artificial effect that only works with microscopic hash sizes but doesn't scale.
15-08-17 sg update_stats_mcp diff
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 27488 W: 5003 L: 4763 D: 17722
sprt @ 10+0.1 th 1 Even more stats bonus
14-08-17 mc fix diff
LLR: 2.95 (-2.94,2.94) [-4.00,0.00]
Total: 14436 W: 2304 L: 2196 D: 9936
sprt @ 10+0.1 th 3 Fix regression introduced by "Thread code reformat".
14-08-17 Vo lmrQS diff
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 14529 W: 2682 L: 2496 D: 9351
sprt @ 10+0.1 th 1 stc
14-08-17 mc ord^ diff
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 17970 W: 2982 L: 2857 D: 12131
sprt @ 10+0.1 th 1 Unify captures scoring: take 2
13-08-17 lb stats16 diff
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 73542 W: 13058 L: 13026 D: 47458
sprt @ 10+0.1 th 1 16bit stats, 2nd test. Now verify no regression with low hash pressure.
09-08-17 lb stats16 diff
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 258430 W: 46977 L: 45943 D: 165510
sprt @ 10+0.1 th 1 16bit stats: does reducing memory footprint by 1.2MB translate into a mesurable speed-up? let's find out in 2 tests: (1) Hash=2 (high pressure) (2) Hash=8 (low pressure).
13-08-17 Vo lmrTweak3b diff
LLR: 2.97 (-2.94,2.94) [0.00,5.00]
Total: 9630 W: 1802 L: 1636 D: 6192
sprt @ 10+0.1 th 1 stc
10-08-17 vd scoreSimp diff
LLR: 2.94 (-2.94,2.94) [-3.00,1.00]
Total: 34641 W: 4521 L: 4419 D: 25701
sprt @ 60+0.6 th 1 ltc, take 2
09-08-17 vd scoreSimp diff
LLR: 3.17 (-2.94,2.94) [-3.00,1.00]
Total: 70240 W: 12543 L: 12494 D: 45203
sprt @ 10+0.1 th 1 stc
09-08-17 vd scoreSimp diff
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 27377 W: 4957 L: 4847 D: 17573
sprt @ 10+0.1 th 1 stc, take 2
05-08-17 sg lmr_mc_depth diff
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 106936 W: 19421 L: 18844 D: 68671
sprt @ 10+0.1 th 1 Combine with another of my yellow lmr tweak. Retry with now fixed formula (i inserted a term at the wrong place)
07-08-17 Vo lmrTweak3 diff
LLR: 2.97 (-2.94,2.94) [0.00,5.00]
Total: 40655 W: 7450 L: 7152 D: 26053
sprt @ 10+0.1 th 1 ver. 3
06-08-17 ia lever_count diff
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 20536 W: 3771 L: 3559 D: 13206
sprt @ 10+0.1 th 1 malus for having few levers
05-08-17 II tmm_simple' diff
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 2559 W: 502 L: 368 D: 1689
sprt @ 80/20 th 1 This should be a significant improvement in x/y time controls when x is huge. Test with x=80. It's a non-functional change for x<=50.
03-08-17 El reg_test diff
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 36987 W: 6792 L: 6510 D: 23685
sprt @ 10+0.1 th 1 Do SPRT test to test for regression.
03-08-17 II tmm_simple diff
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 5913 W: 1217 L: 1069 D: 3627
sprt @ 15+0 th 1 For the completeness of results, let's also see sudden death performance.
02-08-17 II tmm_simple diff
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 75356 W: 9690 L: 9640 D: 56026
sprt @ 60+0.6 th 1 LTC: Take 2: as some unstable machines are losing on time, try to increase Move Overhead. This should be negligible on higher time controls, but I'm not sure for STC.
02-08-17 II tmm_simple diff
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 25363 W: 4658 L: 4545 D: 16160
sprt @ 10+0.1 th 1 Take 2: as some unstable machines are losing on time, try to increase Move Overhead. This should be negligible on higher time controls, but I'm not sure for STC.
02-08-17 tt fix_opposite_bishops diff
LLR: 2.95 (-2.94,2.94) [-4.00,0.00]
Total: 24249 W: 4349 L: 4275 D: 15625
sprt @ 10+0.1 th 1 only affects evaluation of KBBBK, KBBBBK, ect.
01-08-17 II tmm_simple diff
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 19377 W: 3650 L: 3526 D: 12201
sprt @ 40/10 th 1 Test also movestogo case.
31-07-17 Vo yellowCombo diff
LLR: 2.94 (-2.94,2.94) [0.00,5.00]
Total: 8260 W: 1490 L: 1333 D: 5437
sprt @ 10+0.1 th 1 Combine two yellow LTC patches...
30-07-17 Ro WeakUnopposed diff
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 32557 W: 5914 L: 5652 D: 20991
sprt @ 10+0.1 th 1 Take 3: Increase also the supported outpost
27-07-17 Vo searchT diff
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 114130 W: 20626 L: 20085 D: 73419
sprt @ 10+0.1 th 1 stc
26-07-17 sn tweak_asymmetry diff
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 11199 W: 2088 L: 1916 D: 7195
sprt @ 10+0.1 th 1 Tweak asymmetry measure. Tested on top of the passed "Connected3" patch.
25-07-17 vd mdp diff
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 46576 W: 5939 L: 5851 D: 34786
sprt @ 60+0.6 th 1 ltc
25-07-17 vd mdp diff
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 11997 W: 2172 L: 2036 D: 7789
sprt @ 10+0.1 th 1 stc
24-07-17 sn Connected3 diff
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 48044 W: 6371 L: 6099 D: 35574
sprt @ 60+0.6 th 1 LTC: Tweak connected pawns seed[] array : +5. Tested on top of the passed patch "Connected2" by Alain.
24-07-17 vd qsearchhist diff
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 24253 W: 4421 L: 4194 D: 15638
sprt @ 10+0.1 th 1 stc, take 3
22-07-17 sn Connected3 diff
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 99033 W: 17939 L: 17448 D: 63646
sprt @ 10+0.1 th 1 Tweak connected pawns seed[] array : +5. Tested on top of the passed patch "Connected2" by Alain.
17-07-17 pb smp_pickbest2 diff
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 83333 W: 13769 L: 13320 D: 56244
sprt @ 5+0.05 th 5 Take2, taking advantage of empty framework for another try
21-07-17 Fi master diff
ELO: 8.22 +-2.1 (95%) LOS: 100.0%
Total: 40000 W: 7750 L: 6804 D: 25446
40000 @ 10+0.1 th 1 Let's see what a doubling is worth with TT pressure. To be compared to http://tests.stockfishchess.org/tests/view/5971a2f80ebc5916ff649df0 Half throughput.
20-07-17 sg ttRefresh2 diff
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 99443 W: 12796 L: 12777 D: 73870
sprt @ 60+0.6 th 1 LTC. Instead of full refresh increment generation. Bench is the same but differs for higher depth. (Try as simplification)
20-07-17 Ro Connected2 diff
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 24721 W: 3306 L: 3191 D: 18224
sprt @ 60+0.6 th 1 LTC
21-07-17 Fi master diff
ELO: 2.30 +-2.0 (95%) LOS: 98.6%
Total: 40000 W: 7376 L: 7111 D: 25513
40000 @ 10+0.1 th 1 Measure how much elo we get from doubling TT size from 4MB to 8MB at STC. See pull #1178
21-07-17 Vo hpt3 diff
LLR: 2.97 (-2.94,2.94) [0.00,5.00]
Total: 33795 W: 6166 L: 5898 D: 21731
sprt @ 10+0.1 th 1 stc
21-07-17 vd qsearchhist diff
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 9840 W: 1844 L: 1677 D: 6319
sprt @ 10+0.1 th 1 STC, take 2.
20-07-17 Ro Connected2 diff
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 19613 W: 3663 L: 3540 D: 12410
sprt @ 10+0.1 th 1 STC is struggling, try a different way
19-07-17 sg ttRefresh2 diff
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 60012 W: 10753 L: 10698 D: 38561
sprt @ 10+0.1 th 1 Before going to LTC check first at STC with hash pressure (Hash=1). Instead of full refresh increment generation. Bench is the same but differs for higher depth. (Try as simplification)
19-07-17 sg ttRefresh2 diff
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 44271 W: 8060 L: 7979 D: 28232
sprt @ 10+0.1 th 1 Instead of full refresh increment generation. Bench is the same but differs for higher depth. (Try as simplification)
18-07-17 Vo lmrT2 diff
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 15238 W: 2766 L: 2578 D: 9894
sprt @ 10+0.1 th 1 stc
17-07-17 sg ttRefresh diff
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 71414 W: 9181 L: 9126 D: 53107
sprt @ 60+0.6 th 1 LTC: Is generation refresh during tt probing really useful?
17-07-17 sg ttRefresh diff
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 18369 W: 3427 L: 3302 D: 11640
sprt @ 10+0.1 th 1 Is generation refresh during tt probing really useful?
13-07-17 Fi PSQT diff
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 134383 W: 17486 L: 16993 D: 99904
sprt @ 60+0.6 th 1 LTC Take 2
30-06-17 vd deStackMovepick diff
LLR: 3.13 (-2.94,2.94) [-3.00,1.00]
Total: 381053 W: 68071 L: 68551 D: 244431
sprt @ 10+0.1 th 1 test for no regression
12-07-17 Vo aspAlpha diff
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 87507 W: 11264 L: 11230 D: 65013
sprt @ 60+0.6 th 1 LTC: Don't modify alpha window on fail-high