Stockfish Testing Queue

Finished - 50774 tests

15-10-07 mco lazy_smp diff
LLR: 0.19 (-2.94,2.94) [-3.00,1.00]
Total: 32878 W: 5500 L: 5543 D: 21835
sprt @ 15+0.05 th 3 Test for no regression locking semplification
15-10-07 Roc SkipWhenPawnThreats diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 5664 W: 997 L: 1098 D: 3569
sprt @ 15+0.05 th 1 Take 1
15-10-06 aji smp_hybrid3 diff
LLR: 1.30 (-2.94,2.94) [0.00,5.00]
Total: 4679 W: 725 L: 655 D: 3299
sprt @ 15+0.05 th 7 Hybrid of lazy and ybw search: STC
15-10-07 Mys RCC diff
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 15261 W: 2343 L: 2214 D: 10704
sprt @ 60+0.05 th 1 Re-run of a passed simplification to ensure no surprises after several evaluation patches as suggested here (but instead using stricter bounds rather than [0, 4]) https://github.com/official-stockfish/Stockfish/pull/442
15-10-06 Fis lazy_smp diff
LLR: -2.95 (-2.94,2.94) [-3.00,1.00]
Total: 23568 W: 3859 L: 4051 D: 15658
sprt @ 15+0.05 th 3 It should be better to let all threads increment BestMoveChanges. Also simplification. (Testing against fixed easy move.)
15-10-06 Voy HR diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 29135 W: 5368 L: 5368 D: 18399
sprt @ 15+0.05 th 1 Take 3: Update just Fail Low refutation bonus for history.
15-10-05 jhe lazy_smp2 diff
ELO: -4.13 +-3.5 (95%) LOS: 1.1%
Total: 10000 W: 1296 L: 1415 D: 7289
10000 @ 60+0.10 th 3 3 thread LTC test for mbootsector's Lazy SMP 2.
15-10-06 Fis Tempo diff
LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 21797 W: 3998 L: 4080 D: 13719
sprt @ 15+0.05 th 1 Based on SF eval of root position +18cp(46 internal) at depth 42 it seems tempo should be much higher. Take 2 using 22. (Take 1 priority dropped to -1)
15-10-06 Voy HR diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 7117 W: 1243 L: 1338 D: 4536
sprt @ 15+0.05 th 1 Take 2: Update just Fail High refutation bonus for history.
15-10-06 aji smp_hybrid2 diff
Pending...
sprt @ 15+0.05 th 7 Second attempt at lazy smp + ybw smp hybrid: STC
15-10-06 Fis lazy_smp diff
ELO: -3.51 +-3.9 (95%) LOS: 4.0%
Total: 10000 W: 1610 L: 1711 D: 6679
10000 @ 15+0.05 th 3 latest lazy_smp w/ easy move fix
15-10-06 aji smp_hybrid diff
Pending...
sprt @ 15+0.05 th 7 a first cut attempt to make a hybrid between traditional and lazy smp: STC
15-10-06 Voy HR diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 28184 W: 5088 L: 5093 D: 18003
sprt @ 15+0.05 th 1 Update History for Fail Low/High Refutation bonus.
15-10-06 aji widen diff
LLR: -2.97 (-2.94,2.94) [0.00,5.00]
Total: 4366 W: 728 L: 835 D: 2803
sprt @ 15+0.05 th 2 Try to widen search at 2 threads: STC
15-10-06 Voy AgeStatRevisit diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 4943 W: 850 L: 954 D: 3139
sprt @ 15+0.05 th 1 Try 50% decay...since we are now dealing with bigger numbers.
15-10-06 Roc JustHanging diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 11525 W: 2107 L: 2183 D: 7235
sprt @ 15+0.05 th 1 Fixed some duplication
15-10-06 aji widen diff
LLR: -0.07 (-2.94,2.94) [0.00,5.00]
Total: 25 W: 5 L: 8 D: 12
sprt @ 15+0.05 th 2 see if there is any value in widening search at 2 threads : STC
15-10-06 aji widen diff
LLR: 0.69 (-2.94,2.94) [0.00,5.00]
Total: 1534 W: 303 L: 267 D: 964
sprt @ 15+0.05 th 2 see if there is any value in widening search at 2 threads : STC
15-10-06 sni 4men_probe_in_qsearch diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 9031 W: 1318 L: 1406 D: 6307
sprt @ 60+0.05 th 1 4-Syzygy vs 4-Syzygy: test the effect of probing the 4 men tables in qsearch at LTC. Lower troughput.
15-10-05 mbo lazy_smp2 diff
ELO: 31.92 +-8.6 (95%) LOS: 100.0%
Total: 1430 W: 232 L: 101 D: 1097
5000 @ 120+0.1 th 23 New version of Lazy SMP. 23 Threads. VLTC, Just because the queue is almost empty. 120+0.1 is twice the time of the previous test, but short enough to not cause timeouts in the worker.
15-10-06 IIv lazy_smp diff
LLR: -1.11 (-2.94,2.94) [0.00,5.00]
Total: 9300 W: 1478 L: 1488 D: 6334
sprt @ 15+0.05 th 4 My variant of lazy_smp2. Standard test, 4 threads.
15-10-06 Voy FHR diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 4204 W: 680 L: 787 D: 2737
sprt @ 15+0.05 th 1 Try out an idea to improve Fail High Refutation bonus logic.
15-10-05 pec master diff
ELO: 44.23 +-1.9 (95%) LOS: 100.0%
Total: 40000 W: 8755 L: 3690 D: 27555
40000 @ 60+0.05 th 1 Regression test until 83e19f, as framework is empty anyway and test has not been run for long time. Lower throughput.
15-10-05 IIv reduction_tune diff
9866/10000 iterations
20000/20000 games played
20000 @ 30+0.05 th 1 Tuning moves 7-12, session 2.
15-10-05 Voy AgeStatRevisit diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 18766 W: 3384 L: 3429 D: 11953
sprt @ 15+0.05 th 1 Since the way we update stats have dramatically changed. Lets try aging the stats again. Decay of 25%. Based off passed YellowCombo. (Fix Bench)
15-10-05 Roc JustHanging diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 14455 W: 2596 L: 2660 D: 9199
sprt @ 15+0.05 th 1 take 1b with 66% bonus
15-10-05 Roc JustHanging diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 18273 W: 3310 L: 3357 D: 11606
sprt @ 15+0.05 th 1 Respin of take # 1 which had a wrong bench
15-10-05 Roc Hanging2 diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 6380 W: 1150 L: 1248 D: 3982
sprt @ 15+0.05 th 1 Take 3
15-10-05 Roc Hanging2 diff
LLR: -2.97 (-2.94,2.94) [0.00,5.00]
Total: 5111 W: 874 L: 978 D: 3259
sprt @ 15+0.05 th 1 take 2
15-10-05 Voy Simple diff
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 37702 W: 6995 L: 6903 D: 23804
sprt @ 15+0.05 th 1 Simplification of recent passed test (YellowCombo)...hoping that this may get a bit of ELO as well.
15-10-05 Mys RT diff
LLR: -0.02 (-2.94,2.94) [0.00,5.00]
Total: 648 W: 122 L: 120 D: 406
sprt @ 15+0.05 th 1 Ranks & threats ?
15-10-05 Roc tune_check diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 21150 W: 3825 L: 3860 D: 13465
sprt @ 15+0.05 th 1 Verifying the new values after 40M and ck=4
15-10-04 Roc tune_check diff
19609/20000 iterations
39769/40000 games played
40000 @ 30+0.05 th 1 Tuning the new check bonus trying with ck=8 instead of 4 (default was 2.5)
15-10-04 sg new_history diff
LLR: -3.76 (-2.94,2.94) [0.00,5.00]
Total: 51839 W: 9511 L: 9448 D: 32880
sprt @ 15+0.05 th 1 First attempt was neutral. So double up weight at move ordering.
15-10-04 SC scale_factor_tunable diff
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 36089 W: 5589 L: 5331 D: 25169
sprt @ 60+0.05 th 1 Values after 183k iterations. Let us see. LTC.
15-10-04 sni 4men_probe_in_qsearch diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 37998 W: 6911 L: 6874 D: 24213
sprt @ 15+0.05 th 1 4-Syzygy vs 4-Syzygy: test the effect of probing the 4 men tables in qsearch.
15-10-04 Voy YellowCombo diff
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 46036 W: 7046 L: 6756 D: 32234
sprt @ 60+0.05 th 1 LTC: http://tests.stockfishchess.org/tests/view/560c959f0ebc597e4f23e409 , http://tests.stockfishchess.org/tests/view/560a1ae60ebc597e4f23e36e
15-10-04 Mys AU diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 10038 W: 1733 L: 1816 D: 6489
sprt @ 15+0.05 th 1 Fixed bench
15-10-04 jos lazy_smp2 diff
LLR: -0.01 (-2.94,2.94) [-2.00,5.00]
Total: 148 W: 16 L: 16 D: 116
sprt @ 180+2 th 7 Lazy SMP. 7 Threads XLTC. This should already give a good hint about scalability. Resubmitted as sprt[-2, 5] test, so that more machines are able to participate. Test should stop if it turns out to be much weaker or stronger, otherwise we can stop after 5,000 or 10,000 games manually.
15-10-04 Mys QCC diff
LLR: -2.94 (-2.94,2.94) [0.00,5.00]
Total: 3160 W: 536 L: 648 D: 1976
sprt @ 15+0.05 th 1 Larger bonus for Queen contact checks where the King is on the edge of the board.
15-10-04 sni good_knight4 diff
LLR: -1.66 (-2.94,2.94) [0.00,5.00]
Total: 3823 W: 675 L: 729 D: 2419
sprt @ 15+0.05 th 1 Take 4, bonus=S(0,5)
15-10-04 Roc tune_check diff
19070/20000 iterations
40000/40000 games played
40000 @ 30+0.05 th 1 Tuning the new check bonus
15-10-04 sg new_history diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 36106 W: 6494 L: 6466 D: 23146
sprt @ 15+0.05 th 1 Introduce new history table based on from square.
15-10-04 Voy YellowCombo diff
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 21802 W: 4107 L: 3887 D: 13808
sprt @ 15+0.05 th 1 http://tests.stockfishchess.org/tests/view/560c959f0ebc597e4f23e409 , http://tests.stockfishchess.org/tests/view/560a1ae60ebc597e4f23e36e
15-10-04 sni bad_knight diff
LLR: -2.97 (-2.94,2.94) [0.00,5.00]
Total: 3954 W: 661 L: 770 D: 2523
sprt @ 15+0.05 th 1 Bad knight
15-10-04 Roc UnprotectedPhalanx diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 11908 W: 2152 L: 2226 D: 7530
sprt @ 15+0.05 th 1 UP_20151003_1
15-10-03 Roc SafeSentry diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 14676 W: 2609 L: 2672 D: 9395
sprt @ 15+0.05 th 1 Fixed bench
15-10-03 Voy BalanceStatFA diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 11836 W: 2106 L: 2181 D: 7549
sprt @ 15+0.05 th 1 One last shot...prior test gave a good clue what's going on. I think this version will work.
15-10-03 Roc SemiBackward2 diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 5227 W: 908 L: 1011 D: 3308
sprt @ 15+0.05 th 1 Fixed array index
15-10-03 sg checked diff
LLR: -2.95 (-2.94,2.94) [0.00,4.00]
Total: 63017 W: 11448 L: 11389 D: 40180
sprt @ 15+0.05 th 1 Ok the patch have an effect on endgame. Try now the opposite and double up endgame score.