Stockfish Testing Queue

Finished - 27208 tests

06-10-15 aj widen diff
LLR: -0.07 (-2.94,2.94) [0.00,5.00]
Total: 25 W: 5 L: 8 D: 12
sprt @ 15+0.05 th 2 see if there is any value in widening search at 2 threads : STC
06-10-15 aj widen diff
LLR: 0.69 (-2.94,2.94) [0.00,5.00]
Total: 1534 W: 303 L: 267 D: 964
sprt @ 15+0.05 th 2 see if there is any value in widening search at 2 threads : STC
06-10-15 sn 4men_probe_in_qsearch diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 9031 W: 1318 L: 1406 D: 6307
sprt @ 60+0.05 th 1 4-Syzygy vs 4-Syzygy: test the effect of probing the 4 men tables in qsearch at LTC. Lower troughput.
05-10-15 mb lazy_smp2 diff
ELO: 31.92 +-8.6 (95%) LOS: 100.0%
Total: 1430 W: 232 L: 101 D: 1097
5000 @ 120+0.1 th 23 New version of Lazy SMP. 23 Threads. VLTC, Just because the queue is almost empty. 120+0.1 is twice the time of the previous test, but short enough to not cause timeouts in the worker.
06-10-15 II lazy_smp diff
LLR: -1.11 (-2.94,2.94) [0.00,5.00]
Total: 9300 W: 1478 L: 1488 D: 6334
sprt @ 15+0.05 th 4 My variant of lazy_smp2. Standard test, 4 threads.
06-10-15 Vo FHR diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 4204 W: 680 L: 787 D: 2737
sprt @ 15+0.05 th 1 Try out an idea to improve Fail High Refutation bonus logic.
05-10-15 pe master diff
ELO: 44.23 +-1.9 (95%) LOS: 100.0%
Total: 40000 W: 8755 L: 3690 D: 27555
40000 @ 60+0.05 th 1 Regression test until 83e19f, as framework is empty anyway and test has not been run for long time. Lower throughput.
05-10-15 II reduction_tune diff
9866/10000 iterations
20000/20000 games played
20000 @ 30+0.05 th 1 Tuning moves 7-12, session 2.
05-10-15 Vo AgeStatRevisit diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 18766 W: 3384 L: 3429 D: 11953
sprt @ 15+0.05 th 1 Since the way we update stats have dramatically changed. Lets try aging the stats again. Decay of 25%. Based off passed YellowCombo. (Fix Bench)
05-10-15 Ro JustHanging diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 14455 W: 2596 L: 2660 D: 9199
sprt @ 15+0.05 th 1 take 1b with 66% bonus
05-10-15 Ro JustHanging diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 18273 W: 3310 L: 3357 D: 11606
sprt @ 15+0.05 th 1 Respin of take # 1 which had a wrong bench
05-10-15 Ro Hanging2 diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 6380 W: 1150 L: 1248 D: 3982
sprt @ 15+0.05 th 1 Take 3
05-10-15 Ro Hanging2 diff
LLR: -2.97 (-2.94,2.94) [0.00,5.00]
Total: 5111 W: 874 L: 978 D: 3259
sprt @ 15+0.05 th 1 take 2
05-10-15 Vo Simple diff
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 37702 W: 6995 L: 6903 D: 23804
sprt @ 15+0.05 th 1 Simplification of recent passed test (YellowCombo)...hoping that this may get a bit of ELO as well.
05-10-15 My RT diff
LLR: -0.02 (-2.94,2.94) [0.00,5.00]
Total: 648 W: 122 L: 120 D: 406
sprt @ 15+0.05 th 1 Ranks & threats ?
05-10-15 Ro tune_check diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 21150 W: 3825 L: 3860 D: 13465
sprt @ 15+0.05 th 1 Verifying the new values after 40M and ck=4
04-10-15 Ro tune_check diff
19609/20000 iterations
39769/40000 games played
40000 @ 30+0.05 th 1 Tuning the new check bonus trying with ck=8 instead of 4 (default was 2.5)
04-10-15 sg new_history diff
LLR: -3.76 (-2.94,2.94) [0.00,5.00]
Total: 51839 W: 9511 L: 9448 D: 32880
sprt @ 15+0.05 th 1 First attempt was neutral. So double up weight at move ordering.
04-10-15 SC scale_factor_tunable diff
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 36089 W: 5589 L: 5331 D: 25169
sprt @ 60+0.05 th 1 Values after 183k iterations. Let us see. LTC.
04-10-15 sn 4men_probe_in_qsearch diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 37998 W: 6911 L: 6874 D: 24213
sprt @ 15+0.05 th 1 4-Syzygy vs 4-Syzygy: test the effect of probing the 4 men tables in qsearch.
04-10-15 Vo YellowCombo diff
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 46036 W: 7046 L: 6756 D: 32234
sprt @ 60+0.05 th 1 LTC: http://tests.stockfishchess.org/tests/view/560c959f0ebc597e4f23e409 , http://tests.stockfishchess.org/tests/view/560a1ae60ebc597e4f23e36e
04-10-15 My AU diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 10038 W: 1733 L: 1816 D: 6489
sprt @ 15+0.05 th 1 Fixed bench
04-10-15 jo lazy_smp2 diff
LLR: -0.01 (-2.94,2.94) [-2.00,5.00]
Total: 148 W: 16 L: 16 D: 116
sprt @ 180+2 th 7 Lazy SMP. 7 Threads XLTC. This should already give a good hint about scalability. Resubmitted as sprt[-2, 5] test, so that more machines are able to participate. Test should stop if it turns out to be much weaker or stronger, otherwise we can stop after 5,000 or 10,000 games manually.
04-10-15 My QCC diff
LLR: -2.94 (-2.94,2.94) [0.00,5.00]
Total: 3160 W: 536 L: 648 D: 1976
sprt @ 15+0.05 th 1 Larger bonus for Queen contact checks where the King is on the edge of the board.
04-10-15 sn good_knight4 diff
LLR: -1.66 (-2.94,2.94) [0.00,5.00]
Total: 3823 W: 675 L: 729 D: 2419
sprt @ 15+0.05 th 1 Take 4, bonus=S(0,5)
04-10-15 Ro tune_check diff
19070/20000 iterations
40000/40000 games played
40000 @ 30+0.05 th 1 Tuning the new check bonus
04-10-15 sg new_history diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 36106 W: 6494 L: 6466 D: 23146
sprt @ 15+0.05 th 1 Introduce new history table based on from square.
04-10-15 Vo YellowCombo diff
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 21802 W: 4107 L: 3887 D: 13808
sprt @ 15+0.05 th 1 http://tests.stockfishchess.org/tests/view/560c959f0ebc597e4f23e409 , http://tests.stockfishchess.org/tests/view/560a1ae60ebc597e4f23e36e
04-10-15 sn bad_knight diff
LLR: -2.97 (-2.94,2.94) [0.00,5.00]
Total: 3954 W: 661 L: 770 D: 2523
sprt @ 15+0.05 th 1 Bad knight
04-10-15 Ro UnprotectedPhalanx diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 11908 W: 2152 L: 2226 D: 7530
sprt @ 15+0.05 th 1 UP_20151003_1
03-10-15 Ro SafeSentry diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 14676 W: 2609 L: 2672 D: 9395
sprt @ 15+0.05 th 1 Fixed bench
03-10-15 Vo BalanceStatFA diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 11836 W: 2106 L: 2181 D: 7549
sprt @ 15+0.05 th 1 One last shot...prior test gave a good clue what's going on. I think this version will work.
03-10-15 Ro SemiBackward2 diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 5227 W: 908 L: 1011 D: 3308
sprt @ 15+0.05 th 1 Fixed array index
03-10-15 sg checked diff
LLR: -2.95 (-2.94,2.94) [0.00,4.00]
Total: 63017 W: 11448 L: 11389 D: 40180
sprt @ 15+0.05 th 1 Ok the patch have an effect on endgame. Try now the opposite and double up endgame score.
03-10-15 Ro SemiBackward2 diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 5964 W: 1089 L: 1189 D: 3686
sprt @ 15+0.05 th 1 SB_20151003_1
03-10-15 II reduction_tune diff
9855/10000 iterations
20000/20000 games played
20000 @ 30+0.05 th 1 Tuning moves 7-12, session 1. Read more under comments.
03-10-15 SC scale_factor_tunable diff
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 45401 W: 8590 L: 8274 D: 28537
sprt @ 15+0.05 th 1 Values after 183k iterations. Let us see.
03-10-15 sn second_push3 diff
LLR: -2.95 (-2.94,2.94) [0.00,4.00]
Total: 44242 W: 8155 L: 8159 D: 27928
sprt @ 15+0.05 th 1 Add some more weight to the proximity of our king to the squares in front of the passed pawn. Take 3, weight = 5/4
03-10-15 Vo BalanceStatFA diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 12566 W: 2234 L: 2306 D: 8026
sprt @ 15+0.05 th 1 STC: Final attempt at this.
03-10-15 mb lazy_smp2 diff
ELO: -0.00 +-54.9 (95%) LOS: 50.0%
Total: 25 W: 2 L: 2 D: 21
15000 @ 600+0.05 th 7 Lazy SMP. 7 Threads XLLTC.
03-10-15 sg checked diff
LLR: -2.95 (-2.94,2.94) [0.00,4.00]
Total: 25362 W: 4527 L: 4597 D: 16238
sprt @ 15+0.05 th 1 Because safe checks already rewarded in middlegame clear the new checked bonus there
03-10-15 sg checked diff
LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 15191 W: 2745 L: 2850 D: 9596
sprt @ 15+0.05 th 1 we discussed at the repo of the checked bonus patch how the endgame value effects the patch. Local testing seems to indicate that for the endgame the bonus is useless. So try a checked bonus of S(20,0)
03-10-15 jo lazy_smp diff
ELO: -9.17 +-5.5 (95%) LOS: 0.1%
Total: 5000 W: 756 L: 888 D: 3356
5000 @ 15+0.05 th 3 Are we now competitive with 3 Threads (-13 elo)? Further simplified the changing of the search depth of the helper threads, and also restored the old iterative deepening loop logic.
26-09-15 SC scale_factor_tuning diff
89027/100000 iterations
184197/200000 games played
200000 @ 60+0.05 th 1 As we have a lot of machines active, let me submit a very long LTC tuning session on rarely considered parameters, just in case we left some ELO lying around. Low throughput, such that it kicks in only if no Priority 0 stuff is waiting. Maybe we can get an answer before next TCEC stage.
02-10-15 II reduction_tune diff
9899/10000 iterations
20000/20000 games played
20000 @ 30+0.05 th 1 Tuning first 6 moves - session 3. Read more under comments.
02-10-15 Vo BalanceStat3b diff
LLR: -0.55 (-2.94,2.94) [0.00,5.00]
Total: 86539 W: 15900 L: 15553 D: 55086
sprt @ 15+0.05 th 1 STC: v.3b
30-09-15 sn king_separation3 diff
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 115087 W: 17578 L: 17034 D: 80475
sprt @ 60+0.05 th 1 Try king separation (take 4, weight = 8). Tested against master with Stefan's and Jonathan's patches: LTC
02-10-15 Vo BalanceStat4 diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 18127 W: 2726 L: 2780 D: 12621
sprt @ 60+0.05 th 1 LTC: v.4
03-10-15 Vo BalanceStatLMR diff
LLR: -2.61 (-2.94,2.94) [0.00,5.00]
Total: 7854 W: 1399 L: 1476 D: 4979
sprt @ 15+0.05 th 1 stc
02-10-15 Vo BalanceStat2 diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 35331 W: 5336 L: 5327 D: 24668
sprt @ 60+0.05 th 1 LTC: v.2