Stockfish Testing Queue

Finished - 29280 tests

16-01-22 Voy ct diff
LLR: -2.95 (-2.94,2.94) [0.00,4.00]
Total: 13268 W: 2416 L: 2527 D: 8325
sprt @ 10+0.1 th 1 Tweak Good Capture's SEE threshold...
16-01-19 Voy cmhd diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 16238 W: 2141 L: 2205 D: 11892
sprt @ 60+0.6 th 1 This patch failed yellow at STC. It is TC sensitive since it kicks in only at high depths. Like to test at LTC. Low Throughput.
16-01-21 sg aspiration diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 36653 W: 6791 L: 6758 D: 23104
sprt @ 10+0.1 th 1 For research at root node use tighter bounds.
16-01-19 SC multiStepsLMR diff
LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 75461 W: 13986 L: 13882 D: 47593
sprt @ 10+0.1 th 1 Limit reductions to 6. Test as a parameter tweak. Limit chosen such that minimal number of search nodes in bench positions is reached.
16-01-22 SC reductionSimple diff
ELO: -40.20 +-17.5 (95%) LOS: 0.0%
Total: 573 W: 76 L: 142 D: 355
10000 @ 10+0.1 th 1 Would this even more exotic attempt to drastically reduce complexity of reduction estimation have a chance of pass simplification SPRT after tuning?
16-01-21 Voy cmt diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 17807 W: 3221 L: 3270 D: 11316
sprt @ 10+0.1 th 1 Local testing shows 5 elo @4k games. See what framework has to say...
16-01-22 SC LMR_nomovecount diff
ELO: -80.37 +-28.6 (95%) LOS: 0.0%
Total: 220 W: 19 L: 69 D: 132
10000 @ 10+0.1 th 1 Would this exotic attempt to remove moveCount dependency have a chance of pass simplification SPRT after tuning?
16-01-21 pb0 tt_save_5 diff
LLR: -2.95 (-2.94,2.94) [0.00,4.00]
Total: 14201 W: 2621 L: 2729 D: 8851
sprt @ 10+0.1 th 1 Testing this further TT tweak as proposed by Nikita-Guskov on pull#574 (actually it's related to #575)
16-01-20 pb0 tt_save_simplification diff
LLR: -2.96 (-2.94,2.94) [-3.00,1.00]
Total: 22994 W: 4129 L: 4326 D: 14539
sprt @ 10+0.1 th 1 A local test with 1500 games (Hash=8) did show no regression (W: 321 L: 308 D: 871), so trying this here with standard Hash=4
16-01-19 jki distinct_iteration_11 diff
LLR: -0.22 (-2.94,2.94) [-1.00,3.00]
Total: 4694 W: 563 L: 568 D: 3563
sprt @ 12+0.12 th 21 LTC: distinct_iter11. Make sure that it doesn't regress with high number of threads.
16-01-19 Voy cs diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 7665 W: 1388 L: 1481 D: 4796
sprt @ 10+0.1 th 1 Take 2..
16-01-19 Voy sort diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 10997 W: 1998 L: 2076 D: 6923
sprt @ 10+0.1 th 1 Sort bad quiets if no moves was played in earlier stages...
16-01-19 cib simple_delta diff
LLR: -2.95 (-2.94,2.94) [-3.00,1.00]
Total: 26090 W: 3495 L: 3679 D: 18916
sprt @ 60+0.6 th 1 LTC. Simplify aspiration window computation, based on wfenchel patch. Note using Hash of 64mb (hope is the right value for LTC).
16-01-19 Voy cs diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 6847 W: 1199 L: 1295 D: 4353
sprt @ 10+0.1 th 1 Use File to help sort captures...
16-01-19 Mys qp diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 20212 W: 3740 L: 3778 D: 12694
sprt @ 10+0.1 th 1 Endgame queen pawns
16-01-19 pb0 master diff
LLR: 2.96 (-2.94,2.94) [-5.00,0.00]
Total: 11075 W: 2070 L: 1992 D: 7013
sprt @ 10+0.1 th 1 Calculate appropriate hash size for testing. Is the lowest we can use, without losing elo compared to 128, still Hash=4 after #575 has been merged?
16-01-18 Voy srhd diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 28217 W: 5259 L: 5262 D: 17696
sprt @ 10+0.1 th 1 Don't do stat lmr-reductions at high depths.
16-01-18 Voy cmhd diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 64609 W: 12105 L: 11949 D: 40555
sprt @ 10+0.1 th 1 Don't update cm at high depth
16-01-18 cib simple_delta diff
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 35935 W: 6714 L: 6619 D: 22602
sprt @ 10+0.1 th 1 Reschedule with correct bounds. Simplify aspiration window computation, based on wfenchel patch. https://github.com/wfenchel/Stockfish/commit/98ca634007a996c7ead062e7e075163ad91be69c
16-01-18 pec tm diff
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 41963 W: 7967 L: 7883 D: 26113
sprt @ 10+0.1 th 1 Simplification
16-01-18 cib simple_delta diff
LLR: -1.71 (-2.94,2.94) [0.00,5.00]
Total: 28713 W: 5428 L: 5375 D: 17910
sprt @ 10+0.1 th 1 Simplify aspiration window computation, based on wfenchel patch. https://github.com/wfenchel/Stockfish/commit/98ca634007a996c7ead062e7e075163ad91be69c
16-01-18 Voy cmld diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 18456 W: 3464 L: 3509 D: 11483
sprt @ 10+0.1 th 1 Don't update CM at low depths.
16-01-18 Mys bp diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 19535 W: 3716 L: 3756 D: 12063
sprt @ 10+0.1 th 1 Re-run! Fix bench (?)
16-01-18 lbr master diff
LLR: 2.97 (-2.94,2.94) [-5.00,0.00]
Total: 22227 W: 3049 L: 3005 D: 16173
sprt @ 60+0.6 th 1 Same exercise at LTC. Hash=64.
16-01-17 Voy flcm diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 21318 W: 3942 L: 3975 D: 13401
sprt @ 10+0.1 th 1 Update CM if opponent force a fail low.
16-01-17 lbr master diff
LLR: -2.96 (-2.94,2.94) [-5.00,0.00]
Total: 41408 W: 5540 L: 5800 D: 30068
sprt @ 60+0.6 th 1 Same exercise at LTC. Hash=32.
16-01-17 pb0 lazy_smp_tt diff
LLR: 2.96 (-2.94,2.94) [-4.00,0.00]
Total: 23975 W: 3294 L: 3210 D: 17471
sprt @ 60+0.6 th 1 Is the patch a major regression with a single thread? LTC
16-01-17 SC multiStepsLMR diff
LLR: -2.94 (-2.94,2.94) [0.00,5.00]
Total: 2877 W: 448 L: 560 D: 1869
sprt @ 10+0.1 th 1 Before going to full depth search, research with half the reductions, but only if r >=7, as from SPSA tuning.
16-01-17 SC multiStepTuning diff
44013/50000 iterations
91271/100000 games played
100000 @ 10+0.1 th 1 multiStepLMR for r>= 6 was failing with positive score. Since it is the part of code is very ELO sensitive try to make it successfull with SPSA tuning. Dont use nodestime since nodes searched can vary very much in this setting. Reschedule after increasing c (in particular for the threshold).
16-01-16 pb0 lazy_smp_tt diff
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 10264 W: 1498 L: 1321 D: 7445
sprt @ 15+0.15 th 7 Since the queue is completely empty, I go for the tests as proposed by Joona. As this is low-hash-pressure patch, we need very high hashsize.
16-01-16 lan KRPPKRPScaleFactors diff
19788/20000 iterations
40000/40000 games played
40000 @ 20+0.2 th 1 Tune the scale factors for KRPPKRP endgames. Do ranks 7 and 8 really scale as 0 ?
16-01-17 pec delta diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 15159 W: 2774 L: 2834 D: 9551
sprt @ 10+0.1 th 1 take 2
16-01-17 lbr master diff
LLR: -2.95 (-2.94,2.94) [-5.00,0.00]
Total: 5387 W: 650 L: 786 D: 3951
sprt @ 60+0.6 th 1 Same exercise at LTC. Hash=16.
16-01-16 cib contra1 diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 26371 W: 4887 L: 4898 D: 16586
sprt @ 10+0.1 th 1 Expanding counter move stats table with an additional dimension.
16-01-17 lbr master diff
LLR: -2.95 (-2.94,2.94) [-5.00,0.00]
Total: 15692 W: 2819 L: 3013 D: 9860
sprt @ 10+0.1 th 1 Calculate appropriate hash size for testing. What is the lowest we can use, without losing elo compared to 128 ? Hash=2.
16-01-17 pec delta diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 10536 W: 1915 L: 1995 D: 6626
sprt @ 10+0.1 th 1 Simple split of aspiration window size calculation to conform that fails low bear more severe penalty than fails high
16-01-16 SC multiStepTuning diff
10359/50000 iterations
20951/100000 games played
100000 @ 10+0.1 th 1 multiStepLMR for r>= 6 was failing with positive score. Since it is the part of code is very ELO sensitive try to make it successfull with SPSA tuning. Dont use nodestime since nodes searched can vary very much in this setting.
16-01-17 lbr master diff
LLR: 2.96 (-2.94,2.94) [-5.00,0.00]
Total: 16929 W: 3113 L: 3060 D: 10756
sprt @ 10+0.1 th 1 Calculate appropriate hash size for testing. What is the lowest we can use, without losing elo compared to 128 ? Hash=8.
16-01-17 lbr master diff
LLR: 2.96 (-2.94,2.94) [-5.00,0.00]
Total: 15470 W: 2917 L: 2858 D: 9695
sprt @ 10+0.1 th 1 Calculate appropriate hash size for testing. What is the lowest we can use, without losing elo compared to 128 ? Hash=4.
16-01-16 IIv pvS_delta diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 16141 W: 2952 L: 3008 D: 10181
sprt @ 10+0.1 th 1 Tuned values.
16-01-16 pb0 lazy_smp_tt diff
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 14817 W: 2103 L: 1915 D: 10799
sprt @ 30+0.3 th 3 Since the queue is completely empty, I go for the tests as proposed by Joona. As this is low-hash-pressure patch, we need very high hashsize.
16-01-16 pb0 lazy_bestmoves_first_he diff
ELO: 10.22 +-5.6 (95%) LOS: 100.0%
Total: 5000 W: 910 L: 763 D: 3327
5000 @ 10+0.1 th 7 ... and how much is the difference between first and last helper? Both working with half density but first one with skipsize 1 and last one with skipsize 2
16-01-16 SC multiStepsLMR diff
LLR: -2.93 (-2.94,2.94) [0.00,5.00]
Total: 30273 W: 5639 L: 5632 D: 19002
sprt @ 10+0.1 th 1 Before going to full depth search, research with half the reductions, but only if r >=6.
16-01-16 SC quietValues diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 6699 W: 1199 L: 1296 D: 4204
sprt @ 10+0.1 th 1 When updating stats, also use value of searched quiets. Take 5.
16-01-16 pb0 lazy_smp_tt diff
ELO: 4.71 +-3.2 (95%) LOS: 99.8%
Total: 20000 W: 4550 L: 4279 D: 11171
20000 @ 2+0.1 th 1 Since 1-thread queue is currently empty, retry to measure more accurately if we really don't regress at this tc, which would be surprising and indicate that depth-margin might have some dependence to root-depth
16-01-16 SC multiStepsLMR diff
LLR: -2.97 (-2.94,2.94) [0.00,5.00]
Total: 7720 W: 1401 L: 1494 D: 4825
sprt @ 10+0.1 th 1 Before going to full depth search, research we half the reductions.
16-01-16 SC quietValues diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 4906 W: 869 L: 974 D: 3063
sprt @ 10+0.1 th 1 When updating stats, also use value of searched quiets. Take 4.
16-01-16 SC multiStepsLMR diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 3797 W: 657 L: 767 D: 2373
sprt @ 10+0.1 th 1 Go all steps till full depth search.
16-01-02 pb0 distinct_iter_11 diff
LLR: 0.37 (-2.94,2.94) [0.00,5.00]
Total: 82846 W: 10555 L: 10278 D: 62013
sprt @ 15+0.15 th 21 rotating symmetric halfdensity patterns with gradually increasing skipsizes till size 4, LTC
16-01-16 SC quietValues diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 3105 W: 523 L: 636 D: 1946
sprt @ 10+0.1 th 1 When updating stats, also use value of searched quiets. Try to fix undefined bench issue. Take 2.