Stockfish Testing Queue

Finished - 3054 tests

25-02-18 An prelimProbcut2 diff
LLR: 2.97 (-2.94,2.94) [0.00,5.00]
Total: 22907 W: 3954 L: 3738 D: 15215
sprt @ 60+0.6 th 1 LTC. Even if the other passes, this fixes two oddities (depth 5 and return value), and I imagine the maintainers would prefer this one. Priority -1 until (if) prelimProbcut2 passes.
20-02-18 SC master diff
ELO: 149.98 +-4.0 (95%) LOS: 100.0%
Total: 15731 W: 7474 L: 1076 D: 7181
30000 @ 10+0.1 th 1 As a reference: Elo without contempt. Low TP
25-02-18 pe tunedtm diff
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 18970 W: 4268 L: 4029 D: 10673
sprt @ 10+0.1 th 1 Tuned values with adjustment
24-02-18 sn razoring2 diff
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 62620 W: 13948 L: 13537 D: 35135
sprt @ 10+0.1 th 1 Razoring margin = 400
25-02-18 sn tweak_asymmetry2 diff
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 27776 W: 4771 L: 4536 D: 18469
sprt @ 60+0.6 th 1 LTC: Tweak asymmetry measure
25-02-18 An prelimProbcut diff
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 23202 W: 4044 L: 3826 D: 15332
sprt @ 60+0.6 th 1 LTC
25-02-18 An prelimProbcut2 diff
LLR: 4.73 (-2.94,2.94) [0.00,5.00]
Total: 36281 W: 8221 L: 7830 D: 20230
sprt @ 10+0.1 th 1 Same as before except 1) Do not change how we handle ProbCut at the lowest depth (D5) and 2) Do not alter the return bound.
25-02-18 sn mergeCapt diff
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 21293 W: 3527 L: 3407 D: 14359
sprt @ 60+0.6 th 1 LTC for Jerry: merge all capture init stages
25-02-18 sg storm_danger2 diff
LLR: 2.94 (-2.94,2.94) [0.00,5.00]
Total: 16467 W: 3750 L: 3537 D: 9180
sprt @ 10+0.1 th 1 +50%
25-02-18 jd blockedShelter diff
LLR: 2.94 (-2.94,2.94) [0.00,5.00]
Total: 8424 W: 1948 L: 1775 D: 4701
sprt @ 10+0.1 th 1 Take 2. Limit to h / f, a / c files and relative rank 3.
25-02-18 An prelimProbcut diff
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 9841 W: 2251 L: 2071 D: 5519
sprt @ 10+0.1 th 1 Before a preliminary depth one search when we are looking at possible ProbCuts
18-02-18 sg remove_dyn_ct diff
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 51503 W: 11382 L: 11319 D: 28802
sprt @ 10+0.1 th 1 For comparison to Jörg's test i try a full revert of dynamic contempt which includes setting contempt back to 20
24-02-18 pe tunedtm diff
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 35413 W: 7797 L: 7495 D: 20121
sprt @ 10+0.1 th 1 Check ftinal tuning as is
24-02-18 sn tweak_asymmetry2 diff
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 13146 W: 3038 L: 2840 D: 7268
sprt @ 10+0.1 th 1 Tweak asymmetry measure
24-02-18 vd mergeCapt diff
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 16789 W: 3685 L: 3554 D: 9550
sprt @ 10+0.1 th 1 stc
24-02-18 Ha razor diff
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 32890 W: 5577 L: 5476 D: 21837
sprt @ 60+0.6 th 1 LTC: Razoring simplification
23-02-18 Ha razor diff
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 8043 W: 1865 L: 1716 D: 4462
sprt @ 10+0.1 th 1 Razoring simplification
21-02-18 tv Razor3 diff
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 74540 W: 12888 L: 12491 D: 49161
sprt @ 60+0.6 th 1 LTC Margin 590
21-02-18 sn tempo_fix diff
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 52602 W: 11776 L: 11403 D: 29423
sprt @ 10+0.1 th 1 No Tempo for draw scores given by heuristic functions. Question is how will this change affect the search (more positions will be near VALUE_DRAW)
22-02-18 Fi ContemptTweak diff
ELO: 167.03 +-3.2 (95%) LOS: 100.0%
Total: 30000 W: 15885 L: 2480 D: 11635
30000 @ 10+0.1 th 1 Tweak take 2 final try
20-02-18 pr ps_leverpush diff
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 56669 W: 12582 L: 12530 D: 31557
sprt @ 10+0.1 th 1 simplification: leverPush doesn't seem to do much on my local machines.
22-02-18 sn gradient diff
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 26352 W: 5938 L: 5676 D: 14738
sprt @ 10+0.1 th 1 Adjust gradient based on current bestValue
21-02-18 SC fb9f7abc369c1baa0b95867 diff
ELO: 166.98 +-3.0 (95%) LOS: 100.0%
Total: 30000 W: 15430 L: 2028 D: 12542
30000 @ 10+0.1 th 1 Take 3. Contempt 12 with quadratic part. Will also test for [0, 5]
21-02-18 sn LSB2' diff
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 29693 W: 6677 L: 6399 D: 16617
sprt @ 10+0.1 th 1 Try k+=3. Bench: 5619410
20-02-18 SC fb9f7abc369c1baa0b95867 diff
ELO: 150.96 +-2.9 (95%) LOS: 100.0%
Total: 30000 W: 14358 L: 2086 D: 13556
30000 @ 10+0.1 th 1 Take 1 of trying to replace offset contempt by nonlinear one. Testing to check which variant is efficient against bad engines, will then move to [-3, 1]. See discussion on github for issues fixing in dynamic contempt case. Thx to Stefan for checking.
20-02-18 SC symmetricContempt diff
ELO: 153.93 +-2.9 (95%) LOS: 100.0%
Total: 30000 W: 14556 L: 2071 D: 13373
30000 @ 10+0.1 th 1 Take 2 logarithmic (I have to fix the SF7 bench for take 1)
15-02-18 sn tweak_futility_margins diff
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 109891 W: 24566 L: 23891 D: 61434
sprt @ 10+0.1 th 1 Tweak futility margins values, and introduce an array to store them. Tested as SPRT[0..5] because of the introduction of the array, which is complicated and forces a slow memory access.
20-02-18 Fi ContemptTweak diff
ELO: 167.82 +-3.1 (95%) LOS: 100.0%
Total: 30000 W: 15828 L: 2368 D: 11804
30000 @ 10+0.1 th 1 Contempt tweak
20-02-18 tv Razor3 diff
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 24496 W: 5470 L: 5210 D: 13816
sprt @ 10+0.1 th 1 Margin 590, I expect this to fail fast
19-02-18 Fi UnbiasedContempt diff
ELO: 154.63 +-3.0 (95%) LOS: 100.0%
Total: 30000 W: 14970 L: 2435 D: 12595
30000 @ 10+0.1 th 1 Take 2
18-02-18 mc eval_style diff
LLR: 2.95 (-2.94,2.94) [-4.00,0.00]
Total: 75666 W: 16482 L: 16616 D: 42568
sprt @ 10+0.1 th 1 Verify coding style tweaks do not introduce some hidden regression
19-02-18 Fi PawnContempt diff
ELO: 165.22 +-3.1 (95%) LOS: 100.0%
Total: 30000 W: 15521 L: 2241 D: 12238
30000 @ 10+0.1 th 1 Use pawns for contempt phase. Idea by Ronald https://groups.google.com/forum/?fromgroups=#!topic/fishcooking/p0u40TPPyAg
19-02-18 Fi UnbiasedContempt diff
ELO: 157.49 +-3.0 (95%) LOS: 100.0%
Total: 30000 W: 15001 L: 2263 D: 12736
30000 @ 10+0.1 th 1 See if we can keep the mg eg gradient for contempt but around 0 so we don't bias the scores.
19-02-18 Fi master diff
ELO: 165.88 +-3.1 (95%) LOS: 100.0%
Total: 30000 W: 15602 L: 2276 D: 12122
30000 @ 10+0.1 th 1 Baseline for http://tests.stockfishchess.org/tests/view/5a8a55a50ebc590297cc841a
18-02-18 tv RazorEG diff
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 47714 W: 10624 L: 10258 D: 26832
sprt @ 10+0.1 th 1 Try 580/620 margin
17-02-18 vd killerCombine diff
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 55777 W: 12367 L: 12313 D: 31097
sprt @ 10+0.1 th 1 stc
17-02-18 tv RazorEG diff
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 21687 W: 4835 L: 4598 D: 12254
sprt @ 10+0.1 th 1 Try a game stage dependent razor margin
17-02-18 lb threatByPawn diff
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 50844 W: 11166 L: 11102 D: 28576
sprt @ 10+0.1 th 1 simplify ThreatBySafePawn
17-02-18 jd simplifyThreats2 diff
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 36770 W: 8018 L: 7926 D: 20826
sprt @ 10+0.1 th 1 Remove another unnecessary condition per Stephane's comment.
10-02-18 pe tunedtm diff
LLR: 3.50 (-2.94,2.94) [0.00,4.00]
Total: 113172 W: 25224 L: 24582 D: 63366
sprt @ 10+0.1 th 1 Tuned values with adjustment. Take 2
15-02-18 tv Tweak diff
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 23005 W: 5111 L: 4857 D: 13037
sprt @ 10+0.1 th 1 Razor depth and margin (prio -1 for now)
15-02-18 pr ps_trappedrook4 diff
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 16288 W: 2813 L: 2685 D: 10790
sprt @ 60+0.6 th 1 simplification. semiopenFiles is set with pawns anywhere on the files (even very advanced). Therefore, using them to check for a trapped rook may not be best
15-02-18 pr ps_trappedrook4 diff
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 13377 W: 3009 L: 2871 D: 7497
sprt @ 10+0.1 th 1 simplify trapped rook logic. semiopen files are true if there is a pawn anywhere on the file so this seems like a poor condition. mob <=3 probably covers this well enough.
15-02-18 jd RookKS diff
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 4765 W: 1120 L: 965 D: 2680
sprt @ 10+0.1 th 1 Modify kingDanger by # of rooks.
14-02-18 at tt_key64_CS2 diff
ELO: 4.43 +-2.8 (95%) LOS: 99.9%
Total: 24000 W: 5073 L: 4767 D: 14160
24000 @ 10+0.1 th 2 TT with full 64-bit keys and cluster size 2. Testing with 2 threads.
14-02-18 Vo seT3 diff
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 14225 W: 3248 L: 3027 D: 7950
sprt @ 10+0.1 th 1 stc
14-02-18 sn simplify_tropism diff
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 41675 W: 9168 L: 9086 D: 23421
sprt @ 10+0.1 th 1 Is the "shift-and-superpose" trick still necessary to compute the tropism total? It avoids a popcount, but is the speed benefit subtancial enough to justify it? Try to simplify.
14-02-18 sn soft_futility_pruning diff
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 26969 W: 6110 L: 5844 D: 15015
sprt @ 10+0.1 th 1 Array for the margins (take 1, start=30). Bench: 4454324
13-02-18 sn soft_futility_pruning diff
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 25536 W: 5756 L: 5498 D: 14282
sprt @ 10+0.1 th 1 Soft futility pruning, take 2. Bench: 4336799
13-02-18 pb futile_extension diff
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 8405 W: 1889 L: 1718 D: 4798
sprt @ 10+0.1 th 1 take 3: try the opposite