Stockfish Testing Queue

Finished - 1117 tests

18-06-26 31m skewerThreat^ diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 3020 W: 634 L: 752 D: 1634
sprt @ 10+0.1 th 1 Middlegame only, S(10, 0).
18-06-26 31m skewerThreat diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 20760 W: 4602 L: 4631 D: 11527
sprt @ 10+0.1 th 1 Inspired by Bryan's analysis of TCEC 12 SF Game 29, move 70. S(10, 10) penalty if the opponent has a bishop skewer threat. More specifically, check all bishop attacks from our rook square. If they intersect with both a friendly rook/queen and a landing square for an enemy bishop along a single diagonal, but not an enemy bishop already present (i.e., there is not already a skewer), then apply penalty. This will also include bishop forks.
18-06-25 31m passerDoubleSupport4 diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 10256 W: 2249 L: 2330 D: 5677
sprt @ 10+0.1 th 1 I also hope I can retry this 14K green / 48K yellow from June 1, since many tweaks have been made to passed pawn evaluation since then: PassedRank, PassedFile, and PassedDanger. PassedDanger, in particular, interacts very directly with this patch, through the bonus k*w: w is PassedDanger, and k is changed by this patch.
18-06-25 31m tweak_threatOnPawn2 diff
LLR: -2.95 (-2.94,2.94) [0.00,4.00]
Total: 46089 W: 10072 L: 10055 D: 25962
sprt @ 10+0.1 th 1 My apologies for my recent absence from fishtest while traveling. I'm very impressed by the recent tuning work by @candirufish and am curious how it affects the most recent branch I had been working on, since that tuning substantially changes a lot of pawn evaluation, and I know that results from previous tweaks to ThreatByMinor/Rook have changed dramatically in just a few weeks of other commits (i.e., the queen tweak went from just barely failing yellow at LTC, after STC green, to -7 Elo when retested 5-6 weeks later). This was a 90K STC yellow, 104K LTC yellow on June 11-12.
18-06-13 31m tweak_threatOnPawn diff
LLR: -2.95 (-2.94,2.94) [0.00,4.00]
Total: 42688 W: 9452 L: 9447 D: 23789
sprt @ 10+0.1 th 1 Because the framework is mostly empty, let's merge the new master and try much smaller compensation. I know this is a very small tweak, but (a) this is applied to each and every pawn on the board, so it's likely quite sensitive to small changes (a change of 3 already produced a large change), and (b) the main patch was 90K yellow STC, 104K yellow LTC, and thus we only need a small improvement.
18-06-13 31m tweak_threatOnPawn diff
LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 30197 W: 6668 L: 6713 D: 16816
sprt @ 10+0.1 th 1 It appears that, on average, there are about 0.5 threats by minors or rooks on pawns per bench position. Revert to the best take, which gave a 6 cp middlegame bonus to the side with the threats, and compensate by increasing the middlegame value of a pawn by 3 cp.
18-06-13 31m combo_TOP_delta diff
LLR: -2.95 (-2.94,2.94) [0.00,4.00]
Total: 36502 W: 8128 L: 8147 D: 20227
sprt @ 10+0.1 th 1 Another attempt to combo with a promising tweak: @cancetin's delta (61K green STC, 126K yellow LTC). I may not be available to start a LTC run if this passes and would be grateful to anyone who does.
18-06-13 31m combo_TOP_CO diff
LLR: -2.95 (-2.94,2.94) [0.00,4.00]
Total: 21779 W: 4779 L: 4857 D: 12143
sprt @ 10+0.1 th 1 Combo [0, 4]. Even though my speculative LTC failed yellow, it performed at least as well as the STC (90K yellow -> 104K yellow). I'm encouraged by the good scaling to try combo patches. Combine with @snicolet's recent, almost-passed complexity offset tweak (37K green -> 73K yellow). I may not be available to start a LTC run if this passes and would be grateful to anyone who does.
18-06-12 31m tweak_threatOnPawn^^ diff
LLR: -2.95 (-2.94,2.94) [0.00,4.00]
Total: 104513 W: 17939 L: 17753 D: 68821
sprt @ 60+0.6 th 1 Speculative LTC for this 90K yellow, which will at least reveal whether this scales well. Low throughput (166).
18-06-12 31m tweak_threatOnPawn diff
LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 45188 W: 9981 L: 9967 D: 25240
sprt @ 10+0.1 th 1 Merge the SPSA results and try again. Interestingly, the eg components have changed to accommodate the new mg bonuses, which have changed slightly. Hopefully this turns the 90K yellow green; if not, I'll try a speculative LTC to get an idea of how this branch scales.
18-06-12 31m tune_threatOnPawn diff
27980/30000 iterations
58516/60000 games played
60000 @ 20+0.2 th 1 I know that (a) I may be close to a green, due to a recent 90K yellow (90.3% LOS, +1.14 Elo), and (b) small changes can make a big difference on this branch. Try to accomplish the rest by SPSA. Tune also the endgame components, since they are likely not orthogonal to brand-new mg ones. Quadruple ck for the new mg values, and double ck for the old eg ones.
18-06-11 31m tweak_threatOnPawn^ diff
LLR: -2.95 (-2.94,2.94) [0.00,4.00]
Total: 90704 W: 20119 L: 19924 D: 50661
sprt @ 10+0.1 th 1 So far, it appears that larger values may be better over master's 0 (2, +0.27 Elo; 4, +0.63 Elo). Continue increasing: try 6.
18-06-11 31m tweak_threatOnPawn diff
LLR: -2.95 (-2.94,2.94) [0.00,4.00]
Total: 20779 W: 4579 L: 4661 D: 11539
sprt @ 10+0.1 th 1 Value 8.
18-06-11 31m tweak_threatOnPawn diff
LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 52554 W: 10510 L: 10477 D: 31567
sprt @ 10+0.1 th 1 Double effect: increase the middlegame component of ThreatByMinor[PAWN] and ThreatByRook[PAWN] from 0 to 4.
18-06-11 31m tweak_threatOnPawn diff
LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 39413 W: 7813 L: 7829 D: 23771
sprt @ 10+0.1 th 1 Inspired by @xoroshiro's work: applying a mild middlegame penalty for our own hanging pawns appeared strong at STC but yellow at LTC; simply increasing the Hanging bonus failed. We appear to have ThreatByMinor and ThreatByRook for pawns, but the middlegame component is zero--maybe this is what is missing. Increase to 2.
18-06-09 31m passerDoubleSupport3 diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 10800 W: 2075 L: 2154 D: 6571
sprt @ 10+0.1 th 1 I can't believe I didn't think of this before. Expand the bonus by removing the k > 0 condition: bonus will be given even if the blocking square appears unsafe, if we have two or more rooks or queens on the file. Hopefully this maintains the green STC result while improving on the yellow LTC.
18-06-08 31m tune_passerDoubleSuppor diff
28324/30000 iterations
59159/60000 games played
60000 @ 20+0.2 th 1 Rather than search for new greens, I would like to try to improve the scaling of this green STC/yellow LTC. Since it changes k, which interacts with PassedDanger values w, it could change the optimum values of PassedDanger dramatically (because even a small change in k dramatically changes the resulting Score, (k*w, k*w)). Tune the four nonzero values of PassedDanger with quadruple the initially recommended ck (since these were quite small).
18-06-06 31m RookOFPP^ diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 26496 W: 5261 L: 5268 D: 15967
sprt @ 10+0.1 th 1 The versions that included a middlegame bonus failed badly, but the endgame-only version failed with a nearly neutral Elo estimate (+0.12). Try 50% larger effect.
18-06-06 31m RookOFPP diff
LLR: -2.97 (-2.94,2.94) [0.00,5.00]
Total: 20841 W: 4137 L: 4170 D: 12534
sprt @ 10+0.1 th 1 Also try endgame-only double effect.
18-06-06 31m RookOFPP diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 22100 W: 4448 L: 4474 D: 13178
sprt @ 10+0.1 th 1 Same, but endgame bonus only.
18-06-06 31m RookOFPP^^ diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 14980 W: 2985 L: 3044 D: 8951
sprt @ 10+0.1 th 1 It occurs to me that applying bonus--when only we have rooks--for open files has shown some potential in the past, while the same is true for passed pawns. However, giving too much bonus for one implicitly penalizes the other. Try combining them. If only we have rooks, add the number of friendly passed pawns and open files and apply a saturated bonus, such that 1 leads to S(8, 8) but 8 only lead to S(20, 20).
18-06-06 31m RookOFPP^ diff
LLR: -2.97 (-2.94,2.94) [0.00,5.00]
Total: 5975 W: 1116 L: 1217 D: 3642
sprt @ 10+0.1 th 1 Same, but middlegame bonus only.
18-06-05 31m RookPassedPawn diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 22871 W: 4576 L: 4599 D: 13696
sprt @ 10+0.1 th 1 One more check: S(0, 8). If this fails, I may be running out of ideas on this branch.
18-06-05 31m RookPassedPawn diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 19834 W: 3948 L: 3985 D: 11901
sprt @ 10+0.1 th 1 Make sure that the yellow result for S(0, 5)--33K games, +0.8 Elo--wasn't due to insufficient effect size, because this logic is much narrower than some of my previous S(0, 5) tests. Try S(3, 8).
18-06-04 31m RookPassedPawn diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 33335 W: 6793 L: 6766 D: 19776
sprt @ 10+0.1 th 1 Another version. S(0, 5) bonus for every "extra" passed pawn we have, if we have rooks but the opponent does not. This differs from previous tests by subtracting the number of enemy passed pawns from our own.
18-06-04 31m RookPassedPawn^ diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 24328 W: 4875 L: 4891 D: 14562
sprt @ 10+0.1 th 1 A tweak of the yellow passerNoOpposingR2 tests. If we have at least one rook, give a small bonus for every passed pawn of either color--rooks are uniquely helpful not just for defending our own passed pawns, but also for attacking the enemy's passed pawns. If both sides have rooks, this cancels out. S(2, 2).
18-06-04 31m RookPassedPawn diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 12491 W: 2470 L: 2541 D: 7480
sprt @ 10+0.1 th 1 Same, but S(0, 5).
18-06-04 31m passerNoOpposingR2^ diff
LLR: -2.97 (-2.94,2.94) [0.00,5.00]
Total: 36951 W: 7386 L: 7345 D: 22220
sprt @ 10+0.1 th 1 I had debated whether S(2, 2) was large enough to have an effect; I decided no and tried S(5, 5). S(5, 5) was clearly large enough to have an effect--it failed in less than 4000 games!--so let's try the much smaller effect I originally intended.
18-06-04 31m passerNoOpposingR2 diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 33174 W: 6613 L: 6589 D: 19972
sprt @ 10+0.1 th 1 S(0, 5). See S(5, 0) test for further explanation.
18-06-04 31m passerNoOpposingR2 diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 11479 W: 2324 L: 2399 D: 6756
sprt @ 10+0.1 th 1 Intuitively, S(0, 10) seems too large, but I would like to test this to be sure.
18-06-04 31m passerNoOpposingR2^ diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 11806 W: 2289 L: 2363 D: 7154
sprt @ 10+0.1 th 1 Try to split up the terrible S(5, 5) patch into MG and EG components to see if one specifically caused the regression. S(5, 0).
18-06-04 31m passerNoOpposingR2 diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 3731 W: 710 L: 822 D: 2199
sprt @ 10+0.1 th 1 If we have a rook and the enemy does not, give a small bonus for every one of our passed pawns. Here, S(5, 5).
18-06-03 31m passerNoOpposingR diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 19284 W: 3858 L: 3897 D: 11529
sprt @ 10+0.1 th 1 Revert to the best version so far (entire file, ignore queens) and increase the effect from k += 2 to k += 3.
18-06-03 31m passerNoOpposingR diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 10773 W: 2149 L: 2228 D: 6396
sprt @ 10+0.1 th 1 Also consider queens in both conditions.
18-06-03 31m passerNoOpposingR diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 33753 W: 6765 L: 6738 D: 20250
sprt @ 10+0.1 th 1 Same, but also consider support from in front of the pawn, i.e., file_bb(s) rather than forward_file_bb(Them, s).
18-06-03 31m passerNoOpposingR^ diff
LLR: -2.97 (-2.94,2.94) [0.00,5.00]
Total: 23425 W: 4666 L: 4687 D: 14072
sprt @ 10+0.1 th 1 Another passed pawn idea. Rooks are the best for attacking/defending passed pawns, because of their influence throughout the entire file. Give extra passed pawn bonus if we support our pawn from behind with a rook, but the opponent has no rook to answer. This could be helpful in imbalanced endgames.
18-06-03 31m passerDoubleSupport diff
LLR: -2.94 (-2.94,2.94) [0.00,5.00]
Total: 8439 W: 1632 L: 1721 D: 5086
sprt @ 10+0.1 th 1 It seems that what is potentially needed is not a narrowing of this bonus, but an expansion of it. Also give the extra bonus if we attack the (empty) blockSq twice. (This is a different form of double support.)
18-06-02 31m passerDoubleSupport diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 38233 W: 7637 L: 7590 D: 23006
sprt @ 10+0.1 th 1 A narrow restriction to the 14K green/48K yellow: don't apply extra bonus if the enemy also has two rooks/queens on the file.
18-06-02 31m passerDoubleSupport diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 13198 W: 2587 L: 2655 D: 7956
sprt @ 10+0.1 th 1 My apologies: I think I intended to use between_bb rather than LineBB. Try this version instead.
18-06-02 31m passerDoubleSupport diff
LLR: -1.35 (-2.94,2.94) [0.00,5.00]
Total: 5935 W: 1194 L: 1225 D: 3516
sprt @ 10+0.1 th 1 A tweak to the 14K green/48K yellow: exclude the case where enemy pieces block our double support of the passer. (Hopefully, I've done this correctly: take the two friendly rooks/queens/pawns furthest apart on the file, and make sure there are no enemies between them.)
18-06-01 31m passerDoubleSupport diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 42965 W: 8730 L: 8659 D: 25576
sprt @ 10+0.1 th 1 I was disappointed to see the STC 14K green fail yellow at LTC, since a prior test on this branch had scaled very well. Try increasing the size of the effect even further, to k += 4. My hope is that a large effect, if still a gain, can give a LTC green if it makes it past STC (i.e., that the current effect is of inadequate size at LTC).
18-06-01 31m passerDoubleSupport2 diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 47773 W: 9762 L: 9668 D: 28343
sprt @ 10+0.1 th 1 Check the effect of the new logic (using the whole file) on this 96K yellow.
18-06-01 31m passerDoubleSupport diff
LLR: -2.97 (-2.94,2.94) [0.00,5.00]
Total: 35182 W: 7113 L: 7079 D: 20990
sprt @ 10+0.1 th 1 Greater effect: k += 3.
18-06-01 31m passerDoubleSupport^ diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 8728 W: 1700 L: 1788 D: 5240
sprt @ 10+0.1 th 1 One of a few variants of the current LTC, with priority -1 until it finishes, to be raised to 0 if it fails and deleted if it passes. If I am unavailable to do so, hopefully someone else can (as it's not at all clear when this LTC will finish). Here, try less effect: k++ rather than k += 2.
18-06-01 31m passerDoubleSupport diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 33053 W: 6635 L: 6611 D: 19807
sprt @ 10+0.1 th 1 A different idea. Consider cases where we have more rook/queen defenders along the file than the opponent has rook/queen attackers: at least two friendly rooks/queens behind the passed pawn versus less than two enemy ones in front of it, or at least one versus none. (I ignore 3 vs 2, because I suspect this is very rare.)
18-06-01 31m passerDoubleSupport diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 48645 W: 7092 L: 7038 D: 34515
sprt @ 60+0.6 th 1 LTC: A wider application of the same bonus as the 60K STC, 61K LTC yellow. Rather than only consider defense of the passed pawn from behind, consider the entire file.
18-06-01 31m passerDoubleSupport diff
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 14052 W: 2949 L: 2755 D: 8348
sprt @ 10+0.1 th 1 A wider application of the same bonus as the 60K STC, 61K LTC yellow. Rather than only consider defense of the passed pawn from behind, consider the entire file.
18-06-01 31m passerDoubleSupport^ diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 25329 W: 5186 L: 5196 D: 14947
sprt @ 10+0.1 th 1 Another attempt to narrow the extra bonus. Require that the passer, supported twice from behind, is not attacked twice from its front.
18-06-01 31m passerDoubleSupport^^ diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 16963 W: 3394 L: 3444 D: 10125
sprt @ 10+0.1 th 1 It appears more likely than not that neither speculative LTC will pass. However, there's still useful information: passerDoubleSupport appears to scale much better than passerDoubleSupport2 (60K yellow -> 60K games and counting, versus 96K yellow -> 18K red). Let's try a few tweaks to the logic in the 60K yellow and hope for a green. Here, consider rooks only; previous versions included queens.
18-05-30 31m ee33a3f7deb5a63d96466af diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 61227 W: 9107 L: 9005 D: 43115
sprt @ 60+0.6 th 1 Speculative LTC for this 60K yellow STC from passerDoubleSupport. If this also scales poorly, perhaps there is no LTC Elo to gain here. Low throughput (166).