Stockfish Testing Queue

Finished - 977 tests

18-06-30 31m HangingQueenPinner diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 14222 W: 3151 L: 3212 D: 7859
sprt @ 10+0.1 th 1 Since the framework is mostly empty, try a middle-ground approach, -30% WeakQueen compensation.
18-06-29 31m HangingQueenPinner diff
LLR: -2.97 (-2.94,2.94) [0.00,5.00]
Total: 9550 W: 2064 L: 2149 D: 5337
sprt @ 10+0.1 th 1 An aggressive attempt, doubling the compensation. Reduce WeakQueen by 40%.
18-06-29 31m tweak_WeakQueen diff
LLR: -2.95 (-2.94,2.94) [0.00,4.00]
Total: 27274 W: 6058 L: 6114 D: 15102
sprt @ 10+0.1 th 1 This tweak improves HangingQueenPinner, but it's not clear if the improvement comes from their interaction or if this tweak would perform better alone. Let's test it: reduce WeakQueen by 20% to S(40, 8).
18-06-29 31m HangingQueenPinner diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 42993 W: 9572 L: 9490 D: 23931
sprt @ 10+0.1 th 1 Counterintuitively, it seems like the narrower logic in this patch actually leads to this loop being activated about 19% more often, not less (based on dbg_mean_of() the boolean during bench). Compensate by reducing the value of WeakQueen accordingly.
18-06-29 31m HangingQueenPinner diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 21010 W: 4630 L: 4658 D: 11722
sprt @ 10+0.1 th 1 Since the framework isn't very busy, here's a sanity check (and the more intuitive change) for comparison: apply the opposite compensation to WeakQueen.
18-06-29 31m HangingQueenPinner diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 11207 W: 2482 L: 2558 D: 6167
sprt @ 10+0.1 th 1 An attempt to implement an idea Bryan posted in response to TCEC 12 Superfinal Game 38. No WeakQueen penalty if the discovered attacker is hanging, in which case the queen can simply take the "attacking" piece if the enemy blocker is moved by the opponent. The logic in this patch is technically inaccurate if there are two or more blockers/pinners, but this occurs in only 0.07% of positions, and this logic should be fast to compute.
18-06-28 31m skewerThreat diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 6755 W: 1444 L: 1542 D: 3769
sprt @ 10+0.1 th 1 Upon further consideration, @Hanamuke had a really good point about a bug in my previous implementation, which occurred if the bishop and first rook were on the same file/rank, so the bishop had 2 different squares from which to attack the rook. Try this implementation (with rooks only), which hopefully solves all the previous issues.
18-06-26 31m skewerThreat diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 8129 W: 1772 L: 1864 D: 4493
sprt @ 10+0.1 th 1 S(10, 10) again. Ignore queens and only consider skewers of two rooks by an enemy bishop, inspired partly by helpful feedback from @Hanamuke. This eliminates the problem of undefended bishops, but is narrower than originally intended. A broader but still accurate implementation will be a bit more complicated, so try this first.
18-06-26 31m skewerThreat diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 6635 W: 1420 L: 1519 D: 3696
sprt @ 10+0.1 th 1 I don't understand why the MG+EG version is nearly neutral, but either component by itself is terrible...but merge new master and try double effect, S(20, 20), for more information. I anticipate a failure in less than 20K games, but I want to be sure.
18-06-26 31m skewerThreat diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 7127 W: 1505 L: 1601 D: 4021
sprt @ 10+0.1 th 1 Endgame only, S(0, 10).
18-06-26 31m skewerThreat^ diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 3020 W: 634 L: 752 D: 1634
sprt @ 10+0.1 th 1 Middlegame only, S(10, 0).
18-06-26 31m skewerThreat diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 20760 W: 4602 L: 4631 D: 11527
sprt @ 10+0.1 th 1 Inspired by Bryan's analysis of TCEC 12 SF Game 29, move 70. S(10, 10) penalty if the opponent has a bishop skewer threat. More specifically, check all bishop attacks from our rook square. If they intersect with both a friendly rook/queen and a landing square for an enemy bishop along a single diagonal, but not an enemy bishop already present (i.e., there is not already a skewer), then apply penalty. This will also include bishop forks.
18-06-25 31m passerDoubleSupport4 diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 10256 W: 2249 L: 2330 D: 5677
sprt @ 10+0.1 th 1 I also hope I can retry this 14K green / 48K yellow from June 1, since many tweaks have been made to passed pawn evaluation since then: PassedRank, PassedFile, and PassedDanger. PassedDanger, in particular, interacts very directly with this patch, through the bonus k*w: w is PassedDanger, and k is changed by this patch.
18-06-25 31m tweak_threatOnPawn2 diff
LLR: -2.95 (-2.94,2.94) [0.00,4.00]
Total: 46089 W: 10072 L: 10055 D: 25962
sprt @ 10+0.1 th 1 My apologies for my recent absence from fishtest while traveling. I'm very impressed by the recent tuning work by @candirufish and am curious how it affects the most recent branch I had been working on, since that tuning substantially changes a lot of pawn evaluation, and I know that results from previous tweaks to ThreatByMinor/Rook have changed dramatically in just a few weeks of other commits (i.e., the queen tweak went from just barely failing yellow at LTC, after STC green, to -7 Elo when retested 5-6 weeks later). This was a 90K STC yellow, 104K LTC yellow on June 11-12.
18-06-13 31m tweak_threatOnPawn diff
LLR: -2.95 (-2.94,2.94) [0.00,4.00]
Total: 42688 W: 9452 L: 9447 D: 23789
sprt @ 10+0.1 th 1 Because the framework is mostly empty, let's merge the new master and try much smaller compensation. I know this is a very small tweak, but (a) this is applied to each and every pawn on the board, so it's likely quite sensitive to small changes (a change of 3 already produced a large change), and (b) the main patch was 90K yellow STC, 104K yellow LTC, and thus we only need a small improvement.
18-06-13 31m tweak_threatOnPawn diff
LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 30197 W: 6668 L: 6713 D: 16816
sprt @ 10+0.1 th 1 It appears that, on average, there are about 0.5 threats by minors or rooks on pawns per bench position. Revert to the best take, which gave a 6 cp middlegame bonus to the side with the threats, and compensate by increasing the middlegame value of a pawn by 3 cp.
18-06-13 31m combo_TOP_delta diff
LLR: -2.95 (-2.94,2.94) [0.00,4.00]
Total: 36502 W: 8128 L: 8147 D: 20227
sprt @ 10+0.1 th 1 Another attempt to combo with a promising tweak: @cancetin's delta (61K green STC, 126K yellow LTC). I may not be available to start a LTC run if this passes and would be grateful to anyone who does.
18-06-13 31m combo_TOP_CO diff
LLR: -2.95 (-2.94,2.94) [0.00,4.00]
Total: 21779 W: 4779 L: 4857 D: 12143
sprt @ 10+0.1 th 1 Combo [0, 4]. Even though my speculative LTC failed yellow, it performed at least as well as the STC (90K yellow -> 104K yellow). I'm encouraged by the good scaling to try combo patches. Combine with @snicolet's recent, almost-passed complexity offset tweak (37K green -> 73K yellow). I may not be available to start a LTC run if this passes and would be grateful to anyone who does.
18-06-12 31m tweak_threatOnPawn^^ diff
LLR: -2.95 (-2.94,2.94) [0.00,4.00]
Total: 104513 W: 17939 L: 17753 D: 68821
sprt @ 60+0.6 th 1 Speculative LTC for this 90K yellow, which will at least reveal whether this scales well. Low throughput (166).
18-06-12 31m tweak_threatOnPawn diff
LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 45188 W: 9981 L: 9967 D: 25240
sprt @ 10+0.1 th 1 Merge the SPSA results and try again. Interestingly, the eg components have changed to accommodate the new mg bonuses, which have changed slightly. Hopefully this turns the 90K yellow green; if not, I'll try a speculative LTC to get an idea of how this branch scales.
18-06-12 31m tune_threatOnPawn diff
27980/30000 iterations
58516/60000 games played
60000 @ 20+0.2 th 1 I know that (a) I may be close to a green, due to a recent 90K yellow (90.3% LOS, +1.14 Elo), and (b) small changes can make a big difference on this branch. Try to accomplish the rest by SPSA. Tune also the endgame components, since they are likely not orthogonal to brand-new mg ones. Quadruple ck for the new mg values, and double ck for the old eg ones.
18-06-11 31m tweak_threatOnPawn^ diff
LLR: -2.95 (-2.94,2.94) [0.00,4.00]
Total: 90704 W: 20119 L: 19924 D: 50661
sprt @ 10+0.1 th 1 So far, it appears that larger values may be better over master's 0 (2, +0.27 Elo; 4, +0.63 Elo). Continue increasing: try 6.
18-06-11 31m tweak_threatOnPawn diff
LLR: -2.95 (-2.94,2.94) [0.00,4.00]
Total: 20779 W: 4579 L: 4661 D: 11539
sprt @ 10+0.1 th 1 Value 8.
18-06-11 31m tweak_threatOnPawn diff
LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 52554 W: 10510 L: 10477 D: 31567
sprt @ 10+0.1 th 1 Double effect: increase the middlegame component of ThreatByMinor[PAWN] and ThreatByRook[PAWN] from 0 to 4.
18-06-11 31m tweak_threatOnPawn diff
LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 39413 W: 7813 L: 7829 D: 23771
sprt @ 10+0.1 th 1 Inspired by @xoroshiro's work: applying a mild middlegame penalty for our own hanging pawns appeared strong at STC but yellow at LTC; simply increasing the Hanging bonus failed. We appear to have ThreatByMinor and ThreatByRook for pawns, but the middlegame component is zero--maybe this is what is missing. Increase to 2.
18-06-09 31m passerDoubleSupport3 diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 10800 W: 2075 L: 2154 D: 6571
sprt @ 10+0.1 th 1 I can't believe I didn't think of this before. Expand the bonus by removing the k > 0 condition: bonus will be given even if the blocking square appears unsafe, if we have two or more rooks or queens on the file. Hopefully this maintains the green STC result while improving on the yellow LTC.
18-06-08 31m tune_passerDoubleSuppor diff
28324/30000 iterations
59159/60000 games played
60000 @ 20+0.2 th 1 Rather than search for new greens, I would like to try to improve the scaling of this green STC/yellow LTC. Since it changes k, which interacts with PassedDanger values w, it could change the optimum values of PassedDanger dramatically (because even a small change in k dramatically changes the resulting Score, (k*w, k*w)). Tune the four nonzero values of PassedDanger with quadruple the initially recommended ck (since these were quite small).
18-06-06 31m RookOFPP^ diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 26496 W: 5261 L: 5268 D: 15967
sprt @ 10+0.1 th 1 The versions that included a middlegame bonus failed badly, but the endgame-only version failed with a nearly neutral Elo estimate (+0.12). Try 50% larger effect.
18-06-06 31m RookOFPP diff
LLR: -2.97 (-2.94,2.94) [0.00,5.00]
Total: 20841 W: 4137 L: 4170 D: 12534
sprt @ 10+0.1 th 1 Also try endgame-only double effect.
18-06-06 31m RookOFPP diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 22100 W: 4448 L: 4474 D: 13178
sprt @ 10+0.1 th 1 Same, but endgame bonus only.
18-06-06 31m RookOFPP^^ diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 14980 W: 2985 L: 3044 D: 8951
sprt @ 10+0.1 th 1 It occurs to me that applying bonus--when only we have rooks--for open files has shown some potential in the past, while the same is true for passed pawns. However, giving too much bonus for one implicitly penalizes the other. Try combining them. If only we have rooks, add the number of friendly passed pawns and open files and apply a saturated bonus, such that 1 leads to S(8, 8) but 8 only lead to S(20, 20).
18-06-06 31m RookOFPP^ diff
LLR: -2.97 (-2.94,2.94) [0.00,5.00]
Total: 5975 W: 1116 L: 1217 D: 3642
sprt @ 10+0.1 th 1 Same, but middlegame bonus only.
18-06-05 31m RookPassedPawn diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 22871 W: 4576 L: 4599 D: 13696
sprt @ 10+0.1 th 1 One more check: S(0, 8). If this fails, I may be running out of ideas on this branch.
18-06-05 31m RookPassedPawn diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 19834 W: 3948 L: 3985 D: 11901
sprt @ 10+0.1 th 1 Make sure that the yellow result for S(0, 5)--33K games, +0.8 Elo--wasn't due to insufficient effect size, because this logic is much narrower than some of my previous S(0, 5) tests. Try S(3, 8).
18-06-04 31m RookPassedPawn diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 33335 W: 6793 L: 6766 D: 19776
sprt @ 10+0.1 th 1 Another version. S(0, 5) bonus for every "extra" passed pawn we have, if we have rooks but the opponent does not. This differs from previous tests by subtracting the number of enemy passed pawns from our own.
18-06-04 31m RookPassedPawn^ diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 24328 W: 4875 L: 4891 D: 14562
sprt @ 10+0.1 th 1 A tweak of the yellow passerNoOpposingR2 tests. If we have at least one rook, give a small bonus for every passed pawn of either color--rooks are uniquely helpful not just for defending our own passed pawns, but also for attacking the enemy's passed pawns. If both sides have rooks, this cancels out. S(2, 2).
18-06-04 31m RookPassedPawn diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 12491 W: 2470 L: 2541 D: 7480
sprt @ 10+0.1 th 1 Same, but S(0, 5).
18-06-04 31m passerNoOpposingR2^ diff
LLR: -2.97 (-2.94,2.94) [0.00,5.00]
Total: 36951 W: 7386 L: 7345 D: 22220
sprt @ 10+0.1 th 1 I had debated whether S(2, 2) was large enough to have an effect; I decided no and tried S(5, 5). S(5, 5) was clearly large enough to have an effect--it failed in less than 4000 games!--so let's try the much smaller effect I originally intended.
18-06-04 31m passerNoOpposingR2 diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 33174 W: 6613 L: 6589 D: 19972
sprt @ 10+0.1 th 1 S(0, 5). See S(5, 0) test for further explanation.
18-06-04 31m passerNoOpposingR2 diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 11479 W: 2324 L: 2399 D: 6756
sprt @ 10+0.1 th 1 Intuitively, S(0, 10) seems too large, but I would like to test this to be sure.
18-06-04 31m passerNoOpposingR2^ diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 11806 W: 2289 L: 2363 D: 7154
sprt @ 10+0.1 th 1 Try to split up the terrible S(5, 5) patch into MG and EG components to see if one specifically caused the regression. S(5, 0).
18-06-04 31m passerNoOpposingR2 diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 3731 W: 710 L: 822 D: 2199
sprt @ 10+0.1 th 1 If we have a rook and the enemy does not, give a small bonus for every one of our passed pawns. Here, S(5, 5).
18-06-03 31m passerNoOpposingR diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 19284 W: 3858 L: 3897 D: 11529
sprt @ 10+0.1 th 1 Revert to the best version so far (entire file, ignore queens) and increase the effect from k += 2 to k += 3.
18-06-03 31m passerNoOpposingR diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 10773 W: 2149 L: 2228 D: 6396
sprt @ 10+0.1 th 1 Also consider queens in both conditions.
18-06-03 31m passerNoOpposingR diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 33753 W: 6765 L: 6738 D: 20250
sprt @ 10+0.1 th 1 Same, but also consider support from in front of the pawn, i.e., file_bb(s) rather than forward_file_bb(Them, s).
18-06-03 31m passerNoOpposingR^ diff
LLR: -2.97 (-2.94,2.94) [0.00,5.00]
Total: 23425 W: 4666 L: 4687 D: 14072
sprt @ 10+0.1 th 1 Another passed pawn idea. Rooks are the best for attacking/defending passed pawns, because of their influence throughout the entire file. Give extra passed pawn bonus if we support our pawn from behind with a rook, but the opponent has no rook to answer. This could be helpful in imbalanced endgames.
18-06-03 31m passerDoubleSupport diff
LLR: -2.94 (-2.94,2.94) [0.00,5.00]
Total: 8439 W: 1632 L: 1721 D: 5086
sprt @ 10+0.1 th 1 It seems that what is potentially needed is not a narrowing of this bonus, but an expansion of it. Also give the extra bonus if we attack the (empty) blockSq twice. (This is a different form of double support.)
18-06-02 31m passerDoubleSupport diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 38233 W: 7637 L: 7590 D: 23006
sprt @ 10+0.1 th 1 A narrow restriction to the 14K green/48K yellow: don't apply extra bonus if the enemy also has two rooks/queens on the file.
18-06-02 31m passerDoubleSupport diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 13198 W: 2587 L: 2655 D: 7956
sprt @ 10+0.1 th 1 My apologies: I think I intended to use between_bb rather than LineBB. Try this version instead.
18-06-02 31m passerDoubleSupport diff
LLR: -1.35 (-2.94,2.94) [0.00,5.00]
Total: 5935 W: 1194 L: 1225 D: 3516
sprt @ 10+0.1 th 1 A tweak to the 14K green/48K yellow: exclude the case where enemy pieces block our double support of the passer. (Hopefully, I've done this correctly: take the two friendly rooks/queens/pawns furthest apart on the file, and make sure there are no enemies between them.)