Stockfish Testing Queue

Finished - 1117 tests

18-07-16 31m KingDangerQueens diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 36839 W: 8220 L: 8169 D: 20450
sprt @ 10+0.1 th 1 kingDanger > 800.
18-07-16 31m KingDangerQueens^ diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 9905 W: 2181 L: 2264 D: 5460
sprt @ 10+0.1 th 1 Halfway through the brief LTC tuning, there were no large changes...I'm not sure why (inadequate ck?). For now, try experimenting with the one change that appeared faintly in the results: reducing the threshold. Use kingDanger > 900. (The best result so far was with 1000.)
18-07-15 31m tune_KingDangerQueens diff
5192/10000 iterations
10991/20000 games played
20000 @ 60+0.6 th 1 For a mostly empty framework, a low-throughput LTC tuning session modeled on weakpawn_tune. A slight rewrite of the test that was strong at STC but appeared to be a regression at LTC; try to find values that will scale better. 20K 60+0.6 games seem to be a better investment than the equivalent 60K 20+0.2 games.
18-07-16 31m KingDangerQueens diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 20189 W: 4524 L: 4555 D: 11110
sprt @ 10+0.1 th 1 Also restore the best version so far (top quartile of kingDanger, increase by 10%) and try an idea inspired by @Vizvezdenec's comment. Consider any difference in the number of queens.
18-07-15 31m KingDangerQueens^^ diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 23132 W: 5167 L: 5184 D: 12781
sprt @ 10+0.1 th 1 Apply the increase in kingDanger to the top tenth.
18-07-15 31m KingDangerQueens^ diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 22305 W: 4921 L: 4942 D: 12442
sprt @ 10+0.1 th 1 Return to using the top quartile, but increase by 5% rather than 10%.
18-07-15 31m KingDangerQueens diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 22113 W: 4947 L: 4969 D: 12197
sprt @ 10+0.1 th 1 Increase the top quartile of kingDanger values by 15%.
18-07-15 31m KingDangerQueens^^^ diff
LLR: -2.94 (-2.94,2.94) [0.00,5.00]
Total: 7544 W: 1629 L: 1723 D: 4192
sprt @ 10+0.1 th 1 I was lucky at STC with my initial values; this test introduces two parameters, a % increase to kingDanger and a minimum value to apply it. Search for better parameters. Here, apply kingDanger to the top half rather than the top quarter.
18-07-15 31m KingDangerQueens diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 24589 W: 4173 L: 4197 D: 16219
sprt @ 60+0.6 th 1 LTC. Scale up kingDanger by 10% if we have queens but the opponent does not (i.e., materially imbalanced and/or advantageous positions). Only apply if kingDanger > 1000 (roughly the top quartile of kingDanger values). This should especially penalize positions with already high king danger.
18-07-14 31m KingDangerQueens diff
LLR: 2.94 (-2.94,2.94) [0.00,5.00]
Total: 17144 W: 3887 L: 3671 D: 9586
sprt @ 10+0.1 th 1 Same, but only apply if kingDanger > 1000. Based on bench, this applies to roughly the top quartile of kingDanger values.
18-07-14 31m KingDangerQueens^ diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 14422 W: 3205 L: 3265 D: 7952
sprt @ 10+0.1 th 1 Inspired by the TCEC bonus game, LC0 1-0 Stockfish. I have heard discussions for some time (e.g., AlphaZero games) about SF overvaluing material advantage in the face of extreme king danger or other factors. Here, scale up kingDanger by 10% if we have queens but the opponent does not (i.e., materially imbalanced and/or advantageous positions). This should especially penalize positions with already high king danger.
18-07-14 31m WeakKingDefense2^ diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 10620 W: 2327 L: 2406 D: 5887
sprt @ 10+0.1 th 1 Unfortunately, I didn't realize the conflict this branch had with a recently committed PR. Try my recently tuned values, but simply overwrite the recently committed tuning where necessary.
18-07-14 31m WeakKingDefense2 diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 4577 W: 945 L: 1054 D: 2578
sprt @ 10+0.1 th 1 Attempt to merge my code with the new master while preserving as many of the new master values as possible.
18-07-14 31m tune_WeakKingDefense2 diff
27815/30000 iterations
57443/60000 games played
60000 @ 20+0.2 th 1 After 20K games, little to no progress was made. I suspect that I need larger ck values; quadruple them and reschedule.
18-07-14 31m tune_WeakKingDefense2 diff
10197/30000 iterations
20994/60000 games played
60000 @ 20+0.2 th 1 Since the framework is completely empty, I'm curious to see if tuning these two parameters will improve the results.
18-07-14 31m WeakKingDefense2 diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 30800 W: 6801 L: 6780 D: 17219
sprt @ 10+0.1 th 1 Try both including pawns and using the larger weight--leading to a large effect--and compensating for the result. This results in especially little change to the average kingDanger.
18-07-14 31m WeakKingDefense2 diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 14558 W: 3192 L: 3252 D: 8114
sprt @ 10+0.1 th 1 Also apply compensation to the previously-failed no-pawns version: compare the new average kingDanger to the master one and apply the difference to the constant term. This requires a much larger tweak to the constant, from +17 to -56.
18-07-14 31m WeakKingDefense2 diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 8787 W: 1870 L: 1958 D: 4959
sprt @ 10+0.1 th 1 Compensate by tweaking the constant term to maintain the same average kingDanger. I'm not sure how Elo-sensitive this constant term is, but the framework is empty, so I would like to experiment with it.
18-07-06 31m WeakKingDefense2 diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 13998 W: 3018 L: 3081 D: 7899
sprt @ 10+0.1 th 1 Include weak pawns, and reduce the coefficient from 30 to 6 to maintain the same average effect.
18-07-06 31m WeakKingDefense2^ diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 13839 W: 3046 L: 3109 D: 7684
sprt @ 10+0.1 th 1 Try the kingDanger approach suggested by @Rocky640. Begin by implementing his proposed code verbatim.
18-07-05 31m simplify_ThreatByKing diff
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 108031 W: 18329 L: 18350 D: 71352
sprt @ 60+0.6 th 1 LTC. Try to compensate with a parameter tweak to maintain the same average bonus.
18-07-05 31m simplify_ThreatByKing diff
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 7677 W: 1772 L: 1623 D: 4282
sprt @ 10+0.1 th 1 Try to compensate with a parameter tweak to maintain the same average bonus.
18-07-05 31m simplify_ThreatByKing diff
LLR: -2.94 (-2.94,2.94) [-3.00,1.00]
Total: 81401 W: 17770 L: 18095 D: 45536
sprt @ 10+0.1 th 1 My first attempt at simplification; I apologize for any errors. Simplify ThreatByKing to be a single Score, removing the more_than_one special case. While working on WeakKingDefense, I noticed that this case only activates in 2-3 positions per 1000 during bench. Perhaps the reason tuning gave a strange negative middlegame value, as mentioned by @candirufish, is because this case is too rare for it to have meaningful effects. Try to eliminate it completely.
18-07-05 31m WeakKingDefense^^ diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 26222 W: 5744 L: 5746 D: 14732
sprt @ 10+0.1 th 1 Based on a comment by @Rocky640 on overload_king. Bonus if the enemy king is defending weak pieces. Start with Overload's S(10, 5).
18-07-05 31m WeakKingDefense diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 18662 W: 4111 L: 4150 D: 10401
sprt @ 10+0.1 th 1 Overload excludes pawns and our double attacks, so try re-excluding our double attacks.
18-07-05 31m WeakKingDefense^ diff
LLR: -2.97 (-2.94,2.94) [0.00,5.00]
Total: 13760 W: 3029 L: 3093 D: 7638
sprt @ 10+0.1 th 1 Overload excludes pawns and our double attacks, so try re-excluding pawns.
18-07-04 31m overload_king diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 38938 W: 8636 L: 8575 D: 21727
sprt @ 10+0.1 th 1 Also try the simplest idea, simply doubling the Overload penalty, i.e., OverloadKing = Overload, to be simplified later.
18-07-04 31m overload_king^ diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 29351 W: 6504 L: 6490 D: 16357
sprt @ 10+0.1 th 1 S(20, 0).
18-07-04 31m overload_royals diff
LLR: -2.94 (-2.94,2.94) [0.00,5.00]
Total: 17848 W: 3906 L: 3949 D: 9993
sprt @ 10+0.1 th 1 Previous attempts made by @Rocky640 and I at queen overload used similar bonus and logic to overload_king, re-applying the Overload score. I wonder how attackedByKing | attackedByQueen fares...
18-07-04 31m overload_king^^ diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 15340 W: 3306 L: 3362 D: 8672
sprt @ 10+0.1 th 1 S(15, 0).
18-07-04 31m overload_king^^^ diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 9767 W: 2126 L: 2209 D: 5432
sprt @ 10+0.1 th 1 Since middlegame-only, S(10, 0) was the best so far, quickly search this parameter space with a series of SPRTs. S(5, 0).
18-07-04 31m overload_king diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 32973 W: 7304 L: 7272 D: 18397
sprt @ 10+0.1 th 1 We've considered overloading for queens, but what about kings? A king which is the sole defender of an attacked piece is weak to attack. S(10, 0) bonus if the enemy king is overloaded.
18-07-04 31m overload_king diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 15022 W: 3294 L: 3351 D: 8377
sprt @ 10+0.1 th 1 Middlegame and endgame. S(10, 10).
18-07-04 31m overload_king diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 15260 W: 3345 L: 3401 D: 8514
sprt @ 10+0.1 th 1 Rather than middlegame-only S(10, 0), try endgame-only S(0, 10) for comparison.
18-07-04 31m PiecePressure diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 14693 W: 3264 L: 3323 D: 8106
sprt @ 10+0.1 th 1 A more complicated attempt. For doubly-attacked, doubly-defended non-pawn enemies, calculate their mobility. If less than or equal to 3 (inspired by trapped rook), apply bonus.
18-07-03 31m PiecePressure diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 10660 W: 2279 L: 2358 D: 6023
sprt @ 10+0.1 th 1 I wonder how this performs with non-pawn enemies rather than pawn enemies. S(10, 5) bonus (inspired by Overload) for every non-pawn enemy attacked at least twice and defended at least twice. These pieces are not hanging but still imply that we are attacking.
18-07-03 31m PiecePressure diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 7693 W: 1657 L: 1751 D: 4285
sprt @ 10+0.1 th 1 Smaller effect and middlegame only. S(5, 0).
18-07-02 31m PawnPressure diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 11239 W: 2451 L: 2527 D: 6261
sprt @ 10+0.1 th 1 Inspired by @snicolet's weak_pawns tests, but closer to my previous pin_pressure ones: bonus for enemy pawns doubly attacked and doubly defended, because they imply an attack by our side. Also require that they are neither pawn-attacked (leading to easy exchanges) nor pawn-defended. S(15, 5).
18-07-02 31m PawnPressure diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 6134 W: 1307 L: 1408 D: 3419
sprt @ 10+0.1 th 1 Remove the conditions that the enemy pawns be neither pawn-defended nor pawn-attacked.
18-06-30 31m HangingQueenPinner diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 16385 W: 3560 L: 3611 D: 9214
sprt @ 10+0.1 th 1 I'm starting to wonder if the 42K yellow (-20%) was a fluke...try -10% compensation, since -30% performed poorly. Possibly my final attempt on this branch.
18-06-30 31m HangingQueenPinner diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 14222 W: 3151 L: 3212 D: 7859
sprt @ 10+0.1 th 1 Since the framework is mostly empty, try a middle-ground approach, -30% WeakQueen compensation.
18-06-29 31m HangingQueenPinner diff
LLR: -2.97 (-2.94,2.94) [0.00,5.00]
Total: 9550 W: 2064 L: 2149 D: 5337
sprt @ 10+0.1 th 1 An aggressive attempt, doubling the compensation. Reduce WeakQueen by 40%.
18-06-29 31m tweak_WeakQueen diff
LLR: -2.95 (-2.94,2.94) [0.00,4.00]
Total: 27274 W: 6058 L: 6114 D: 15102
sprt @ 10+0.1 th 1 This tweak improves HangingQueenPinner, but it's not clear if the improvement comes from their interaction or if this tweak would perform better alone. Let's test it: reduce WeakQueen by 20% to S(40, 8).
18-06-29 31m HangingQueenPinner diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 42993 W: 9572 L: 9490 D: 23931
sprt @ 10+0.1 th 1 Counterintuitively, it seems like the narrower logic in this patch actually leads to this loop being activated about 19% more often, not less (based on dbg_mean_of() the boolean during bench). Compensate by reducing the value of WeakQueen accordingly.
18-06-29 31m HangingQueenPinner diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 21010 W: 4630 L: 4658 D: 11722
sprt @ 10+0.1 th 1 Since the framework isn't very busy, here's a sanity check (and the more intuitive change) for comparison: apply the opposite compensation to WeakQueen.
18-06-29 31m HangingQueenPinner diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 11207 W: 2482 L: 2558 D: 6167
sprt @ 10+0.1 th 1 An attempt to implement an idea Bryan posted in response to TCEC 12 Superfinal Game 38. No WeakQueen penalty if the discovered attacker is hanging, in which case the queen can simply take the "attacking" piece if the enemy blocker is moved by the opponent. The logic in this patch is technically inaccurate if there are two or more blockers/pinners, but this occurs in only 0.07% of positions, and this logic should be fast to compute.
18-06-28 31m skewerThreat diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 6755 W: 1444 L: 1542 D: 3769
sprt @ 10+0.1 th 1 Upon further consideration, @Hanamuke had a really good point about a bug in my previous implementation, which occurred if the bishop and first rook were on the same file/rank, so the bishop had 2 different squares from which to attack the rook. Try this implementation (with rooks only), which hopefully solves all the previous issues.
18-06-26 31m skewerThreat diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 8129 W: 1772 L: 1864 D: 4493
sprt @ 10+0.1 th 1 S(10, 10) again. Ignore queens and only consider skewers of two rooks by an enemy bishop, inspired partly by helpful feedback from @Hanamuke. This eliminates the problem of undefended bishops, but is narrower than originally intended. A broader but still accurate implementation will be a bit more complicated, so try this first.
18-06-26 31m skewerThreat diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 6635 W: 1420 L: 1519 D: 3696
sprt @ 10+0.1 th 1 I don't understand why the MG+EG version is nearly neutral, but either component by itself is terrible...but merge new master and try double effect, S(20, 20), for more information. I anticipate a failure in less than 20K games, but I want to be sure.
18-06-26 31m skewerThreat diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 7127 W: 1505 L: 1601 D: 4021
sprt @ 10+0.1 th 1 Endgame only, S(0, 10).