Stockfish Testing Queue

Finished - 1408 tests

18-10-13 31m combo_uh_wq diff
LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 18906 W: 4041 L: 4131 D: 10734
sprt @ 10+0.1 th 1 Revive sg's update_history (June 3, 73K LTC yellow). Combo with @SFisGOD's 66K yellow mentioned previously.
18-10-13 31m combo_uh_cp diff
LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 13518 W: 2869 L: 2980 D: 7669
sprt @ 10+0.1 th 1 Revive sg's update_history (June 3, 73K LTC yellow). Combo with sg's 102K yellow mentioned previously.
18-10-13 31m combo_top_ver diff
LLR: -2.94 (-2.94,2.94) [0.00,4.00]
Total: 33496 W: 7148 L: 7181 D: 19167
sprt @ 10+0.1 th 1 Revive my own tweak_threatOnPawn^^ (June 12, LTC 104K yellow). Combo with @DU-jdto's 73K yellow mentioned previously.
18-10-13 31m combo_top_cp diff
LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 27179 W: 5795 L: 5853 D: 15531
sprt @ 10+0.1 th 1 Revive my own tweak_threatOnPawn^^ (June 12, LTC 104K yellow). Combo with sg's 102K yellow mentioned previously.
18-10-13 31m combo_top_wq diff
LLR: -2.95 (-2.94,2.94) [0.00,4.00]
Total: 21020 W: 4401 L: 4483 D: 12136
sprt @ 10+0.1 th 1 Revive my own tweak_threatOnPawn^^ (June 12, LTC 104K yellow). Combo with @SFisGOD's 66K yellow mentioned previously.
18-10-13 31m combo_ver_wq diff
LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 46525 W: 7449 L: 7458 D: 31618
sprt @ 60+0.6 th 1 The STC has already outperformed the one that led to my recent successful speculative LTC. Therefore, before taking any further steps, run a LTC. (I am submitting this now because I probably will not be available when the STC finishes.) Please raise to normal throughput (1000) from the reduced, speculative LTC throughput (166) if the STC passes.
18-10-13 31m combo_ver_wq diff
LLR: -2.95 (-2.94,2.94) [0.00,4.00]
Total: 100567 W: 21521 L: 21296 D: 57750
sprt @ 10+0.1 th 1 I went back to June 21 and found another seemingly unmerged potential combo candidate, @DU-jdto's verification (73K yellow). Combo with @SFisGOD's 66K yellow mentioned previously.
18-10-13 31m combo_ver_cp diff
LLR: -2.95 (-2.94,2.94) [0.00,4.00]
Total: 49404 W: 10555 L: 10527 D: 28322
sprt @ 10+0.1 th 1 I went back to June 21 and found another seemingly unmerged potential combo candidate, @DU-jdto's verification (73K yellow). Combo with sg's 102K yellow mentioned previously.
18-10-13 31m combo_cp_wq diff
LLR: -2.95 (-2.94,2.94) [0.00,4.00]
Total: 57564 W: 12330 L: 12270 D: 32964
sprt @ 10+0.1 th 1 Looking back through long yellow LTC [0, 4] runs, I think we missed an opportunity in sg's connected_pawns from August 15 (102K yellow). I intend to watch carefully for another promising [0, 4] for a combo. Try @SFisGOD's recent 66K LTC yellow, weakQueen5.
18-10-13 31m qsFut^^ diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 42126 W: 6880 L: 6837 D: 28409
sprt @ 60+0.6 th 1 LTC for @vondele. Take 5, param 9500.
18-10-11 31m BsafeCh1 diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 17584 W: 2799 L: 2853 D: 11932
sprt @ 60+0.6 th 1 I'm curious to see how this very promising STC (108K yellow) from @Vizvezdenec scales, and the framework is completely empty. Speculative LTC, low throughput (166).
18-10-11 31m QsafeChQ6 diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 23612 W: 3806 L: 3836 D: 15970
sprt @ 60+0.6 th 1 LTC for @Vizvezdenec: Take 6.
18-10-09 31m combo_TOQ4_pss diff
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 137198 W: 22351 L: 21772 D: 93075
sprt @ 60+0.6 th 1 Speculative LTC for the combo. I am not dissuaded by the long yellow (83K) STC, because I chose pss specifically for its LTC performance. Low throughput (166).
18-10-09 31m combo_TOQ4_pss diff
LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 83913 W: 17986 L: 17825 D: 48102
sprt @ 10+0.1 th 1 Combo my 88K-and-counting LTC tweak with a 113K LTC yellow by @jdonald from August 29, pss. (Can I combo all three? I thought this wasn't allowed, but I think I've seen it done before...)
18-10-09 31m combo_TOQ4_PF diff
LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 27592 W: 5823 L: 5880 D: 15889
sprt @ 10+0.1 th 1 Combo my 88K-and-counting LTC tweak with a 145K LTC yellow by @snicolet from August 18, pawnless_flank6.
18-10-09 31m tweak_threatOnQueen4 diff
LLR: -2.95 (-2.94,2.94) [0.00,4.00]
Total: 29502 W: 6252 L: 6301 D: 16949
sprt @ 10+0.1 th 1 Double effect (+30). It's possible that not many approvers will available if/when the +15 LTC fails, so I am submitting this now instead. I intend to run this at priority 0 (not the current -1) if the LTC fails and cancel this test otherwise.
18-10-08 31m tweak_threatOnQueen4 diff
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 38142 W: 8322 L: 8011 D: 21809
sprt @ 10+0.1 th 1 Removing immediate threats against the queen changed my 103K yellow to a 20K red. Check to see if we can't gain Elo by just adding a roughly equivalent amount, 15, to the middlegame components of ThreatByMinor[QUEEN] and ThreatByRook[QUEEN].
18-10-08 31m TrappedQueenRisk^ diff
LLR: -2.94 (-2.94,2.94) [0.00,5.00]
Total: 32963 W: 7033 L: 7004 D: 18926
sprt @ 10+0.1 th 1 Another tweak of the 103K yellow. This was a middlegame-only bonus; add an endgame component of equal value.
18-10-08 31m TrappedQueenRisk diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 13417 W: 2813 L: 2879 D: 7725
sprt @ 10+0.1 th 1 Another tweak of the 103K yellow. This was a middlegame-only bonus; add an endgame component half the size of the middlegame one.
18-10-08 31m TrappedQueenRisk diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 20883 W: 4501 L: 4530 D: 11852
sprt @ 10+0.1 th 1 Tweak of the 103K yellow. I've been including the square the queen currently occupies as "safe", but perhaps I shouldn't--the risk of being trapped is much more about the number of safe squares to move to, regardless of whether the current square is "safe" for now. Exclude it here (remove "| s").
18-10-08 31m TrappedQueenRisk^^^^ diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 39030 W: 6326 L: 6296 D: 26408
sprt @ 60+0.6 th 1 I might be able to turn the 103K yellow (estimated +1.62 Elo) green with a tuning run or more tweaks, but there's no point if it doesn't scale. Therefore, I would like to jump straight to speculative LTC to at least see if there's anything here (especially since the framework is empty). Low throughput (166).
18-10-08 31m TrappedQueenRisk diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 103794 W: 22327 L: 21956 D: 59511
sprt @ 10+0.1 th 1 Surprised that the larger effect seems better so far--increase further to gather more data.
18-10-08 31m TrappedQueenRisk diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 27847 W: 5962 L: 5958 D: 15927
sprt @ 10+0.1 th 1 75, 25. (In other words: S(75, 0) for no safe mobility, S(50, 0) for one safe square, S(25, 0) for two, and no bonus for three or more.)
18-10-08 31m TrappedQueenRisk^ diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 14516 W: 3080 L: 3140 D: 8296
sprt @ 10+0.1 th 1 Parameters 45, 15 are on the cusp of passing (-1 < LLR < 0 after 71K games and counting); so far bigger is better. Keep going. 60, 20.
18-10-08 31m TrappedQueenRisk diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 12552 W: 2640 L: 2710 D: 7202
sprt @ 10+0.1 th 1 Narrower than the neutral version. S(20, 0) for no safe mobility, S(10, 0) for one safe square, no bonus if two or more safe squares.
18-10-08 31m TrappedQueenRisk diff
LLR: -2.97 (-2.94,2.94) [0.00,5.00]
Total: 9103 W: 1891 L: 1978 D: 5234
sprt @ 10+0.1 th 1 Broader than the neutral version. S(40, 0) for no safe mobility, S(30, 0) for one safe square, S(20, 0) for two, S(10, 0) for three, no bonus for four or more.
18-10-08 31m TrappedQueenRisk^ diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 22701 W: 4900 L: 4920 D: 12881
sprt @ 10+0.1 th 1 An initial attempt based on Bryan's comments and suggestions from CCCC Bonus Game 10. Check whether the enemy queen is on our side of the board and unable to immediately trade for our own. If so, count the "safe" squares it attacks/occupies (excluding its own friendly pieces). Define "safe" to be squares that we do not attack, or squares we attack only with a queen and they defend with a second piece. If fewer than three safe squares, give us a bonus. Start with S(30, 0) for no safe mobility and reduce by S(10, 0) for each safe square (modeled on TrappedRook penalty).
18-10-08 31m TrappedQueenRisk diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 5771 W: 1158 L: 1260 D: 3353
sprt @ 10+0.1 th 1 Half effect. S(15, 0) for no safe mobility, S(10, 0) for one safe square, S(5, 0) for two safe squares, no bonus otherwise.
18-10-06 31m RookBlockedPawns3 diff
LLR: -2.97 (-2.94,2.94) [0.00,5.00]
Total: 13422 W: 2880 L: 2946 D: 7596
sprt @ 10+0.1 th 1 Double effect (S(2, 2)). For example, if we have two rooks and five blocked pawns when the opponent has more bishops, S(20, 20) penalty. (I expect that this is too much but would like to test to be sure.)
18-10-06 31m RookBlockedPawns3^ diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 11121 W: 2315 L: 2392 D: 6414
sprt @ 10+0.1 th 1 Revisiting my older ideas. The key change from RookBlockedPawns is that that patch cancelled out the penalty when we had unblocked pawns--and this branch ignores them (thus applying the penalty more broadly). If the opponent has more bishops, then give a S(1, 1) penalty per friendly blocked pawn per friendly rook. This can grow quickly; in key positions in SF 0-1 Houdini (in CCCC Stage 3), this penalty would total S(10, 10).
18-10-06 31m RookVersusBishop2 diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 3765 W: 742 L: 854 D: 2169
sprt @ 10+0.1 th 1 My attempt to implement one of @NKONSTANTAKIS's suggestions. Add analogous logic for the rook, also with divisor 10.
18-10-06 31m RookVersusBishop diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 17522 W: 3748 L: 3794 D: 9980
sprt @ 10+0.1 th 1 Narrower version, applied only if we have the bishop pair. Divisor 10.
18-10-06 31m RookVersusBishop diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 10918 W: 2263 L: 2341 D: 6314
sprt @ 10+0.1 th 1 A different narrow version, applied only if the enemy has more rooks but we have more bishops. (Previous tests only had the first condition.) Divisor 10.
18-10-05 31m RookVersusBishop^ diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 6631 W: 1384 L: 1483 D: 3764
sprt @ 10+0.1 th 1 Divide by 8, resulting in a 25% larger effect than the 38K yellow (dividing by 10). Although this is a small change, if it also performs well, I'll have greater confidence that the positive yellow result wasn't just a lucky test.
18-10-05 31m RookVersusBishop diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 38677 W: 8264 L: 8208 D: 22205
sprt @ 10+0.1 th 1 A fundamentally different attempt at improving evaluation in recent exchange-up positions at CCCC. I noticed that in these positions the bishops tend to be highly mobile. If the opponent has more rooks, increase the size of the mobility bonus for our bishops by 10%--thus penalizing immobile bishops more harshly but giving bonus (cancelling out some of the material difference) correlated to increased bishop mobility.
18-10-05 31m RookVersusBishop diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 8973 W: 1907 L: 1994 D: 5072
sprt @ 10+0.1 th 1 The first test looks promising so far (after 23K games, Elo estimate +2.06 [-0.90,4.99]). Try dividing by 5 rather than 10 (thus doubling the effect size).
18-10-01 31m RookBlockedPawns diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 33720 W: 7148 L: 7116 D: 19456
sprt @ 10+0.1 th 1 I continue to draw inspiration from @Vizvezdenec's tests. Retry the 4, 2 test but only apply the penalty if the opponent has a bishop pair--this especially shouldn't apply if the enemy doesn't have minors.
18-10-01 31m RookBlockedPawns diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 18037 W: 3835 L: 3878 D: 10324
sprt @ 10+0.1 th 1 The only-if-bishop-pair test seems to be roughly neutral...perhaps the effect size is too small. Double it. 8, 4.
18-10-01 31m RookBlockedPawns2 diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 12066 W: 2536 L: 2608 D: 6922
sprt @ 10+0.1 th 1 Simply multiply by the number of enemy bishops, thus also increasing the effect size. 4, 2.
18-10-01 31m RookBlockedPawns2 diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 14714 W: 3174 L: 3233 D: 8307
sprt @ 10+0.1 th 1 Bugfix (the prior version could become a bonus, rather than a penalty, if we had more minors). 2, 1.
18-10-01 31m RookBlockedPawns2^ diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 7079 W: 1472 L: 1568 D: 4039
sprt @ 10+0.1 th 1 Bugfix (the prior version could become a bonus, rather than a penalty, if we had more minors). 4, 2.
18-10-01 31m RookBlockedPawns2 diff
LLR: -1.97 (-2.94,2.94) [0.00,5.00]
Total: 8720 W: 1864 L: 1909 D: 4947
sprt @ 10+0.1 th 1 Parameters 2, 1. These are the best-performing so far despite the suspiciously small effect size (I suspect luck), but now that the penalty is multiplied by the difference in minors, it could become quite large.
18-10-01 31m RookBlockedPawns2^ diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 5356 W: 1094 L: 1199 D: 3063
sprt @ 10+0.1 th 1 Adapting an idea from the forum. Rather than requiring that the enemy have the bishop pair, more generally multiply the penalty by the number of enemy minors minus the number of friendly minors. Parameters 4, 2.
18-10-01 31m RookBlockedPawns diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 13964 W: 2950 L: 3013 D: 8001
sprt @ 10+0.1 th 1 Try double effect of best version: parameters 4, 2.
18-10-01 31m RookBlockedPawns^^ diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 37518 W: 8066 L: 8015 D: 21437
sprt @ 10+0.1 th 1 I think @Vizvezdenec's formulation is really interesting--here's my attempt at his idea. First try: parameters 2, 1.
18-10-01 31m RookBlockedPawns diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 17446 W: 3700 L: 3746 D: 10000
sprt @ 10+0.1 th 1 Third try: parameters 3, 2.
18-10-01 31m RookBlockedPawns^ diff
LLR: -3.16 (-2.94,2.94) [0.00,5.00]
Total: 6479 W: 1315 L: 1423 D: 3741
sprt @ 10+0.1 th 1 Second try: parameters 3, 1.
18-09-30 31m RookOpportunity^ diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 6030 W: 1225 L: 1326 D: 3479
sprt @ 10+0.1 th 1 Using both MG and EG performed quite poorly. Try EG only.
18-09-30 31m RookOpportunity diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 5304 W: 1075 L: 1180 D: 3049
sprt @ 10+0.1 th 1 MG only.
18-09-30 31m RookOpportunity diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 3597 W: 710 L: 823 D: 2064
sprt @ 10+0.1 th 1 Only penalize if no opportunities at all.