Stockfish Testing Queue

Finished - 35727 tests

15-02-17 vin wedges diff
LLR: -2.96 (-2.94,2.94) [0.00,6.00]
Total: 41041 W: 6830 L: 6737 D: 27474
sprt @ 60+0.05 th 1 STC test of alternate version also passed, so try at LTC. Hopefully at least one of them will emerge as a better candidate.
15-02-17 jki smp5 diff
ELO: -6.01 +-5.0 (95%) LOS: 0.9%
Total: 6246 W: 1002 L: 1110 D: 4134
10000 @ 15+0.05 th 16 MAX_SLAVES_PER_SPLITPOINT = 5
15-02-17 jki smp3 diff
ELO: -15.72 +-6.0 (95%) LOS: 0.0%
Total: 4336 W: 635 L: 831 D: 2870
10000 @ 15+0.05 th 16 MAX_SLAVES_PER_SPLITPOINT = 3
15-02-18 sni connected_pawns2 diff
LLR: 2.96 (-2.94,2.94) [0.00,6.00]
Total: 30398 W: 5315 L: 5063 D: 20020
sprt @ 60+0.05 th 1 LTC: Try to create mobile phalanxes
15-02-18 sg pawn_attack_threat5 diff
ELO: -0.31 +-2.5 (95%) LOS: 40.2%
Total: 30000 W: 5957 L: 5984 D: 18059
30000 @ 15+0.05 th 1 Measure elo for tuned parameters on STC first. I expect no or little gain because parameters tuned on LTC and last tests show a strong TC dependency.
15-02-18 Roc PawnDefensePush diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 8658 W: 1706 L: 1789 D: 5163
sprt @ 15+0.05 th 1 Larger score S(15,15). Fixed base signature in the test submission.
15-02-18 vin wedges2 diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 15564 W: 3031 L: 3095 D: 9438
sprt @ 15+0.05 th 1 Rewrite to use a BB approach which allows finer control. Also try a rank-based bonus. If this STC is a regression from previous try, then we'll change the scores to match the previous try and use that as the base for tuning.
15-02-18 jki smpinf diff
ELO: -44.95 +-12.6 (95%) LOS: 0.0%
Total: 956 W: 99 L: 222 D: 635
10000 @ 15+0.05 th 16 MAX_SLAVES_PER_SPLITPOINT = 100
15-02-18 mco smp diff
ELO: -0.52 +-2.8 (95%) LOS: 36.0%
Total: 20000 W: 3489 L: 3519 D: 12992
20000 @ 15+0.05 th 4 Crash test for smp simplification patch
15-02-18 vin wedges_spsa diff
19762/20000 iterations
40000/40000 games played
40000 @ 15+0.05 th 1 Take the wedge scores from the middle run as these performed best, and use these as the start for SPSA run. Implementation is different but static eval (and so bench signature) is the same.
15-02-18 Roc PawnDefensePush diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 15086 W: 2991 L: 3056 D: 9039
sprt @ 15+0.05 th 1 S(5, 5). Fixed Git last commit
15-02-18 sni piece_support2 diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 14413 W: 2883 L: 2950 D: 8580
sprt @ 15+0.05 th 1 Implement Alain Savard's idea of a bonus for pawn pushes supporting one of our pieces (using a mask to restrict the area in the enemy camp)
15-02-19 lbr maxslaves diff
ELO: -180.32 +-46.6 (95%) LOS: 0.0%
Total: 130 W: 7 L: 69 D: 54
20000 @ 15+0.05 th 2 MaxSlavesPerSplitPoint= Threads / 2 (suggested by vincent)
15-02-19 lbr smp3 diff
LLR: -2.95 (-2.94,2.94) [0.00,6.00]
Total: 6769 W: 1109 L: 1179 D: 4481
sprt @ 15+0.05 th 4 MAX_SLAVES_PER_SPLITPOINT = 1+log2(Threads) for Threads=4. No change on 2 or 8 threads. Within error bar on 16. See if we get an elo gain on 4.
15-02-19 sg pawn_attack_threat5 diff
LLR: -2.97 (-2.94,2.94) [0.00,6.00]
Total: 29837 W: 4936 L: 4897 D: 20004
sprt @ 60+0.05 th 1 As expected the patch seems neutral at STC. Now test on LTC where the paramaters are tuned.
15-02-19 sni king_on_pieces diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 10386 W: 2037 L: 2115 D: 6234
sprt @ 15+0.05 th 1 Tweak KingOnOne and KingOnMany values
15-02-19 vin wedges diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 27744 W: 5455 L: 5486 D: 16803
sprt @ 15+0.05 th 1 STC test of SPSA tuned values. Since the values for the two ranks turned out to be virtually identical, go back to the simpler code format (which also yellowed at LTC and is therefore the most promising)
15-02-19 mco late_join diff
ELO: -0.76 +-2.8 (95%) LOS: 29.9%
Total: 20000 W: 3477 L: 3521 D: 13002
20000 @ 15+0.05 th 4 Use only 'level' as late join metric: quick test to get a rough idea if this could work.
15-02-19 jos passed_defdef diff
LLR: 2.94 (-2.94,2.94) [-1.50,4.50]
Total: 13771 W: 2758 L: 2615 D: 8398
sprt @ 15+0.05 th 1 Bonus for a passer which is supported by a pawn, which again is also defended by a pawn.
15-02-19 sg asp_window diff
LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 13774 W: 2671 L: 2781 D: 8322
sprt @ 15+0.05 th 1 Increase aspiration window on research by a constant(=4). So this is more like a tuning.
15-02-19 zar tune_double diff
LLR: -2.94 (-2.94,2.94) [0.00,4.00]
Total: 114045 W: 22830 L: 22571 D: 68644
sprt @ 15+0.05 th 1 Add penalty for doubled pawns in A and H files
15-02-19 SC ortho_threats diff
ELO: -23.35 +-4.5 (95%) LOS: 0.0%
Total: 10000 W: 1822 L: 2493 D: 5685
10000 @ 15+0.05 th 1 Orthogonality experiment, take 1. Replace threats evaluation by linear combination of king, passed_pawns and mobility. Coefficients obtained from bench. A quick check to see how much Elo is lost.
15-02-19 Roc CenterWedge diff
LLR: -4.10 (-2.94,2.94) [-1.50,4.50]
Total: 61556 W: 12085 L: 12065 D: 37406
sprt @ 15+0.05 th 1 Dbl supported wedge (or pawn in the centerbind). A bonus or a liability ? Try a S(5,5) bonus.
15-02-19 mco official diff
ELO: -1.78 +-4.0 (95%) LOS: 19.1%
Total: 9575 W: 1547 L: 1596 D: 6432
10000 @ 15+0.05 th 16 Regression test at 16 threads for the full smp series. It should be a non functional change, but becuase it is SMP stuff it is better to be safe.
15-02-19 mco late_join diff
ELO: 1.36 +-3.9 (95%) LOS: 75.0%
Total: 10000 W: 1690 L: 1651 D: 6659
10000 @ 15+0.05 th 16 Use only 'level' as late join metric: quick test to get a rough idea if this could work. RESCHEDULE with 16 threads (with 4 threads it seems ok).
15-02-20 vin wedges diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 9741 W: 1865 L: 1945 D: 5931
sprt @ 15+0.05 th 1 Ok, final try for this approach. Suitably enlightened about the limitations of SPSA.. since slightly increasing the bonus was bad, try slightly reducing.
15-02-20 jos passed_defdef diff
LLR: -2.95 (-2.94,2.94) [0.00,6.00]
Total: 5110 W: 809 L: 887 D: 3414
sprt @ 60+0.05 th 1 LTC: Bonus for a passer which is supported by a pawn, which again is also defended by a pawn.
15-02-20 sni mobility diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 18318 W: 3674 L: 3730 D: 10914
sprt @ 15+0.05 th 1 Bigger penalty for pieces with very bad mobility
15-02-20 Roc CenterWedge diff
LLR: -2.97 (-2.94,2.94) [-1.50,4.50]
Total: 60597 W: 12047 L: 11988 D: 36562
sprt @ 15+0.05 th 1 1st one was S(5,0). Try with S(10, 0)
15-02-20 jos passed_defdef diff
19222/20000 iterations
38760/40000 games played
40000 @ 60+0.05 th 1 Idea looks promising at STC. Try to tune at LTC.
15-02-20 lan end_double_penalty diff
LLR: -2.97 (-2.94,2.94) [-1.50,4.50]
Total: 17917 W: 3492 L: 3550 D: 10875
sprt @ 15+0.05 th 1 Increase penalty for doubled pawns only on H file. An idea of Lyudmil Tsvetkov.
15-02-20 sni probabilistic diff
LLR: -1.04 (-2.94,2.94) [-1.50,4.50]
Total: 9075 W: 1541 L: 1555 D: 5979
sprt @ 15+0.05 th 4 Experimental run : use the probability of a cut to amend the YBWC strategy
15-02-21 ren remove_is_ok diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 55714 W: 11020 L: 10974 D: 33720
sprt @ 15+0.05 th 1 Remove the calls to the method is_ok(Move). Seems to increase elo in local tests.
15-02-21 sg queen_contact_check diff
LLR: -2.94 (-2.94,2.94) [-1.50,4.50]
Total: 19077 W: 3747 L: 3801 D: 11529
sprt @ 15+0.05 th 1 if no queen contact checks exists give bonus to queen moves which threats such a check
15-02-21 jos passed_defdef diff
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 17331 W: 3514 L: 3360 D: 10457
sprt @ 15+0.05 th 1 Take 2 with tuned values.
15-02-21 mco late_join_full diff
ELO: 3.54 +-4.4 (95%) LOS: 94.4%
Total: 8153 W: 1411 L: 1328 D: 5414
10000 @ 15+0.05 th 16 Don't stop at first failed join attempt: on the bench this pacthes reduced missed joins attempts after locking from 75% to about 10%
15-02-21 sni probabilistic2 diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 13639 W: 2243 L: 2313 D: 9083
sprt @ 15+0.05 th 7 Experimental run 2 : use the probability of a cut to amend the YBWC strategy. This version seemed reasonable in local testing with 3 threads after 400 games (score of stockfish vs base: 83 - 48 - 269 [0.544] 400).
15-02-21 jos passed_defdef diff
LLR: -2.95 (-2.94,2.94) [0.00,6.00]
Total: 15378 W: 2530 L: 2559 D: 10289
sprt @ 60+0.05 th 1 LTC: Take 2 with tuned values.
15-02-21 sni probcut_tweak2 diff
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 45966 W: 9107 L: 8876 D: 27983
sprt @ 15+0.05 th 1 Probcut tweak
15-02-21 ren remove_is_ok diff
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 12817 W: 2559 L: 2423 D: 7835
sprt @ 15+0.05 th 1 Remove the calls to the method is_ok(Move). Seems to increase elo in local tests.
15-02-21 Roc CenterWedge diff
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 50774 W: 10198 L: 9952 D: 30624
sprt @ 15+0.05 th 1 S(15, 0)
15-02-21 Roc WedgeNoBind diff
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 6358 W: 1296 L: 1173 D: 3889
sprt @ 15+0.05 th 1 Retake on Vincent idea about Wedges. Removed overlap with CenterBind cases,
15-02-21 sni probcut_tweak2 diff
LLR: -2.95 (-2.94,2.94) [0.00,6.00]
Total: 13998 W: 2276 L: 2312 D: 9410
sprt @ 60+0.05 th 1 LTC: Probcut tweak
15-02-22 Roc WedgeNoBind diff
LLR: -2.95 (-2.94,2.94) [0.00,6.00]
Total: 17424 W: 2896 L: 2915 D: 11613
sprt @ 60+0.05 th 1 Retake on Vincent idea about Wedges. Removed overlap with CenterBind cases,
15-02-22 ren remove_is_ok diff
LLR: 2.94 (-2.94,2.94) [-3.00,1.00]
Total: 23418 W: 3916 L: 3800 D: 15702
sprt @ 60+0.05 th 1 Remove the calls to the method is_ok(Move). Seems to increase elo in local tests. Passed STC as simplification. LTC as simplification.
15-02-22 Roc SemiOpenRook diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 12362 W: 2385 L: 2458 D: 7519
sprt @ 15+0.05 th 1 SemiOpenFile bonus also when Rook is in front of our pawn.
15-02-22 Roc CenterWedge diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 4879 W: 905 L: 998 D: 2976
sprt @ 15+0.05 th 1 S(20,0) (I lowered the priority of the S(15,0) test)
15-02-22 vin pawn_tweak_5th diff
LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 21204 W: 4130 L: 4213 D: 12861
sprt @ 15+0.05 th 1 We've had a bunch of near misses all of which reward pawns on the 5th rank. Perhaps we simply need to tweak the psq table. Parameter tweak patch.
15-02-22 vin true_wedge diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 52007 W: 10319 L: 10283 D: 31405
sprt @ 15+0.05 th 1 Try detecting only 'true' wedges involving two or more pawns. Annoyingly none of the bench positions yield these so the signature is the same; which might also mean these are too rare to matter. But let's see.
15-02-22 mco measure_level diff
ELO: -12.86 +-13.3 (95%) LOS: 2.9%
Total: 1000 W: 172 L: 209 D: 619
10000 @ 15+0.05 th 16 Measure 'level' metric alone