Stockfish Testing Queue

Pending - 0 tests 0.0 hrs

None

Active - 0 tests

Finished - 179 tests

19-05-15 Ala master diff
ELO: 28.93 +-1.7 (95%) LOS: 100.0%
Total: 40000 W: 6896 L: 3573 D: 29531
40000 @ 30+0.3 th 8 Multicore regression/progression test against SF10 after "Update failedHighCnt rule" of May 15th. Low TP.
19-05-15 Ala master diff
ELO: 19.76 +-1.8 (95%) LOS: 100.0%
Total: 40000 W: 6938 L: 4665 D: 28397
40000 @ 60+0.6 th 1 Regression/progression test against SF10 after "Update failedHighCnt rule" of May 15th.
19-05-06 Ala BadOutpost diff
LLR: -2.95 (-2.94,2.94) [0.50,4.50]
Total: 5647 W: 1205 L: 1342 D: 3100
sprt @ 10+0.1 th 1 Rate a knight outpost depending on how many enemy pieces (non-pawn) are close to it. First attempt with guessed values.
19-05-05 Ala TrappedRookPlus diff
LLR: -2.95 (-2.94,2.94) [0.50,4.50]
Total: 17225 W: 3780 L: 3859 D: 9586
sprt @ 10+0.1 th 1 If a rook on rank 1 is completely surrounded by friendly pieces, consider its mobility to be 0.
19-05-03 Ala BishopPawnShelter diff
LLR: -2.95 (-2.94,2.94) [0.50,4.50]
Total: 15188 W: 3377 L: 3466 D: 8345
sprt @ 10+0.1 th 1 Fixed version
19-05-03 Ala BishopPawnShelter diff
LLR: -2.96 (-2.94,2.94) [0.50,4.50]
Total: 10289 W: 2227 L: 2341 D: 5721
sprt @ 10+0.1 th 1 Bench is the same at default depth, but it changes at higher depth.
19-04-29 Ala GlobalMobility diff
LLR: -1.69 (-2.94,2.94) [0.50,4.50]
Total: 421 W: 63 L: 158 D: 200
sprt @ 10+0.1 th 1 The idea behind this test is to not only evaluate each piece mobility separately, but to look at all the mobilities together to better detect cramped positions. First crude attempt.
19-04-24 Ala master diff
ELO: 16.39 +-1.8 (95%) LOS: 100.0%
Total: 40000 W: 6519 L: 4634 D: 28847
40000 @ 60+0.6 th 1 Regression/progression test against SF10 after "Remove useless initializations" of April 24th.
19-04-19 Ala 82ad9ce9cfb0eff33f1d781 diff
ELO: 24.33 +-1.7 (95%) LOS: 100.0%
Total: 40000 W: 6430 L: 3633 D: 29937
40000 @ 30+0.3 th 8 As the framework is empty, try out a multicore regression test (same commit as last regular regression test) as suggested in issue #2094. Very low TP.
19-04-07 Ala KDconversionTune diff
9676/10000 iterations
20000/20000 games played
20000 @ 10+0.1 th 1 STC verification for a new set of improved variance parameters.
19-04-07 Ala KDconversionTune diff
1343/10000 iterations
2779/20000 games played
20000 @ 10+0.1 th 1 The conversion formula from king danger to mg and eg scores is somewhat arbitrary. This tune is meant to check the resilience of the kd->score mapping, and if the values deviate significantly, to guide future attempts at new formulas. This is a STC verification for variance parameters, as it is critical that the values move slowly enough to not mess up ordering, but that they still move.
19-04-06 Ala CenterBlockPawns diff
LLR: -1.78 (-2.94,2.94) [0.00,3.50]
Total: 62974 W: 10958 L: 10883 D: 41133
sprt @ 60+0.6 th 1 This was tuned at LTC and ended up neutral at STC. Spec LTC, low TP.
19-04-06 Ala CenterBlockPawns diff
LLR: -2.95 (-2.94,2.94) [0.50,4.50]
Total: 32508 W: 7235 L: 7238 D: 18035
sprt @ 10+0.1 th 1 Test tune results
19-04-02 Ala CenterBlockPawnsTune diff
71115/75000 iterations
148293/150000 games played
150000 @ 60+0.6 th 1 Variance lowered after STC test indicating it should still allow values to move. Very low TP to start to not slow down the current spec LTCs, I may adapt later depending on fishtest load.
19-04-02 Ala CenterBlockPawnsTune diff
6700/7500 iterations
14263/15000 games played
15000 @ 10+0.1 th 1 This short STC tuning run is meant to sanity-check the variance values before running a long LTC tune,
19-04-01 Ala master diff
ELO: 16.58 +-1.8 (95%) LOS: 100.0%
Total: 40000 W: 6649 L: 4742 D: 28609
40000 @ 60+0.6 th 1 Regression/progression test against SF10 after "Assorted trivial cleanups 3/2019" of March 31st.
19-03-14 Ala VizMinorPawnTune diff
48030/50000 iterations
100000/100000 games played
100000 @ 30+0.3 th 1 Restart tune for Viz with higher variance as values moved less than expected
19-03-17 Ala QueenDraw diff
LLR: -2.96 (-2.94,2.94) [0.50,4.50]
Total: 18162 W: 3982 L: 4057 D: 10123
sprt @ 10+0.1 th 1 Detect many additional KQ(P)KRPs draws. (I know bench doesn't change, but it's functionally different)
19-03-15 Ala master diff
ELO: 16.58 +-1.8 (95%) LOS: 100.0%
Total: 40000 W: 6542 L: 4635 D: 28823
40000 @ 60+0.6 th 1 Regression/progression test against SF10 after "Increase thread stack for OS X (#2035)" of March, 12th.
19-03-14 Ala VizMinorPawnTune diff
3235/50000 iterations
6851/100000 games played
100000 @ 30+0.3 th 1 Tune minorBehindPawn array depending on minor rank for Viz.
19-03-05 Ala EnemyPawnShield diff
LLR: -2.96 (-2.94,2.94) [0.50,4.50]
Total: 20956 W: 4633 L: 4694 D: 11629
sprt @ 10+0.1 th 1 Take 1
19-03-02 Ala RookDefence diff
LLR: -2.96 (-2.94,2.94) [0.50,4.50]
Total: 16650 W: 3518 L: 3601 D: 9531
sprt @ 10+0.1 th 1 Penalize defending rook which can't move without dropping a piece
19-02-21 Ala MinorSynergyTune diff
23628/25000 iterations
50002/50000 games played
50000 @ 20+0.2 th 1 Score adjustment for Qv3M depending on passed pawns and how well the minor side cover its pieces. Short tune at high variance to quickly get a rough approximate
19-02-20 Ala TrappedRookExtended diff
LLR: -2.32 (-2.94,2.94) [0.50,4.50]
Total: 30908 W: 6880 L: 6856 D: 17172
sprt @ 10+0.1 th 1 Separating piece evals appears to cause a huge hit to performance. Interestingly, the trapped rook additional malus failed but by significantly less. Testing against the non-functional version to see if it really is better and deserve work for a more efficient implementation. Low TP.
19-02-20 Ala TiedUpQueen diff
LLR: -2.96 (-2.94,2.94) [0.50,4.50]
Total: 9040 W: 1923 L: 2043 D: 5074
sprt @ 10+0.1 th 1 Additional bonus when restricted enemy piece is the queen
19-02-20 Ala 4f94ff9cfbe5074bce82fdd diff
LLR: -2.94 (-2.94,2.94) [-3.00,1.00]
Total: 4089 W: 849 L: 1023 D: 2217
sprt @ 10+0.1 th 1 Measure impact of separating piece attack/mobility setup and piece eval. No functional change, low TP.
19-02-20 Ala KingMob diff
LLR: -0.60 (-2.94,2.94) [0.50,4.50]
Total: 63 W: 4 L: 38 D: 21
sprt @ 10+0.1 th 1 Take 1, eg kingdanger malus for king which has very low mobility.
19-02-19 Ala TrappedRookExtended diff
LLR: -2.95 (-2.94,2.94) [0.50,4.50]
Total: 5399 W: 1099 L: 1236 D: 3064
sprt @ 10+0.1 th 1 Take 1 : bigger trapped rook malus if king has no mobility
19-02-18 Ala AttackedByX diff
LLR: -2.96 (-2.94,2.94) [0.50,4.50]
Total: 1934 W: 371 L: 528 D: 1035
sprt @ 10+0.1 th 1 Test tune results
19-02-17 Ala AttackedByXTune diff
95715/100000 iterations
199271/200000 games played
200000 @ 20+0.2 th 1 Tune for ThreatBy depending on attacker count,(found and fixed cause of bench issues)
19-02-17 Ala AttackedByXTune diff
136/100000 iterations
372/200000 games played
200000 @ 20+0.2 th 1 Make ThreatBy more consistent in regard to accumulation of bonus when several attacking pieces can trigger a bonus. Tune at 20+0.2 only to allow a higher number of games as there are many values to tune. (Fixed initialization)
19-02-17 Ala AttackedByXTune diff
5636/100000 iterations
11747/200000 games played
200000 @ 20+0.2 th 1 Make ThreatBy more consistent in regard to accumulation of bonus when several attacking pieces can trigger a bonus. Tune at 20+0.2 only to allow a higher number of games as there are many values to tune.
19-02-15 Ala AttackedByXTune diff
202/60000 iterations
541/120000 games played
120000 @ 30+0.3 th 1 Continue tune with fixed fake check issue and values from stopped tune. Test separating knight and bishop threat to see if there is any noticeable divergence.
19-02-14 Ala AttackedByXTune diff
14758/75000 iterations
30807/150000 games played
150000 @ 30+0.3 th 1 Queen x-ray and attackedBy3 attempts have been unsuccessful. Try to combine them and to improve the most affected element : "threat by" arrays. (Queen x-rays are still ignored for mobility)
19-02-12 Ala master diff
LLR: -2.96 (-2.94,2.94) [-3.00,1.00]
Total: 44941 W: 9013 L: 9258 D: 26670
sprt @ 10+0.1 th 1 Framework is empty, so I'm scheduling this non-regression test for C=32 vs C=0. Low TP.
19-02-11 Ala SafePawnThreatTune diff
48230/50000 iterations
100000/100000 games played
100000 @ 30+0.03 th 1 Tune v2 for Viz
19-02-11 Ala SafePawnThreatTune diff
48172/50000 iterations
100000/100000 games played
100000 @ 30+0.3 th 1 Tune for Viz : different threat by safe pawn depending on targeted piece
19-02-09 Ala AssymKingPSQT diff
LLR: -1.50 (-2.94,2.94) [0.00,3.50]
Total: 95656 W: 16166 L: 15986 D: 63504
sprt @ 60+0.6 th 1 Spec LTC, as it may be better on longer TC. Low TP.
19-02-09 Ala AssymKingPSQT diff
LLR: -2.95 (-2.94,2.94) [0.50,4.50]
Total: 34715 W: 7616 L: 7609 D: 19490
sprt @ 10+0.1 th 1 Test tune results
19-02-09 Ala PawnPushTweakBoth diff
LLR: -2.96 (-2.94,2.94) [0.50,4.50]
Total: 16736 W: 3652 L: 3734 D: 9350
sprt @ 10+0.1 th 1 Test tune results for an attempt at more accurate pawn push threats
19-02-07 Ala PawnPushTweakTune diff
47831/50000 iterations
99214/100000 games played
100000 @ 20+0.2 th 1 Quick tune for differentiated parameters for threats by pawn push
19-02-06 Ala 8x8KingPSQTune2 diff
28739/30000 iterations
59714/60000 games played
60000 @ 60+0.6 th 1 STC with raw unsmoothed values did +1.5 elo. Finish tune on rebased master with lower variance to help convergence and some pawn value co-tuning.
19-02-07 Ala PawnPushTweak diff
LLR: -2.95 (-2.94,2.94) [0.50,4.50]
Total: 18211 W: 4023 L: 4097 D: 10091
sprt @ 10+0.1 th 1 Take 3, try to avoid "blocking" a pawn push threat by doing a reckless counter pawn push. (fixed bench)
19-02-06 Ala 8x8KingPSQT diff
LLR: -1.68 (-2.94,2.94) [0.50,4.50]
Total: 79475 W: 17620 L: 17321 D: 44534
sprt @ 10+0.1 th 1 Test 106K 8x8 king PSQT tune results (non-smoothed)
19-02-06 Ala PawnPushTweak diff
LLR: -2.95 (-2.94,2.94) [0.50,4.50]
Total: 23652 W: 5218 L: 5265 D: 13169
sprt @ 10+0.1 th 1 Take 2 : fix conditions so that double square pawn push doesn't require intermediate square to be safe, only not attacked by a pawn.
19-02-06 Ala PawnPushTweak diff
LLR: -2.95 (-2.94,2.94) [0.50,4.50]
Total: 13431 W: 2882 L: 2980 D: 7569
sprt @ 10+0.1 th 1 Try to better assess pawn push threats to better evaluate positions like those in the lost game 11 of TCEC SuFi
19-02-05 Ala KingSafetyParams diff
LLR: -2.96 (-2.94,2.94) [0.00,3.50]
Total: 88935 W: 14711 L: 14640 D: 59584
sprt @ 60+0.6 th 1 LTC for king safety params tune (yellow STC was with elo gainer bounds, would have passed with param tweak bounds)
19-02-06 Ala DynamContemptTweak diff
LLR: -2.96 (-2.94,2.94) [0.50,4.50]
Total: 10501 W: 2291 L: 2404 D: 5806
sprt @ 10+0.1 th 1 Try to make SF accept smaller compensation to simplify if position is bad (< -0.4cp against itself) to limit "contempt blunders".
19-02-05 Ala KingSafetyParams diff
LLR: -2.95 (-2.94,2.94) [0.50,4.50]
Total: 88849 W: 19567 L: 19294 D: 49988
sprt @ 10+0.1 th 1 Test tuning results
19-02-04 Ala KingSafetyTune2 diff
24023/25000 iterations
49975/50000 games played
50000 @ 60+0.6 th 1 Finish king safety tuning with tweaked variances.