Stockfish Testing Queue

Pending - 0 tests 0.0 hrs

None

Active - 0 tests

Finished - 482 tests

09-12-17 II contem diff
LLR: -2.95 (-2.94,2.94) [0.00,4.00]
Total: 109961 W: 14232 L: 14095 D: 81634
sprt @ 60+0.6 th 1 LTC: Take 2: Larger change
12-12-17 II contem diff
LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 39958 W: 7273 L: 7293 D: 25392
sprt @ 10+0.1 th 1 Last try on this idea: (Tempo, Contempt) = (24, 2).
12-12-17 II evalcon diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 6651 W: 1307 L: 1405 D: 3939
sprt @ 10+0.1 th 1 Evaluation based contempt - take 2.
11-12-17 II evalcon diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 5854 W: 1084 L: 1185 D: 3585
sprt @ 10+0.1 th 1 Evaluation based contempt - take 1.
09-12-17 II tempo diff
LLR: -2.95 (-2.94,2.94) [0.00,4.00]
Total: 84768 W: 15355 L: 15223 D: 54190
sprt @ 10+0.1 th 1 Is Stockfish ready for higher tempo? (an alternative / check for running LTC test)
10-12-17 II stempo diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 18955 W: 2367 L: 2424 D: 14164
sprt @ 60+0.6 th 1 LTC: Define Tempo as Score.
10-12-17 II stempo diff
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 9323 W: 1749 L: 1585 D: 5989
sprt @ 10+0.1 th 1 Define Tempo as Score.
09-12-17 II contem diff
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 25041 W: 4710 L: 4467 D: 15864
sprt @ 10+0.1 th 1 Take 2: Larger change
09-12-17 II contem diff
LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 23666 W: 4256 L: 4332 D: 15078
sprt @ 10+0.1 th 1 Small increase for contempt and tempo, based on tuning graphs.
08-12-17 II tune_con diff
28747/30000 iterations
59121/60000 games played
60000 @ 10+0.1 th 1 Try to tune Contempt, White Contempt and Tempo (corrected ck values - they were too low for contempt and wcontempt because of the scaling with PawnValueMg)
08-12-17 II wcontempt diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 24000 W: 4304 L: 4327 D: 15369
sprt @ 10+0.1 th 1 Test White Contempt.
08-12-17 II tune_con diff
389/30000 iterations
801/60000 games played
60000 @ 10+0.1 th 1 Try to tune Contempt, White Contempt and Tempo.
02-12-17 II time2 diff
LLR: -2.95 (-2.94,2.94) [0.00,4.00]
Total: 26827 W: 3408 L: 3486 D: 19933
sprt @ 60+0.6 th 1 LTC: Time management - half tuned, half guess
01-12-17 II time2 diff
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 97450 W: 17882 L: 17393 D: 62175
sprt @ 10+0.1 th 1 Time management - half tuned, half guess
01-12-17 II pcv2 diff
LLR: -2.94 (-2.94,2.94) [0.00,4.00]
Total: 26693 W: 4805 L: 4870 D: 17018
sprt @ 10+0.1 th 1 Piece values - half tuned, half guess.
24-11-17 II time diff
LLR: -2.95 (-2.94,2.94) [0.00,4.00]
Total: 170461 W: 31092 L: 30666 D: 108703
sprt @ 10+0.1 th 1 Time management - tuned values.
17-11-17 II tune_time diff
47611/50000 iterations
97259/100000 games played
100000 @ 10+0.1 th 1 One more try to tune time management, half throughput.
22-11-17 II eval diff
LLR: -2.94 (-2.94,2.94) [0.00,5.00]
Total: 12197 W: 2150 L: 2223 D: 7824
sprt @ 10+0.1 th 1 Try also time randomisation, to see a difference.
22-11-17 II eval diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 12939 W: 2307 L: 2377 D: 8255
sprt @ 10+0.1 th 1 Try the Zobrist key randomisation
17-11-17 II pcv diff
LLR: -2.94 (-2.94,2.94) [0.00,4.00]
Total: 23504 W: 4207 L: 4283 D: 15014
sprt @ 10+0.1 th 1 Try tuned values.
15-11-17 II tune_pcv diff
46515/50000 iterations
98356/100000 games played
100000 @ 10+0.1 th 1 Last try to tune piece values with SPSA, half throughput.
12-11-17 II hSMP diff
ELO: -2.65 +-9.6 (95%) LOS: 29.4%
Total: 1313 W: 165 L: 175 D: 973
2000 @ 5+0.05 th 60 Just to see is this promising now when CoffeeOne is in the framework - I'll lower priority of the 30-threaded test.
11-11-17 II tune_pcv diff
48659/50000 iterations
100000/100000 games played
100000 @ 10+0.1 th 1 Tune piece values, hopefully with correct code this time.
11-11-17 II hSMP diff
ELO: 0.42 +-3.7 (95%) LOS: 58.7%
Total: 9030 W: 1243 L: 1232 D: 6555
20000 @ 5+0.05 th 30 As this is identical to master up to 21 threads (see forum) I suspect this should be tested with much higher number of threads. So, fixed number test with 30 threads so far.
05-11-17 II tune_pcv diff
10146/50000 iterations
20682/100000 games played
100000 @ 10+0.1 th 1 The priority of first tuning was set to -1. Trying to catch good values for this tricky tuning where it seems that parameters are very Elo sensitive, close to optimum and also BishopValueMg performs very asymmetric behaviour.
05-11-17 II bishop_mg diff
LLR: -2.95 (-2.94,2.94) [0.00,4.00]
Total: 5673 W: 1009 L: 1147 D: 3517
sprt @ 10+0.1 th 1 Test BishopValueMg from the current tuning - it is very suspicious.
05-11-17 II tune_pcv diff
13411/50000 iterations
27561/100000 games played
100000 @ 10+0.1 th 1 Unfortunatelly, I had too narrow SPSA bounds in my local tunings and tests failed miserably. As a bright side, I found that these values are very Elo sensitive, and I would like to take another try here.
02-11-17 II k_simple diff
LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 75647 W: 9646 L: 9600 D: 56401
sprt @ 60+0.6 th 1 LTC for #1279
01-11-17 II sd_bugfix diff
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 19041 W: 4057 L: 3823 D: 11161
sprt @ 10+0 th 1 Try to see whether there is some Elo gain in this patch (sorry for a mistake in the first run)
01-11-17 II mobility diff
LLR: -2.95 (-2.94,2.94) [0.00,4.00]
Total: 8790 W: 1496 L: 1622 D: 5672
sprt @ 10+0.1 th 1 Mobility - tuned values.
01-11-17 II sd_bugfix diff
LLR: -2.95 (-2.94,2.94) [0.00,4.00]
Total: 290 W: 13 L: 174 D: 103
sprt @ 10+0 th 1 Try to see whether there is some Elo gain in this patch (#1279)
27-10-17 II tune_eval diff
47551/50000 iterations
100000/100000 games played
100000 @ 10+0.1 th 1 Continue tuning mobility as explained in the forum.
29-10-17 II sd_bugfix diff
ELO: 0.44 +-2.9 (95%) LOS: 61.6%
Total: 22000 W: 4532 L: 4504 D: 12964
20000 @ 10+0 th 1 Test #1279 for time losses (Move Overhead = 20, no adjudication rules!)
29-10-17 II master' diff
ELO: 0.52 +-3.1 (95%) LOS: 62.9%
Total: 20000 W: 4143 L: 4113 D: 11744
20000 @ 10+0 th 1 Test master for time losses (Move Overhead = 20, no adjudication rules!)
28-10-17 II sd_bugfix diff
ELO: -1.15 +-3.1 (95%) LOS: 23.4%
Total: 20000 W: 4119 L: 4185 D: 11696
20000 @ 10+0 th 1 Measure time losses for #1279 (without adjudication rules)
28-10-17 II master' diff
ELO: -0.66 +-3.1 (95%) LOS: 33.8%
Total: 20000 W: 4100 L: 4138 D: 11762
20000 @ 10+0 th 1 Measure time losses for master (without adjudication rules)
28-10-17 II psqt diff
LLR: -2.95 (-2.94,2.94) [0.00,4.00]
Total: 12202 W: 2215 L: 2330 D: 7657
sprt @ 10+0.1 th 1 It seems that I have some problem with tuning piece values; try again only psqt values.
28-10-17 II psqt diff
LLR: -2.95 (-2.94,2.94) [0.00,4.00]
Total: 650 W: 101 L: 273 D: 276
sprt @ 10+0.1 th 1 Locally tuned psqt and piece values.
23-10-17 II mobility diff
LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 69326 W: 12289 L: 12213 D: 44824
sprt @ 10+0.1 th 1 Tuned values.
20-10-17 II tune_eval diff
47798/50000 iterations
100000/100000 games played
100000 @ 10+0.1 th 1 Try some tunings based on experience with the simulator - all steps will be explained in the forum.
22-10-17 II sd diff
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 18561 W: 3402 L: 3277 D: 11882
sprt @ 10+0.1 th 1 STC: A simpler solution for long sudden death games (corrected).
22-10-17 II sd diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 15221 W: 2804 L: 2864 D: 9553
sprt @ 16+0 th 1 A more complicated version, but now testing for an Elo gain.
22-10-17 II sd diff
LLR: 3.30 (-2.94,2.94) [-3.00,1.00]
Total: 22876 W: 4278 L: 4142 D: 14456
sprt @ 16+0 th 1 A simpler solution for long sudden death games (corrected).
22-10-17 II sd diff
Pending...
sprt @ 16+0 th 1 A simpler solution for long sudden death games.
14-10-17 II sudden_death diff
LLR: -2.96 (-2.94,2.94) [-3.00,1.00]
Total: 312860 W: 57315 L: 58014 D: 197531
sprt @ 16+0 th 1 Try to reduce time losses in sudden death case.
17-10-17 II psqt diff
LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 12039 W: 2145 L: 2261 D: 7633
sprt @ 10+0.1 th 1 Take 2 - without changing piece values.
17-10-17 II psqt diff
LLR: -2.95 (-2.94,2.94) [0.00,4.00]
Total: 237 W: 10 L: 196 D: 31
sprt @ 10+0.1 th 1 Try to tune psqt tables ones more - these are locally tuned values BEFORE doing simulations.
15-10-17 II sudden_death diff
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 65912 W: 11739 L: 11694 D: 42479
sprt @ 10+0.1 th 1 (standard STC now) Last try: from the previous tests it is obvious that aggressive time usage is unfortunately important for Elo performance, and as this has also slight impact on increment case (should be tested separately), this is my final recommendation for possible reduction of time losses in sudden death case.
15-10-17 II sudden_death diff
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 8636 W: 1646 L: 1504 D: 5486
sprt @ 16+0 th 1 Last try: from the previous tests it is obvious that aggressive time usage is unfortunately important for Elo performance, and as this has also slight impact on increment case (should be tested separately), this is my final recommendation for possible reduction of time losses in sudden death case.
14-10-17 II mtg' diff
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 113963 W: 20008 L: 20042 D: 73913
sprt @ 60/15 th 1 Try to reduce possibility of time losses, take 2 - cleaner solution.