Stockfish Testing Queue

Pending - 0 tests 0.0 hrs

None

Active - 0 tests

Finished - 949 tests

18-08-17 jo adaptive_pruning diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 8714 W: 1546 L: 1634 D: 5534
sprt @ 10+0.1 th 1 Take 2, based on material. (Bugix, thanks Stefan!)
18-08-17 jo adaptive_pruning diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 1601 W: 252 L: 372 D: 977
sprt @ 10+0.1 th 1 Take 2, based on material.
17-08-17 jo adaptive_pruning diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 1491 W: 225 L: 345 D: 921
sprt @ 10+0.1 th 1 First draft of adaptive pruning based on number of legal moves.
07-08-17 jo always_save_draws diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 52323 W: 9389 L: 9293 D: 33641
sprt @ 10+0.1 th 1 Always save draw scores into TT. (Not committable since it doesn't handle contempt setting, but let's see how it's doing.)
30-07-17 jo qr_battery diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 10767 W: 1854 L: 1934 D: 6979
sprt @ 10+0.1 th 1 Take 2.
29-07-17 jo qr_battery diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 22581 W: 4003 L: 4033 D: 14545
sprt @ 10+0.1 th 1 Queen rook battery. Take 1.
24-07-17 jo rook_psqt diff
LLR: -2.95 (-2.94,2.94) [0.00,4.00]
Total: 10948 W: 1910 L: 2029 D: 7009
sprt @ 10+0.1 th 1 Rook psqt endgame tweak.
21-07-17 jo noEvalInTT^ diff
ELO: -17.13 +-6.5 (95%) LOS: 0.0%
Total: 4000 W: 636 L: 833 D: 2531
10000 @ 10+0.1 th 1 Just a quick measure of how much we lose by not hashing the static eval. (2 entries per cluster.)
19-07-17 jo rewrite_staticEval diff
LLR: -2.93 (-2.94,2.94) [0.00,5.00]
Total: 5289 W: 935 L: 1037 D: 3317
sprt @ 10+0.1 th 1 Rewrite static evaluation part in search() and qsearch(). Because it's a functional change, test with SPRT(0, 5).
19-07-17 jo imb4 diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 6099 W: 1026 L: 1125 D: 3948
sprt @ 10+0.1 th 1 Try a pawn-count based bonus for the bishop pair.
17-07-17 jo tt_tweak diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 9702 W: 1695 L: 1779 D: 6228
sprt @ 10+0.1 th 1 Another TT tweak.
15-07-17 jo bestmax diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 6813 W: 994 L: 1090 D: 4729
sprt @ 5+0.05 th 7 One more try.
14-07-17 jo bestmax diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 3198 W: 427 L: 536 D: 2235
sprt @ 5+0.05 th 7 Pick the thread with higher selDepth now that we have a more accurate info for each best move and each thread.
10-07-17 jo imb8 diff
LLR: -2.95 (-2.94,2.94) [0.00,4.00]
Total: 83815 W: 10648 L: 10581 D: 62586
sprt @ 60+0.6 th 1 LTC: Try the opposite, halve imbalance resolution. This would noticeably decrease parameter space and increase tuning sensitivity.
12-07-17 jo imb1 diff
LLR: -2.94 (-2.94,2.94) [0.00,4.00]
Total: 25671 W: 4546 L: 4615 D: 16510
sprt @ 10+0.1 th 1 Do we really need the increased resolution for evaluating material imbalances? Final test of this series.
10-07-17 jo imb4 diff
LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 61099 W: 11012 L: 10961 D: 39126
sprt @ 10+0.1 th 1 Go even further and quarter imbalance resolution.
10-07-17 jo imb8 diff
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 27094 W: 4990 L: 4742 D: 17362
sprt @ 10+0.1 th 1 Try the opposite, halve imbalance resolution. This would noticeably decrease parameter space and increase tuning sensitivity.
04-07-17 jo imbalanceNEU diff
LLR: -2.95 (-2.94,2.94) [-3.00,1.00]
Total: 1589 W: 272 L: 439 D: 878
sprt @ 10+0.1 th 1 Attempt to rewrite imbalance eval. Start with bishop pair. (Test as simplification, even though the majoritiy of loc's deleted are simple constant values. But this may still finish faster than fixed number of games test, and will yet give enough information how much worse this is, too!)
03-07-17 jo redundancy_fix diff
LLR: -2.97 (-2.94,2.94) [0.00,5.00]
Total: 25554 W: 4600 L: 4617 D: 16337
sprt @ 10+0.1 th 1 Now test Knight and Queen redundancy. Take 2.
02-07-17 jo redundancy_fix diff
ELO: -2.19 +-2.9 (95%) LOS: 6.9%
Total: 20000 W: 3542 L: 3668 D: 12790
20000 @ 10+0.1 th 1 A quick check to see what happens, if we correctly only apply the redundancy bonus if we have more than one piece of a kind. (Bugfix for bishop pair.)
02-07-17 jo redundancy_fix diff
ELO: -42.43 +-8.7 (95%) LOS: 0.0%
Total: 2691 W: 429 L: 756 D: 1506
20000 @ 10+0.1 th 1 A quick check to see what happens, if we correctly only apply the redundancy bonus if we have more than one piece of a kind.
28-06-17 jo singular1 diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 24129 W: 4282 L: 4305 D: 15542
sprt @ 10+0.1 th 1 Compensate by increasing search depth. Take 2.
27-06-17 jo singular1 diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 8572 W: 1249 L: 1338 D: 5985
sprt @ 20+0.2 th 1 Don't allow singular extensions during singular search. This should make singular searches a bit cheaper, especially at higher depths. But maybe also a bit more unreliable? (Bench only changes at higher depths, therefore I test at 20+0.2.)
25-06-17 jo aspiration_simple diff
LLR: -2.95 (-2.94,2.94) [-3.00,1.00]
Total: 12016 W: 2040 L: 2216 D: 7760
sprt @ 10+0.1 th 1 Simplification. Leave the non-failing bounds untouched.
24-06-17 jo aspiration_full diff
LLR: -2.95 (-2.94,2.94) [0.00,4.00]
Total: 12211 W: 2171 L: 2286 D: 7754
sprt @ 10+0.1 th 1 Now go the full way and always center window around new score. Take 2. (Suggested by Uri)
23-06-17 jo aspiration_full diff
LLR: -3.26 (-2.94,2.94) [0.00,4.00]
Total: 80431 W: 14368 L: 14269 D: 51794
sprt @ 10+0.1 th 1 Aspiration change. (Still ok to test as parameter tweak?!)
19-06-17 jo statsFix diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 5410 W: 959 L: 1062 D: 3389
sprt @ 10+0.1 th 1 Reset stat scores for TT move and captures and also decrease reduction for good moves in these cases.Take 2.
19-06-17 jo qimbalance diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 15608 W: 2810 L: 2869 D: 9929
sprt @ 10+0.1 th 1 Retest queen vs 3 minors imbalance tweak.
19-06-17 jo statsFix diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 6478 W: 1129 L: 1227 D: 4122
sprt @ 10+0.1 th 1 Reset stat scores for TT move and captures and also further increase reduction for bad moves in these cases.
17-06-17 jo statsT diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 29004 W: 5297 L: 5298 D: 18409
sprt @ 10+0.1 th 1 Exclude promotions as previous quiet ttMove.
10-06-17 jo null_tweak2 diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 31989 W: 5864 L: 5852 D: 20273
sprt @ 10+0.1 th 1 Only allow consecutive null moves if the static eval of the last null move was high enough. Take 2.
10-06-17 jo null_tweak2 diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 8375 W: 1525 L: 1615 D: 5235
sprt @ 10+0.1 th 1 Don't allow immediate consecutive null moves. Take 1. Idea is that 1 null move might work but 2 are already too generous.
30-05-17 jo measure_endgames diff
ELO: -5.25 +-2.9 (95%) LOS: 0.0%
Total: 20000 W: 3522 L: 3824 D: 12654
20000 @ 10+0.1 th 1 Endgame experiment. A quick measure of most of the endgames deleted. (Half throughput)
23-05-17 jo imbalance_depth1 diff
LLR: -2.95 (-2.94,2.94) [0.00,4.00]
Total: 4727 W: 885 L: 1028 D: 2814
sprt @ 10+0.1 th 1 Another tuning experiment.
20-05-17 jo lmr_threshold1 diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 13776 W: 2445 L: 2512 D: 8819
sprt @ 10+0.1 th 1 LMR move count threshold tweak.
20-05-17 jo outpost_depth1 diff
LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 24200 W: 4360 L: 4434 D: 15406
sprt @ 10+0.1 th 1 Almost passed attempt plus changed Lever values.
12-05-17 jo lazy_skip2 diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 2416 W: 345 L: 458 D: 1613
sprt @ 5+0.05 th 7 Two ideas combined.
08-05-17 jo no_protector diff
LLR: -2.95 (-2.94,2.94) [-3.00,1.00]
Total: 46587 W: 8428 L: 8665 D: 29494
sprt @ 10+0.1 th 1 Simplify away Protector eval and try to compensate by modifying piece values. Take 1.
03-05-17 jo issue502 diff
LLR: 2.95 (-2.94,2.94) [-4.00,0.00]
Total: 41443 W: 5339 L: 5299 D: 30805
sprt @ 60+0.6 th 1 Make sure the final solution for issue #502 doesn't regress! See PR #1074
26-04-17 jo outpost_depth1 diff
LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 17355 W: 3060 L: 3158 D: 11137
sprt @ 10+0.1 th 1 Now with some more tuning on pawns, each of them with not less than 1 million games.
23-04-17 jo lazy_skip2 diff
LLR: -2.94 (-2.94,2.94) [0.00,5.00]
Total: 20272 W: 3171 L: 3215 D: 13886
sprt @ 5+0.05 th 7 Also skip lower depths than the main thread already finished searching. Seems to be a non-issue at shallow depths.
22-04-17 jo lazy_random diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 7735 W: 1168 L: 1260 D: 5307
sprt @ 5+0.05 th 7 Now correctly search all shuffled root moves, but only for the first iteration. The effect is that more nodes are being searched. Whether this is good or bad ... ?
16-04-17 jo lazy_random diff
LLR: -1.72 (-2.94,2.94) [0.00,5.00]
Total: 32000 W: 5176 L: 5123 D: 21701
sprt @ 5+0.05 th 7 Does it help to shuffle the root moves of the helper threads?
12-04-17 jo outpost_depth1 diff
LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 270128 W: 49716 L: 48945 D: 171467
sprt @ 10+0.1 th 1 Final try with some more tuned values.
13-04-17 jo lazyOddEven diff
ELO: -8.57 +-6.8 (95%) LOS: 0.7%
Total: 3000 W: 419 L: 493 D: 2088
3000 @ 5+0.05 th 11 Try a odd-even distribution of the root moves for higher threads. (Compare with tries of St├ęphane.)
10-04-17 jo sf2 diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 3460 W: 565 L: 675 D: 2220
sprt @ 10+0.1 th 1 Scale more. Take 2.
10-04-17 jo sf2 diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 16177 W: 2962 L: 3018 D: 10197
sprt @ 10+0.1 th 1 Endgames with still lots of pawns tend to be drawish. Take 1.
07-04-17 jo outpost_depth1 diff
LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 31397 W: 5653 L: 5703 D: 20041
sprt @ 10+0.1 th 1 Result of a tuning at fixed depth 1 with a much narrower range.
06-04-17 jo outpost_depth1 diff
LLR: -2.95 (-2.94,2.94) [0.00,4.00]
Total: 7283 W: 1290 L: 1422 D: 4571
sprt @ 10+0.1 th 1 A last fixed depth 1 tuning experiment with values after more than 1.2 million games played. (most likely still not enough!)
05-04-17 jo threats_tuning2 diff
LLR: -2.95 (-2.94,2.94) [0.00,4.00]
Total: 40621 W: 7317 L: 7335 D: 25969
sprt @ 10+0.1 th 1 Take 2 with updated values.