Stockfish Testing Queue

Pending - 0 tests 0.0 hrs

None

Active - 0 tests

Finished - 856 tests

18-03-14 lbr connectivity diff
LLR: -2.95 (-2.94,2.94) [0.00,4.00]
Total: 20739 W: 4259 L: 4342 D: 12138
sprt @ 10+0.1 th 1 Connectivity tune
18-03-14 lbr probcut diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 2545 W: 489 L: 607 D: 1449
sprt @ 10+0.1 th 1 probcut tweak. test on top of PR 1487 (make using qsearch implicit)
18-03-12 lbr probcut diff
LLR: -2.96 (-2.94,2.94) [-3.00,1.00]
Total: 43866 W: 8963 L: 9208 D: 25695
sprt @ 10+0.1 th 1 Probcut: only 1 capture
18-03-12 lbr probcutTTmove diff
LLR: -2.94 (-2.94,2.94) [-3.00,1.00]
Total: 1361 W: 198 L: 360 D: 803
sprt @ 10+0.1 th 1 Probcut TT move
18-03-11 lbr probcut diff
LLR: -2.95 (-2.94,2.94) [-3.00,1.00]
Total: 12989 W: 2669 L: 2857 D: 7463
sprt @ 10+0.1 th 1 take 3: Is this threshold really sensitive, or can it be simplified away ?
18-03-11 lbr probcut diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 16310 W: 3273 L: 3326 D: 9711
sprt @ 10+0.1 th 1 take 2
18-03-11 lbr probcut diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 5789 W: 1144 L: 1246 D: 3399
sprt @ 10+0.1 th 1 Use TT refined eval for probcut threshold
18-02-17 lbr threatByPawn diff
LLR: -2.96 (-2.94,2.94) [-3.00,1.00]
Total: 107825 W: 17850 L: 18177 D: 71798
sprt @ 60+0.6 th 1 simplify ThreatBySafePawn
18-02-17 lbr threatByPawn diff
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 50844 W: 11166 L: 11102 D: 28576
sprt @ 10+0.1 th 1 simplify ThreatBySafePawn
18-02-17 lbr threat diff
LLR: -2.95 (-2.94,2.94) [-3.00,1.00]
Total: 14389 W: 3225 L: 3421 D: 7743
sprt @ 10+0.1 th 1 Drop 'defended' from threats logic (Marco's test) + rescale weights (based on bench), to avoid losing elo by simple biais from optimally tuned values.
18-01-14 lbr space diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 11055 W: 2032 L: 2110 D: 6913
sprt @ 10+0.1 th 1 take 3
18-01-14 lbr space diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 10260 W: 1876 L: 1957 D: 6427
sprt @ 10+0.1 th 1 take 2
18-01-14 lbr space diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 2405 W: 408 L: 524 D: 1473
sprt @ 10+0.1 th 1 Increase space area
17-10-29 lbr noendgame diff
LLR: -2.95 (-2.94,2.94) [-3.00,1.00]
Total: 13751 W: 2441 L: 2622 D: 8688
sprt @ 10+0.1 th 1 Remove all specific endgame knowledge. Leaving only general purpose scaling rules (and of course perfect knowledge for draw by chess rules, and syzygy endgames). Suggested by Marco in PR #1280. This test is running with adjudication disabled (see hack in UCI::value()), to make sure that the inability to convert difficult endgames will be penalized. It is, of course, running without syzygy (impossible to test in fishtest).
17-09-07 lbr stopFaster diff
LLR: 1.76 (-2.94,2.94) [-3.00,1.00]
Total: 128000 W: 23237 L: 23363 D: 81400
sprt @ 10+0.1 th 1 check time 2x more often
17-08-27 lbr qsdraw diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 10469 W: 1866 L: 1947 D: 6656
sprt @ 10+0.1 th 1 Don't check draw in deep QS
17-08-21 lbr noeasy diff
LLR: -2.95 (-2.94,2.94) [-3.00,1.00]
Total: 15490 W: 2862 L: 3048 D: 9580
sprt @ 10+0.1 th 1 retire easy move
17-08-20 lbr time diff
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 16371 W: 3092 L: 2963 D: 10316
sprt @ 40/10 th 1 Restore safety margin of 60ms. Tournament tc.
17-08-19 lbr time diff
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 59664 W: 10945 L: 10891 D: 37828
sprt @ 16+0 th 1 Restore safety margin of 60ms. Sudden death tc.
17-08-19 lbr time diff
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 58008 W: 10674 L: 10617 D: 36717
sprt @ 10+0.1 th 1 Restore safety margin of 60ms. Previous code used 60ms incompressible safety margin, plus an additional 30ms for each "move to go". This patch is also a simplification, removing a small whart.
17-08-15 lbr stats16 diff
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 31489 W: 4532 L: 4295 D: 22662
sprt @ 40+0.4 th 1 16bit stats, 3rd test. Verify that we really have an elo gain with strong hash pressure. This time use longer time control, and larger hash to ensure that the hash size is big w.r.t. CPU cache sizes, and it's not an artificial effect that only works with microscopic hash sizes but doesn't scale.
17-08-13 lbr stats16 diff
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 73542 W: 13058 L: 13026 D: 47458
sprt @ 10+0.1 th 1 16bit stats, 2nd test. Now verify no regression with low hash pressure.
17-08-09 lbr stats16 diff
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 258430 W: 46977 L: 45943 D: 165510
sprt @ 10+0.1 th 1 16bit stats: does reducing memory footprint by 1.2MB translate into a mesurable speed-up? let's find out in 2 tests: (1) Hash=2 (high pressure) (2) Hash=8 (low pressure).
17-08-05 lbr futility^ diff
LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 61689 W: 10966 L: 10915 D: 39808
sprt @ 10+0.1 th 1 futility: 1 depth lower
17-08-05 lbr futility diff
LLR: -3.50 (-2.94,2.94) [0.00,4.00]
Total: 43380 W: 7795 L: 7833 D: 27752
sprt @ 10+0.1 th 1 futility: 1 depth higher
17-08-05 lbr singular3 diff
LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 12208 W: 2085 L: 2200 D: 7923
sprt @ 10+0.1 th 1 singular tweak: take 3
17-08-05 lbr singular4 diff
LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 7098 W: 1241 L: 1374 D: 4483
sprt @ 10+0.1 th 1 singular tweak: take 4
17-08-05 lbr singular diff
LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 4806 W: 817 L: 958 D: 3031
sprt @ 10+0.1 th 1 singular tweak
17-08-05 lbr singular2 diff
LLR: -2.95 (-2.94,2.94) [0.00,4.00]
Total: 9144 W: 1513 L: 1638 D: 5993
sprt @ 10+0.1 th 1 singular tweak: take 2
17-07-01 lbr singular diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 6954 W: 1223 L: 1319 D: 4412
sprt @ 10+0.1 th 1 don't update stats with partial search
17-06-26 lbr recursive_singular diff
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 31680 W: 5701 L: 5598 D: 20381
sprt @ 10+0.1 th 1 Allow recursive SE
17-01-29 lbr master diff
ELO: 8.82 +-1.5 (95%) LOS: 100.0%
Total: 40000 W: 4595 L: 3580 D: 31825
40000 @ 60+0.6 th 1 Regression test until "Simplify TT penalty stat"
17-01-29 lbr hinder diff
LLR: -2.96 (-2.94,2.94) [-3.00,1.00]
Total: 7609 W: 1331 L: 1502 D: 4776
sprt @ 10+0.1 th 1 last try
17-01-28 lbr hinder diff
LLR: -2.95 (-2.94,2.94) [-3.00,1.00]
Total: 35434 W: 6410 L: 6628 D: 22396
sprt @ 10+0.1 th 1 remove HinderPassedPawn, compensating in Passed[MG][]
17-01-28 lbr hinder diff
LLR: -0.02 (-2.94,2.94) [-3.00,1.00]
Total: 89 W: 11 L: 12 D: 66
sprt @ 10+0.1 th 1 do we need HinderPassedPawn ?
17-01-28 lbr ring diff
LLR: -0.02 (-2.94,2.94) [-3.00,1.00]
Total: 23 W: 3 L: 4 D: 16
sprt @ 10+0.1 th 1 simplify king ring
17-01-24 lbr bonus diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 4768 W: 810 L: 915 D: 3043
sprt @ 10+0.1 th 1 depth^1.9
17-01-24 lbr bonus^ diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 2753 W: 423 L: 536 D: 1794
sprt @ 10+0.1 th 1 depth^1.8
17-01-22 lbr tune diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 921 W: 121 L: 244 D: 556
sprt @ 10+0.1 th 1 try tuned values
17-01-22 lbr tune diff
36508/40000 iterations
74996/80000 games played
80000 @ 20+0.2 th 1 tune history
17-01-13 lbr lazy diff
LLR: -2.96 (-2.94,2.94) [-3.00,1.00]
Total: 40683 W: 7223 L: 7449 D: 26011
sprt @ 10+0.1 th 1 simplify the lazy eval, but respecting the normal eval logic (mg/eg blending + tempo)
17-01-12 lbr counterMoves diff
LLR: -2.96 (-2.94,2.94) [-3.00,1.00]
Total: 21975 W: 3177 L: 3360 D: 15438
sprt @ 30+0.3 th 1 do we need counterMoves ? Rerun STC at 30+0.3 (Throughout x1/3), because I suspect counterMoves do not scale, and CMH take over at longer tc.
17-01-11 lbr counterMoves diff
LLR: -2.95 (-2.94,2.94) [-3.00,1.00]
Total: 27570 W: 4844 L: 5047 D: 17679
sprt @ 10+0.1 th 1 do we need counterMoves ?
17-01-10 lbr counterMoves diff
LLR: -2.97 (-2.94,2.94) [0.00,5.00]
Total: 85688 W: 11144 L: 10985 D: 63559
sprt @ 60+0.6 th 1 LTC: Use (from,to) instead of (pc,to) for MoveStats
17-01-10 lbr counterMoves diff
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 39202 W: 7113 L: 6823 D: 25266
sprt @ 10+0.1 th 1 Use (from,to) instead of (pc,to) for MoveStats
17-01-08 lbr history diff
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 26565 W: 3519 L: 3406 D: 19640
sprt @ 60+0.6 th 1 LTC: do we still need HistoryStats ?
17-01-08 lbr history diff
LLR: -2.95 (-2.94,2.94) [-3.00,1.00]
Total: 6780 W: 1122 L: 1289 D: 4369
sprt @ 10+0.1 th 1 take 2: double FromToStats to compensate for the removal of HistoryStats
17-01-08 lbr history diff
LLR: 3.44 (-2.94,2.94) [-3.00,1.00]
Total: 120831 W: 21572 L: 21594 D: 77665
sprt @ 10+0.1 th 1 do we still need HistoryStats ?
17-01-07 lbr tune diff
LLR: -2.94 (-2.94,2.94) [0.00,5.00]
Total: 36974 W: 4822 L: 4819 D: 27333
sprt @ 60+0.6 th 1 LTC: test tuned values. SPRT(0,5), because (0,4) is really too costly.
17-01-06 lbr tune diff
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 224957 W: 40981 L: 40060 D: 143916
sprt @ 10+0.1 th 1 test tuned values: rescheduling, as it was stopped by biffhero (i don't know if this is the known fishtest bug causing wrong bench from time to time...)