Stockfish Testing Queue

Finished - 50774 tests

15-10-20 mbo accurate_tt diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 2776 W: 416 L: 528 D: 1832
sprt @ 10+0.1 th 7 Test a more accurate TT entry. (Second try, because my master was not updated)
15-10-20 mbo accurate_tt diff
LLR: 0.19 (-2.94,2.94) [0.00,5.00]
Total: 74 W: 15 L: 7 D: 52
sprt @ 10+0.1 th 7 Test a more accurate TT entry.
15-10-05 Voy Simple diff
LLR: 0.17 (-2.94,2.94) [-3.00,1.00]
Total: 128000 W: 19228 L: 19408 D: 89364
sprt @ 60+0.05 th 1 LTC: Simplification of recent passed test (YellowCombo)...hoping that this may get a bit of ELO as well.
15-10-20 Roc RookExperiment diff
LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 12110 W: 2385 L: 2501 D: 7224
sprt @ 10+0.1 th 1 Do we need rook x-ray through rook ? Run as a parameter tweak.
15-10-20 sni king_march2 diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 17399 W: 3403 L: 3452 D: 10544
sprt @ 10+0.1 th 1 Try to use vertical king separation (take 3)
15-10-15 Roc MobExtended diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 17449 W: 3350 L: 3399 D: 10700
sprt @ 10+0.1 th 1 Take 2: only extend the Rook
15-10-19 Roc MinorThreatSimplified diff
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 74116 W: 11001 L: 10958 D: 52157
sprt @ 60+0.4 th 1 Simpler threat handling
15-10-20 IIv time_management diff
LLR: -2.97 (-2.94,2.94) [0.00,5.00]
Total: 32390 W: 4953 L: 4954 D: 22483
sprt @ 60+0.05 th 1 Time usage, sprt[0,5], LTC, 60 + 0.05 is more interesting here than 40 + 0.4, because 40/0.4=10/0.1
15-10-20 sni king_march diff
LLR: -2.97 (-2.94,2.94) [0.00,5.00]
Total: 6927 W: 1357 L: 1454 D: 4116
sprt @ 10+0.1 th 1 Try to use vertical king separation (take 2, double bonus)
15-10-20 IIv time_management diff
LLR: -0.03 (-2.94,2.94) [0.00,5.00]
Total: 76 W: 8 L: 9 D: 59
sprt @ 60+0.4 th 1 Time usage, sprt[0,5], Take2, LTC
15-10-20 tvi lazy_NUMA2 diff
ELO: -6.52 +-7.9 (95%) LOS: 5.4%
Total: 2664 W: 458 L: 508 D: 1698
5000 @ 10+0.1 th 3 Retest of rewritten version on non NUMA
15-10-19 IIv time_management diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 8549 W: 1596 L: 1685 D: 5268
sprt @ 10+0.1 th 1 Time usage, sprt[0,5], new parameter added
15-10-19 tvi lazy_NUMA2 diff
ELO: -31.39 +-35.7 (95%) LOS: 4.1%
Total: 111 W: 12 L: 22 D: 77
5000 @ 10+0.1 th 23 Quick test of NUMA hack on NUMA iron
15-10-18 mco lazy_smp diff
ELO: 28.01 +-6.7 (95%) LOS: 100.0%
Total: 2275 W: 349 L: 166 D: 1760
10000 @ 120+0.1 th 20 lazy smp pre-merge test: high-threads scenario. With 20 threads test with fixed number of games at XLTC to compare against same conditions at LTC.
15-10-20 Voy SPL diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 12458 W: 2329 L: 2401 D: 7728
sprt @ 10+0.1 th 1 Try improve SEE logic pruning...
15-10-19 IIv time_management diff
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 25583 W: 5011 L: 4770 D: 15802
sprt @ 10+0.1 th 1 Time usage, sprt[0,5], Take2
15-10-19 sni pawn_mobility3 diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 8210 W: 1537 L: 1627 D: 5046
sprt @ 10+0.1 th 1 Pawn mobility in endgame
15-10-19 Voy QuietMoves diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 4558 W: 823 L: 930 D: 2805
sprt @ 10+0.1 th 1 Include pseudo quiet moves that has negative SEE value, in our stats.
15-10-19 gli razor_margin diff
LLR: -0.33 (-2.94,2.94) [0.00,5.00]
Total: 396 W: 42 L: 54 D: 300
sprt @ 240+0.4 th 3 Testing timeout fix.
15-10-19 IIv time_management diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 12148 W: 2337 L: 2410 D: 7401
sprt @ 10+0.1 th 1 Time usage, sprt [0,5], Take1
15-10-19 sni material5 diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 16728 W: 3131 L: 3184 D: 10413
sprt @ 10+0.1 th 1 Keep balanced material to attack (take2, lower malus)
15-10-19 sg initiative diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 47259 W: 9174 L: 9089 D: 28996
sprt @ 10+0.1 th 1 Use pawn span for initiative. Half weight. Take 4
15-10-19 Roc MinorThreatSimplified diff
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 18879 W: 3725 L: 3600 D: 11554
sprt @ 10+0.1 th 1 Simpler threat handling
15-10-15 Roc MobExtended diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 59483 W: 11382 L: 11245 D: 36856
sprt @ 10+0.1 th 1 Take 3: extend only the Bishop
15-10-19 Voy HistoryStats diff
LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 42420 W: 8177 L: 8184 D: 26059
sprt @ 10+0.1 th 1 Take 2. (More aggressive tweak)
15-10-19 sg initiative diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 16502 W: 3194 L: 3247 D: 10061
sprt @ 10+0.1 th 1 Pawn count and pawn span are highly correlated (r=0.78). So try to replace former with later one.
15-10-19 sg initiative diff
LLR: -2.94 (-2.94,2.94) [0.00,5.00]
Total: 20994 W: 4103 L: 4135 D: 12756
sprt @ 10+0.1 th 1 Use pawn span for initiative. Double up weight. Take 2
15-10-18 sni material5 diff
LLR: -2.97 (-2.94,2.94) [0.00,5.00]
Total: 20309 W: 3871 L: 3908 D: 12530
sprt @ 10+0.1 th 1 Keep balanced material to attack
15-10-19 gli razor_margin diff
LLR: -0.12 (-2.94,2.94) [0.00,5.00]
Total: 1070 W: 103 L: 105 D: 862
sprt @ 240+0.4 th 3 Testing timeout fix.
15-10-18 jhe simple_time_2 diff
LLR: -3.04 (-2.94,2.94) [-3.00,1.00]
Total: 7571 W: 1065 L: 1232 D: 5274
sprt @ 40+0.4 th 1 LTC
15-10-18 sni material5 diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 12721 W: 2415 L: 2485 D: 7821
sprt @ 10+0.1 th 1 Try to keep enough material (take 2)
15-10-18 IIv time_management diff
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 15253 W: 2998 L: 2866 D: 9389
sprt @ 10+0.1 th 1 My last try to help with time_management. I know this is not an ideal solution, but maybe could serve until better solution would be found.
15-10-18 tvi lazy_NUMA2 diff
ELO: 0.23 +-6.0 (95%) LOS: 53.0%
Total: 4551 W: 801 L: 798 D: 2952
5000 @ 10+0.1 th 3 Quick test of NUMA hack
15-10-19 gli razor_margin diff
Pending...
sprt @ 240+0.4 th 3 Testing timeout fix.
15-10-18 sg initiative diff
LLR: -3.26 (-2.94,2.94) [0.00,5.00]
Total: 88856 W: 17286 L: 17028 D: 54542
sprt @ 10+0.1 th 1 Use pawn span for initiative.
15-10-17 Voy HistoryStats diff
LLR: -2.95 (-2.94,2.94) [0.00,4.00]
Total: 88058 W: 16923 L: 16768 D: 54367
sprt @ 10+0.1 th 1 Improve History Stats weights...
15-10-18 SC assorted_tuning diff
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 21577 W: 3507 L: 3289 D: 14781
sprt @ 40+0.4 th 1 Retest assorted tuning without time management as discussed in https://github.com/official-stockfish/Stockfish/pull/464 LTC
15-10-17 jhe simple_time_2 diff
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 53079 W: 10230 L: 10166 D: 32683
sprt @ 10+0.1 th 1 Further Code Cleanup.
15-10-15 sni threats6 diff
LLR: -3.04 (-2.94,2.94) [0.00,5.00]
Total: 38229 W: 7295 L: 7256 D: 23678
sprt @ 10+0.1 th 1 Change values of threats on attacked queens (take 1, change = S(10,10))
15-10-18 mco lazy_smp diff
ELO: -4.09 +-33.1 (95%) LOS: 40.4%
Total: 85 W: 8 L: 9 D: 68
10000 @ 180+0.1 th 20 lazy smp pre-merge test: high-threads scenario. With 20 threads test with fixed number of games at XXLTC to compare against same conditions at LTC.
15-10-18 sni material5 diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 26878 W: 5149 L: 5156 D: 16573
sprt @ 10+0.1 th 1 Try to keep enough material
15-10-18 jos razor_margin diff
LLR: 0.08 (-2.94,2.94) [0.00,5.00]
Total: 64 W: 4 L: 1 D: 59
sprt @ 240+0.4 th 3 Respin an old patch at very long tc to check, the timeout fix works. Test will be cancelled as soon as it's clear, the one or the other way. Now with 3 threads and 4 minutes to make sure, we hit the 5min limit.
15-10-18 sni rook_scale_factor diff
LLR: -2.95 (-2.94,2.94) [0.00,4.00]
Total: 33176 W: 6229 L: 6270 D: 20677
sprt @ 10+0.1 th 1 Simplify logic in SF's KRPPKRP endgame. Simplification, tested as sprt(0..4)
15-10-18 mco lazy_smp diff
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 4235 W: 671 L: 528 D: 3036
sprt @ 60+0.1 th 7 lazy smp pre-merge test: middle case scenario. With 7 threads YBW is still godd but lazy should start to be effective. Test as simplification because lazy_smp saves 385(!) less lines of code. LTC version.
15-10-18 mco lazy_smp diff
ELO: 44.75 +-7.6 (95%) LOS: 100.0%
Total: 2069 W: 407 L: 142 D: 1520
10000 @ 60+0.1 th 20 lazy smp pre-merge test: high-threads scenario. With 20 threads test with fixed number of games at LTC because the advantage of lazy should be already clear after just 10K games and resources available for such a test are very few. Set at high priority because we want to allocate the few high core machines available. This is the _real_ test where lazy should prove stronger.
15-10-18 jos razor_margin diff
LLR: -0.06 (-2.94,2.94) [0.00,5.00]
Total: 29 W: 1 L: 3 D: 25
sprt @ 180+0.4 th 1 Respin an old patch at very long tc to check, the timeout fix works. Test will be cancelled as soon as it's clear, the one or the other way.
15-10-18 SC assorted_tuning diff
LLR: 3.07 (-2.94,2.94) [0.00,4.00]
Total: 15124 W: 2974 L: 2756 D: 9394
sprt @ 10+0.1 th 1 Retest assorted tuning without time management as discussed in https://github.com/official-stockfish/Stockfish/pull/464
15-10-17 sni king_march diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 33232 W: 6508 L: 6485 D: 20239
sprt @ 10+0.1 th 1 Try to use vertical king separation info
15-10-18 mco lazy_smp diff
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 3607 W: 674 L: 526 D: 2407
sprt @ 10+0.1 th 7 lazy smp pre-merge test: middle case scenario. With 7 threads YBW is still godd but lazy should start to be effective. Test as simplification because lazy_smp saves 385(!) less lines of code.
15-10-17 aji hybrid_history diff
ELO: 2.92 +-3.9 (95%) LOS: 93.0%
Total: 10000 W: 1669 L: 1585 D: 6746
10000 @ 10+0.1 th 7 Use a shared history/countermoves within the cluster: STC