Stockfish Testing Queue

Finished - 38987 tests

14-01-31 uri lessnull diff
ELO: 1.54 +-2.1 (95%) LOS: 92.8%
Total: 40000 W: 7429 L: 7252 D: 25319
40000 @ 15+0.05 th 1 I want to compare start not using null move pruning only at depth that is bigger than 14 to see if I get different results
14-01-31 Fis king_pawns_psqt diff
LLR: -0.00 (-2.94,2.94) [-1.50,4.50]
Total: 65 W: 9 L: 9 D: 47
sprt @ 15+0.05 th 1 Make king side pawns worth a bit more than queen side.
14-01-31 jki ppsqt3 diff
ELO: 1.64 +-2.0 (95%) LOS: 94.8%
Total: 50000 W: 10664 L: 10428 D: 28908
50000 @ 5+0.05 th 1 ppsqt: H-file penalty and center bonus
14-01-31 Fis king_pawns_psqt diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 31058 W: 5767 L: 5791 D: 19500
sprt @ 15+0.05 th 1 Make king side pawns worth a bit more than queen side.
14-01-31 uri time_manag_change diff
LLR: -2.98 (-2.94,2.94) [-1.50,4.50]
Total: 9597 W: 1748 L: 1829 D: 6020
sprt @ 15+0.05 th 1 I test some changes the change in timeman,cpp failed with positive score in the past and the changes in search are the same change that passed stage 1 and failed stage 2 without increasing slowmover from 70 to 80.
14-01-31 uri lessnull diff
ELO: 0.94 +-2.1 (95%) LOS: 81.5%
Total: 40000 W: 7327 L: 7219 D: 25454
40000 @ 40/15 th 1 test a simpler version of not using null move pruning and I use this time control because I am afraid that at fast time control the change has not enough depth to cause a significant change in the endgame
14-01-31 jos iid_depth diff
LLR: -2.96 (-2.94,2.94) [-4.00,0.00]
Total: 18520 W: 3274 L: 3495 D: 11751
sprt @ 15+0.05 th 1 Small simplification to IID. Same reduced depth for both, PV and NonPV nodes.
14-01-31 jki ppsqt3 diff
ELO: 0.26 +-1.9 (95%) LOS: 60.9%
Total: 50000 W: 9380 L: 9342 D: 31278
50000 @ 15+0.05 th 1 ppsqt: H-file penalty and center bonus
14-01-31 uri lessnull diff
ELO: 0.25 +-2.3 (95%) LOS: 58.5%
Total: 32004 W: 5791 L: 5768 D: 20445
40000 @ 40/15 th 1 This seem to be the best version so far based on 15+0.05 so I want to test if it is only because of luck or it is going to perform better also at this time control when you get higher depth in the endgame.
14-01-31 sg check_killers diff
LLR: -2.97 (-2.94,2.94) [-1.50,4.50]
Total: 21726 W: 4000 L: 4049 D: 13677
sprt @ 15+0.05 th 1 add check killers
14-02-01 uri lessnull diff
ELO: 0.33 +-2.9 (95%) LOS: 58.8%
Total: 20000 W: 3699 L: 3680 D: 12621
20000 @ 15+0.05 th 1 This is only for measurement. I expect to get a significant negative result after 20000 games and if I do not get it then it means that I probably need a longer time control to get them so I also need a longer time control to test my ideas
14-02-01 uri lessnull diff
ELO: -2.21 +-2.7 (95%) LOS: 5.4%
Total: 20000 W: 3058 L: 3185 D: 13757
20000 @ 60+0.05 th 1 I am surprised not to see negative result after some thousands of games so I test it at longer time control to see if we have a bigger difference and maybe it is better simply to avoid null move pruning when the remaining depth is high and in this case we can also save the verfication search code and detect zugzwangs without it by not using null move pruning.
14-02-01 sg lmr_exclude diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 4448 W: 772 L: 865 D: 2811
sprt @ 15+0.05 th 1 LMR: exclude double checks
14-02-01 sg lmr_exclude diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 4556 W: 789 L: 882 D: 2885
sprt @ 15+0.05 th 1 LMR: exclude discovered checks
14-02-01 sg lmr_exclude diff
LLR: -2.97 (-2.94,2.94) [-1.50,4.50]
Total: 5965 W: 1066 L: 1156 D: 3743
sprt @ 15+0.05 th 1 LMR: less reduction for double checks
14-02-01 sg lmr_exclude diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 32903 W: 6017 L: 6037 D: 20849
sprt @ 15+0.05 th 1 LMR: less reduction for discovered checks
14-02-01 uri lessnull diff
ELO: -3.23 +-2.9 (95%) LOS: 1.4%
Total: 20000 W: 3525 L: 3711 D: 12764
20000 @ 15+0.05 th 1 another try to change null move pruning and this is simplification when I remove null move verification and add not using null move pruning when the remaining depth is higher than the number of legal moves and at least 12(so practically zugzwang problems should be solved after enough time).
14-02-01 uri zugzwang_detect diff
ELO: -1.45 +-2.1 (95%) LOS: 8.3%
Total: 40000 W: 7208 L: 7375 D: 25417
40000 @ 40/15 th 1 I do not know why my previous version failed and I want to test something more simple of only limiting reductions to 10 plies
14-02-01 jki ppsqt3 diff
ELO: 1.65 +-1.7 (95%) LOS: 97.1%
Total: 50000 W: 7971 L: 7733 D: 34296
50000 @ 60+0.05 th 1 LTC: ppsqt: H-file penalty and center bonus
14-02-02 Fis pvinstability_tmratio diff
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 57549 W: 10582 L: 10330 D: 36637
sprt @ 15+0.05 th 1 Combo of pv_instability and tune_tmratio both of which failed but got positive scores.
14-02-02 inf razor_margin diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 15440 W: 2777 L: 2842 D: 9821
sprt @ 15+0.05 th 1 Take 1
14-02-02 uri zugzwang_detect diff
ELO: -2.66 +-2.1 (95%) LOS: 0.6%
Total: 40000 W: 7255 L: 7561 D: 25184
40000 @ 15+0.05 th 1 I try to be less aggressive in null move pruning when the number of pieces of the side to move is small. I want also to measure being more aggressive when the number of pieces is big.
14-02-02 uri zugzwang_detect diff
ELO: -3.55 +-2.1 (95%) LOS: 0.0%
Total: 40000 W: 7161 L: 7570 D: 25269
40000 @ 15+0.05 th 1 testing the opposite direction of more aggressive null move pruning in the middle game, It is possible that both of them are positive and if I see positive results in both tests I am going to use SPRT for making both changes
14-02-02 inf reduction diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 15046 W: 2672 L: 2738 D: 9636
sprt @ 15+0.05 th 1 Take 1: More aggressive reduction
14-02-02 inf reduction diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 4738 W: 798 L: 890 D: 3050
sprt @ 15+0.05 th 1 Take 2: Even more aggressive reduction. The bench is so drastic that either this fails badly, or it passes amazingly.
14-02-02 inf razor_margin diff
LLR: -2.97 (-2.94,2.94) [-1.50,4.50]
Total: 120433 W: 22163 L: 21957 D: 76313
sprt @ 15+0.05 th 1 Take 2: Tested very well locally at very short TC
14-02-02 inf pv_instability diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 5280 W: 919 L: 1010 D: 3351
sprt @ 15+0.05 th 1 Variation of pv_instability by mstembera (bench unchanged but functional change)
14-02-02 hxi rook_eval diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 4235 W: 740 L: 834 D: 2661
sprt @ 15+0.05 th 1 small change in spec. evaluation for rooks
14-02-02 Fis pvinstability_tmratio diff
LLR: -2.97 (-2.94,2.94) [0.00,6.00]
Total: 11337 W: 1693 L: 1744 D: 7900
sprt @ 60+0.05 th 1 Combo of pv_instability and tune_tmratio both of which failed but got positive scores.
14-02-03 inf razor_margin diff
ELO: -2.71 +-2.1 (95%) LOS: 0.5%
Total: 40000 W: 7190 L: 7502 D: 25308
40000 @ 15+0.05 th 1 v + razor_margin(depth) / original values: Verify if regression.
14-02-03 inf pv_instability diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 2881 W: 491 L: 589 D: 1801
sprt @ 15+0.05 th 1 Take 2: /= 2
14-02-03 dor rook_eval diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 10814 W: 2459 L: 2537 D: 5818
sprt @ 15+0.05 th 1 quickly failed patch (4235 games) - retest with 2moves_v1 book
14-02-03 dor blocked_storm_pawn_radi diff
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 24012 W: 5643 L: 5458 D: 12911
sprt @ 15+0.05 th 1 quickly passed patch (6239 games) - retest with 2moves_v1 book (my last test of this book)
14-02-03 uri lessnull1 diff
ELO: -0.77 +-2.1 (95%) LOS: 24.0%
Total: 36965 W: 6671 L: 6753 D: 23541
40000 @ 40/15 th 1 another try to avoid null move pruning in the first plies(mainly in endgame). The bench is slightly smaller and it can solve 8/8/8/2p5/1pp5/brpp4/1pprp2P/qnkbK3 w - - 0 1. If I see that 0 is inside the error bounds then I am going to try more 40,000 games match in order to try to tune parameters
14-02-03 jos razoring diff
LLR: -2.94 (-2.94,2.94) [-1.50,4.50]
Total: 5952 W: 1046 L: 1135 D: 3771
sprt @ 15+0.05 th 1 Razoring, Take 3.
14-02-03 jos razoring^^ diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 7715 W: 1369 L: 1454 D: 4892
sprt @ 15+0.05 th 1 Razoring, Take 1.
14-02-03 dor king_exposed diff
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 7525 W: 1821 L: 1686 D: 4018
sprt @ 15+0.05 th 1 retire KingExposed[] and merge its values into KPSQT - low priority retest with 2moves_v1 book, previously passed after 5348 games
14-02-03 dor onepawn diff
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 15115 W: 3527 L: 3370 D: 8218
sprt @ 15+0.05 th 1 Scale down evaluation when only one pawn leftretire KingExposed[] and merge its values into KPSQT - low priority retest with 2moves_v1 book, previously passed after 11921 games
14-02-03 dor pawn_dist diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 7330 W: 1688 L: 1777 D: 3865
sprt @ 15+0.05 th 1 Try to improve on 'pawns on both wing' patch. Increase bonus to 20.retire KingExposed[] and merge its values into KPSQT - low priority retest with 2moves_v1 book, previously passed after 18331 games
14-02-03 dor king_safety_trigger4 diff
LLR: 2.97 (-2.94,2.94) [-1.50,4.50]
Total: 19406 W: 4611 L: 4439 D: 10356
sprt @ 15+0.05 th 1 take 1retire KingExposed[] and merge its values into KPSQT - low priority retest with 2moves_v1 book, previously passed after 17938 games
14-02-03 dor king_pawn_attacks diff
LLR: -2.94 (-2.94,2.94) [-1.50,4.50]
Total: 5292 W: 1196 L: 1291 D: 2805
sprt @ 15+0.05 th 1 Further push along Chris pawn on king attack idea (take 2)retire KingExposed[] and merge its values into KPSQT - low priority retest with 2moves_v1 book, previously passed after 30171 games
14-02-03 dor more_ks^ diff
LLR: 2.97 (-2.94,2.94) [-1.50,4.50]
Total: 57181 W: 13344 L: 13057 D: 30780
sprt @ 15+0.05 th 1 Always compute KSretire KingExposed[] and merge its values into KPSQT - low priority retest with 2moves_v1 book, previously failed after 20359 games
14-02-03 dor backward3 diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 17013 W: 3868 L: 3927 D: 9218
sprt @ 15+0.05 th 1 Retry this old idea. Slightly changed - low priority retest with 2moves_v1 book, previously failed after 22358 games.
14-02-03 dor only_move diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 20220 W: 3740 L: 3792 D: 12688
sprt @ 15+0.05 th 1 easy move with 1 pawn adv. - low priority retest with 2moves_v1 book, previously failed after 26259 games
14-02-03 dor stsu diff
LLR: -2.94 (-2.94,2.94) [-1.50,4.50]
Total: 1915 W: 434 L: 542 D: 939
sprt @ 15+0.05 th 1 check rammed (last check, since 2 tests failed miserably and one neutral, this must be a gem) - low priority retest with 2moves_v1 book, previously failed after 1329 games
14-02-03 dor storm_blocked diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 8504 W: 1911 L: 1996 D: 4597
sprt @ 15+0.05 th 1 Don't count blocked files in kingRing - low priority retest with 2moves_v1 book, previously failed after 9826 games
14-02-03 inf razor_margin diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 13821 W: 2496 L: 2565 D: 8760
sprt @ 15+0.05 th 1 Take 3: Try to tune around (v + razor_margin); 25K iterations
14-02-03 Tha gives_check diff
LLR: 2.97 (-2.94,2.94) [-1.50,4.50]
Total: 12441 W: 2333 L: 2196 D: 7912
sprt @ 15+0.05 th 1 Speed improvement, optimized common case of pos.gives_check inline to avoid fairly expensive function calls.
14-02-04 inf razor_margin diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 55607 W: 10330 L: 10290 D: 34987
sprt @ 15+0.05 th 1 Take 4: return v; 25K iterations
14-02-04 inf pv_instability diff
LLR: -2.94 (-2.94,2.94) [-1.50,4.50]
Total: 3806 W: 627 L: 721 D: 2458
sprt @ 15+0.05 th 1 Take 3: Decay PV faster when depth is greater