Stockfish Testing Queue

Finished - 22425 tests

03-02-14 jo razoring diff
LLR: -2.94 (-2.94,2.94) [-1.50,4.50]
Total: 5952 W: 1046 L: 1135 D: 3771
sprt @ 15+0.05 th 1 Razoring, Take 3.
03-02-14 do blocked_storm_pawn_radi diff
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 24012 W: 5643 L: 5458 D: 12911
sprt @ 15+0.05 th 1 quickly passed patch (6239 games) - retest with 2moves_v1 book (my last test of this book)
02-02-14 ur zugzwang_detect diff
ELO: -3.55 +-2.1 (95%) LOS: 0.0%
Total: 40000 W: 7161 L: 7570 D: 25269
40000 @ 15+0.05 th 1 testing the opposite direction of more aggressive null move pruning in the middle game, It is possible that both of them are positive and if I see positive results in both tests I am going to use SPRT for making both changes
03-02-14 do rook_eval diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 10814 W: 2459 L: 2537 D: 5818
sprt @ 15+0.05 th 1 quickly failed patch (4235 games) - retest with 2moves_v1 book
03-02-14 in razor_margin diff
ELO: -2.71 +-2.1 (95%) LOS: 0.5%
Total: 40000 W: 7190 L: 7502 D: 25308
40000 @ 15+0.05 th 1 v + razor_margin(depth) / original values: Verify if regression.
26-01-14 do easymove diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 27069 W: 6209 L: 6237 D: 14623
sprt @ 15+0.05 th 1 second try to change easy move and this time only check for easy move at depth=12 and also getting rid of another line of the code. - low priority retest with 2moves_v1 book. Previously failed after 68216 games.
26-01-14 do ks6 diff
LLR: -2.97 (-2.94,2.94) [-1.50,4.50]
Total: 7357 W: 1646 L: 1735 D: 3976
sprt @ 15+0.05 th 1 king safety, variant 6 - low priority retest with 2moves_v1 book. Previously failed after 67536 games.
02-02-14 hx rook_eval diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 4235 W: 740 L: 834 D: 2661
sprt @ 15+0.05 th 1 small change in spec. evaluation for rooks
03-02-14 in pv_instability diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 2881 W: 491 L: 589 D: 1801
sprt @ 15+0.05 th 1 Take 2: /= 2
01-02-14 ur zugzwang_detect diff
ELO: -1.45 +-2.1 (95%) LOS: 8.3%
Total: 40000 W: 7208 L: 7375 D: 25417
40000 @ 40/15 th 1 I do not know why my previous version failed and I want to test something more simple of only limiting reductions to 10 plies
02-02-14 Fi pvinstability_tmratio diff
LLR: -2.97 (-2.94,2.94) [0.00,6.00]
Total: 11337 W: 1693 L: 1744 D: 7900
sprt @ 60+0.05 th 1 Combo of pv_instability and tune_tmratio both of which failed but got positive scores.
02-02-14 in razor_margin diff
LLR: -2.97 (-2.94,2.94) [-1.50,4.50]
Total: 120433 W: 22163 L: 21957 D: 76313
sprt @ 15+0.05 th 1 Take 2: Tested very well locally at very short TC
02-02-14 in pv_instability diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 5280 W: 919 L: 1010 D: 3351
sprt @ 15+0.05 th 1 Variation of pv_instability by mstembera (bench unchanged but functional change)
02-02-14 in reduction diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 4738 W: 798 L: 890 D: 3050
sprt @ 15+0.05 th 1 Take 2: Even more aggressive reduction. The bench is so drastic that either this fails badly, or it passes amazingly.
02-02-14 in reduction diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 15046 W: 2672 L: 2738 D: 9636
sprt @ 15+0.05 th 1 Take 1: More aggressive reduction
02-02-14 in razor_margin diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 15440 W: 2777 L: 2842 D: 9821
sprt @ 15+0.05 th 1 Take 1
02-02-14 Fi pvinstability_tmratio diff
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 57549 W: 10582 L: 10330 D: 36637
sprt @ 15+0.05 th 1 Combo of pv_instability and tune_tmratio both of which failed but got positive scores.
01-02-14 ur lessnull diff
ELO: -3.23 +-2.9 (95%) LOS: 1.4%
Total: 20000 W: 3525 L: 3711 D: 12764
20000 @ 15+0.05 th 1 another try to change null move pruning and this is simplification when I remove null move verification and add not using null move pruning when the remaining depth is higher than the number of legal moves and at least 12(so practically zugzwang problems should be solved after enough time).
26-01-14 do mate2 diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 30215 W: 6797 L: 6816 D: 16602
sprt @ 15+0.05 th 1 Mate detection: take 2 - low priority retest with 2moves_v1 book. Previously failed after 57881 games.
01-02-14 ur lessnull diff
ELO: -2.21 +-2.7 (95%) LOS: 5.4%
Total: 20000 W: 3058 L: 3185 D: 13757
20000 @ 60+0.05 th 1 I am surprised not to see negative result after some thousands of games so I test it at longer time control to see if we have a bigger difference and maybe it is better simply to avoid null move pruning when the remaining depth is high and in this case we can also save the verfication search code and detect zugzwangs without it by not using null move pruning.
26-01-14 do pp_blocked2 diff
LLR: -2.97 (-2.94,2.94) [-1.50,4.50]
Total: 41126 W: 9259 L: 9246 D: 22621
sprt @ 15+0.05 th 1 Blockade of rook passers is difficult to eliminate. Take 2 - low priority retest with 2moves_v1 book. Previously failed after 72434 games.
01-02-14 sg lmr_exclude diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 32903 W: 6017 L: 6037 D: 20849
sprt @ 15+0.05 th 1 LMR: less reduction for discovered checks
01-02-14 sg lmr_exclude diff
LLR: -2.97 (-2.94,2.94) [-1.50,4.50]
Total: 5965 W: 1066 L: 1156 D: 3743
sprt @ 15+0.05 th 1 LMR: less reduction for double checks
01-02-14 sg lmr_exclude diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 4556 W: 789 L: 882 D: 2885
sprt @ 15+0.05 th 1 LMR: exclude discovered checks
26-01-14 do kingAttWeight^ diff
LLR: -2.97 (-2.94,2.94) [-1.50,4.50]
Total: 13921 W: 3094 L: 3163 D: 7664
sprt @ 15+0.05 th 1 RookWeight, Take 3. - low priority retest with 2moves_v1 book. Previously failed after 60119 games.
01-02-14 ur lessnull diff
ELO: 0.33 +-2.9 (95%) LOS: 58.8%
Total: 20000 W: 3699 L: 3680 D: 12621
20000 @ 15+0.05 th 1 This is only for measurement. I expect to get a significant negative result after 20000 games and if I do not get it then it means that I probably need a longer time control to get them so I also need a longer time control to test my ideas
26-01-14 do update_on_tt_hit diff
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 48832 W: 11215 L: 10956 D: 26661
sprt @ 15+0.05 th 1 Update History and Counter move on TT hit - low priority retest with 2moves_v1 book. Previously passed after 52690 games.
01-02-14 sg lmr_exclude diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 4448 W: 772 L: 865 D: 2811
sprt @ 15+0.05 th 1 LMR: exclude double checks
26-01-14 do fast_threat diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 50503 W: 11595 L: 11552 D: 27356
sprt @ 15+0.05 th 1 Higher value of threatened by pawn to compensate - low priority retest with 2moves_v1 book. Previously passed after 64445 games.
28-01-14 jo pp_blockSq diff
ELO: 2.00 +-2.2 (95%) LOS: 96.2%
Total: 40000 W: 8507 L: 8277 D: 23216
40000 @ 5+0.05 th 1 Penalty for them_king attacking the blockSq (4*rr) - retest with 8moves_v3.pgn
28-01-14 jo pp_blockSq diff
ELO: 0.03 +-2.2 (95%) LOS: 50.9%
Total: 40000 W: 8449 L: 8446 D: 23105
40000 @ 5+0.05 th 1 Penalty for them_king attacking the blockSq (6*rr) - retest with 8moves_v3.pgn
31-01-14 jk ppsqt3 diff
ELO: 0.26 +-1.9 (95%) LOS: 60.9%
Total: 50000 W: 9380 L: 9342 D: 31278
50000 @ 15+0.05 th 1 ppsqt: H-file penalty and center bonus
29-01-14 jk ppsqt1 diff
ELO: -1.34 +-1.8 (95%) LOS: 6.9%
Total: 50000 W: 8368 L: 8561 D: 33071
50000 @ 60+0.05 th 1 Pawn psqt: Only h-pawn penalty
26-01-14 do update_stats diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 87680 W: 19711 L: 19558 D: 48411
sprt @ 15+0.05 th 1 update stats forr pv moves too. Use weighted counter move stats. - low priority retest with 2moves_v1 book. Previously passed after 62429 games.
31-01-14 ur lessnull diff
ELO: 0.25 +-2.3 (95%) LOS: 58.5%
Total: 32004 W: 5791 L: 5768 D: 20445
40000 @ 40/15 th 1 This seem to be the best version so far based on 15+0.05 so I want to test if it is only because of luck or it is going to perform better also at this time control when you get higher depth in the endgame.
31-01-14 ur lessnull diff
ELO: 1.54 +-2.1 (95%) LOS: 92.8%
Total: 40000 W: 7429 L: 7252 D: 25319
40000 @ 15+0.05 th 1 I want to compare start not using null move pruning only at depth that is bigger than 14 to see if I get different results
31-01-14 sg check_killers diff
LLR: -2.97 (-2.94,2.94) [-1.50,4.50]
Total: 21726 W: 4000 L: 4049 D: 13677
sprt @ 15+0.05 th 1 add check killers
31-01-14 ur null_changing diff
ELO: 0.67 +-2.0 (95%) LOS: 74.0%
Total: 40000 W: 7200 L: 7123 D: 25677
40000 @ 40/15 th 1 I believe that my change should help mainly in endgames and in 15+0.05 we have no time for ending so I want to test in different type of time control when the program can get bigger depths in endgames to see if the behaviour is different.
31-01-14 ur lessnull diff
ELO: 0.94 +-2.1 (95%) LOS: 81.5%
Total: 40000 W: 7327 L: 7219 D: 25454
40000 @ 40/15 th 1 test a simpler version of not using null move pruning and I use this time control because I am afraid that at fast time control the change has not enough depth to cause a significant change in the endgame
31-01-14 jo iid_depth diff
LLR: -2.96 (-2.94,2.94) [-4.00,0.00]
Total: 18520 W: 3274 L: 3495 D: 11751
sprt @ 15+0.05 th 1 Small simplification to IID. Same reduced depth for both, PV and NonPV nodes.
31-01-14 jk ppsqt3 diff
ELO: 1.64 +-2.0 (95%) LOS: 94.8%
Total: 50000 W: 10664 L: 10428 D: 28908
50000 @ 5+0.05 th 1 ppsqt: H-file penalty and center bonus
31-01-14 ur time_manag_change diff
LLR: -2.98 (-2.94,2.94) [-1.50,4.50]
Total: 9597 W: 1748 L: 1829 D: 6020
sprt @ 15+0.05 th 1 I test some changes the change in timeman,cpp failed with positive score in the past and the changes in search are the same change that passed stage 1 and failed stage 2 without increasing slowmover from 70 to 80.
28-01-14 ur null_changing diff
ELO: -0.85 +-2.1 (95%) LOS: 21.1%
Total: 40000 W: 7371 L: 7469 D: 25160
40000 @ 15+0.05 th 1 The target is to solve the following position fast and first I want to measure the change that does not change bench significantly and next I may try to tune it. The idea is that usually number of legal moves is high when eval is significantly above beta and if it is small I suspect that something is wrong and avoid null move pruning. 8/8/8/2p5/1pp5/brpp4/1pprp2P/qnkbK3 w - - 0 1
31-01-14 Fi king_pawns_psqt diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 31058 W: 5767 L: 5791 D: 19500
sprt @ 15+0.05 th 1 Make king side pawns worth a bit more than queen side.
29-01-14 gl master diff
ELO: 36.66 +-2.0 (95%) LOS: 100.0%
Total: 40000 W: 9110 L: 4905 D: 25985
40000 @ 60+0.05 th 1 Regression test after bishop psqt tweak
31-01-14 Fi king_pawns_psqt diff
LLR: -0.00 (-2.94,2.94) [-1.50,4.50]
Total: 65 W: 9 L: 9 D: 47
sprt @ 15+0.05 th 1 Make king side pawns worth a bit more than queen side.
31-01-14 pe tm diff
LLR: -2.97 (-2.94,2.94) [-1.50,4.50]
Total: 2586 W: 439 L: 538 D: 1609
sprt @ 15+0.05 th 1 tune time management
31-01-14 hw more_see_2 diff
LLR: -2.97 (-2.94,2.94) [-1.50,4.50]
Total: 17524 W: 3204 L: 3264 D: 11056
sprt @ 15+0.05 th 1 Second attempt to see optimization.
28-01-14 Fi pv_instability diff
LLR: -2.95 (-2.94,2.94) [0.00,6.00]
Total: 58345 W: 9084 L: 8922 D: 40339
sprt @ 60+0.05 th 1 Decay PV instability faster since the most recent should be by far the most important. Take 2.
31-01-14 in futility_pruning diff
LLR: -2.96 (-2.94,2.94) [0.00,6.00]
Total: 9995 W: 1465 L: 1522 D: 7008
sprt @ 60+0.05 th 1 Take 4 (LTC)