Stockfish Testing Queue

Finished - 40836 tests

14-01-24 joa pp_material_scaling diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 3736 W: 909 L: 1012 D: 1815
sprt @ 5+0.05 th 1 pp: scaling down eval for major pieces present
14-01-24 uri fix_bug_infinite diff
LLR: 2.96 (-2.94,2.94) [-4.00,0.00]
Total: 48524 W: 7419 L: 7412 D: 33693
sprt @ 60+0.05 th 1 long time control I do not expect a big elo improvement and I verify no regression with these bounds considering the fact that the queue is almost empty
14-01-24 sg nullmove diff
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 17432 W: 3278 L: 3128 D: 11026
sprt @ 15+0.05 th 1 increase nullmove reduction
14-01-24 sg nullmove diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 49042 W: 9005 L: 8983 D: 31054
sprt @ 15+0.05 th 1 decrease nullmove reduction
14-01-24 sg nullmove diff
LLR: 3.92 (-2.94,2.94) [-1.50,4.50]
Total: 62697 W: 11630 L: 11329 D: 39738
sprt @ 15+0.05 th 1 increase reduction on greater difference eval-beta
14-01-24 sg nullmove diff
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 5699 W: 1095 L: 976 D: 3628
sprt @ 15+0.05 th 1 slower increase reduction on greater difference eval-beta
14-01-25 sg nullmove diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 10640 W: 1938 L: 2015 D: 6687
sprt @ 15+0.05 th 1 more increase nullmove reduction. Before a do the LTC of the passed version i try to increase the reduction further.
14-01-25 sg nullmove diff
LLR: -2.95 (-2.94,2.94) [0.00,6.00]
Total: 18237 W: 2802 L: 2821 D: 12614
sprt @ 60+0.05 th 1 LTC: increase nullmove reduction
14-01-25 bin nullmove diff
LLR: -2.97 (-2.94,2.94) [-1.50,4.50]
Total: 4413 W: 778 L: 872 D: 2763
sprt @ 15+0.05 th 1 Dont reduce null move when 1 legal move. Easy fix to test position in Uri patch.
14-01-25 uri fix_bug_infinite diff
ELO: -1.49 +-2.9 (95%) LOS: 15.8%
Total: 20000 W: 3643 L: 3729 D: 12628
20000 @ 15+0.05 th 1 I add to the previous change not using null move pruning when the remaining depth is high and the number of legal moves is small. I prefer to measure difference first before using SPRT
14-01-25 uri fix_bug_infinite diff
ELO: -0.56 +-2.9 (95%) LOS: 35.5%
Total: 20000 W: 3668 L: 3700 D: 12632
20000 @ 15+0.05 th 1 this time I only decide not to use null move pruning when the number of legal moves is 1 and bench is the same(if 1 is better than 10 then I learn that the idea of not using null move pruning when the number of legal moves small is bad because the price of counting the legal move is payed in both cases).
14-01-25 sg nullmove diff
LLR: -3.00 (-2.94,2.94) [0.00,6.00]
Total: 18198 W: 2777 L: 2798 D: 12623
sprt @ 60+0.05 th 1 LTC: slower increase reduction on greater difference eval-beta
14-01-25 sg nullmove diff
LLR: 1.82 (-2.94,2.94) [0.00,6.00]
Total: 29897 W: 4637 L: 4442 D: 20818
sprt @ 60+0.05 th 1 LTC: increase reduction on greater difference eval-beta
14-01-25 hxi piece_values diff
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 31902 W: 5947 L: 5760 D: 20195
sprt @ 15+0.05 th 1 mg-eg value diff moved to psqt (all eg)
14-01-25 hxi piece_values diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 11909 W: 2201 L: 2275 D: 7433
sprt @ 15+0.05 th 1 mg-eg value diff moved to psqt (all mg)
14-01-25 hxi piece_values diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 48390 W: 8837 L: 8817 D: 30736
sprt @ 15+0.05 th 1 mg-eg value diff moved to psqt (avg)
14-01-25 lbr pawn diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 8162 W: 1495 L: 1579 D: 5088
sprt @ 15+0.05 th 1 do not cumulate pawn penalties
14-01-25 uri reduce_value_known_win diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 2090 W: 324 L: 423 D: 1343
sprt @ 15+0.05 th 1 Note that bench is not a monotonic function of VALUE_KNOWN_WIN and the bench is clearly random but it is a clear functional change(I guess it is going to fail but if it fails it will probably fail fast and I do not plan to try more changes of this parameter). changing it earlier from 15000 to 10000 maybe cause some improvement so I want to check if bigger change can cause a significant improvement.
14-01-25 alw KingAttackW diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 4969 W: 889 L: 981 D: 3099
sprt @ 15+0.05 th 1 lowering king attack weights - values obtained with help of JK's SPSA script on very low search depth. Check whether it it scales. try 1.
14-01-25 sg nullmove diff
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 26141 W: 4871 L: 4699 D: 16571
sprt @ 15+0.05 th 1 Fixed version: increase reduction on greater difference eval-beta
14-01-25 jki bpsqt diff
ELO: 2.63 +-2.0 (95%) LOS: 99.6%
Total: 50000 W: 10733 L: 10354 D: 28913
50000 @ 5+0.05 th 1 bpsqt, 23k iter
14-01-25 sg nullmove diff
LLR: 2.97 (-2.94,2.94) [0.00,6.00]
Total: 33695 W: 5309 L: 5056 D: 23330
sprt @ 60+0.05 th 1 LTC: Fixed version: increase reduction on greater difference eval-beta
14-01-25 alw KingAttackW diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 4656 W: 835 L: 928 D: 2893
sprt @ 15+0.05 th 1 lowering king attack weights - values obtained with help of JK's SPSA script. Try 2, rounding up more.
14-01-26 gli bpsqt diff
LLR: -3.00 (-2.94,2.94) [-1.50,4.50]
Total: 54006 W: 10132 L: 10097 D: 33777
sprt @ 15+0.05 th 1 SPRT for Joona: bpsqt, 23k iter
14-01-26 alw KingAttackW diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 6937 W: 1262 L: 1349 D: 4326
sprt @ 15+0.05 th 1 more conservatively lowering king attack weights - no SPSA values but direct downscaling, w/ half-weighted rounding and truncating
14-01-26 oki piece_values diff
LLR: -2.96 (-2.94,2.94) [0.00,6.00]
Total: 31320 W: 4716 L: 4679 D: 21925
sprt @ 60+0.05 th 1 LTC for hx: mg-eg value diff moved to psqt (all eg)
14-01-26 hxi piece_values diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 53321 W: 9779 L: 9746 D: 33796
sprt @ 15+0.05 th 1 mg-eg value diff moved to psqt (keeping eg psqt near zero - all zero or about 12-16 squares positive)
14-01-26 mco null_verification^^ diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 16399 W: 2886 L: 2949 D: 10564
sprt @ 15+0.05 th 1 Tweak null verification threshold: take 1
14-01-26 mco null_verification^ diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 6422 W: 1146 L: 1234 D: 4042
sprt @ 15+0.05 th 1 Tweak null verification threshold: take 2
14-01-26 mco null_verification diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 13199 W: 2374 L: 2445 D: 8380
sprt @ 15+0.05 th 1 Tweak null verification threshold: take 3
14-01-26 mco bpsqt diff
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 47663 W: 10777 L: 10524 D: 26362
sprt @ 15+0.05 th 1 SPRT for Joona: bpsqt, 23k iter (retest with 2moves book)
14-01-26 dor less_pv diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 75105 W: 16954 L: 16838 D: 41313
sprt @ 15+0.05 th 1 Remove most PV distinctions - low priority retest with 2moves_v1 book. Previously passed after 78877 games.
14-01-26 dor pp_blocked diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 47291 W: 10670 L: 10638 D: 25983
sprt @ 15+0.05 th 1 Blockage of rook passers is difficult eliminate - - low priority retest with 2moves_v1 book. Previously passed after 74017 games.
14-01-26 dor update_stats diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 87680 W: 19711 L: 19558 D: 48411
sprt @ 15+0.05 th 1 update stats forr pv moves too. Use weighted counter move stats. - low priority retest with 2moves_v1 book. Previously passed after 62429 games.
14-01-26 dor fast_threat diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 50503 W: 11595 L: 11552 D: 27356
sprt @ 15+0.05 th 1 Higher value of threatened by pawn to compensate - low priority retest with 2moves_v1 book. Previously passed after 64445 games.
14-01-26 dor update_on_tt_hit diff
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 48832 W: 11215 L: 10956 D: 26661
sprt @ 15+0.05 th 1 Update History and Counter move on TT hit - low priority retest with 2moves_v1 book. Previously passed after 52690 games.
14-01-26 dor kingAttWeight^ diff
LLR: -2.97 (-2.94,2.94) [-1.50,4.50]
Total: 13921 W: 3094 L: 3163 D: 7664
sprt @ 15+0.05 th 1 RookWeight, Take 3. - low priority retest with 2moves_v1 book. Previously failed after 60119 games.
14-01-26 dor pp_blocked2 diff
LLR: -2.97 (-2.94,2.94) [-1.50,4.50]
Total: 41126 W: 9259 L: 9246 D: 22621
sprt @ 15+0.05 th 1 Blockade of rook passers is difficult to eliminate. Take 2 - low priority retest with 2moves_v1 book. Previously failed after 72434 games.
14-01-26 dor mate2 diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 30215 W: 6797 L: 6816 D: 16602
sprt @ 15+0.05 th 1 Mate detection: take 2 - low priority retest with 2moves_v1 book. Previously failed after 57881 games.
14-01-26 dor ks6 diff
LLR: -2.97 (-2.94,2.94) [-1.50,4.50]
Total: 7357 W: 1646 L: 1735 D: 3976
sprt @ 15+0.05 th 1 king safety, variant 6 - low priority retest with 2moves_v1 book. Previously failed after 67536 games.
14-01-26 dor easymove diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 27069 W: 6209 L: 6237 D: 14623
sprt @ 15+0.05 th 1 second try to change easy move and this time only check for easy move at depth=12 and also getting rid of another line of the code. - low priority retest with 2moves_v1 book. Previously failed after 68216 games.
14-01-26 jki lognull diff
LLR: -2.97 (-2.94,2.94) [-1.50,4.50]
Total: 5865 W: 1030 L: 1120 D: 3715
sprt @ 15+0.05 th 1 Logistic try for null move value based reduction.
14-01-26 uri less_known_win diff
LLR: 2.95 (-2.94,2.94) [-4.00,0.00]
Total: 45061 W: 8224 L: 8222 D: 28615
sprt @ 15+0.05 th 1 I submit the test of only changing one value as marco suggested.
14-01-26 cee rep_hash^ diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 13414 W: 2416 L: 2486 D: 8512
sprt @ 15+0.05 th 1 Aid repetition detection by hashing
14-01-26 cee rep_hash diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 14744 W: 2622 L: 2689 D: 9433
sprt @ 15+0.05 th 1 Use 32 bit hash
14-01-26 jki bpsqt diff
ELO: 1.22 +-2.0 (95%) LOS: 88.8%
Total: 50000 W: 10575 L: 10399 D: 29026
50000 @ 5+0.05 th 1 bpsqt, 49k iterations
14-01-26 Fis pv_instability diff
LLR: 2.97 (-2.94,2.94) [-1.50,4.50]
Total: 45270 W: 8385 L: 8163 D: 28722
sprt @ 15+0.05 th 1 Decay PV instability faster since the most recent should be by far the most important.
14-01-27 uri less_known_win diff
LLR: 2.95 (-2.94,2.94) [-4.00,0.00]
Total: 22655 W: 3478 L: 3394 D: 15783
sprt @ 60+0.05 th 1 I submit the test of only changing one value as marco suggested.
14-01-27 Fis pv_instability diff
LLR: -2.96 (-2.94,2.94) [0.00,6.00]
Total: 49773 W: 7643 L: 7522 D: 34608
sprt @ 60+0.05 th 1 Decay PV instability faster since the most recent should be by far the most important.
14-01-27 jki bpsqt diff
LLR: 2.97 (-2.94,2.94) [-1.50,4.50]
Total: 14231 W: 2684 L: 2542 D: 9005
sprt @ 15+0.05 th 1 STC: bpsqt, 49k iterations