Stockfish Testing Queue

Finished - 1741 tests

15-08-31 SC capture_tuned diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 26284 W: 4862 L: 4874 D: 16548
sprt @ 15+0.05 th 1 Almost final values from SPSA tuning for piece type dependent rank penalty in capture ordering.
15-08-31 SC return_if_4 diff
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 1543 W: 198 L: 315 D: 1030
sprt @ 15+0.05 th 1 Why are we searching positions with 4 or less pieces on the board? If I understand it correctly, (almost all) those positions are correctly treated by evaluate(pos).
15-08-08 SC capture_tuning diff
74060/75000 iterations
149158/150000 games played
150000 @ 30+0.05 th 1 Reschedule tuning session for piece type dependent rank penalty from most promising values until now. Back then was stopped by icewulf.
15-08-07 SC capture_tuned diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 7313 W: 1332 L: 1426 D: 4555
sprt @ 15+0.05 th 1 Another attempt on using piece value dependent rank penalty. This time values are chosen such that SEE is best approximated on my personal bench.
15-08-04 SC capture_tuned diff
LLR: -3.26 (-2.94,2.94) [0.00,5.00]
Total: 6012 W: 1084 L: 1197 D: 3731
sprt @ 15+0.05 th 1 Previous version was 0 for pawns and high for kings. Now 0 for pawns, high for queens.
15-08-04 SC capture_tuned diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 7541 W: 1362 L: 1455 D: 4724
sprt @ 15+0.05 th 1 No rank penalty for pawn captures, high for queens.
15-08-03 SC capture_promotion diff
LLR: -2.94 (-2.94,2.94) [0.00,5.00]
Total: 15489 W: 2876 L: 2934 D: 9679
sprt @ 15+0.05 th 1 Score promotions with capture over all other captures.
15-08-02 SC capture_tuned diff
LLR: -4.18 (-2.94,2.94) [0.00,5.00]
Total: 27659 W: 5196 L: 5253 D: 17210
sprt @ 15+0.05 th 1 Extrapolate linearly from final values of SPSA tuning. Take 2 on tuning piece type dependent rank penalty.
15-08-02 SC capture_tuned diff
LLR: -3.37 (-2.94,2.94) [0.00,5.00]
Total: 15258 W: 2833 L: 2910 D: 9515
sprt @ 15+0.05 th 1 icewulf stopped the SPSA tuning due to wrong bench. Check whether current values are already good enough.
15-08-01 SC capture_tuning diff
10399/60000 iterations
20605/120000 games played
120000 @ 30+0.05 th 1 Tuning rank penalty in capture scoring depending on moved piece. Idea by VoyagerOne.
15-07-31 SC opposite_bishops_ks diff
LLR: -2.97 (-2.94,2.94) [0.00,5.00]
Total: 9979 W: 1879 L: 1962 D: 6138
sprt @ 15+0.05 th 1 King safety and opposite bishops, values from local tuning. Take 3.
15-07-29 SC opposite_bishops_ks diff
LLR: -3.53 (-2.94,2.94) [0.00,5.00]
Total: 19876 W: 3663 L: 3727 D: 12486
sprt @ 15+0.05 th 1 Testing opposite bishops based king safety bonus, take 2.
15-07-29 SC ks_increased diff
LLR: -4.06 (-2.94,2.94) [0.00,4.00]
Total: 85445 W: 16165 L: 16081 D: 53199
sprt @ 15+0.05 th 1 SPSA tuning of opposite bishop bonus showd a slightly increased king safety in general. Look whether increasing this of 3% is a Elo gain. Rescheduling: test was stopped by a mysterious wrong bench.
15-07-28 SC ks_increased diff
LLR: -0.65 (-2.94,2.94) [0.00,4.00]
Total: 26083 W: 4925 L: 4868 D: 16290
sprt @ 15+0.05 th 1 SPSA tuning of opposite bishop bonus showd a slightly increased king safety in general. Look whether increasing this of 3% is a Elo gain.
15-07-28 SC opposite_bishops_ks diff
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 9128 W: 1645 L: 1731 D: 5752
sprt @ 15+0.05 th 1 Testing opposite bishops based king safety bonus, take 1.
15-07-25 SC opposite_bishops_ks_per diff
61029/60000 iterations
120000/120000 games played
120000 @ 30+0.05 th 1 Try to tune per SPSA bonus of opposite bishops contribution to king safety. Low throughput, so it will kick in only if framework goes idle again.
15-07-26 SC capture_tuned diff
LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 18577 W: 3476 L: 3569 D: 11532
sprt @ 15+0.05 th 1 The tuning session converged to almost the same values as it started with. Check whether this minimal change is an improvement. Throughput 200.
15-07-26 SC opposite_bishops_separa diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 7764 W: 1407 L: 1492 D: 4865
sprt @ 15+0.05 th 1 In endings with opposite bishops, add king distance to scale factor. Inspired by http://www.chess.com/video/player/opposite-colored-bishops-separation-anxiety.
15-07-24 SC capture_tuning diff
61975/60000 iterations
120000/120000 games played
120000 @ 30+0.05 th 1 Try to get the right values per SPSA as long as the framework is idle.
15-07-24 SC opposite_bishops_ks diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 10334 W: 1917 L: 1995 D: 6422
sprt @ 15+0.05 th 1 With opposite bishops, attacking side is favoured in middlegame. Emphasize king safety if this is the case.
15-07-23 SC capture_MVV_LVA_rank diff
LLR: -2.94 (-2.94,2.94) [0.00,5.00]
Total: 7271 W: 1307 L: 1401 D: 4563
sprt @ 15+0.05 th 1 I just realized that back then I did not try MVV+LVA+rank heuristics together. If it's ok for the mantainers I would switch to test with [0,5] [0,5] as suggested by Lucas
15-07-22 SC capture_pawn_rank diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 8433 W: 1500 L: 1583 D: 5350
sprt @ 15+0.05 th 1 Make rank malus in movepick dependent on pawn count. The less the enemy pawns, the smaller the malus of capturing in the enemy field.
15-07-21 SC improving diff
LLR: -2.95 (-2.94,2.94) [0.00,4.00]
Total: 30227 W: 4704 L: 4764 D: 20759
sprt @ 60+0.05 th 1 A stricter, but not so strict definition of improving. LTC. Wrong bench from JoJom.
15-07-21 SC improving diff
Pending...
sprt @ 60+0.05 th 1 A stricter, but not so strict definition of improving. LTC.
15-07-20 SC improving diff
LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 76170 W: 14510 L: 14399 D: 47261
sprt @ 15+0.05 th 1 A stricter, but not so strict definition of improving.
15-07-19 SC QKfork2 diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 11530 W: 2087 L: 2162 D: 7281
sprt @ 15+0.05 th 1 Local tuning suggested much higher bonus. Take 3.
15-07-18 SC QKfork2 diff
LLR: -2.97 (-2.94,2.94) [-1.50,4.50]
Total: 22025 W: 4298 L: 4345 D: 13382
sprt @ 15+0.05 th 1 If QK are diagonally adjacent to each other, the risk of been forked or X-rayed is higher. Give a little higher penalty, but only if we have a knight or a bishop. Take 2.
15-07-18 SC QKfork diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 13406 W: 2466 L: 2536 D: 8404
sprt @ 15+0.05 th 1 If QK are diagonally adjacent to each other, the risk of been forked or X-rayed is higher. Give a little penalty. Rescheduled: wrong bench from icewulf.
15-07-18 SC QKfork diff
LLR: -1.11 (-2.94,2.94) [-1.50,4.50]
Total: 2511 W: 463 L: 496 D: 1552
sprt @ 15+0.05 th 1 If QK are diagonally adjacent to each other, the risk of been forked or X-rayed is higher. Give a little penalty. Rescheduled: wrong bench from icewulf. Please stop it.
15-07-17 SC QKfork diff
LLR: -0.14 (-2.94,2.94) [-1.50,4.50]
Total: 105 W: 21 L: 26 D: 58
sprt @ 15+0.05 th 1 If QK are diagonally adjacent to each other, the risk of been forked or X-rayed is higher. Give a little penalty.
15-07-12 SC piece_values_see diff
LLR: -2.97 (-2.94,2.94) [-1.50,4.50]
Total: 511 W: 90 L: 207 D: 214
sprt @ 15+0.05 th 1 Further attempt on SEE-dedicated piece values. Tuned locally.
15-07-10 SC MVV_dynamic_OU_tuned diff
LLR: -2.97 (-2.94,2.94) [-1.50,4.50]
Total: 11327 W: 2091 L: 2167 D: 7069
sprt @ 15+0.05 th 1 A further attempt (nr 5) on dynamic capture scoring, this time tuned locally. Also interesting to understand whether tuning with search on bench positions could work.
15-06-26 SC piece_values_see diff
LLR: -2.97 (-2.94,2.94) [-1.50,4.50]
Total: 6960 W: 1316 L: 1404 D: 4240
sprt @ 15+0.05 th 1 Use final tuned values from SPSA.
15-06-26 SC MVV_dynamic_OU_tuned diff
LLR: -2.97 (-2.94,2.94) [-1.50,4.50]
Total: 3056 W: 537 L: 635 D: 1884
sprt @ 15+0.05 th 1 Give a further try to the dynamic capture stats, this time removing the term regarding the moved piece.
15-06-24 SC see_values diff
53358/30000 iterations
75000/60000 games played
60000 @ 30+0.05 th 1 Try to use a different PieceValue for SEE-related activities. Tuning session.
15-06-25 SC piece_values_see diff
LLR: -2.97 (-2.94,2.94) [-1.50,4.50]
Total: 37818 W: 7110 L: 7116 D: 23592
sprt @ 15+0.05 th 1 The spsa tuning seems to be converging to some strange values. Test whether deciding whether to use MG or EG value based on the indications of spsa tuning is good.
15-06-20 SC MVV_dynamic_OU_tuning diff
42003/30000 iterations
61000/60000 games played
60000 @ 60+0.05 th 1 MVV_dynamic_OU looked slightly better than MVV_dynamic but it looks like it is going to fail. Tune at 30s (framework is not really loaded now). If tuned version fails, I am moving to something else.
15-06-22 SC MVV_dynamic_OU_tuned diff
LLR: -2.96 (-2.94,2.94) [-1.50,4.50]
Total: 11520 W: 2133 L: 2208 D: 7179
sprt @ 15+0.05 th 1 Going to SPRT with almost final values.
15-06-20 SC MVV_dynamic_OU diff
LLR: -3.06 (-2.94,2.94) [-1.50,4.50]
Total: 18176 W: 3408 L: 3469 D: 11299
sprt @ 15+0.05 th 1 First shot was not that bad. Small variation: return to default values rather than to 0.
15-06-20 SC MVV_dynamic diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 9533 W: 1792 L: 1872 D: 5869
sprt @ 15+0.05 th 1 A first, very primitive attempt at collecting capture stats during play and use it for probcut stage in move picker.
15-06-19 SC adjudication_1 diff
LLR: -2.97 (-2.94,2.94) [-1.50,4.50]
Total: 54821 W: 10740 L: 10698 D: 33383
sprt @ 10+0.05 th 1 Testing whether fishtest is sensitive to non-functional manipulations of score at short TC, see https://groups.google.com/forum/?fromgroups=#!topic/fishcooking/ClIF6fyNeow. Take 2, UCI::value /= 10.
15-06-19 SC adjudication_1 diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 422 W: 72 L: 191 D: 159
sprt @ 10+0.05 th 1 Testing whether fishtest is sensitive to non-functional manipulations of score at short TC, see https://groups.google.com/forum/?fromgroups=#!topic/fishcooking/ClIF6fyNeow. Take 1, UCI::value *= 10.
15-06-14 SC scale_regression_tuned diff
ELO: -19.30 +-3.0 (95%) LOS: 0.0%
Total: 20000 W: 3428 L: 4538 D: 12034
20000 @ 10+0.05 th 1 Now that it is tuned, let us go and see where it came out. Previously was -17.5 ELO.
15-06-10 SC phalanx_1 diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 6015 W: 1135 L: 1225 D: 3655
sprt @ 15+0.05 th 1 Further tweak of phalanx scoring with file, now using same logic of scoring for phalanx rank. Now I've got some very interesting search lines from initiali position, if this means anything. Personal log can be found on google doc see https://docs.google.com/document/d/1TdMfx2a_fp5CjkSYwDrkPY8LgnowhHJ9A0JcDD-_xuA/edit?usp=sharing
15-06-08 SC phalanx_1 diff
LLR: -2.97 (-2.94,2.94) [-1.50,4.50]
Total: 16223 W: 3019 L: 3082 D: 10122
sprt @ 15+0.05 th 1 Last attempt at file dependent phalanxes. Give a small relative bonus only in middlegame.
15-06-07 SC phalanx_1 diff
LLR: -2.95 (-2.94,2.94) [-1.50,4.50]
Total: 12609 W: 2303 L: 2375 D: 7931
sprt @ 15+0.05 th 1 Make bonus 3x larger.
15-06-07 SC phalanx_1 diff
LLR: -2.95 (-2.94,2.94) [0.00,6.00]
Total: 16333 W: 2526 L: 2553 D: 11254
sprt @ 60+0.05 th 1 Give a small bonus for central phalanxes. Inspired by Botvinnik's treatment of queen gambit, e.g. http://www.chessgames.com/perl/chessgame?gid=1032264. LTC.
15-06-06 SC phalanx_1 diff
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 3856 W: 789 L: 673 D: 2394
sprt @ 15+0.05 th 1 Give a small bonus for central phalanxes. Inspired by Botvinnik's treatment of queen gambit, e.g. http://www.chessgames.com/perl/chessgame?gid=1032264
15-06-05 SC scale_regression_6 diff
ELO: -17.56 +-3.0 (95%) LOS: 0.0%
Total: 20000 W: 3500 L: 4510 D: 11990
20000 @ 10+0.05 th 1 Analyzing the results of scale_regression_5, I discovered that scale factors during the opening could get innaturally high. New coefficients including search from initial position in the training data. If this does not improve against the linear model, I give up and report on the forum about the experiment.
15-06-05 SC scale_regression_5 diff
ELO: -40.94 +-3.5 (95%) LOS: 0.0%
Total: 15347 W: 2327 L: 4127 D: 8893
20000 @ 10+0.05 th 1 Linear models could not get better than -15 ELO. Let us see whether moving to a simple quadratic model and remove a large part of the infrastructure for scaling factors is an improvment.