Stockfish Testing Queue

Pending - 0 tests 0.0 hrs

None

Active - 0 tests

Finished - 1094 tests

19-07-14 jos opening1 diff
LLR: -2.95 (-2.94,2.94) [0.50,4.50]
Total: 5112 W: 1058 L: 1197 D: 2857
sprt @ 10+0.1 th 1 Take 2, less restrictive.
19-07-14 jos opening1 diff
LLR: -2.96 (-2.94,2.94) [0.50,4.50]
Total: 5899 W: 1256 L: 1392 D: 3251
sprt @ 10+0.1 th 1 Don't skip quiet moves in the opening.
19-07-14 jos eval_in_check diff
LLR: -2.96 (-2.94,2.94) [0.50,4.50]
Total: 4170 W: 819 L: 962 D: 2389
sprt @ 10+0.1 th 1 Call qsearch to get a valid eval when in check.
19-06-26 jos master_25mr diff
LLR: 2.96 (-2.94,2.94) [0.00,3.50]
Total: 69594 W: 11923 L: 11550 D: 46121
sprt @ 60+0.6 th 1 Experiment mainly of theoretical interest. Can we test with a 25-moves rule at standard time controls? Retest the shuffle extension. (Low TP) STC looks neutral, now run LTC.
19-06-25 jos master_25mr diff
LLR: -2.96 (-2.94,2.94) [0.50,4.50]
Total: 31858 W: 7134 L: 7140 D: 17584
sprt @ 10+0.1 th 1 Experiment mainly of theoretical interest. Can we test with a 25-moves rule at standard time controls? Retest the shuffle extension. (Low TP)
19-06-24 jos blockade1 diff
LLR: -2.95 (-2.94,2.94) [0.50,4.50]
Total: 15578 W: 3416 L: 3503 D: 8659
sprt @ 10+0.1 th 1 Take 2. Keep lazy eval but exclude possibly blocked positions.
19-06-24 jos blockade1 diff
LLR: -2.96 (-2.94,2.94) [0.50,4.50]
Total: 13686 W: 2967 L: 3064 D: 7655
sprt @ 10+0.1 th 1 Correctly evaluate fully blocked positions. For this to work lazy eval had to be removed. (This will probably fail, of course!)
19-06-23 jos dcExt diff
LLR: -2.96 (-2.94,2.94) [0.50,4.50]
Total: 11265 W: 2451 L: 2560 D: 6254
sprt @ 10+0.1 th 1 Extend if side to move is double checked.
19-06-20 jos fh_patch diff
LLR: -0.21 (-2.94,2.94) [0.00,3.50]
Total: 31102 W: 5399 L: 5322 D: 20381
sprt @ 60+0.6 th 1 Simplified version. (Spec. LTC)
19-06-20 jos fh_patch diff
LLR: -2.96 (-2.94,2.94) [0.50,4.50]
Total: 65905 W: 14996 L: 14830 D: 36079
sprt @ 10+0.1 th 1 Simplified version.
19-06-20 jos fh_patch diff
LLR: -2.95 (-2.94,2.94) [0.50,4.50]
Total: 5587 W: 1165 L: 1302 D: 3120
sprt @ 10+0.1 th 1 Fundamental change of the fail-high patch. Only reduce by 2 plies and only once per iteration. After the reduced search also repeat the nominal search. Always display the nominal search depth.
19-06-17 jos lmr_escape diff
LLR: -2.95 (-2.94,2.94) [0.50,4.50]
Total: 24993 W: 5591 L: 5631 D: 13771
sprt @ 10+0.1 th 1 Take 3.
19-06-17 jos lmr_escape diff
LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 70362 W: 15715 L: 15600 D: 39047
sprt @ 10+0.1 th 1 Tweak the escape part.
19-06-17 jos lmr_escape diff
LLR: -2.96 (-2.94,2.94) [0.50,4.50]
Total: 7931 W: 1716 L: 1842 D: 4373
sprt @ 10+0.1 th 1 Tweak 2.
19-06-16 jos lmr_escape diff
LLR: -2.96 (-2.94,2.94) [-3.00,1.00]
Total: 105432 W: 17877 L: 18204 D: 69351
sprt @ 60+0.6 th 1 LTC: Always consider escaping a capture in LMR.
19-06-16 jos lmr_escape diff
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 143135 W: 31840 L: 31960 D: 79335
sprt @ 10+0.1 th 1 Always consider escaping a capture in LMR.
19-06-15 jos pv_vector_new diff
LLR: -2.96 (-2.94,2.94) [0.50,4.50]
Total: 55090 W: 12318 L: 12209 D: 30563
sprt @ 10+0.1 th 1 Retest for a gain now that MAX_PLY has almost been doubled.
19-05-24 jos multipv_fix diff
LLR: 2.95 (-2.94,2.94) [0.00,3.50]
Total: 7369 W: 3197 L: 2911 D: 1261
sprt @ 60+0.6 th 1 LTC: Some kind of bugfix of multipv mode. (See commit message.)
19-05-23 jos multipv_fix diff
LLR: 2.95 (-2.94,2.94) [0.50,4.50]
Total: 8233 W: 3708 L: 3424 D: 1101
sprt @ 10+0.1 th 1 Some kind of bugfix of multipv mode. (See commit message.)
19-05-23 jos cutNode4 diff
LLR: -2.96 (-2.94,2.94) [0.50,4.50]
Total: 12737 W: 2788 L: 2890 D: 7059
sprt @ 10+0.1 th 1 CutNode tweak.
19-05-23 jos captures16 diff
LLR: -2.94 (-2.94,2.94) [-3.00,1.00]
Total: 95389 W: 21110 L: 21465 D: 52814
sprt @ 10+0.1 th 1 Shrink capturesSearched, functional change at higher depths. Test for non-regression.
19-05-22 jos vector_captures_quiets diff
LLR: -2.95 (-2.94,2.94) [-3.00,1.00]
Total: 6221 W: 1297 L: 1474 D: 3450
sprt @ 10+0.1 th 1 Curious to see how this performs. Use vectors for captures and quiets. (Half throughput)
19-05-19 jos multipv_jos diff
ELO: 28.17 +-6.0 (95%) LOS: 100.0%
Total: 10000 W: 4257 L: 3448 D: 2295
10000 @ 10+0.1 th 1 Bugfix and slightly more emphasis on score diff. ... and 15.
19-05-19 jos multipv_jos diff
ELO: -38.16 +-6.7 (95%) LOS: 0.0%
Total: 10000 W: 4246 L: 5340 D: 414
10000 @ 10+0.1 th 1 Bugfix and slightly more emphasis on score diff. Test also at Skill Level 5 ...
19-05-19 jos multipv_jos diff
ELO: -40.34 +-6.4 (95%) LOS: 0.0%
Total: 10000 W: 3749 L: 4905 D: 1346
10000 @ 10+0.1 th 1 Bugfix and slightly more emphasis on score diff. Now test with Skill Level 10.
19-05-19 jos multipv_jos diff
ELO: 59.68 +-6.3 (95%) LOS: 100.0%
Total: 10000 W: 5094 L: 3393 D: 1513
10000 @ 10+0.1 th 1 Bugfix and slightly more emphasis on score diff.
19-05-19 jos multipv_jos diff
ELO: -202.62 +-20.3 (95%) LOS: 0.0%
Total: 1421 W: 293 L: 1039 D: 89
10000 @ 10+0.1 th 1 Also consider the previous score of this PV line. Take 1.
19-05-18 jos multipv_jos_vdv diff
ELO: 10.03 +-4.5 (95%) LOS: 100.0%
Total: 20000 W: 9011 L: 8434 D: 2555
20000 @ 10+0.1 th 1 Base and test branch in multipv=4 mode. (combo of vdv's multipv patch and mine for comparison.)
19-05-18 jos multipv_jos_vdv diff
ELO: -401.28 +-6.6 (95%) LOS: 0.0%
Total: 20000 W: 565 L: 16953 D: 2482
20000 @ 10+0.1 th 1 Make sure the remaining nonPV moves are searched with the nominal search depth. Is this the reason for the bad performance of the 1st test?
19-05-18 jos MultiPVDepthPRrebased diff
ELO: 34.87 +-8.9 (95%) LOS: 100.0%
Total: 5089 W: 2430 L: 1921 D: 738
20000 @ 10+0.1 th 1 Base and test branch in multipv=4 mode.
19-05-17 jos multipv_jos_vdv diff
ELO: -409.26 +-6.7 (95%) LOS: 0.0%
Total: 20000 W: 535 L: 17071 D: 2394
20000 @ 10+0.1 th 1 Also run a combo of vdv's multipv patch and mine for comparison.
19-05-02 jos psqt2 diff
ELO: -22.62 +-4.7 (95%) LOS: 0.0%
Total: 10000 W: 2055 L: 2705 D: 5240
10000 @ 10+0.1 th 1 Add some more manual values and run another quick check. (Baseline -154 elo)
19-04-28 jos scale_blocked diff
LLR: -2.95 (-2.94,2.94) [0.50,4.50]
Total: 3482 W: 720 L: 868 D: 1894
sprt @ 10+0.1 th 1 A more generalized form of blocked position detection. First, run as standard patch and alongside the shuffle patch.
19-04-21 jos psqt2 diff
ELO: -147.40 +-5.6 (95%) LOS: 0.0%
Total: 10000 W: 1204 L: 5209 D: 3587
10000 @ 10+0.1 th 1 A quick check of manual rook values.
19-04-21 jos rpt diff
LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 17684 W: 3841 L: 3936 D: 9907
sprt @ 10+0.1 th 1 Test manual values against master.
19-04-17 jos psqt_file_rank diff
ELO: -408.94 +-15.5 (95%) LOS: 0.0%
Total: 3833 W: 106 L: 3274 D: 453
20000 @ 10+0.1 th 1 Check new values. Baseline was -154 elo!
19-04-16 jos tune_psqt_new diff
95512/100000 iterations
200000/200000 games played
200000 @ 20+0.2 th 1 One more try with different settings, now with different ck values for each oiece type. tc=20+0.2 nodestime=600, Pawn ck=20, Knight ck=60, Bishop ck=40, Rook ck=20, Queen ck=30, King ck=80, rk=0.010 (x10 compared to 1st session to allow faster change of values!).
19-04-16 jos tune_psqt_new diff
3005/100000 iterations
6275/200000 games played
200000 @ 20+0.2 th 1 One more try with different settings, tc=20+0.2 nodestime=600, ck=80, rk=0.020 (x20 compared to 1st session to allow faster change of values!).
19-04-16 jos psqt_file_rank diff
ELO: -154.91 +-4.0 (95%) LOS: 0.0%
Total: 20000 W: 2337 L: 10707 D: 6956
20000 @ 10+0.1 th 1 Check values after 2nd tuning session for progress. Baseline was -152 elo.
19-04-15 jos tune_psqt_new diff
95510/100000 iterations
199919/200000 games played
200000 @ 20+0.2 th 1 Second tuning session, after which I will check progress to decide whether it's worth continuing or not.
19-04-14 jos tune_psqt_new diff
95466/100000 iterations
199888/200000 games played
200000 @ 20+0.2 th 1 First tuning session, tc=20+0.2 nodestime=600, ck=60, rk=0.0010. Running with 8moves book to tune for more 'common' opening lines. (I hope everything is setup correctly and fishtest is now able to handle 192 parameters ...)
19-04-14 jos psqt_file_rank diff
ELO: -154.32 +-5.7 (95%) LOS: 0.0%
Total: 10000 W: 1189 L: 5360 D: 3451
10000 @ 10+0.1 th 1 Resetting PSTs to zero and calculate by file and rank. This significantly reduces the parameter space. (A quick first measurement as baseline before tuning).
19-03-20 jos vector_pv diff
LLR: 1.76 (-2.94,2.94) [-3.00,1.00]
Total: 43404 W: 7309 L: 7287 D: 28808
sprt @ 60+0.6 th 1 LTC: Using vectors to build the pv. Seems simpler and easier to understand than the current solution. Test as simplification.
19-03-19 jos vector_pv diff
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 38117 W: 8500 L: 8411 D: 21206
sprt @ 10+0.1 th 1 Using vectors to build the pv. Seems simpler and easier to understand than the current solution. Test as simplification.
19-03-11 jos no_qs_at_root diff
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 134343 W: 29782 L: 29884 D: 74677
sprt @ 10+0.1 th 1 Now that we no longer enter qsearch() while still at root node, we can simplify away 2 changes and modify one assert to catch this more easily in the future. Test for no regression.
19-03-06 jos mcts_aspiration diff
ELO: -6.05 +-4.0 (95%) LOS: 0.1%
Total: 10000 W: 1626 L: 1800 D: 6574
10000 @ 60+0.6 th 1 LTC estimate for information purpose only! (1/3 throughput)
19-03-06 jos mcts_aspiration diff
LLR: -2.96 (-2.94,2.94) [0.50,4.50]
Total: 5575 W: 1195 L: 1333 D: 3047
sprt @ 10+0.1 th 1 Shift the aspiration window towards the mcts-like score. I would expect this to work better at longer time-controls (if at all!), but try it anyways.
19-03-02 jos bugfix2_red diff
LLR: -2.95 (-2.94,2.94) [0.50,4.50]
Total: 21374 W: 4633 L: 4692 D: 12049
sprt @ 10+0.1 th 1 How is this one doing?
19-02-23 jos bugfix_red diff
LLR: -2.96 (-2.94,2.94) [-3.00,1.00]
Total: 81307 W: 17583 L: 17907 D: 45817
sprt @ 10+0.1 th 1 Non-regression test for PR#2017
19-02-22 jos multicut_tweak diff
LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 86892 W: 19163 L: 18985 D: 48744
sprt @ 10+0.1 th 1 Not sure if this can be considered a bugfix, so test as parameter tweak. Allow to return value out of AB window like everywhere (fail-soft).