Moronic Linux
The root cause appeared to be a crazy Linux update on solar: /bin/mail
moved to /usr/bin/mail, breaking the code that submits completed games.
This type of totally unnecessary change is exactly what makes Linux an
unreliable platform compared to the BSDs or Solaris.
I lost the logs of the games from February 22 to March 18, or about 5,000 games.
On March 18 /bin/mail was sym-linked to the new mail program.
This is a workaround until I have time to recompile the binaries.
No full recovery
Today I found time to restore some of the damage. I imported the missing
games from the game e-mails that FICS and ICC send out, so at least I have
the PGN data back.
Anyway, here are the corrected results. On ICC, Rookie(C) has fully recovered
to its expected position with respect to blik(C).
On FICS, this isn't the case.
Maybe this is due to some players preferring one opponent over the other.
It does mean that even extended server play is not a reliable method
to measure strength differences. That is a very disappointing conclusion.

Developments
I have been busy with moving for my work again, so there is not much
work done on Rookie.
The eval tuning experiment has failed.
I will focus now on book learning and tuning, as that
must be the other easy way to improve Rookie.
Last month I rewrote the book file handling code so
that it becomes more flexible to use.
I have some other plans with that code as well that I won't disclose those now.

| Account |
Server | Rating stats (last 1000) | |||
| Average | Sigma | Accuracy | Delta rating | ||
| Rookie(C) | chessclub.com | 2130 | 80 | 2.5 | -392 |
| blik(C) | chessclub.com | 2499 | 65 | 2.0 | +32 |
| Rookie(C) | freechess.org | 1983 | 24 | 0.8 | -238 |
| blik(C) | freechess.org | 2190 | 57 | 1.8 | +7 |
Disaster strikes
After copying the new vector to the Internet versions of the engine,
Rookie(C) got some games quite quickly.
However, the first couple of games already indicated that something was
wrong, very wrong.
Some players commented that this version has gone totally crazy.
It indeed looks like the computer is drunk.
When you observe the games, the erratic behavior becomes clear almost
immediately:
Rookie(C) whispers +2 pawns up or more in equal positions, gives
away material, refuses to capture easy pawns and so on...
It is really horrible.
No need to explain that the ratings took a steep dive right away.
I didn't dare to check out how much it sank precisely, but so
far it looks like Rookie(C) lost 100 to 200 points...
Lessons learnt
So, the first big `improvement' towards Rookie 3.0 turns out to go terribly bad.
Not a good start! What can we learn from that?
I promise that the current version will go offline as soon as I can't bear it anymore: probably just a few days from now. Players can enjoy beating the sitting duck until that time...

| Account |
Server | Blitz games | Rating stats (last 1000) | ||
| Average | Sigma | Accuracy | |||
| Rookie(C) | chessclub.com | 1998 | 2522 | 55 | 1.7 |
| blik(C) | chessclub.com | 2198 | 2467 | 55 | 1.7 |
| Rookie(C) | freechess.org | 1282 | 2221 | 39 | 1.3 |
| blik(C) | freechess.org | 1604 | 2183 | 61 | 1.9 |
2007-01-02 22:02:31 <comp: warning: unhandled alarm signal in stopped engine 2007-01-02 22:02:35 <comp: engine panic: 5 unhandled alarm signals 2007-01-02 22:02:40 <comp: engine panic: 10 unhandled alarm signals, stopping engine 2007-01-02 22:02:45 <comp: engine panic: 15 unhandled alarm signals, resetting data structures 2007-01-02 22:03:00 <comp: engine panic: 30 unhandled alarm signals, leaving in 30 seconds 2007-01-02 22:03:31 <comp: engine panic: 60 unhandled alarm signals, leaving now RookICS: Lost contact with computer. RookICS: main.c:457, exit code 0
2007-01-01 20:11:04 <ics: Movelist for game 119: 2007-01-01 20:11:04 <ics: Rupert (2236) vs. Rookie (2258) --- Mon Jan 1, 12:10 PST 2007 2007-01-01 20:11:04 <ics: Rated blitz match, initial time: 3 minutes, increment: 1 seconds. 2007-01-01 20:11:04 >comp: [Event "fics server game, rated blitz match"](And after that, not even an exit message...)
Evaluator auto-tuning started
As a second milestone, the old scripts for tuning the evaluator are
running again for the first time in almost 6 years.
Solar has 4 CPU cores, so now I have enough computing power
available to run these kind of long-term calibrations without
impacting online play too much.
I have extracted a test set of 47,265 positions from old
blik(C) games.
(The total number that passed the sanity filters. Initial runs
suggested that maybe the targeted number of 10,000 positions
is not enough for calibration purposes.)
The scripts use a hill-climbing algorithm on 205 of Rookie's
adjustable evaluation parameters.
This should be enough to test the feasibility of this
auto-tuning method.
Not all parameters are covered, but it is a good start.
It will take a couple of weeks to iterate over the set a number
of times:
Long enough to do the rating null measurement and testing the setup in the
meantime.