ctl_cyrusdb Segmentation fault
I would like to share this with everyone else out there using Cyrus imapd as their main pop3/imap server.
These week I'm involved in a mail server migration. This server runs Sendmail (MTA), Cyrus imapd (pop3/imap), OpenLDAP (as the authentication backend), Cyrus sasl (as the connector between sendmail, cyrus and openldap) and Bogofilter (as the anti-spam solution). It has been running smoothly for the last one-and-a-half year (about 5000 email accounts) but some weeks ago we decided to move the server from its current location to another provider.
As you can imagine, this is a dedicated server in a hosting company, so we couldn't move the server from one location to another. We just picked up a better one (this move means a big hardware upgrade) and we install FreeBSD 7 (what else?) on it.
First step in the migration was to install every package needed. As sendmail is in the default FreeBSD base, I installed:
- openldap-client-2.4.13
- openldap-server-2.4.13
- cyrus-imapd-2.3.13
- cyrus-sasl-2.1.22_2
- cyrus-sasl-saslauthd-2.1.22_1
- bogofilter-1.1.7_1
- milter-bogom-1.9.2
I'll not go in detail through the process of setting up the whole thing, but the cyrus imapd part of it. This was my first migration between different servers, trying to mirror every piece of data from one server to another and, to add some more ingredients to the recipe, I switch cyrus from version 2.2.13 to 2.3.13.
Configuration files (cyrus.conf, imapd.conf) were easy to migrate, I just copied them using ssh from one server to another.
Then I had to migrate both /var/imap and /var/spool/imap (the first one is the place where cyrus saves mailbox information, metadata and so on, while the second one is the place where emails are really saved). As I've told you, this server holds near 5000 email accounts (no quota!!) so you can imagine the size of /var/spool/imap. That, and the fact that the server was running while I did the first copy, pointed me to rsync to perform the task. I activated temporaly root access through ssh to my old server and, from the new one, issued:
/usr/local/bin/rsync -vzaHe ssh --delete root@72.55.165.91:/var/spool/imap/ /var/spool/imap/
/usr/local/bin/rsync -vzaHe ssh --delete root@72.55.165.91:/var/imap/ /var/imap/
Important: do not forget to add the H option, as it will force rsync to copy hard links too.
It took some time to sync the /var/spool/imap directory. Once it has finished, I checked the permissions on both directories (should be owned by cyrus:cyrus) and tried to start the imapd server:
/usr/local/etc/rc.d/imapd restart
just to notice this in /var/log/messages:
Apr 23 12:15:17 ns205739 kernel: pid 13988 (ctl_cyrusdb), uid 60: exited on signal 11
and this in /var/log/debug:
Apr 23 12:15:17 ns205739 master[13988]: about to exec /usr/local/cyrus/bin/ctl_cyrusdb
The imapd server was trying to check its databases before trying to run imap, pop and the rest of the services, and there was a problem about it. Then I did try to run ctl_cyrusdb directly on a shell, as root:
ns205739# /usr/local/cyrus/bin/ctl_cyrusdb -r Segmentation fault ns205739#
A really ugly message, isn't it?
At first I thought it could be a problem about a mismatch in the version of berkeley db used by cyrus to generate the needed .db files in /var/imap, but soon I realized myself both servers are running the same version of bdb 4.1, but you should check it at first, because it is usual to find such a problem when working with cyrus servers, just use ldd:
ns205739# ldd /usr/local/cyrus/bin/ctl_cyrusdb|grep libdb libdb41.so.1 => /usr/local/lib/libdb41.so.1 (0x800e9d000) ns205739#
and check if there is the same version of the library in both servers:
prunus# ldd /usr/local/cyrus/bin/ctl_cyrusdb|grep libdb libdb41.so.1 => /usr/local/lib/libdb41.so.1 (0x281bb000) prunus#
Then I took a really loooong ride through the Internet, using google to search for this issue (or similar ones), just to find a lot of topics about:
- using ctl_mboxlist to export/import the mailboxes between servers - DIDN'T WORK
- using reconstruct to rebuild the mailboxes.db file - DIDN'T WORK
- using some tools from berkeley db to update/rebuild the .db files within */var/imap** - DIDN'T WORK.
Luckily for me, I just found the solution: remove the db.backup* directories from /var/imap. After that, both ctl_cyrusdb and everything else began to work smoothly on this new server.
The db.backup directories are created by cyrus itself, from time to time, and contain copies of the mailboxes.db files, just in case something goes wrong and you need to get them back. But, for some reason, the copy of ctl_cyrusdb in the new server didn't like the backups from the old one.
Anyone with a nice easy-to-understand reason for that?