Entries : Category [ FreeBSD ]
[OpenBSD]  [BSD]  [FreeBSD]  [Linux]  [Security]  [Python]  [Zope]  [Daily]  [e-shell]  [Hacks]  [PostgreSQL]  [OSX]  [Nintendo DS]  [enlightenment]  [Apache]  [Nintendo Wii]  [Django]  [Music]  [Plone]  [Varnish]  [Lugo]  [Sendmail]  [europython]  [Cherokee]  [self]  [Nature]  [Hiking]  [uwsgi]  [nginx]  [cycling]  [Networking]  [DNS] 

21 enero


or what do tunning stands for from your point of view?

man tuning?

man tuning

This is another discover from this busy morning. Just searching about the way to replace some kernel values in FreeBSD, values that seems unable to be changed through sysctl (I love sysctl), and finally I found that interesting man page about how to perform some tuning on a FreeBSD system.

It covers things like file systems, phisical access to hard drives, CPU performance, etc.

It is worth a read.

Posted by wu at 13:49 | Comments (0) | Trackbacks (0)
24 enero

FreeBSD ports and Python versions

or how the env magic saved me again...

If you have some experience with FreeBSD ports, and more precisely with Python related ones (for example, extra modules like imaging or psycopg), you should already noticed that it is common to have two different python versions installed in the same system.

For example, in one of my FreeBSD servers, I've both Python 2.4 and 2.5 installed, because 2.5 is the current stable version, but I need 2.4 for certain things, like Zope.

And what happens when you need an additional module for a given version of Python ?

While working in that server today, trying to set up a Plone site. I noticed that I needed the python imaging module. I checked with pkg_info (because I thought it was already installed) and I found that there is already a version installed of such module:

[prunus] /var/log> pkg_info | grep imaging
py25-imaging-1.1.6_2 The Python Imaging Library
[prunus] /var/log>

The module was installed for Python 2.5. The problem was that Plone runs on top of Zope, which needs Python 2.4, and as I do not have the module for that version of Python installed, Plone didn't find it.

No problem, I just went to /usr/ports/graphics/py-imaging and perform another make install.

As you could imagine, it tried to re-install the py25-imaging package and, obviously, it crashes:

===>  Checking if graphics/py-imaging already installed
===>   py25-imaging-1.1.6_2 is already installed
      You may wish to ``make deinstall'' and install this port again
      by ``make reinstall'' to upgrade it properly.
      If you really wish to overwrite the old port of graphics/py-imaging
      without deleting it first, set the variable "FORCE_PKG_REGISTER"
      in your environment or the "make install" command line.
*** Error code 1


After reading carefully the error message, I opened the Makefile inside the port directory, just searching for some knobs or options to set the default Python version... but no luck. What now?

Well, after some googling I found something interesting in the porters handbook:

"The Ports Collection supports parallel installation of multiple Python versions. Ports should make sure to use a correct python interpreter, according to the user-settable PYTHON_VERSION variable."

AHA! it was PYTHON_VERSION... So the only thing I need is to set up such variable in the user environment to get the correct version of the module installed:

prunus# setenv PYTHON_VERSION python2.4
prunus# make clean install
===>  Cleaning for py24-imaging-1.1.6_2
===>  Found saved configuration for py25-imaging-1.1.6_2
===>  Extracting for py24-imaging-1.1.6_2
=> MD5 Checksum OK for python/Imaging-1.1.6.tar.gz.
=> SHA256 Checksum OK for python/Imaging-1.1.6.tar.gz.
===>  Patching for py24-imaging-1.1.6_2
===>  Applying FreeBSD patches for py24-imaging-1.1.6_2

[ ... ]

===>   Registering installation for py24-imaging-1.1.6_2

NOTE: probably you already noticed it, this time, I wasn't using sudo, because if I've had added the PYTHON_VERSION variable to my user env and then used sudo, the process executed through sudo couldn't be able to access that environment variable.

UPDATE: As betabug said in the comments, you can pass environment variables to sudo, for example, to install elementtree for both python 2.5 and 2.4:

[prunus] /usr/ports/devel/py-elementtree> sudo make install
=> elementtree-1.2.6-20050316.tar.gz doesn't seem to exist in /usr/ports/distfiles/.
=> Attempting to fetch from http://effbot.org/downloads/.

[ ... ]

===>   Registering installation for py25-elementtree-1.2.6
[prunus] /usr/ports/devel/py-elementtree> sudo make clean
===>  Cleaning for py25-elementtree-1.2.6
[prunus] /usr/ports/devel/py-elementtree> sudo env PYTHON_VERSION=python2.4 make install
===>  Extracting for py24-elementtree-1.2.6
=> MD5 Checksum OK for elementtree-1.2.6-20050316.tar.gz.

[ ... ]

===>   Registering installation for py24-elementtree-1.2.6
[prunus] /usr/ports/devel/py-elementtree>

Posted by wu at 14:15 | Comments (0) | Trackbacks (0)
12 mayo

FreeBSD, ports and OpenLDAP versions

or how to set up the proper openldap version when needed.

This is mostly a reminder for myself, but perhaps any of you would find it helpful.

Today, while trying to set up ldap-based user authentication on FreeBSD (using OpenLDAP as the backend), I found this error while trying to install pam_ldap from ports:

===>  openldap-client-2.3.41 conflicts with installed package(s):

      They install files into the same place.
      Please remove them first with pkg_delete(1).
*** Error code 1

Stop in /usr/ports/net/openldap23-client.
*** Error code 1

Stop in /usr/ports/security/pam_ldap.

The problem here was that the pam_ldap port has openldap-client as a dependency, and the default version for such dep is OpenLDAP 2.3, while my server already has OpenLDAP 2.4 on it.

What to do now?

Easy, just adding one line to /etc/make.conf solved the problem:


Back to the port directory, make install finished smoothly:

===>   Compressing manual pages for pam_ldap-1.8.4
===>   Registering installation for pam_ldap-1.8.4
[PXGOServer] /usr/ports/security/pam_ldap>

Posted by wu at 11:22 | Comments (0) | Trackbacks (0)
12 agosto

FreeBSD, OpenLDAP and the boot process

or how difficult is to debug some problems.

This is something that happens from time to time to me, so I need to write the solution to the problem somewhere as soon as I find it.

Here at the office I've a FreeBSD intranet server, which acts as an authentication server for the local network. User information is stored in a LDAP database (OpenLDAP) and every service that performs any kind of user authentication/identification connects to it to retrieve such information. That means local system accounts, samba accounts, svn and trac accounts and more.

It is not my first time setting up such a server, and I've set up LDAP authentication backends for more kind of servers (mail servers, web servers, etc), but it is the first time I have to deal with this issue.

When booting the server, it took too long to finish the boot process, because each time a daemon (like named, sendmail or mountd) was launched before slapd (the OpenLDAP daemon) it tried to connect to the LDAP database to search for the user associated with the daemon. As slapd wasn't running, it kept trying once and another until some kind of timeout forced the boot process to continue with the rest of the daemons.

And that was unacceptable, can you imagine a server that takes more than 5 minutes to boot?

The first thing in my mind was the FreeBSD boot order, perhaps changing the slapd boot order, so it boots before any other daemon could be a solution...

Could be, but I wasn't sure about the root of the problem, I mean, it was quite strange that something like that didn't create much more noise over the internet. So, I tried Google for a while, and I ended founding an email on the FreeBSD ports mailing list which covers exactly the same problem.

Someone pointed there to a posible solution, modifying the RC scripts to force slapd to be launched before anything else (yupp!, my idea) but, reading carefully, I found here something very interesting:

Additionally you could set "bind_policy soft" in
${LOCALBASE}/etc/nss_ldap.conf to let nss_ldap return in case of
connection problems to slapd instead of waiting forever.

That's it!, I mean, it is supposed that having something like:

# nsswitch.conf(5) - name service switch configuration file
# $FreeBSD: src/etc/nsswitch.conf,v 1.1 2006/05/03 15:14:47 ume Exp $
group: files ldap
group_compat: nis
hosts: files dns
networks: files
passwd: files ldap
passwd_compat: nis
shells: files
services: compat
services_compat: nis
protocols: files
rpc: files

in /etc/nsswitch.conf should prevent ldap queries to keep there opened forever (which in fact seemed to be my problem). User and/or group information should be retrieved first from local files (/etc/group, /etc/passwd, etc), then from the LDAP database. But apparently that's not working at all (i've no idea why)

Well, a quick look over /usr/local/etc/nss_ldap.conf and I realized that I had two options that could be the reason behind the boot delay:

bind_timelimit 120
bind_policy hard

Seems like bind_policy hard force the ldap connections (or tries) retrying once and another until reached bind_timelimit (which in my case had a value of 2 minutes). That means that every failed connection to the slapd daemon will took 2 minutes to realize the connection couldn't be done... awful.

After changed those options to:

bind_timelimit 10
bind_policy soft

everything went back to normal, and the server took only a few seconds to start.

UPDATE: Seems like setting bind_policy to soft wasn't a good idea after all. It solved the boot problem, but it caused some other problems, like users being not associated with their groups in the LDAP database. It seems like FreeBSD knows there are such groups in the ldap database (as running getent group shows all of them, even showing usernames associated with each group), but when running id as a user who is in some groups that are defined in the ldap database, it said the user is only in groups defined in /etc/group. Setting bind_policy to hard again solved the problem, but now I'm back to the boot problem again...

Posted by wu at 12:27 | Comments (0) | Trackbacks (0)
31 octubre

FreeBSD, OpenLDAP and the boot process (II)

there it is again! arrrggg

As one of the things that scares me the most, the FreeBSD, OpenLDAP and the boot process thing is back!

This time, I had to reboot that faulty server because of a kernel panic caused by a usb-connected external hd. I just noticed some noise in /var/log/messages:

Oct 31 17:06:37 PXGOServer kernel: g_vfs_done():da1s1d[WRITE(offset=6144000, length=14336)]error = 6
Oct 31 17:06:47 PXGOServer kernel: g_vfs_done():da1s1e[WRITE(offset=6144000, length=8192)]error = 6
Oct 31 17:09:33 PXGOServer kernel: g_vfs_done():da1s1e[READ(offset=114688, length=16384)]error = 6
Oct 31 17:09:33 PXGOServer kernel: g_vfs_done():da1s1d [READ(offset=114688, length=16384)]error = 6
Oct 31 17:09:38 PXGOServer kernel: g_vfs_done():da1s1d[READ(offset=114688, length=16384)]error = 6

and BAM, a panic there.

Of course, rebooting the server took too much time (as always), so I decided to take another deep look into the problem (as I've done before).

But this time, I'll try to solve the problem from another point of view. Let's try to force slapd to be booted early in the boot process.

Just two links to understand it better:

(If you are used to manage FreeBSD servers, you probably know them already).

After some research (20-30 minutes), I found a quick solution, I just had to modify /usr/local/etc/rc.d/slapd to change two lines:

# BEFORE: securelevel


# BEFORE: SERVERS securelevel

What does this mean?

It's pretty easy to understand. The first two lines set that slapd needed every rc script in the NETWORKING (aka, setting up network interfaces, routing tables, etc) and SERVERS (aka named, mountd, sendmail, etc) groups to be running before trying to start it, while slapd itself should be started before the securelevel rc script.

After my change, slapd only needs rc scripts in the NETWORKING group to be already started and it will be started before any other server rc script.

A quick reboot and I noticed that slapd was started before every other service (including named, sendmail, mountd, sshd, apache, zope, postgresql and some other ones).

Everything was fine, but there is still a delay, cause while starting slapd itself, the system still tries to connect to the OpenLDAP database in order to find the user resposible for running the slapd process... (WTF!, that's the process you are trying to start!).

Posted by wu at 17:41 | Comments (0) | Trackbacks (0)
17 marzo

FreeBSD + gmirror = defence against hard drive failures

this is probably one of the most awesome tests I've ever done

The poweredge 1800, beautiful, isn't it?

Some days ago I've bought a used DELL poweredge 1800 server (yes, another one after the 6650). This new server will replace the one where this site currently runs on. It has 1x2.8Ghz 64bit Intel Dual Xeon processor, 2Gb DDRII RAM and 3x73Gb UltraSCSI 360 hard drives (10000rpm) attached to an Adaptec 39160 SCSI card.

Last weekend I installed FreeBSD (what else?) 7.1 (amd64) on it, using the first scsi hd (da0). A minimal install was enough, then I followed the usual process and I did update it to 7_STABLE (creating my own customized kernel configuration file).

The poweredge 1800, without the frontal protector

Once the box was completely up-to-date (I even got a synced version of the ports tree using cvsup) I set up a gmirror following this chapter from the FreeBSD Handbook. Gmirror is part of the new GEOM framework, which let admins manage things like software RAID easily (I recommend you to take some time to read that chapter from the handbook, it is really worth a read).

One of the shiny features from gmirror is that it allows us to create a RAID-1 mirror of our main hard drive (that is, the drive from where FreeBSD boots) and it was supposed to manage hard drive failures so if one of the disks attached to the mirror fails, the system should be up-and-running as if nothing happened.

So, there I was, with FreeBSD and a complete gmirror (I had to wait until the mirroring process was complete) with 2 hot-plug SCSI drives... could it be a better environment to test gmirror and see how it performs against a harddrive failure? :D

unplugging the disk from a live system

The test was pretty easy to perform. I just pick up one of the disks (da0, the one from where the gmirror was build) and I pull it from the hot-plug SCSI card...


And the system kept itself running smoothly!, the only notice about the fact that the disk was unplugged appeared in the /var/log/messages log file:

Mar 15 12:51:20 nidhogg kernel: ahc0: Someone reset channel A
Mar 15 12:51:31 nidhogg kernel: (da1:ahc0:0:1:0): WRITE(10). CDB: 2a 0 8 8b b9 39 0 0 1 0
Mar 15 12:51:31 nidhogg kernel: (da1:ahc0:0:1:0): CAM Status: SCSI Status Error
Mar 15 12:51:31 nidhogg kernel: (da1:ahc0:0:1:0): SCSI Status: Check Condition
Mar 15 12:51:31 nidhogg kernel: (da1:ahc0:0:1:0): UNIT ATTENTION asc:29,2
Mar 15 12:51:31 nidhogg kernel: (da1:ahc0:0:1:0): SCSI bus reset occurred
Mar 15 12:51:31 nidhogg kernel: (da1:ahc0:0:1:0): Retrying Command (per Sense Data)
Mar 15 12:51:31 nidhogg kernel: (da0:ahc0:0:0:0): lost device
Mar 15 12:51:31 nidhogg kernel: (da0:ahc0:0:0:0): Invalidating pack
Mar 15 12:51:31 nidhogg kernel: GEOM_MIRROR: Cannot write metadata on da0 (device=gm0, error=6).
Mar 15 12:51:31 nidhogg kernel: GEOM_MIRROR: Cannot update metadata on disk da0 (error=6).
Mar 15 12:51:31 nidhogg kernel: GEOM_MIRROR: Device gm0: provider da0 disconnected.
Mar 15 12:51:31 nidhogg kernel: (da0:ahc0:0:0:0): Synchronize cache failed, status == 0x4a, scsi status == 0x0
Mar 15 12:51:31 nidhogg kernel: (da0:ahc0:0:0:0): removing device entry

The system is advicing us about some problems in the ahc0 controller (the adaptec scsi card), as it seems that one of the disks attached to it isn't there anymore. Then GEOM is advicing us too, about a fail when trying to write metadata to one of the disks attached to the mirror (but it kept itself running fine, using only the other disk).

The first part of the test was completely successfull, but I still had to check if the server was able to reboot without the first hard drive attached to it, and indeed it did!. I rebooted the box without the first scsi disk and everything booted up fine, the system was up-and-running in a matter of seconds.

plugging back the disk into a live system

To end my tests, I just plugged the drive back in the scsi hot-plug card, just to notice some more information in /var/log/messages:

Mar 15 12:56:25 nidhogg kernel: ahc0: Someone reset channel A
Mar 15 12:56:30 nidhogg kernel: (da0:ahc0:0:1:0): WRITE(10). CDB: 2a 0 8 8b b9 39 0 0 1 0
Mar 15 12:56:30 nidhogg kernel: (da0:ahc0:0:1:0): CAM Status: SCSI Status Error
Mar 15 12:56:30 nidhogg kernel: (da0:ahc0:0:1:0): SCSI Status: Check Condition
Mar 15 12:56:30 nidhogg kernel: (da0:ahc0:0:1:0): UNIT ATTENTION asc:29,2
Mar 15 12:56:30 nidhogg kernel: (da0:ahc0:0:1:0): SCSI bus reset occurred
Mar 15 12:56:30 nidhogg kernel: (da0:ahc0:0:1:0): Retrying Command (per Sense Data)

And the drive was back!.

One thing to take care of is the fact that using the gmirror status command I noticed that the disk wasn't recognized by gmirror, so I had to tell the mirror to forget it's associates and re-add the disk to it:

gmirror forget gm0
gmirror insert gm0 /dev/da0

This lead the mirror to re-sync the da0 device with the current mirror (which is somehow understandable, as the system kept running and writing data to da1 while da0 wasn't connected at all).

So, successfull test!. Now I've a hard-drive-failure tolerant server.

Just to end this post, two comments:

1- gmirror comes with some commands (like status or list) that will be useful to get some information about the mirror itself but another tool to gather information (this time about disk usage/activity) is systat. The command:

systat -iostat 2

will show you information about your disks usage in a top-like interactive way:

          /0%  /10  /20  /30  /40  /50  /60  /70  /80  /90  /100

(check the man page of systat to learn more about it)

2- I did not stress tests/performance benchmarks on the mirror (will be the next test), but it would be nice to check if there are any performance issues using the mirror instead only one of the disks (and, if there are any, measure if such performance penalty is acceptable knowing the fact that the system will be running anyway if one of the drives crashes).

Posted by wu at 08:19 | Comments (0) | Trackbacks (0)
23 abril

ctl_cyrusdb Segmentation fault

upgrade cyrus imapd from 2.2.13 to 2.3.13

I would like to share this with everyone else out there using Cyrus imapd as their main pop3/imap server.

These week I'm involved in a mail server migration. This server runs Sendmail (MTA), Cyrus imapd (pop3/imap), OpenLDAP (as the authentication backend), Cyrus sasl (as the connector between sendmail, cyrus and openldap) and Bogofilter (as the anti-spam solution). It has been running smoothly for the last one-and-a-half year (about 5000 email accounts) but some weeks ago we decided to move the server from its current location to another provider.

As you can imagine, this is a dedicated server in a hosting company, so we couldn't move the server from one location to another. We just picked up a better one (this move means a big hardware upgrade) and we install FreeBSD 7 (what else?) on it.

First step in the migration was to install every package needed. As sendmail is in the default FreeBSD base, I installed:

I'll not go in detail through the process of setting up the whole thing, but the cyrus imapd part of it. This was my first migration between different servers, trying to mirror every piece of data from one server to another and, to add some more ingredients to the recipe, I switch cyrus from version 2.2.13 to 2.3.13.

Configuration files (cyrus.conf, imapd.conf) were easy to migrate, I just copied them using ssh from one server to another.

Then I had to migrate both /var/imap and /var/spool/imap (the first one is the place where cyrus saves mailbox information, metadata and so on, while the second one is the place where emails are really saved). As I've told you, this server holds near 5000 email accounts (no quota!!) so you can imagine the size of /var/spool/imap. That, and the fact that the server was running while I did the first copy, pointed me to rsync to perform the task. I activated temporaly root access through ssh to my old server and, from the new one, issued:

/usr/local/bin/rsync -vzaHe ssh --delete root@ /var/spool/imap/
/usr/local/bin/rsync -vzaHe ssh --delete root@ /var/imap/

Important: do not forget to add the H option, as it will force rsync to copy hard links too.

It took some time to sync the /var/spool/imap directory. Once it has finished, I checked the permissions on both directories (should be owned by cyrus:cyrus) and tried to start the imapd server:

/usr/local/etc/rc.d/imapd restart

just to notice this in /var/log/messages:

Apr 23 12:15:17 ns205739 kernel: pid 13988 (ctl_cyrusdb), uid 60: exited on signal 11

and this in /var/log/debug:

Apr 23 12:15:17 ns205739 master[13988]: about to exec /usr/local/cyrus/bin/ctl_cyrusdb

The imapd server was trying to check its databases before trying to run imap, pop and the rest of the services, and there was a problem about it. Then I did try to run ctl_cyrusdb directly on a shell, as root:

ns205739# /usr/local/cyrus/bin/ctl_cyrusdb -r
Segmentation fault

A really ugly message, isn't it?

At first I thought it could be a problem about a mismatch in the version of berkeley db used by cyrus to generate the needed .db files in /var/imap, but soon I realized myself both servers are running the same version of bdb 4.1, but you should check it at first, because it is usual to find such a problem when working with cyrus servers, just use ldd:

ns205739# ldd /usr/local/cyrus/bin/ctl_cyrusdb|grep libdb
        libdb41.so.1 => /usr/local/lib/libdb41.so.1 (0x800e9d000)

and check if there is the same version of the library in both servers:

prunus# ldd /usr/local/cyrus/bin/ctl_cyrusdb|grep libdb
        libdb41.so.1 => /usr/local/lib/libdb41.so.1 (0x281bb000)

Then I took a really loooong ride through the Internet, using google to search for this issue (or similar ones), just to find a lot of topics about:

Luckily for me, I just found the solution: remove the db.backup* directories from /var/imap. After that, both ctl_cyrusdb and everything else began to work smoothly on this new server.

The db.backup directories are created by cyrus itself, from time to time, and contain copies of the mailboxes.db files, just in case something goes wrong and you need to get them back. But, for some reason, the copy of ctl_cyrusdb in the new server didn't like the backups from the old one.

Anyone with a nice easy-to-understand reason for that?

Posted by wu at 18:42 | Comments (1) | Trackbacks (0)
03 noviembre

pkg_add -r and sudo in FreeBSD

strange behaviour indeed, never happened to me before.

I usually install software in my FreeBSD servers using the ports collection instead of installing binary pre-compiled packages (I don't want a flame about this, I prefer to compile the software, period) but, from time to time, I use the package collection too.

Today, while playing with a 8.0-RC2 box, I noticed that pkg_add -r (to install packages remotely from the FreeBSD ftp server) doesn't work properly with sudo:

[nidhogg] ~> sudo pkg_add -r lynx
Error: Unable to get ftp://ftp.freebsd.org/pub/FreeBSD/ports/amd64/packages-8.0-release/Latest/lynx.tbz: Syntax error, command unrecognized
pkg_add: unable to fetch 'ftp://ftp.freebsd.org/pub/FreeBSD/ports/amd64/packages-8.0-release/Latest/lynx.tbz' by URL
[nidhogg] ~>

It works fine as root:

nidhogg# pkg_add -r lynx
Fetching ftp://ftp.freebsd.org/pub/FreeBSD/ports/amd64/packages-8.0-release/Latest/lynx.tbz... Done.

To enable certificate handling for SSL connnections, set
SSL_CERT_DIR and SSL_CERT_FILE in your environment to the
proper values (depending upon which SSL library
/usr/local/bin/lynx uses), as described in:




You may also need to generate keys and certificates as
described in the latter document and your SSL documentation.


Anyone that could explain it? (Probably something related to sudo being more restrictive with the env vars settings?)

Posted by wu at 15:51 | Comments (1) | Trackbacks (0)
27 noviembre

FreeBSD 8.0 released

what a timing, just released with self() 3.0!!

Wow, FreeBSD 8.0 was released yesterday!

Perhaps you think it could be a coincidence that the next major release of FreeBSD was announced just the very same day the next major release of self() was announced too... but I know it wasn't ;D

More about the new version of one of my favourite operating systems:

I've been running 8.0-RC* and even -beta versions for some time now and you will find the usual FreeBSD quality + some major improvements. I recommend you all, FreeBSD users out there, to upgrade ASAP.

Posted by wu at 17:13 | Comments (0) | Trackbacks (0)
30 diciembre

mergemaster -U

and it was there all the time, in the man page!

From mergemaster(8):

-U     Attempt to auto upgrade files that have not
       been user modified.

Which is a very useful option to use if you are upgrading a FreeBSD server from a -RELEASE version to a -STABLE version, for example from 8.0-RELEASE to 8-STABLE.

In such case there are a lot of files that will be modified during the installation (like those in /etc/defaults or /etc/periodic) but whose only modification is a timestamp in the cvs information at the beginning of the file.

Passing the -U argument to mergemaster will avoid you having to check each file separately.

(Lucky me I did found this!)

Posted by wu at 12:52 | Comments (0) | Trackbacks (0)
[1]   2   Next