sshd: Corrupted MAC on input.
Quite a strange error, don't you think?
It took me some time to realized what was happening, so I hope sharing this information with the rest of the world could help somebody in the future.
Some days ago I was syncing some data between two FreeBSD 8.x servers using rsync when I found that the data transmission just stopped suddenly and I could see in the origin server:
/home/ficheros/201001261403000.pdf Received disconnect from xxx.xxx.xxx.xxx: 2: Packet corrupt rsync: writefd_unbuffered failed to write 4 bytes to socket [sender]: Broken pipe (32) rsync: connection unexpectedly closed (14093 bytes received so far) [sender] rsync error: unexplained error (code 255) at io.c(601) [sender=3.0.7]
(I've replaced the current host ip address with xxx.xxx.xxx.xxx)
At first I thought it should have been some network error, so I tried again but, after a while, I found that the error was apppearing randomly.
Next thing I did was to check the logs in the destination server, and I found some interesting lines in /var/log/auth.log:
Sep 21 07:59:12 mangosta sshd[48978]: Corrupted MAC on input. Sep 21 07:59:12 mangosta sshd[48978]: Disconnecting: Packet corrupt
(being mangosta the hostname of the destination server)
Wow, Corrupted MAC on input ¿? Could it be that someone was doing something nasty with my servers?
Short answer: NO.
The problem was that the origin server had some big load on it, the server was running out of memory and it was swaping a lot (extracted from the top command):
Swap: 1024M Total, 577M Used, 447M Free, 56% Inuse
As soon as I stopped some services in that server, the load went down a little and there was plenty of memory for rsync to eat, it works nicely (as usually).
So, this is one of those cases where the error messages are totally useless. Probably it was all rsync's fault, because it usually needs a lot of memory to doing that kind of syncs (-vza) so, as soon as the system was running out of memory, rsync wasn't able to send data over the wire properly. In the other end, as soon as sshd receives some malformed data from the origin server, it probably cut the connection with the Packet Corrupt message.
Perhaps using wireshark or tcpdump to gather some more information could be helpful in this kind of scenario.