sshd: Corrupted MAC on input.
rsync + Received disconnect from xxx.xxx.xxx.xxx: 2: Packet corrupt
Quite a strange error, don't you think?
It took me some time to realized what was happening, so I hope sharing this information with the rest of the world could help somebody in the future.
Some days ago I was syncing some data between two FreeBSD 8.x servers using rsync when I found that the data transmission just stopped suddenly and I could see in the origin server:
Received disconnect from xxx.xxx.xxx.xxx: 2: Packet corrupt
rsync: writefd_unbuffered failed to write 4 bytes to socket [sender]: Broken pipe (32)
rsync: connection unexpectedly closed (14093 bytes received so far) [sender]
rsync error: unexplained error (code 255) at io.c(601) [sender=3.0.7]
(I've replaced the current host ip address with xxx.xxx.xxx.xxx)
At first I thought it should have been some network error, so I tried again but, after a while, I found that the error was apppearing randomly.
Next thing I did was to check the logs in the destination server, and I found some interesting lines in /var/log/auth.log:
Sep 21 07:59:12 mangosta sshd: Corrupted MAC on input.
Sep 21 07:59:12 mangosta sshd: Disconnecting: Packet corrupt
(being mangosta the hostname of the destination server)
Wow, Corrupted MAC on input ¿? Could it be that someone was doing something nasty with my servers?
Short answer: NO.
The problem was that the origin server had some big load on it, the server was running out of memory and it was swaping a lot (extracted from the top command):
Swap: 1024M Total, 577M Used, 447M Free, 56% Inuse
As soon as I stopped some services in that server, the load went down a little and there was plenty of memory for rsync to eat, it works nicely (as usually).
So, this is one of those cases where the error messages are totally useless. Probably it was all rsync's fault, because it usually needs a lot of memory to doing that kind of syncs (-vza) so, as soon as the system was running out of memory, rsync wasn't able to send data over the wire properly. In the other end, as soon as sshd receives some malformed data from the origin server, it probably cut the connection with the Packet Corrupt message.
Perhaps using wireshark or tcpdump to gather some more information could be helpful in this kind of scenario.
Thanks for posting this information. I'm have the same issue using scp to download a database export. My hosting provider tried to blame it on the network... but given prior experience on my shared server, I know it's a lack of resources as you've explained. Thanks for writing this explanation up. I'm going to share it with my hosting provider and see if they have a solution for me as a result.
I'd like to express my thanks as well. I have a small embedded device being backed up nightly by Dirvish (using rsync) and have had the corrupted MAC error the last 3 nights. Worked well for quite a while, system update recently has knocked something out of whack. Amongst other things it runs rtorrent 24/7, occasionally getting up to 70% load, though most of the time it's below 30%. Really don't know why rtorrent does that, have tweaked on it quite extensively. Anyhow, I'll just try putting in a cron job to suspend the rtorrent to give it some breathing room during the backup.
An update on my problem for future searchers of this bizarre problem.
I placed my embedded computer in a less accessible place and had only been doing soft resets after OS updates. Today I did a hard reset (power off, on) and it solved the problem. The load in my case had nothing to do with it, even with a starting load of close to 0.00 it swings up to 1.7, even 2.5 briefly from a very busy rsync. Yet works fine.
Hope you are as fortunate.
Should have added that is the load on the embedded computer being backed up, which is where my problem was occurring.
Sorry for the repeat post, no editing option.
Thanks for posting this information!
I came across the same problem recently on some crappy hardware and would have taken it for another hardware problem. Thanks to this page, I found out this one at least was a software bug, so I could go on and copy the data another way.
FWIW, killing most other processes let rsync get a little further, but not enough in my case. rsync version (and everything else) was Debian squeeze.
There are no trackbacks.