Discussion:
[BackupPC-users] BackupPC and PowerEdge E1410 CPU 1 IERR
Jonathan Dill
2006-12-15 15:01:21 UTC
Permalink
Maybe this is a shot in the dark, I have already asked on the Linux
PowerEdge mailing list, just hoping that someone has had a very similar
problem and can help narrow down all of the possibilities.

I have a PowerEdge 1900 with Ubuntu Dapper 6.06.1 LTS x86_64 with dual
Xeon 5110 processors, PERC 5/i with 2 SATA drives in RAID1, 3 more
drives on motherboard SATA with LVM for the backup data. The server is
stable as long as I don't run BackupPC. I called Dell and they
recommended taking out the Intel add-in gigabit card and try the onboard
Broadcom, tried that and it still crashed with the same error, so it's
not the network card. I have run a regular rsync of about 200 GB from
the server to another computer, and that worked fine, and that was more
CPU and I/O intensive than BackupPC. I really need to get some more
time to do more hardware diagnostics, but I'm having to do it off hours
and get someone there to "escort" me so it's been hard to schedule downtime.

I disabled all but one host then ran "BackupPC_dump -v -f host" from the
console and also a tail of /var/log/messages, there were no errors in
the system log, the dump was just a ways into the stream of dumping
files, create / pool, toward then end there are a few "Can't link X to"
errors. I started the dump around 11pm and it crashed at 4:13am, this
was only supposed to be about 18 GB so it seems like it should have been
quicker than that.

There are a lot of hardware things that it could be, CPU, memory, riser
board, PERC controller, motherboard. The disks that it is dumping to
are on the motherboard SATA controller and not on the PERC. Any ideas
that could help me to pinpoint this problem?

Thanks,
Jonathan
Guus Houtzager
2006-12-15 15:21:11 UTC
Permalink
On Fri, 2006-12-15 at 10:01 -0500, Jonathan Dill wrote:

[...]

Ok, more details please, you're being too vague. Is your linux box
crashing (kernel oops, freeze, spontaneous reboot) or is just the backup
failing? What version of backuppc are you running? If you get an oops,
can you post it here? What are the last lines of the LOG file for the
host you tried?
What backup method are you using (rsync/ssh, sshd, smb, tar)? What
filesystem do you use for your (c)pool? How many files does the backup
you tried consist of?

Usually if it's hardware related, the memory is the culprit. Try a
memtest86+ run and see if that shows any errors.

Hth,

Guus
Jonathan Dill
2006-12-15 15:49:53 UTC
Permalink
Post by Guus Houtzager
Ok, more details please, you're being too vague. Is your linux box
crashing (kernel oops, freeze, spontaneous reboot) or is just the backup
failing? What version of backuppc are you running? If you get an oops,
can you post it here? What are the last lines of the LOG file for the
host you tried?
No kernel oops, no spontaneous reboots, no errors in the system log, no
thermal warnings, the system freezes and the LCD on the front bezel
shows "E1410 CPU 1 IERR". The Dell tech told me this usually indicates
a problem on the PCI bus, most often the add-in network card, but we
ruled that out.

backuppc-2.1.2-2ubuntu5
perl-5.8.7-10ubuntu
rsync-2.6.6-1ubuntu2

***@imageserver:~# uname -a
Linux imageserver 2.6.15-27-amd64-server #1 SMP Fri Dec 8 18:02:49 UTC
2006 x86_64 GNU/Linux

LOG (entire contents):
2006-12-14 23:00:00 full backup started for directory DriveC
Tino Schwarze
2006-12-15 15:28:17 UTC
Permalink
Post by Guus Houtzager
Usually if it's hardware related, the memory is the culprit. Try a
memtest86+ run and see if that shows any errors.
You might want to use the Dell tools to check the memory sind the Dell
guys usually want it's results before they care.

HTH,

Tino.
--
www.quantenfeuerwerk.de
www.spiritualdesign-chemnitz.de
www.lebensraum11.de
Loading...