Mailing List Archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [tlug] freeze on AMD64 Dual Opteron server



Marcus Metzler <mocm@example.com> writes:

>>>>>> "Evan" == Evan Monroig <evan.monroig@example.com> writes:
>
> Looks like it may be a problem with your network card (or network
> chipset on the board). Since you get the lockup as soon as you start
> sending lots of data over the network.
> I used to have problems like that when the realtek GBit cards when
> there were still problems with the driver.

Bingo !

I added another network card (some standard 10/100 stuff with e100 for
driver), and I am in the process of copying these gigabytes (right now
the mileage is 2.5G), without any problems.

> Since you are running an SMP system, the problem with
> the network (or maybe some other) driver could be that it is not SMP save.

I will try to investigate this.

By the way, the 'faulty' card identifies itself as:

0000:02:09.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5703X Gigabit Ethernet (rev 02)
        Subsystem: Tyan Computer: Unknown device 2885
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B-
        Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
        Latency: 64 (16000ns min), Cache Line Size: 0x10 (64 bytes)
        Interrupt: pin A routed to IRQ 193
        Region 0: Memory at fc6f0000 (64-bit, non-prefetchable) [size=64K]
        Expansion ROM at fc6e0000 [disabled] [size=64K]
        Capabilities: [40] PCI-X non-bridge device.
                Command: DPERE- ERO- RBC=2 OST=0
                Status: Bus=2 Dev=9 Func=1 64bit+ 133MHz+ SCD- USC-, DC=simple, DMMRBC=2, DMOST=0, DMCRS=1, RSCEM-
        Capabilities: [48] Power Management version 2
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot+,D3cold+)
                Status: D0 PME-Enable- DSel=0 DScale=1 PME-
        Capabilities: [50] Vital Product Data
        Capabilities: [58] Message Signalled Interrupts: 64bit+ Queue=0/3 Enable-
                Address: 05012201aa210200  Data: 0380

And I found one thread [1] with someone that had a similar problem and
solved it by using bcm5700 instead of tg3 as driver.

> Usually there is no time to flush the logs before it freezes
> completely.

Yes, I tried to set up syslog to flush the logs after each write, but to
no avail.

> You could try booting a kernel without SMP to see if that is the problem. 
> If possible you can try a different NIC.
>
> If you get any clue as to what part of the system or which driver could
> be responsible you can contact the maintainer of that driver.

Now I guess I know, and I will contact the maintainer of the driver.

Thanks for all,

Evan


[1] http://lists.suse.com/archive/suse-amd64/2004-Jun/0023.html


Home | Main Index | Thread Index

Home Page Mailing List Linux and Japan TLUG Members Links