Have variable connectivity "dropouts" at one of mine customers locations. The drops only lasts for a few seconds, so email, www and other services works without interuption, but PPTP VPN drops, and clients must reconnect manually. Customer have not used VPN intensively until now, so this fault might have existed since server was setup just before this summer.
The best logs I have found to show this is syswatch-log, here an typical example from today:
It can be hours/days between each time it drops, at best, but sometimes it is 5-20 minutes between drops. Drops has also occured while there is no specific user-activity like VPN..
COS is on a HP DL380P with HP Ethernet 1Gb 4-port 331FLR Adapter.
Eth0 is for WAN. IP is set manually for eth0.
WAN is connected to a fibre-line via a Cisco 24 port switch (don't have modelnumber at hand), who is VLAN-ed by the ISP, we are the only customer on this switch. 7 metre of good quality UTP Cat6.
ISP has checked their logs, nothing logged since June 2014. COS reports alot of drops in the syswatch-logs who is there.
I don't know wether this is related to hardware/drivers, or COS, but I suspect HW/drivers.
I have not been at the location since we discovered this fault, I will of course try to switch to another port on the HP 4-port card and replace cable as soon as I get to the location (tomorrow hopefully).
Is there any good methods to find out where the fault is? I have located the drivers page at HP.com, but I am not experienced enough to dare to try this without help.
There exists a "Premium subscription" support, so I consider to open a ticket, but if this is possible to fix by myself... :-)
The best logs I have found to show this is syswatch-log, here an typical example from today:
Thu Oct 9 12:09:58 2014 info: system - heartbeat... ...
Thu Oct 9 12:12:00 2014 info: eth0 - ping check on server #1 failed - (ISP gateway IP) ...
Thu Oct 9 12:12:05 2014 info: eth0 - ping check on server #2 failed - 69.90.141.72 ...
Thu Oct 9 12:12:05 2014 warn: eth0 - connection warning ...
Thu Oct 9 12:12:15 2014 info: eth0 - ping check on server #1 passed - (ISP gateway IP) ...
Thu Oct 9 12:19:15 2014 info: system - heartbeat... ...
It can be hours/days between each time it drops, at best, but sometimes it is 5-20 minutes between drops. Drops has also occured while there is no specific user-activity like VPN..
COS is on a HP DL380P with HP Ethernet 1Gb 4-port 331FLR Adapter.
Eth0 is for WAN. IP is set manually for eth0.
WAN is connected to a fibre-line via a Cisco 24 port switch (don't have modelnumber at hand), who is VLAN-ed by the ISP, we are the only customer on this switch. 7 metre of good quality UTP Cat6.
ISP has checked their logs, nothing logged since June 2014. COS reports alot of drops in the syswatch-logs who is there.
I don't know wether this is related to hardware/drivers, or COS, but I suspect HW/drivers.
I have not been at the location since we discovered this fault, I will of course try to switch to another port on the HP 4-port card and replace cable as soon as I get to the location (tomorrow hopefully).
Is there any good methods to find out where the fault is? I have located the drivers page at HP.com, but I am not experienced enough to dare to try this without help.
There exists a "Premium subscription" support, so I consider to open a ticket, but if this is possible to fix by myself... :-)
In Hardware
Share this post:
Responses (4)
-
Accepted Answer
Thanks alot for Your help Nick. This COS is a Pro, and uptime is too essential to dare to play with kernels etc ;-)
Now, when I searched the scary interweb for possibly hints, I saw several complaining about unstability regarding ACK/ARP , virtual OS-es and these NICs. This made me suspisious that the ILO4 (HP Integrated Lights-Out 4......) might be the troublemaker, as I had configured to share eth0 with COS. I have never got this port-sharing to work as supposed, even if I followed HP's receipt. Anyway; I went to the customer, reconfigured ILO to use dedicated port, and voila&presto&all: Stable as it should be. Knock-knock.
When I have better time, I will dig a bit more into this ILO-thing with shared NIC, as this opens for the possibility to access BIOS and maintaining-tools, even when "lightsout"; OS is not responding. Maybe not very important, since COS has a tendency to work as it should with no "lightsout" :woohoo: -
Accepted Answer
3.124 is not the most recent. The most recent kmod driver on Elrepo is 3.133 so you could try Tim's driver which is that version. On HP's site they appear to have v3.136, but you'll need to find the sources for it so it can be recompiled. Having said that, if you are using the community version of ClearOS you can update your kernel (or perhaps just reboot if it is already installed). The latest kernel (2.6.32-431.23.3.v6.x86_64) has v3.132 of the tg3 driver which is pretty recent.
I don't know which log files can be used for troubleshooting. Try all of them, especially messages and system. -
Accepted Answer
Ahh, forgot version: 64 bit: 2.6.32-358.23.2.v6.x86_64
Seems like it is version 3.124, so if I understand correct that is as new as possible?
Is there any other log-file I could look for signs/reasons, except for the syswatch?
EDIT: Found that the HP 331FLR NIC has a Broadcom BCM5719 chip. If that clearify anything.. :-) -
Accepted Answer
If you suspect drivers, can you check your current version with "modinfo tg3"? Also are you running 32bit or 64bit ClearOS (do a "uname -r" if you don't know)? Tim's latest kmod-tg3 driver is here for 64bit and here for 32bit. The 64bit one is the latest kmod one available. The 32bit one is a little older. On their site HP appear to have slightly later drivers (3.136) but I can't see their sources whch you would need to compile for a ClearOS compatible driver - you can't use the RHEL drivers directly.
If you install new drivers it is best to reboot to have them take effect. Alternatively you could try restarting networking from the terminal (not remotely)
Please login to post a reply
You will need to be logged in to be able to post a reply. Login using the form on the right or register an account if you are new here.
Register Here »