Forums

Resolved
0 votes
ClearOS gateway keeps dropping internet connectivity 2 - 3 times per week.
Syswatch log shows this -

Tue Mar 3 10:10:56 2015 info: eth0 - ping check on server #1 failed - 68.185.15.105 ...
Tue Mar 3 10:11:01 2015 info: eth0 - ping check on server #2 failed - 69.90.141.72 ...
Tue Mar 3 10:11:01 2015 warn: eth0 - connection warning ...
Tue Mar 3 10:11:13 2015 info: eth0 - ping check on gateway failed - 68.185.15.105 ...
Tue Mar 3 10:11:15 2015 info: eth0 - ping check on server #1 failed - 68.185.15.105 ...
Tue Mar 3 10:11:20 2015 info: eth0 - ping check on server #2 failed - 69.90.141.72 ...
Tue Mar 3 10:11:20 2015 warn: eth0 - connection warning ...
Tue Mar 3 10:11:30 2015 info: system - heartbeat... ...
Tue Mar 3 10:11:32 2015 info: eth0 - ping check on gateway failed - 68.185.15.105 ...
Tue Mar 3 10:11:34 2015 info: eth0 - ping check on server #1 failed - 68.185.15.105 ...
Tue Mar 3 10:11:39 2015 info: eth0 - ping check on server #2 failed - 69.90.141.72 ...
Tue Mar 3 10:11:39 2015 warn: eth0 - connection warning ...
Tue Mar 3 10:11:49 2015 info: eth0 - ping check on server #1 passed - 68.185.15.105 ...
Tue Mar 3 10:20:49 2015 info: system - heartbeat... ...

Power cycling the system is the only way I can bring connectivity back. Please help.

Here's some System info -
Version ClearOS Community release 6.6.0 (Final)
Kernel Version 2.6.32-504.8.1.v6.x86_64

02:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection
Subsystem: Super Micro Computer Inc Device 10d3
Kernel driver in use: e1000e
Kernel modules: e1000e
03:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection
Subsystem: Super Micro Computer Inc Device 10d3
Kernel driver in use: e1000e
Kernel modules: e1000e


2.6.32-504.8.1.v6.x86_64

[root@pathwayinc ~]# modinfo e1000e
filename: /lib/modules/2.6.32-504.8.1.v6.x86_64/kernel/drivers/net/e1000e/e1000e.ko
version: 2.3.2-k
license: GPL
description: Intel(R) PRO/1000 Network Driver
author: Intel Corporation, <linux.nics@intel.com>
srcversion: AFD3474CD5CF7B1D88BDD10

Thank you in advance.
Thursday, March 05 2015, 06:07 PM
Share this post:
Responses (19)
  • Accepted Answer

    Tuesday, November 29 2016, 05:56 PM - #Permalink
    Resolved
    0 votes
    walter ferry dissmann wrote:
    Good to know, and to avoid those nics for my future servers ;)
    Even Windoze does not come with every driver. The only real difference is that Windoze drivers come pre-compiled whereas Linux drivers generally need compiling. The r8168 is fine with the correct drivers. The problem with the e1000e driver may also only be for certain cards but I'm not sure and I think e1000e cards are now obsolete and replaced by igb supported cards and others.
    The reply is currently minimized Show
  • Accepted Answer

    Tuesday, November 29 2016, 01:25 PM - #Permalink
    Resolved
    0 votes
    Nick Howitt wrote:

    It is not just a ClearOS issue. If you follow the links through, the solution was found on a Ubuntu machine. This is a completely different stable to FC/RHEL/CentOS/ClearOS, so I'd assume it is a Linux-wide issue.

    Re the r8168, this has never been part of the kernel so the issue will affect at least all FC/RHEL/CentOS/ClearOS derivatives and may also be Linux-wide if other distro streams have not backported the r8168 stud into the r8169 driver.


    Good to know, and to avoid those nics for my future servers ;)
    The reply is currently minimized Show
  • Accepted Answer

    Tuesday, November 29 2016, 12:56 PM - #Permalink
    Resolved
    0 votes
    It is not just a ClearOS issue. If you follow the links through, the solution was found on a Ubuntu machine. This is a completely different stable to FC/RHEL/CentOS/ClearOS, so I'd assume it is a Linux-wide issue.

    Re the r8168, this has never been part of the kernel so the issue will affect at least all FC/RHEL/CentOS/ClearOS derivatives and may also be Linux-wide if other distro streams have not backported the r8168 stud into the r8169 driver.
    The reply is currently minimized Show
  • Accepted Answer

    Tuesday, November 29 2016, 12:49 PM - #Permalink
    Resolved
    0 votes
    Nick Howitt wrote:

    Note that from the thread I linked to, upgrading the driver did not help - the OP started with the kmod driver then downgraded but neither helped. Turning off tso seemed to be the only solution, whichever driver is used.


    I think it worked, good and simple stuff, but why does clearos struggle with such a popular NIC... i have two on that server that i had trouble, RTL8168 and the e1000e.

    The RTL8168 was using the 8169 driver. And now the e1000e. :/

    Thanks a lot for the help, i searched a lot and tried different solution and it was really simple!
    The reply is currently minimized Show
  • Accepted Answer

    Monday, November 28 2016, 09:11 PM - #Permalink
    Resolved
    1 votes
    Note that from the thread I linked to, upgrading the driver did not help - the OP started with the kmod driver then downgraded but neither helped. Turning off tso seemed to be the only solution, whichever driver is used.
    Like
    1
    The reply is currently minimized Show
  • Accepted Answer

    Monday, November 28 2016, 07:28 PM - #Permalink
    Resolved
    0 votes
    Nick Howitt wrote:

    I don't think it will help because the only recent issues I've heard about the e1000e driver are in the thread I linked to, but you are welcome to try my driver. If you are miles away, once installed, you will need to reboot the server to make the change take effect. Anything else risks losing the interface and should only be attempted while on site.


    Well, i did the turn off TSO and GSO.

    Let´s see, if i get 8 days uptime i´m happy, couse i will be near the server!

    Thanks a lot!
    The reply is currently minimized Show
  • Accepted Answer

    Monday, November 28 2016, 07:26 PM - #Permalink
    Resolved
    0 votes
    I don't think it will help because the only recent issues I've heard about the e1000e driver are in the thread I linked to, but you are welcome to try my driver. If you are miles away, once installed, you will need to reboot the server to make the change take effect. Anything else risks losing the interface and should only be attempted while on site.
    Like
    1
    The reply is currently minimized Show
  • Accepted Answer

    Monday, November 28 2016, 07:11 PM - #Permalink
    Resolved
    0 votes
    Nick Howitt wrote:

    You can't use the ElRepo drivers directly. You need to recompile the source against the ClearOS kernel, but I don't think that will help you. If you think it will help I have a compiled version here.

    Have a look at this thread and see if it helps.


    Why do you think it won´t help me?
    The only solution is to change nic, or install another one?

    Another question, lets say using new drivers will do the trick, can i do this remotely? Couse i´m 600km away from the server. Lol :(

    Will i loose connectivity by updating the driver?

    Thanks for the reply man, i will do this:

    ethtool -K eno1 tso off


    And see if it works
    The reply is currently minimized Show
  • Accepted Answer

    Monday, November 28 2016, 07:06 PM - #Permalink
    Resolved
    0 votes
    You can't use the ElRepo drivers directly. You need to recompile the source against the ClearOS kernel, but I don't think that will help you. If you think it will help I have a compiled version here.

    Have a look at this thread and see if it helps.
    Like
    1
    The reply is currently minimized Show
  • Accepted Answer

    Monday, November 28 2016, 06:13 PM - #Permalink
    Resolved
    0 votes
    Same issue here, ClearOS 7.2. with e1000e


    [root@fwrps ~]# ethtool -i eno1
    driver: e1000e
    version: 3.2.5-k
    firmware-version: 0.8-4
    bus-info: 0000:00:1f.6
    supports-statistics: yes
    supports-test: yes
    supports-eeprom-access: yes
    supports-register-dump: yes
    supports-priv-flags: no



    Mon Nov 28 10:09:33 2016 info: eno1 - ping check on gateway failed - 192.168.0.1
    Mon Nov 28 10:09:35 2016 debug: eno1 - ping check on server #1 failed - 8.8.8.8 (ping size: 1)
    Mon Nov 28 10:09:37 2016 info: eno1 - ping check on server #1 failed - 8.8.8.8
    Mon Nov 28 10:09:44 2016 info: eno1 - ping check on server #2 failed - 54.152.208.245
    Mon Nov 28 10:09:44 2016 warn: eno1 - connection is down
    Mon Nov 28 10:09:46 2016 info: eno1 - waiting for static IP reconnect
    Mon Nov 28 10:09:46 2016 info: system - changing active WAN list - none (was eno1)
    Mon Nov 28 10:09:46 2016 info: system - current WANs in use - none


    So after a while i loose connectivity.

    Simply adding ElRepo and installing with yum install kmod-e1000e gives a bunch of errors dependancy related.

    Any Workarounds? Even with scripting like restarting the interface? Even changing this automatic behaviour to change the "active wan" to none.


    Thanks in Advance.
    The reply is currently minimized Show
  • Accepted Answer

    Friday, March 13 2015, 10:59 PM - #Permalink
    Resolved
    0 votes
    Sorry for not posting sooner. Worked on it last Friday and now...it's been 7 days without losing connectivity. Nick - Thank you for your help. It's greatly appreciated.
    The reply is currently minimized Show
  • Accepted Answer

    Friday, March 06 2015, 09:07 PM - #Permalink
    Resolved
    0 votes
    I've compiled and uploaded a version of the e1000e driver for ClearOS 6.6 which you are welcome to try, but please uninstall your current kmod-e1000e driver first with:
    rpm -e kmod-e1000e
    The reply is currently minimized Show
  • Accepted Answer

    Friday, March 06 2015, 07:29 PM - #Permalink
    Resolved
    0 votes
    Thank you, Nick. Very much appreciated. I have this weekend to work on this. I'll post more info then.

    Again, thank you.
    The reply is currently minimized Show
  • Accepted Answer

    Friday, March 06 2015, 07:07 PM - #Permalink
    Resolved
    0 votes
    Digging further into the rpm there is a post-install script which a command similar to:
    /sbin/weak-modules --add-kernel --dry-run --verbose 2.6.32-504.8.1.v6.x86_64
    without the dry-run and verbose options. This throws an error:
    Module igb.ko from kernel 2.6.32-431.23.3.v6.x86_64 is not compatible   with kernel 2.6.32-504.8.1.v6.x86_64 in symbols: set_ethtool_ops_ext
    Perhaps this is why it did not work and you may need a new version compiled but I am a little out of my depth.
    The reply is currently minimized Show
  • Accepted Answer

    Friday, March 06 2015, 05:14 PM - #Permalink
    Resolved
    0 votes
    I can recompile the driver against the current kernel so it will install correctly but that defeats the point of a kmod driver. I've pinged an e-mail to Tim and to the Elrepo mailing list. Let's see what happens.

    It is probably safe to copy the driver across, or perhaps copy it into /lib/modules/2.6.32-504.8.1.v6.x86_64/extra/e1000e then run "depmod -a" then reboot. Back up or move files rather than delete them.
    The reply is currently minimized Show
  • Accepted Answer

    Thursday, March 05 2015, 08:59 PM - #Permalink
    Resolved
    0 votes
    Image file shows what's in the /lib/modules directory.
    I found 2.6.32-504.8.1.v6.x86_64/kernel/drivers/net that the modinfo e1000e is referencing.
    I also found the e1000e.ko file from the update I installed last night in the /lib/modules/2.6.32-431.3.1.v6.x86_64/extra/e1000e folder.

    I wonder if I can copy the e1000e.ko file from that folder to the /lib/modules/2.6.32-504.8.1.v6.x86_64/kernel/drivers/net folder and make it work? There's probably a better idea out there. A file I can modify to point to the /lib/modules/2.6.32-431.3.1.v6.x86_64/extra/e1000e folder. Please feel free to respond.

    Thank you again. http://www.clearfoundation.com/media/kunena/attachments/legacy/images/ClearOS-20150305.JPG
    The reply is currently minimized Show
  • Accepted Answer

    Thursday, March 05 2015, 08:33 PM - #Permalink
    Resolved
    0 votes
    What you did should have worked. For mine do a -Uvh as you are updating Tim's.

    I am wondering if the installation of the e1000e driver may be failing but I don't know enough. The equivalent r8168 one references "week-updates":
    [root@server ~]# modinfo r8168
    filename: /lib/modules/2.6.32-504.8.1.v6.x86_64/weak-updates/r8168/r8168.ko
    version: 8.039.00-NAPI
    license: GPL
    description: RealTek RTL-8168 Gigabit Ethernet driver
    author: Realtek and the Linux r8168 crew <netdev@vger.kernel.org>
    srcversion: 1ABAB2C5CDB55DDB867B0D8
    etc
    My igb one seems to have the same issue but I don't know if it is a problem or not.
    The reply is currently minimized Show
  • Accepted Answer

    Thursday, March 05 2015, 08:01 PM - #Permalink
    Resolved
    0 votes
    Thank you for the quick response.

    I ran this command last night and restarted the system after it "installed" -

    rpm -ivh ftp://download.clearfoundation.com/community/timb80/repo/clearos/6.3/testing/x86_64/kmod-e1000e-3.0.4-1.clearos.x86_64.rpm

    modinfo e1000e still shows this

    modinfo e1000e
    filename: /lib/modules/2.6.32-504.8.1.v6.x86_64/kernel/drivers/net/e1000e/e1000e.ko
    version: 2.3.2-k
    license: GPL

    Do I need to use "-Uvh" instead of "-ivh"?

    I did notice that there's a newer version on the link you gave me - (kmod-e1000e-3.1.0.2-1.clearos6.njh.x86_64.rpm) I can install this tonight.
    How do I know it installed correctly?


    Thank you again for your time.
    The reply is currently minimized Show
  • Accepted Answer

    Thursday, March 05 2015, 07:34 PM - #Permalink
    Resolved
    0 votes
    The 82574L used to have a problem with the stock drivers but I thought that was with older versions of the driver. You can install Tim's e1000e driver from here which is the most recent. I've also got it compiled here.
    The reply is currently minimized Show
Your Reply