Network interface cycles Link Yes/No when DHCP client

Offline

Bug

Network interface cycles Link Yes/No when DHCP client

Resolved

0 votes

Greetings,

I have a handful of ClearOS 7 Community systems that occasionally will lose network connectivity on any interface. Typically it is the External one, and seems to happen more often when the interface is set to receive an IP via DHCP.

Right now I have a system that is "broken", so I'm collecting data and posting here in hopes of getting to the bottom of this frustratingly random problem.

This particular machine has two network adapters: one is a built-in on the Dell motherboard (enp0s25), and the other is an Intel PCIe add-on Card (p1p1):



[root@system ~]# lspci -k | grep Eth -A 3

00:19.0 Ethernet controller: Intel Corporation 82579LM Gigabit Network Connection (Lewisville) (rev 04)

        Subsystem: Dell Device 047e

        Kernel driver in use: e1000e

        Kernel modules: e1000e

--

01:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection

        Subsystem: Intel Corporation Gigabit CT Desktop Adapter

        Kernel driver in use: e1000e

        Kernel modules: e1000e



root@system ~]# ifconfig

enp0s25: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500

        inet 10.5.5.1  netmask 255.255.255.0  broadcast 10.5.5.255

        inet6 fe80::1a03:73ff:fe3b:e05f  prefixlen 64  scopeid 0x20<link>

        ether 18:03:73:3b:e0:5f  txqueuelen 1000  (Ethernet)

        RX packets 38362474  bytes 28872594295 (26.8 GiB)

        RX errors 0  dropped 226  overruns 0  frame 0

        TX packets 47796355  bytes 56127797835 (52.2 GiB)

        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

        device interrupt 20  memory 0xe1b00000-e1b20000



lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536

        inet 127.0.0.1  netmask 255.0.0.0

        inet6 ::1  prefixlen 128  scopeid 0x10<host>

        loop  txqueuelen 1000  (Local Loopback)

        RX packets 206246471  bytes 65738824235 (61.2 GiB)

        RX errors 0  dropped 0  overruns 0  frame 0

        TX packets 206246471  bytes 65738824235 (61.2 GiB)

        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0



p1p1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500

        inet 192.168.0.5  netmask 255.255.255.0  broadcast 192.168.0.255

        inet6 fe80::6a05:caff:fe80:1082  prefixlen 64  scopeid 0x20<link>

        ether 68:05:ca:80:10:82  txqueuelen 1000  (Ethernet)

        RX packets 56879014  bytes 67190367287 (62.5 GiB)

        RX errors 0  dropped 0  overruns 0  frame 0

        TX packets 46920170  bytes 31030020923 (28.8 GiB)

        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

        device interrupt 16  memory 0xe1ac0000-e1ae0000



[root@system ~]# grep IF /etc/clearos/network.conf

EXTIF="p1p1"

LANIF="enp0s25"

DMZIF=""

HOTIF=""

When I have p1p1 (external interface) set to DHCP, it will pull an address from the modem (192.168.0.1) it's connected to pretty much instantly: 192.168.0.5. However, despite receiving an address, I am unable to ping my modem/gateway (192.168.0.1) from the ClearOS box.

Running `arp` results in an INCOMPLETE message for the gateway. However, tcpdump indicates that it is properly requesting and receiving a reply from the gateway indicating its address. But it doesn't seem to be entering it into the ARP table: the INCOMPLETE message never disappears, and it continues to re-request the IP for the gateway MAC, every second:



23:54:14.590636 ARP, Request who-has 192.168.0.1 tell 192.168.0.5, length 28

23:54:14.593611 ARP, Reply 192.168.0.1 is-at b0:b9:8a:05:e3:da, length 46

23:54:15.592637 ARP, Request who-has 192.168.0.1 tell 192.168.0.5, length 28

23:54:15.593598 ARP, Reply 192.168.0.1 is-at b0:b9:8a:05:e3:da, length 46

If I wait 40-100 seconds or so, it will change the Link status in the Webconfig from "Yes" to "No" for about 2 seconds, and the IP address shown in Webconfig will disappear.

Running `ip monitor` during this time indicates that something is blowing the interface away in the system and re-creating it:



192.168.0.1 dev p1p1  FAILED

192.168.0.1 dev p1p1  FAILED

192.168.0.1 dev p1p1  FAILED

192.168.0.1 dev p1p1  FAILED

192.168.0.1 dev p1p1  FAILED

3: p1p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default

    link/ether 68:05:ca:80:10:82 brd ff:ff:ff:ff:ff:ff

Deleted fe80::/64 dev p1p1 proto kernel metric 256 pref medium

Deleted 3: p1p1    inet 192.168.0.5/24 brd 192.168.0.255 scope global dynamic p1p1

       valid_lft 86365sec preferred_lft 86365sec

Deleted 192.168.0.0/24 dev p1p1 proto kernel scope link src 192.168.0.5

Deleted broadcast 192.168.0.255 dev p1p1 table local proto kernel scope link src 192.168.0.5

Deleted broadcast 192.168.0.0 dev p1p1 table local proto kernel scope link src 192.168.0.5

Deleted local 192.168.0.5 dev p1p1 table local proto kernel scope host src 192.168.0.5

Deleted 224.0.0.22 dev p1p1 lladdr 01:00:5e:00:00:16 NOARP

Deleted 1.1.1.1 dev p1p1 lladdr 00:00:00:00:00:00 PERMANENT

Deleted 8.8.8.8 dev p1p1 lladdr 00:00:00:00:00:00 PERMANENT

Deleted 224.0.0.251 dev p1p1 lladdr 01:00:5e:00:00:fb NOARP

Deleted 192.168.0.20 dev p1p1  FAILED

Deleted 192.168.0.1 dev p1p1  FAILED

3: p1p1: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast state DOWN group default

    link/ether 68:05:ca:80:10:82 brd ff:ff:ff:ff:ff:ff

Deleted ff02::1:ff80:1082 dev p1p1 lladdr 33:33:ff:80:10:82 NOARP

Deleted ff02::16 dev p1p1 lladdr 33:33:00:00:00:16 NOARP

Deleted ff00::/8 dev p1p1 table local metric 256 pref medium

Deleted 3: p1p1    inet6 fe80::6a05:caff:fe80:1082/64 scope link

       valid_lft forever preferred_lft forever

Deleted local fe80::6a05:caff:fe80:1082 dev lo table local proto unspec metric 0 pref medium

3: p1p1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast state DOWN group default

    link/ether 68:05:ca:80:10:82 brd ff:ff:ff:ff:ff:ff

10.5.5.181 dev enp0s25 lladdr 9c:4e:36:3b:52:d0 REACHABLE

ff00::/8 dev p1p1 table local metric 256 pref medium

fe80::/64 dev p1p1 proto kernel metric 256 pref medium

3: p1p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default

    link/ether 68:05:ca:80:10:82 brd ff:ff:ff:ff:ff:ff

3: p1p1    inet6 fe80::6a05:caff:fe80:1082/64 scope link

       valid_lft forever preferred_lft forever

local fe80::6a05:caff:fe80:1082 dev lo table local proto unspec metric 0 pref medium

3: p1p1    inet 192.168.0.5/24 brd 192.168.0.255 scope global dynamic p1p1

       valid_lft 86400sec preferred_lft 86400sec

local 192.168.0.5 dev p1p1 table local proto kernel scope host src 192.168.0.5

broadcast 192.168.0.255 dev p1p1 table local proto kernel scope link src 192.168.0.5

192.168.0.0/24 dev p1p1 proto kernel scope link src 192.168.0.5

broadcast 192.168.0.0 dev p1p1 table local proto kernel scope link src 192.168.0.5

default via 192.168.0.1 dev p1p1

192.168.0.1 dev p1p1  FAILED

192.168.0.1 dev p1p1  FAILED

`/var/log/messages` tells a similar story:



Nov  6 23:43:44 system kernel: e1000e: p1p1 NIC Link is Down

Nov  6 23:43:45 system kernel: IPv6: ADDRCONF(NETDEV_UP): p1p1: link is not ready

Nov  6 23:43:48 system kernel: e1000e: p1p1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx

Nov  6 23:43:48 system kernel: IPv6: ADDRCONF(NETDEV_CHANGE): p1p1: link becomes ready

Nov  6 23:43:48 system dhclient[25693]: DHCPREQUEST on p1p1 to 255.255.255.255 port 67 (xid=0x7d3b331a)

Nov  6 23:43:48 system dhclient[25693]: DHCPACK from 192.168.0.1 (xid=0x7d3b331a)

Nov  6 23:43:50 system dhclient[25693]: bound to 192.168.0.5 -- renewal in 42130 seconds.

Nov  6 23:44:06 system dnsmasq-dhcp[1535]: DHCPINFORM(enp0s25) 10.5.5.181 9c:4e:36:3b:52:d0

Nov  6 23:44:06 system dnsmasq-dhcp[1535]: DHCPACK(enp0s25) 10.5.5.181 9c:4e:36:3b:52:d0 MarkMartin

Nov  6 23:44:19 system kernel: e1000e: p1p1 NIC Link is Down

Nov  6 23:44:19 system kernel: IPv6: ADDRCONF(NETDEV_UP): p1p1: link is not ready

Nov  6 23:44:22 system kernel: e1000e: p1p1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx

Nov  6 23:44:22 system kernel: IPv6: ADDRCONF(NETDEV_CHANGE): p1p1: link becomes ready

I've tried increasing my kernel logging to Debug level to try getting a handle on this problem, but haven't been able to uncover anything yet.

Setting a Static IP seems to prevent it from destroying the interface constantly, but doesn't fix what seems to be an ARP issue.

Can someone please point me to the next step in diagnosing this? Many thanks!

network

In Other Network Topics

Thursday, November 07 2019, 05:28 AM

Share this post:

Accepted Answer

Nick Howitt

Offline

Monday, October 19 2020, 09:15 PM - #Permalink

Resolved

0 votes

If you're into hacking, have a look at /usr/sbin/syswatch around lines 666 to 669. This is where it takes down and brings back up the interface. You can probably add a:

            system("/usr/sbin/dhclient -r $extif >/dev/null 2>/dev/null");

            system("/usr/sbin/dhclient $extif >/dev/null 2>/dev/null");

I am not sure where they should appear in the sequence. It is something to play with.

The reply is currently minimized Show

Responses (31)

Accepted Answer
Nick Howitt

Offline
Thursday, November 07 2019, 09:18 AM - #Permalink
Resolved

0 votes

I am curious about your p1p1 interface as I didn't think that numbering system was used any more.

Can you have a look in /var/log/syswatch to see if it is bringing the interface down? It uses ping checking to test if the interface is up.

[edit]
What type of network is your modem on?
[/edit]
The reply is currently minimized Show

Accepted Answer

Marvin Martin

Offline

Thursday, November 07 2019, 01:35 PM - #Permalink

Resolved

0 votes

I am curious about your p1p1 interface as I didn't think that numbering system was used any more.

Well, I don't know what to say to that: I see it pretty frequently when I put in PCIe network cards.

Can you have a look in /var/log/syswatch to see if it is bringing the interface down? It uses ping checking to test if the interface is up.

Sure thing: Here's what I am seeing:



Wed Nov  6 23:39:01 2019  info:  system - WAN network is not up

Wed Nov  6 23:39:21 2019  info:    p1p1 - ping check on server #1 failed - 8.8.8.8 (ping size: 64)

Wed Nov  6 23:39:34 2019  info:    p1p1 - ping check on server #2 failed - 1.1.1.1 (ping size: 64)

Wed Nov  6 23:39:34 2019  warn:    p1p1 - connection is down

Wed Nov  6 23:39:36 2019  info:    p1p1 - restarting DHCP connection

Wed Nov  6 23:39:43 2019  info:  system - WAN network is not up

Wed Nov  6 23:40:03 2019  info:    p1p1 - ping check on server #1 failed - 8.8.8.8 (ping size: 64)

Wed Nov  6 23:40:16 2019  info:    p1p1 - ping check on server #2 failed - 1.1.1.1 (ping size: 64)

Wed Nov  6 23:40:16 2019  warn:    p1p1 - connection is down

Wed Nov  6 23:40:18 2019  info:    p1p1 - restarting DHCP connection

Now, when I changed the interface to Static, the behavior changed a bit, and it seems to restart the systwatch service itself:



Wed Nov  6 23:44:22 2019  info:    p1p1 - waiting for static IP reconnect

Wed Nov  6 23:44:22 2019  info:  system - WAN network is not up

Wed Nov  6 23:44:25 2019  info:  system - syswatch terminated

Wed Nov  6 23:44:25 2019  info:  system - syswatch started

Wed Nov  6 23:44:25 2019  info:  config - IP referrer tool is installed

Wed Nov  6 23:44:25 2019  info:  config - debug level - 0

Wed Nov  6 23:44:25 2019  info:  config - retries - 5

Wed Nov  6 23:44:25 2019  info:  config - heartbeat - 10

Wed Nov  6 23:44:25 2019  info:  config - interval - 60 seconds

Wed Nov  6 23:44:25 2019  info:  config - offline interval - 10 seconds

Wed Nov  6 23:44:25 2019  info:  config - referrer IP detection - enabled

Wed Nov  6 23:44:25 2019  info:  config - ping server auto-detect - enabled

Wed Nov  6 23:44:25 2019  info:  config - try pinging gateway - yes

Wed Nov  6 23:44:25 2019  info:  config - number of external networks - 1

Wed Nov  6 23:44:25 2019  info:  config - monitoring external network - p1p1

Wed Nov  6 23:44:25 2019  info:  config - number of standby networks - 0

Wed Nov  6 23:44:25 2019  info:    info - loading network configuration

Wed Nov  6 23:44:25 2019  info:    info - network configuration for p1p1 -                   config: ifcfg-p1p1

Wed Nov  6 23:44:25 2019  info:    info - network configuration for p1p1 -                   onboot: enabled

Wed Nov  6 23:44:25 2019  info:    info - network configuration for p1p1 -                     type: static

Wed Nov  6 23:44:25 2019  info:    info - network configuration for p1p1 -                     wifi: disabled

Wed Nov  6 23:44:25 2019  info:    info - network configuration for p1p1 -                  gateway: 192.168.0.1

Wed Nov  6 23:44:25 2019  info:    info - Overide configuration for p1p1 -                  ARP nud: enabled

Wed Nov  6 23:44:25 2019  info:    info - Overide configuration for p1p1 -Restart Static Conenction: disabled

Wed Nov  6 23:44:25 2019  info:    info - Overide configuration for p1p1 -             Ping Gateway: disabled

Wed Nov  6 23:44:25 2019  info:    info - Overide configuration for p1p1 -            Ping Server 1: disabled

Wed Nov  6 23:44:25 2019  info:    info - Overide configuration for p1p1 -            Ping Server 2: disabled

Wed Nov  6 23:44:25 2019  info:    info - Overide configuration for p1p1 -               Ping Proto: disabled

Wed Nov  6 23:44:25 2019  info:    info - Overide configuration for p1p1 -             Ping Timeout: disabled

Wed Nov  6 23:44:25 2019  info:    p1p1 - network - IP address - 192.168.0.5

Wed Nov  6 23:44:25 2019  info:    p1p1 - network - gateway - 192.168.0.1

Wed Nov  6 23:44:25 2019  info:    p1p1 - network - type - private IP range

Wed Nov  6 23:44:35 2019  info:    p1p1 - ping check on server #1 failed - 8.8.8.8 (ping size: 64)

Wed Nov  6 23:44:48 2019  info:    p1p1 - ping check on server #2 failed - 1.1.1.1 (ping size: 64)

Wed Nov  6 23:44:48 2019  warn:    p1p1 - connection is down

Wed Nov  6 23:44:50 2019  info:    p1p1 - waiting for static IP reconnect

Wed Nov  6 23:44:50 2019  info:    p1p1 - new IP address detected - 192.168.0.5

Wed Nov  6 23:45:00 2019  info:    p1p1 - ping check on server #1 failed - 8.8.8.8 (ping size: 64)

Wed Nov  6 23:45:13 2019  info:    p1p1 - ping check on server #2 failed - 1.1.1.1 (ping size: 64)

Wed Nov  6 23:45:13 2019  warn:    p1p1 - connection is down

Wed Nov  6 23:45:13 2019  info:  system - WAN network is not up

Wed Nov  6 23:45:13 2019  info:  system - restarting firewall

Wed Nov  6 23:49:54 2019  info:  system - updating intrusion prevention whitelist

Wed Nov  6 23:49:54 2019  info:  system - adding ping server 1.1.1.1

Wed Nov  6 23:49:54 2019  info:  system - adding ping server 8.8.8.8

Wed Nov  6 23:49:54 2019  info:  system - adding DNS server 192.168.0.1

Wed Nov  6 23:49:54 2019  info:  system - reloading intrusion prevention system

Wed Nov  6 23:50:26 2019  info:    p1p1 - ping check on server #1 failed - 8.8.8.8 (ping size: 64)

Wed Nov  6 23:50:39 2019  info:    p1p1 - ping check on server #2 failed - 1.1.1.1 (ping size: 64)

Wed Nov  6 23:50:39 2019  warn:    p1p1 - connection is down

Wed Nov  6 23:50:41 2019  info:    p1p1 - waiting for static IP reconnect

Wed Nov  6 23:50:41 2019  info:  system - WAN network is not up

Wed Nov  6 23:51:01 2019  info:    p1p1 - ping check on server #1 failed - 8.8.8.8 (ping size: 64)

Wed Nov  6 23:51:14 2019  info:    p1p1 - ping check on server #2 failed - 1.1.1.1 (ping size: 64)

Wed Nov  6 23:51:14 2019  warn:    p1p1 - connection is down

Wed Nov  6 23:51:16 2019  info:    p1p1 - waiting for static IP reconnect

Wed Nov  6 23:51:16 2019  info:  system - WAN network is not up

Wed Nov  6 23:51:36 2019  info:    p1p1 - ping check on server #1 failed - 8.8.8.8 (ping size: 64)

Wed Nov  6 23:51:49 2019  info:    p1p1 - ping check on server #2 failed - 1.1.1.1 (ping size: 64)

Wed Nov  6 23:51:49 2019  warn:    p1p1 - connection is down

Wed Nov  6 23:51:51 2019  info:    p1p1 - waiting for static IP reconnect

Wed Nov  6 23:51:51 2019  info:  system - WAN network is not up

And so forth.

This looks quite interesting to me: does syswatch actually destroy and recreate network interfaces if it doesn't detect connectivity, and if so, can this be turned off?

However that wouldn't explain why it's not updating the ARP status from INCOMPLETE--it seems to me we're looking at a pretty low-level problem, but I'm determined to chase it down. I will increase the logging level on syswatch and see what happens.

What type of network is your modem on?

In this case, it's on a cable network. Modem is a Netgear AC1750 (model C6300). Provider is Antietam Broadband (Hagerstown, MD, USA)

(And, FWIW, yes, I have checked and I do have internet if I plug other equipment directly into the modem. But for good measure I will switch out the network cable and move to a different port on the modem.)

The reply is currently minimized Show

Accepted Answer

Marvin Martin

Offline

Thursday, November 07 2019, 02:52 PM - #Permalink

Resolved

0 votes

Ok, I've swapped out the network cable and moved to a different port on my modem, FWIW.

I increased `syswatch` logging to level 7 (debug).

Now I'm getting a constant loop of this below in the syswatch log:

Thu Nov  7 09:47:13 2019  info:  system - WAN network is not up

Thu Nov  7 09:47:13 2019 debug:    p1p1 - using failed interval time (sec) - 10

Thu Nov  7 09:47:23 2019 debug:    p1p1 - updating ARP table

Thu Nov  7 09:47:23 2019 debug:    p1p1 - added ARP nud - 00:00:00:00:00:00 / 8.8.8.8

Thu Nov  7 09:47:23 2019 debug:    p1p1 - added ARP nud - 00:00:00:00:00:00 / 1.1.1.1

Thu Nov  7 09:47:23 2019 debug:    p1p1 - ping timeout: 5

Thu Nov  7 09:47:23 2019 debug:    p1p1 - ping check started with proto icmp Timeout of 5s

Thu Nov  7 09:47:28 2019 debug:    p1p1 - ping check on server #1 failed - 8.8.8.8 (ping size: 1)

Thu Nov  7 09:47:33 2019  info:    p1p1 - ping check on server #1 failed - 8.8.8.8 (ping size: 64)

Thu Nov  7 09:47:41 2019 debug:    p1p1 - ping check on server #2 failed - 1.1.1.1 (ping size: 1)

Thu Nov  7 09:47:46 2019  info:    p1p1 - ping check on server #2 failed - 1.1.1.1 (ping size: 64)

Thu Nov  7 09:47:46 2019 debug:    p1p1 - ping check down count - 124

Thu Nov  7 09:47:46 2019  warn:    p1p1 - connection is down

Thu Nov  7 09:47:46 2019 debug:  system - connection summary (up:down:warn = total) - 0:1:0 = 1

Thu Nov  7 09:47:48 2019  info:    p1p1 - restarting DHCP connection

Thu Nov  7 09:47:55 2019 debug:    p1p1 - unable to determine IP address

Thu Nov  7 09:47:55 2019 debug:    p1p1 - network - IP address - 192.168.0.5

Thu Nov  7 09:47:55 2019 debug:    p1p1 - network - gateway - 192.168.0.1

Thu Nov  7 09:47:55 2019 debug:    p1p1 - network - type - private IP range

Thu Nov  7 09:47:55 2019 debug:    p1p1 - checking IP address for changes

Thu Nov  7 09:47:55 2019 debug:    p1p1 - skipping check on get_ip request - count 2

Thu Nov  7 09:47:55 2019 debug:    p1p1 - network is down

Thu Nov  7 09:47:55 2019 debug:  system - no WANS available for multiwan

Thu Nov  7 09:47:55 2019  info:  system - WAN network is not up

The reply is currently minimized Show

Accepted Answer
Nick Howitt

Offline
Thursday, November 07 2019, 03:00 PM - #Permalink
Resolved

0 votes

You can stop syswatch with a "systemctl stop syswatch". You also need to disable it or servicewatch will try to restart it - "systemctl disable syswatch". I am not sure how that would affect getting an IP address. It also fires off some magic with the firewall, although, once the firewall is set up, there probably wouldn't be any changes to it needed as it should keep pulling the same DHCP IP from the router. Note there are other options in /etc/syswatch to allow you to do things like udp pings and so on. I don't know the expected behaviour of syswatch but if it detects the WAN is down (with no pings), it does drop the interface and restart it.

Are 8.8.8.8 and 1.1.1.1 pingable normally? Have you disabled ICMP anywhere?

As a sideways jump are you able to put your modem into Bridge Mode so ClearOS gets the WAN IP?

The AC1750 is a router, not a modem. What connects your cable to the router?

How do you have your router wired? If it is still working as a WiFi router, the WiFi devices won't be able to access anything behind ClearOS.

Out of curiosity, what is the output of:
grep MODE /etc/clearos/network.conf ip r
The reply is currently minimized Show
Accepted Answer

Marvin Martin

Offline
Thursday, November 07 2019, 03:47 PM - #Permalink
Resolved

0 votes

You can stop syswatch with a "systemctl stop syswatch"...

Thanks for the info. I did go looking in the syswatch code on GitHub then and discovered what you're saying here, that yes, it does drop the interface.

Are 8.8.8.8 and 1.1.1.1 pingable normally?

Normally, when things are working properly, yes.

But pings to them right now look like this:

[root@system etc]# ping 8.8.8.8 PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data. From 192.168.0.5 icmp_seq=1 Destination Host Unreachable From 192.168.0.5 icmp_seq=2 Destination Host Unreachable From 192.168.0.5 icmp_seq=3 Destination Host Unreachable From 192.168.0.5 icmp_seq=4 Destination Host Unreachable ^C --- 8.8.8.8 ping statistics --- 5 packets transmitted, 0 received, +4 errors, 100% packet loss, time 4000ms pipe 4

As noted earlier, 192.168.0.5 is my p1p1 interface IP, assigned via DHCP.

If I try to ping the gateway (modem) at 192.168.0.1, I get the exact same response.

Have you disabled ICMP anywhere?

Not that I'm aware of.

As a sideways jump are you able to put your modem into Bridge Mode so ClearOS gets the WAN IP?

Yes, I could, but right now I'm connected to the modem too to bypass the ClearOS gateway so I have internet... see below

The AC1750 is a router, not a modem. What connects your cable to the router?

Wow, that's confusing--I didn't realize that Netgear uses that number on multiple pieces of equipment. I do indeed have a combo/all-in-one cable modem router. The operative detail in this case is the "C6300" rather than the one starting with "AC..."
This is the exact model I have: https://www.netgear.com/home/products/networking/cable-modems-routers/C6300.aspx

How do you have your router wired? If it is still working as a WiFi router, the WiFi devices won't be able to access anything behind ClearOS.

My network is configured like so:
cable from ISP -> Netgear C6300 Modem (WAN, dynamic IP from ISP; LAN 192.168.0.1) -> ClearOS WAN p1p1 (192.168.0.5) and then ClearOS LAN enp0s25 (10.5.5.1) -> LAN switch -> computers

The C6300 has four ports on it, and as is obvious from above, is doing NAT. So right now I'm connected directly to one of the other ports on my C6300 so I have internet.
I have WiFi disabled; and a separate AP on my LAN switch.

...what is the output of grep MODE...

[root@system etc]# grep MODE /etc/clearos/network.conf MODE="gateway"

... ip r

[root@system etc]# ip r default via 192.168.0.1 dev p1p1 10.5.5.0/24 dev enp0s25 proto kernel scope link src 10.5.5.1 192.168.0.0/24 dev p1p1 proto kernel scope link src 192.168.0.5

Sorry, I should've thought to put these basic network details in my very first post for clarity.
The reply is currently minimized Show
Accepted Answer
Nick Howitt

Offline
Thursday, November 07 2019, 04:55 PM - #Permalink
Resolved

0 votes

Do you have a spare cheap switch you could put between the modem and server in case the NICs at each end have an issue with each other?
The reply is currently minimized Show
Accepted Answer

Marvin Martin

Offline
Thursday, November 07 2019, 05:15 PM - #Permalink
Resolved

0 votes

Sure: I will get a switch and let you know what happens. It may be a few hours though until I have a chance to do that.
The reply is currently minimized Show
Accepted Answer
Tony Ellis

Offline
Thursday, November 07 2019, 11:24 PM - #Permalink
Resolved

0 votes

Interesting - don't know if there is any relation below to Marvin's issues...

I have a multiwan setup - cable (Netgear CG3100D-2) and ADSL2+ (TP-LInk TD-W9980). Both modems are in NAT mode and have 4 LAN ports. Each feeds three ClearOS firewalls...

Syswatch showed a tendancy on interfaces connected to the TP-Link ADSL2+ modem to go into the "connection is down" retry "connection is down" retry cycle. Depending on whether static or dhcp would get the usual messages like "waiting for static IP reconnect", "restarting DHCP connection" and loosing the gateway address etc.

What has been perfectly stable for several years has been to set the modem in DHCP mode and set the MAC addresses of the 3 firewall NICs into the Reservation table to give out static address to each firewall. Then configure each of the gateways in static mode with the same address as in the modem's reservation table. Might seem a wierd thing to do - but works for me...
The reply is currently minimized Show

Accepted Answer

Marvin Martin

Offline

Saturday, November 09 2019, 10:29 PM - #Permalink

Resolved

0 votes

@Tony Ellis, thanks for your input: I'm glad to hear that I'm not the only one who has run into this DHCP client issue.

@Nick Howitt:
I finally got a chance to plug a switch in. I used an ancient hub -- yes, a true hub! (so I could do some Wiresharking) -- and put that between my WAN NIC and the modem. No change in behavior.

Today I took some time to dive deep with _strace_, under the theory that it has something to do with the syswatch process.

The problem goes in a constant, repeatable loop, according to the logfiles. (Note: I'm guessing on my log loop start/stop positions; I am quite at the edge of my knowledge here, so I may have split it in the middle. In any case, all of this happens every time, in sequence.)

First, I have the syswatch log output, with debug verbosity enabled:

Sat Nov  9 16:32:24 2019 debug:    p1p1 - updating ARP table

Sat Nov  9 16:32:24 2019 debug:    p1p1 - added ARP nud - 00:00:00:00:00:00 / 8.8.8.8

Sat Nov  9 16:32:24 2019 debug:    p1p1 - added ARP nud - 00:00:00:00:00:00 / 1.1.1.1

Sat Nov  9 16:32:24 2019 debug:    p1p1 - ping timeout: 5

Sat Nov  9 16:32:24 2019 debug:    p1p1 - ping check started with proto icmp Timeout of 5s

Sat Nov  9 16:32:29 2019 debug:    p1p1 - ping check on server #1 failed - 8.8.8.8 (ping size: 1)

Sat Nov  9 16:32:34 2019  info:    p1p1 - ping check on server #1 failed - 8.8.8.8 (ping size: 64)

Sat Nov  9 16:32:42 2019 debug:    p1p1 - ping check on server #2 failed - 1.1.1.1 (ping size: 1)

Sat Nov  9 16:32:47 2019  info:    p1p1 - ping check on server #2 failed - 1.1.1.1 (ping size: 64)

Sat Nov  9 16:32:47 2019 debug:    p1p1 - ping check down count - 1050

Sat Nov  9 16:32:47 2019  warn:    p1p1 - connection is down

Sat Nov  9 16:32:47 2019 debug:  system - connection summary (up:down:warn = total) - 0:1:0 = 1

Sat Nov  9 16:32:49 2019  info:    p1p1 - restarting DHCP connection

Sat Nov  9 16:33:00 2019 debug:    p1p1 - unable to determine IP address

Sat Nov  9 16:33:00 2019 debug:    p1p1 - network - IP address - 192.168.0.5

Sat Nov  9 16:33:00 2019 debug:    p1p1 - network - gateway - 192.168.0.1

Sat Nov  9 16:33:00 2019 debug:    p1p1 - network - type - private IP range

Sat Nov  9 16:33:00 2019 debug:    p1p1 - checking IP address for changes

Sat Nov  9 16:33:00 2019 debug:    p1p1 - skipping check on get_ip request - count 9

Sat Nov  9 16:33:00 2019 debug:    p1p1 - network is down

Sat Nov  9 16:33:00 2019 debug:  system - no WANS available for multiwan

Sat Nov  9 16:33:00 2019 debug:    p1p1 - using failed interval time (sec) - 10

And now, the enormous output of _strace_ over the same timeframe, watching the syswatch PID, broken up with my notes/speculations on what is happening. As noted earlier, this is my _guess_. Please don't be mislead if I am wrong.

I am hoping a code expert can look over this and see the issue.

## next block after sleep message on prior cycle; see last block below--similar##

16:32:24.072406 [00007f4e1c9d7817] geteuid() = 0

16:32:24.072734 [00007f4e1ca00710] open("/etc/protocols", O_RDONLY|O_CLOEXEC) = 3

16:32:24.073041 [00007f4e1ca002c4] fstat(3, {st_mode=S_IFREG|0644, st_size=6545, ...}) = 0

16:32:24.073308 [00007f4e1ca09d4a] mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f4e1e21b000

16:32:24.073557 [00007f4e1ca00950] read(3, "# /etc/protocols:\n# $Id: protocols,v 1.11 2011/05/03 14:45:40 ovasik Exp $\n#\n# Internet (IP) protoco"..., 4096) = 4096

16:32:24.073812 [00007f4e1ca01000] close(3) = 0

16:32:24.073977 [00007f4e1ca09dd7] munmap(0x7f4e1e21b000, 4096) = 0

16:32:24.074215 [00007f4e1ca10a07] socket(AF_INET, SOCK_RAW, IPPROTO_ICMP) = 3

16:32:24.074298 [00007f4e1ca05a19] ioctl(3, TCGETS, 0x7ffd4091bc50) = -1 ENOTTY (Inappropriate ioctl for device)

16:32:24.074410 [00007f4e1ccedd80] lseek(3, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek)

16:32:24.074497 [00007f4e1ca05a19] ioctl(3, TCGETS, 0x7ffd4091bc50) = -1 ENOTTY (Inappropriate ioctl for device)

16:32:24.074594 [00007f4e1ccedd80] lseek(3, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek)

16:32:24.074665 [00007f4e1cced894] fcntl(3, F_SETFD, FD_CLOEXEC) = 0

16:32:24.074781 [00007f4e1ca109aa] setsockopt(3, SOL_SOCKET, SO_BINDTODEVICE, "p1p1\0", 5) = 0

16:32:24.074944 [00007f4e1ca10577] bind(3, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("192.168.0.5")}, 16) = 0

16:32:24.075074 [00007f4e1c9d7817] geteuid() = 0

16:32:24.075164 [00007f4e1ca00710] open("/etc/protocols", O_RDONLY|O_CLOEXEC) = 4

16:32:24.075235 [00007f4e1ca002c4] fstat(4, {st_mode=S_IFREG|0644, st_size=6545, ...}) = 0

16:32:24.075325 [00007f4e1ca09d4a] mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f4e1e21b000

16:32:24.075406 [00007f4e1ca00950] read(4, "# /etc/protocols:\n# $Id: protocols,v 1.11 2011/05/03 14:45:40 ovasik Exp $\n#\n# Internet (IP) protoco"..., 4096) = 4096

16:32:24.075538 [00007f4e1ca01000] close(4) = 0

16:32:24.075603 [00007f4e1ca09dd7] munmap(0x7f4e1e21b000, 4096) = 0

16:32:24.075726 [00007f4e1ca10a07] socket(AF_INET, SOCK_RAW, IPPROTO_ICMP) = 4

16:32:24.075792 [00007f4e1ca05a19] ioctl(4, TCGETS, 0x7ffd4091bc50) = -1 ENOTTY (Inappropriate ioctl for device)

16:32:24.075869 [00007f4e1ccedd80] lseek(4, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek)

16:32:24.075949 [00007f4e1ca05a19] ioctl(4, TCGETS, 0x7ffd4091bc50) = -1 ENOTTY (Inappropriate ioctl for device)

16:32:24.076031 [00007f4e1ccedd80] lseek(4, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek)

16:32:24.076090 [00007f4e1cced894] fcntl(4, F_SETFD, FD_CLOEXEC) = 0

16:32:24.076160 [00007f4e1ca109aa] setsockopt(4, SOL_SOCKET, SO_BINDTODEVICE, "p1p1\0", 5) = 0

16:32:24.076251 [00007f4e1ca10577] bind(4, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("192.168.0.5")}, 16) = 0





#writes "update ARP table" logline to file:

16:32:24.076348 [00007f4e1ca00275] stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=3519, ...}) = 0

16:32:24.076472 [00007f4e1ccedea0] open("/var/log/syswatch", O_WRONLY|O_CREAT|O_APPEND, 0666) = 5

16:32:24.076562 [00007f4e1ccedd80] lseek(5, 0, SEEK_END) = 9524324

16:32:24.076655 [00007f4e1ca05a19] ioctl(5, TCGETS, 0x7ffd4091ba60) = -1 ENOTTY (Inappropriate ioctl for device)

16:32:24.076730 [00007f4e1ccedd80] lseek(5, 0, SEEK_CUR) = 9524324

16:32:24.076794 [00007f4e1ca002c4] fstat(5, {st_mode=S_IFREG|0644, st_size=9524324, ...}) = 0

16:32:24.076866 [00007f4e1cced894] fcntl(5, F_SETFD, FD_CLOEXEC) = 0

16:32:24.076943 [00007f4e1cced6a0] write(5, "Sat Nov  9 16:32:24 2019 debug:    p1p1 - updating ARP table\n", 61) = 61

16:32:24.077047 [00007f4e1cced760] close(5) = 0





##I don't know what this is. ##

16:32:24.077222 [00007f4e1ca010e7] pipe([5, 6]) = 0

16:32:24.077340 [00007f4e1ca010e7] pipe([7, 8]) = 0

16:32:24.077415 [00007f4e1c9d6922] clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f4e1e20ba10) = 20573

16:32:24.077773 [00007f4e1cced760] close(8) = 0

16:32:24.077849 [00007f4e1cced760] close(6) = 0

16:32:24.077932 [00007f4e1cced700] read(7, "", 4) = 0

16:32:24.078386 [00007f4e1cced760] close(7) = 0

16:32:24.078488 [00007f4e1ca05a19] ioctl(5, TCGETS, 0x7ffd4091bac0) = -1 ENOTTY (Inappropriate ioctl for device)

16:32:24.078559 [00007f4e1ccedd80] lseek(5, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek)

16:32:24.078637 [00007f4e1ca002c4] fstat(5, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0

16:32:24.078720 [00007f4e1cced700] read(5, "00:00:00:00:00:00\n", 8192) = 18

16:32:24.083027 [00007f4e1cced700] read(5, "", 8192) = 0

16:32:24.083325 [00007f4e1cced700] --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=20573, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---

16:32:24.083388 [00007f4e1ccee5f9] rt_sigreturn({mask=[]}) = 0

16:32:24.083457 [00007f4e1ca002c4] fstat(5, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0

16:32:24.083551 [00007f4e1cced760] close(5) = 0

16:32:24.083629 [00007f4e1ccee6fd] rt_sigaction(SIGHUP, {SIG_IGN, [], SA_RESTORER, 0x7f4e1ccee5f0}, {0x7f4e1dd0df90, [], SA_RESTORER, 0x7f4e1ccee5f0}, 8) = 0

16:32:24.083720 [00007f4e1ccee6fd] rt_sigaction(SIGINT, {SIG_IGN, [], SA_RESTORER, 0x7f4e1ccee5f0}, {0x7f4e1dd0df90, [], SA_RESTORER, 0x7f4e1ccee5f0}, 8) = 0

16:32:24.083822 [00007f4e1ccee6fd] rt_sigaction(SIGQUIT, {SIG_IGN, [], SA_RESTORER, 0x7f4e1ccee5f0}, {SIG_DFL, [], SA_RESTORER, 0x7f4e1ccee5f0}, 8) = 0

16:32:24.083913 [00007f4e1ccee14c] wait4(20573, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 20573

16:32:24.083998 [00007f4e1ccee6fd] rt_sigaction(SIGHUP, {0x7f4e1dd0df90, [], SA_RESTORER, 0x7f4e1ccee5f0}, NULL, 8) = 0

16:32:24.084061 [00007f4e1ccee6fd] rt_sigaction(SIGINT, {0x7f4e1dd0df90, [], SA_RESTORER, 0x7f4e1ccee5f0}, NULL, 8) = 0

16:32:24.084117 [00007f4e1ccee6fd] rt_sigaction(SIGQUIT, {SIG_DFL, [], SA_RESTORER, 0x7f4e1ccee5f0}, NULL, 8) = 0

16:32:24.084173 [00007f4e1c9475e0] rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0

16:32:24.084274 [00007f4e1ccee6fd] rt_sigaction(SIGCHLD, NULL, {0x7f4e1dd0df90, [], SA_RESTORER, 0x7f4e1ccee5f0}, 8) = 0

16:32:24.084352 [00007f4e1ccee14c] wait4(-1, 0x7ffd4091ba54, 0, NULL) = -1 ECHILD (No child processes)

16:32:24.084415 [00007f4e1c9475e0] rt_sigprocmask(SIG_UNBLOCK, [CHLD], NULL, 8) = 0

16:32:24.084545 [00007f4e1ca010e7] pipe([5, 6]) = 0

16:32:24.084606 [00007f4e1ca010e7] pipe([7, 8]) = 0

16:32:24.084664 [00007f4e1c9d6922] clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f4e1e20ba10) = 20579

16:32:24.084961 [00007f4e1cced760] close(8) = 0

16:32:24.085062 [00007f4e1cced760] close(6) = 0

16:32:24.085182 [00007f4e1cced700] read(7, "", 4) = 0

16:32:24.085533 [00007f4e1cced760] close(7) = 0

16:32:24.085613 [00007f4e1ca05a19] ioctl(5, TCGETS, 0x7ffd4091bac0) = -1 ENOTTY (Inappropriate ioctl for device)

16:32:24.085707 [00007f4e1ccedd80] lseek(5, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek)

16:32:24.085819 [00007f4e1cced700] read(5, "", 8192) = 0

16:32:24.089225 [00007f4e1cced700] --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=20579, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---

16:32:24.089281 [00007f4e1ccee5f9] rt_sigreturn({mask=[]}) = 0

16:32:24.089380 [00007f4e1cced760] close(5) = 0

16:32:24.089477 [00007f4e1ccee6fd] rt_sigaction(SIGHUP, {SIG_IGN, [], SA_RESTORER, 0x7f4e1ccee5f0}, {0x7f4e1dd0df90, [], SA_RESTORER, 0x7f4e1ccee5f0}, 8) = 0

16:32:24.089546 [00007f4e1ccee6fd] rt_sigaction(SIGINT, {SIG_IGN, [], SA_RESTORER, 0x7f4e1ccee5f0}, {0x7f4e1dd0df90, [], SA_RESTORER, 0x7f4e1ccee5f0}, 8) = 0

16:32:24.089672 [00007f4e1ccee6fd] rt_sigaction(SIGQUIT, {SIG_IGN, [], SA_RESTORER, 0x7f4e1ccee5f0}, {SIG_DFL, [], SA_RESTORER, 0x7f4e1ccee5f0}, 8) = 0

16:32:24.089745 [00007f4e1ccee14c] wait4(20579, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 20579

16:32:24.089828 [00007f4e1ccee6fd] rt_sigaction(SIGHUP, {0x7f4e1dd0df90, [], SA_RESTORER, 0x7f4e1ccee5f0}, NULL, 8) = 0

16:32:24.089901 [00007f4e1ccee6fd] rt_sigaction(SIGINT, {0x7f4e1dd0df90, [], SA_RESTORER, 0x7f4e1ccee5f0}, NULL, 8) = 0

16:32:24.089976 [00007f4e1ccee6fd] rt_sigaction(SIGQUIT, {SIG_DFL, [], SA_RESTORER, 0x7f4e1ccee5f0}, NULL, 8) = 0

16:32:24.090032 [00007f4e1c9475e0] rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0

16:32:24.090113 [00007f4e1ccee6fd] rt_sigaction(SIGCHLD, NULL, {0x7f4e1dd0df90, [], SA_RESTORER, 0x7f4e1ccee5f0}, 8) = 0

16:32:24.090178 [00007f4e1ccee14c] wait4(-1, 0x7ffd4091ba54, 0, NULL) = -1 ECHILD (No child processes)

16:32:24.090256 [00007f4e1c9475e0] rt_sigprocmask(SIG_UNBLOCK, [CHLD], NULL, 8) = 0





#writes "Added ARP nud for 8.8.8.8" logline to file

16:32:24.090344 [00007f4e1ca00275] stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=3519, ...}) = 0

16:32:24.090494 [00007f4e1ccedea0] open("/var/log/syswatch", O_WRONLY|O_CREAT|O_APPEND, 0666) = 5

16:32:24.090573 [00007f4e1ccedd80] lseek(5, 0, SEEK_END) = 9524385

16:32:24.090640 [00007f4e1ca05a19] ioctl(5, TCGETS, 0x7ffd4091ba60) = -1 ENOTTY (Inappropriate ioctl for device)

16:32:24.090707 [00007f4e1ccedd80] lseek(5, 0, SEEK_CUR) = 9524385

16:32:24.090763 [00007f4e1ca002c4] fstat(5, {st_mode=S_IFREG|0644, st_size=9524385, ...}) = 0

16:32:24.090829 [00007f4e1cced894] fcntl(5, F_SETFD, FD_CLOEXEC) = 0

16:32:24.090904 [00007f4e1cced6a0] write(5, "Sat Nov  9 16:32:24 2019 debug:    p1p1 - added ARP nud - 00:00:00:00:00:00 / 8.8.8.8\n", 86) = 86

16:32:24.090993 [00007f4e1cced760] close(5) = 0

16:32:24.091093 [00007f4e1ca010e7] pipe([5, 6]) = 0

16:32:24.091160 [00007f4e1ca010e7] pipe([7, 8]) = 0



## does something else##

16:32:24.091224 [00007f4e1c9d6922] clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f4e1e20ba10) = 20582

16:32:24.091546 [00007f4e1cced760] close(8) = 0

16:32:24.091614 [00007f4e1cced760] close(6) = 0

16:32:24.091706 [00007f4e1cced700] read(7, "", 4) = 0

16:32:24.092031 [00007f4e1cced760] close(7) = 0

16:32:24.092102 [00007f4e1ca05a19] ioctl(5, TCGETS, 0x7ffd4091bac0) = -1 ENOTTY (Inappropriate ioctl for device)

16:32:24.092161 [00007f4e1ccedd80] lseek(5, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek)

16:32:24.092234 [00007f4e1cced700] read(5, "", 8192) = 0

16:32:24.095512 [00007f4e1cced700] --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=20582, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---

16:32:24.095569 [00007f4e1ccee5f9] rt_sigreturn({mask=[]}) = 0

16:32:24.095645 [00007f4e1cced760] close(5) = 0

16:32:24.095704 [00007f4e1ccee6fd] rt_sigaction(SIGHUP, {SIG_IGN, [], SA_RESTORER, 0x7f4e1ccee5f0}, {0x7f4e1dd0df90, [], SA_RESTORER, 0x7f4e1ccee5f0}, 8) = 0

16:32:24.095772 [00007f4e1ccee6fd] rt_sigaction(SIGINT, {SIG_IGN, [], SA_RESTORER, 0x7f4e1ccee5f0}, {0x7f4e1dd0df90, [], SA_RESTORER, 0x7f4e1ccee5f0}, 8) = 0

16:32:24.095830 [00007f4e1ccee6fd] rt_sigaction(SIGQUIT, {SIG_IGN, [], SA_RESTORER, 0x7f4e1ccee5f0}, {SIG_DFL, [], SA_RESTORER, 0x7f4e1ccee5f0}, 8) = 0

16:32:24.095888 [00007f4e1ccee14c] wait4(20582, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 20582

16:32:24.095953 [00007f4e1ccee6fd] rt_sigaction(SIGHUP, {0x7f4e1dd0df90, [], SA_RESTORER, 0x7f4e1ccee5f0}, NULL, 8) = 0

16:32:24.096007 [00007f4e1ccee6fd] rt_sigaction(SIGINT, {0x7f4e1dd0df90, [], SA_RESTORER, 0x7f4e1ccee5f0}, NULL, 8) = 0

16:32:24.096063 [00007f4e1ccee6fd] rt_sigaction(SIGQUIT, {SIG_DFL, [], SA_RESTORER, 0x7f4e1ccee5f0}, NULL, 8) = 0

16:32:24.096119 [00007f4e1c9475e0] rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0

16:32:24.096199 [00007f4e1ccee6fd] rt_sigaction(SIGCHLD, NULL, {0x7f4e1dd0df90, [], SA_RESTORER, 0x7f4e1ccee5f0}, 8) = 0

16:32:24.096268 [00007f4e1ccee14c] wait4(-1, 0x7ffd4091ba54, 0, NULL) = -1 ECHILD (No child processes)

16:32:24.096330 [00007f4e1c9475e0] rt_sigprocmask(SIG_UNBLOCK, [CHLD], NULL, 8) = 0



#writes "Added ARP nud for 1.1.1.1" logline to file

16:32:24.096409 [00007f4e1ca00275] stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=3519, ...}) = 0

16:32:24.096504 [00007f4e1ccedea0] open("/var/log/syswatch", O_WRONLY|O_CREAT|O_APPEND, 0666) = 5

16:32:24.096573 [00007f4e1ccedd80] lseek(5, 0, SEEK_END) = 9524471

16:32:24.096637 [00007f4e1ca05a19] ioctl(5, TCGETS, 0x7ffd4091ba60) = -1 ENOTTY (Inappropriate ioctl for device)

16:32:24.096696 [00007f4e1ccedd80] lseek(5, 0, SEEK_CUR) = 9524471

16:32:24.096749 [00007f4e1ca002c4] fstat(5, {st_mode=S_IFREG|0644, st_size=9524471, ...}) = 0

16:32:24.096808 [00007f4e1cced894] fcntl(5, F_SETFD, FD_CLOEXEC) = 0

16:32:24.096872 [00007f4e1cced6a0] write(5, "Sat Nov  9 16:32:24 2019 debug:    p1p1 - added ARP nud - 00:00:00:00:00:00 / 1.1.1.1\n", 86) = 86

16:32:24.097009 [00007f4e1cced760] close(5) = 0





#writes "ping timeout: 5" logline to file

16:32:24.097107 [00007f4e1ca00275] stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=3519, ...}) = 0

16:32:24.097202 [00007f4e1ccedea0] open("/var/log/syswatch", O_WRONLY|O_CREAT|O_APPEND, 0666) = 5

16:32:24.097275 [00007f4e1ccedd80] lseek(5, 0, SEEK_END) = 9524557

16:32:24.097332 [00007f4e1ca05a19] ioctl(5, TCGETS, 0x7ffd4091ba60) = -1 ENOTTY (Inappropriate ioctl for device)

16:32:24.097391 [00007f4e1ccedd80] lseek(5, 0, SEEK_CUR) = 9524557

16:32:24.097444 [00007f4e1ca002c4] fstat(5, {st_mode=S_IFREG|0644, st_size=9524557, ...}) = 0

16:32:24.097505 [00007f4e1cced894] fcntl(5, F_SETFD, FD_CLOEXEC) = 0

16:32:24.097569 [00007f4e1cced6a0] write(5, "Sat Nov  9 16:32:24 2019 debug:    p1p1 - ping timeout: 5\n", 58) = 58

16:32:24.097669 [00007f4e1cced760] close(5) = 0





#writes "pingcheck started" logline to file

16:32:24.097790 [00007f4e1ca00275] stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=3519, ...}) = 0

16:32:24.097886 [00007f4e1ccedea0] open("/var/log/syswatch", O_WRONLY|O_CREAT|O_APPEND, 0666) = 5

16:32:24.098018 [00007f4e1ccedd80] lseek(5, 0, SEEK_END) = 9524615

16:32:24.098105 [00007f4e1ca05a19] ioctl(5, TCGETS, 0x7ffd4091ba60) = -1 ENOTTY (Inappropriate ioctl for device)

16:32:24.098168 [00007f4e1ccedd80] lseek(5, 0, SEEK_CUR) = 9524615

16:32:24.098267 [00007f4e1ca002c4] fstat(5, {st_mode=S_IFREG|0644, st_size=9524615, ...}) = 0

16:32:24.098364 [00007f4e1cced894] fcntl(5, F_SETFD, FD_CLOEXEC) = 0

16:32:24.098429 [00007f4e1cced6a0] write(5, "Sat Nov  9 16:32:24 2019 debug:    p1p1 - ping check started with proto icmp Timeout of 5s \n", 92) = 92

16:32:24.098547 [00007f4e1cced760] close(5) = 0



#sends ping

16:32:24.098798 [00007f4e1ccedcc3] sendto(3, "\10\0\245HR\266\0\1\0", 9, 0, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("8.8.8.8")}, 16) = 9

16:32:24.098954 [00007f4e1ca06933] select(8, [3], NULL, NULL, {5, 0}) = 0 (Timeout)



#writes "pingcheck failed" logline to file:

16:32:29.104244 [00007f4e1ca00275] stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=3519, ...}) = 0

16:32:29.104529 [00007f4e1ccedea0] open("/var/log/syswatch", O_WRONLY|O_CREAT|O_APPEND, 0666) = 5

16:32:29.104733 [00007f4e1ccedd80] lseek(5, 0, SEEK_END) = 9524707

16:32:29.104987 [00007f4e1ca05a19] ioctl(5, TCGETS, 0x7ffd4091ba60) = -1 ENOTTY (Inappropriate ioctl for device)

16:32:29.105165 [00007f4e1ccedd80] lseek(5, 0, SEEK_CUR) = 9524707

16:32:29.105319 [00007f4e1ca002c4] fstat(5, {st_mode=S_IFREG|0644, st_size=9524707, ...}) = 0

16:32:29.105457 [00007f4e1cced894] fcntl(5, F_SETFD, FD_CLOEXEC) = 0

16:32:29.105655 [00007f4e1cced6a0] write(5, "Sat Nov  9 16:32:29 2019 debug:    p1p1 - ping check on server #1 failed - 8.8.8.8 (ping size: 1)\n", 98) = 98

16:32:29.105783 [00007f4e1cced760] close(5) = 0

16:32:29.105943 [00007f4e1ccedcc3] sendto(4, "\10\0\301DR\266\0\1\0\1\2\3\4\5\6\7\10\t\n\v\f\r\16\17\20\21\22\23\24\25\26\27\30\31\32\33\34\35\36\37 !\"#$%&'()*+,-./0123456789:;<=>?", 72, 0, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("8.8.8.8")}, 16) = 72

16:32:29.106091 [00007f4e1ca06933] select(8, [4], NULL, NULL, {5, 0}) = 0 (Timeout)



#writes "pingcheck2 to 8.8.8.8 failed" logline to file:

16:32:34.111363 [00007f4e1ca00275] stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=3519, ...}) = 0

16:32:34.111740 [00007f4e1ccedea0] open("/var/log/syswatch", O_WRONLY|O_CREAT|O_APPEND, 0666) = 5

16:32:34.112068 [00007f4e1ccedd80] lseek(5, 0, SEEK_END) = 9524805

16:32:34.112322 [00007f4e1ca05a19] ioctl(5, TCGETS, 0x7ffd4091ba60) = -1 ENOTTY (Inappropriate ioctl for device)

16:32:34.112517 [00007f4e1ccedd80] lseek(5, 0, SEEK_CUR) = 9524805

16:32:34.112653 [00007f4e1ca002c4] fstat(5, {st_mode=S_IFREG|0644, st_size=9524805, ...}) = 0

16:32:34.112790 [00007f4e1cced894] fcntl(5, F_SETFD, FD_CLOEXEC) = 0

16:32:34.112895 [00007f4e1cced6a0] write(5, "Sat Nov  9 16:32:34 2019  info:    p1p1 - ping check on server #1 failed - 8.8.8.8 (ping size: 64)\n", 99) = 99

16:32:34.113067 [00007f4e1cced760] close(5) = 0

16:32:34.113200 [00007f4e1c9475e0] rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0

16:32:34.113317 [00007f4e1c9474bd] rt_sigaction(SIGCHLD, NULL, {0x7f4e1dd0df90, [], SA_RESTORER, 0x7f4e1ccee5f0}, 8) = 0

16:32:34.113421 [00007f4e1c9475e0] rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0

16:32:34.113490 [00007f4e1c9d67f0] nanosleep({3, 0}, 0x7ffd4091bcf0) = 0



#sends ping to 1.1.1.1

16:32:37.113879 [00007f4e1ccedcc3] sendto(3, "\10\0\245GR\266\0\2\0", 9, 0, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("1.1.1.1")}, 16) = 9

16:32:37.114236 [00007f4e1ca06933] select(8, [3], NULL, NULL, {5, 0}) = 0 (Timeout)



#writes "ping to 1.1.1.1 failed" logline to file

16:32:42.119467 [00007f4e1ca00275] stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=3519, ...}) = 0

16:32:42.119808 [00007f4e1ccedea0] open("/var/log/syswatch", O_WRONLY|O_CREAT|O_APPEND, 0666) = 5

16:32:42.120088 [00007f4e1ccedd80] lseek(5, 0, SEEK_END) = 9524904

16:32:42.120248 [00007f4e1ca05a19] ioctl(5, TCGETS, 0x7ffd4091ba60) = -1 ENOTTY (Inappropriate ioctl for device)

16:32:42.120404 [00007f4e1ccedd80] lseek(5, 0, SEEK_CUR) = 9524904

16:32:42.120521 [00007f4e1ca002c4] fstat(5, {st_mode=S_IFREG|0644, st_size=9524904, ...}) = 0

16:32:42.120663 [00007f4e1cced894] fcntl(5, F_SETFD, FD_CLOEXEC) = 0

16:32:42.120818 [00007f4e1cced6a0] write(5, "Sat Nov  9 16:32:42 2019 debug:    p1p1 - ping check on server #2 failed - 1.1.1.1 (ping size: 1)\n", 98) = 98

16:32:42.120934 [00007f4e1cced760] close(5) = 0



#ping response???

16:32:42.121158 [00007f4e1ccedcc3] sendto(4, "\10\0\301CR\266\0\2\0\1\2\3\4\5\6\7\10\t\n\v\f\r\16\17\20\21\22\23\24\25\26\27\30\31\32\33\34\35\36\37 !\"#$%&'()*+,-./0123456789:;<=>?", 72, 0, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("1.1.1.1")}, 16) = 72

16:32:42.121314 [00007f4e1ca06933] select(8, [4], NULL, NULL, {5, 0}) = 0 (Timeout)



#writes "ping to 1.1.1.1 failed" logline to file

16:32:47.123934 [00007f4e1ca00275] stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=3519, ...}) = 0

16:32:47.124266 [00007f4e1ccedea0] open("/var/log/syswatch", O_WRONLY|O_CREAT|O_APPEND, 0666) = 5

16:32:47.124541 [00007f4e1ccedd80] lseek(5, 0, SEEK_END) = 9525002

16:32:47.124774 [00007f4e1ca05a19] ioctl(5, TCGETS, 0x7ffd4091ba60) = -1 ENOTTY (Inappropriate ioctl for device)

16:32:47.124914 [00007f4e1ccedd80] lseek(5, 0, SEEK_CUR) = 9525002

16:32:47.125165 [00007f4e1ca002c4] fstat(5, {st_mode=S_IFREG|0644, st_size=9525002, ...}) = 0

16:32:47.125381 [00007f4e1cced894] fcntl(5, F_SETFD, FD_CLOEXEC) = 0

16:32:47.125516 [00007f4e1cced6a0] write(5, "Sat Nov  9 16:32:47 2019  info:    p1p1 - ping check on server #2 failed - 1.1.1.1 (ping size: 64)\n", 99) = 99

16:32:47.125650 [00007f4e1cced760] close(5) = 0



#writes "ping check down count" logline to file

16:32:47.125744 [00007f4e1ca00275] stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=3519, ...}) = 0

16:32:47.125847 [00007f4e1ccedea0] open("/var/log/syswatch", O_WRONLY|O_CREAT|O_APPEND, 0666) = 5

16:32:47.125925 [00007f4e1ccedd80] lseek(5, 0, SEEK_END) = 9525101

16:32:47.126001 [00007f4e1ca05a19] ioctl(5, TCGETS, 0x7ffd4091ba60) = -1 ENOTTY (Inappropriate ioctl for device)

16:32:47.126103 [00007f4e1ccedd80] lseek(5, 0, SEEK_CUR) = 9525101

16:32:47.126169 [00007f4e1ca002c4] fstat(5, {st_mode=S_IFREG|0644, st_size=9525101, ...}) = 0

16:32:47.126278 [00007f4e1cced894] fcntl(5, F_SETFD, FD_CLOEXEC) = 0

16:32:47.126341 [00007f4e1cced6a0] write(5, "Sat Nov  9 16:32:47 2019 debug:    p1p1 - ping check down count - 1050\n", 71) = 71

16:32:47.126482 [00007f4e1cced760] close(5) = 0



#writes "connection is down" logline to file

16:32:47.126633 [00007f4e1ca00275] stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=3519, ...}) = 0

16:32:47.126737 [00007f4e1ccedea0] open("/var/log/syswatch", O_WRONLY|O_CREAT|O_APPEND, 0666) = 5

16:32:47.126821 [00007f4e1ccedd80] lseek(5, 0, SEEK_END) = 9525172

16:32:47.126881 [00007f4e1ca05a19] ioctl(5, TCGETS, 0x7ffd4091ba60) = -1 ENOTTY (Inappropriate ioctl for device)

16:32:47.126973 [00007f4e1ccedd80] lseek(5, 0, SEEK_CUR) = 9525172

16:32:47.127043 [00007f4e1ca002c4] fstat(5, {st_mode=S_IFREG|0644, st_size=9525172, ...}) = 0

16:32:47.127117 [00007f4e1cced894] fcntl(5, F_SETFD, FD_CLOEXEC) = 0

16:32:47.127177 [00007f4e1cced6a0] write(5, "Sat Nov  9 16:32:47 2019  warn:    p1p1 - connection is down\n", 61) = 61

16:32:47.127314 [00007f4e1cced760] close(5) = 0

16:32:47.127481 [00007f4e1cced760] close(3) = 0

16:32:47.127564 [00007f4e1cced760] close(4) = 0



#writes "system connection summary" logline to file

16:32:47.127710 [00007f4e1ca00275] stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=3519, ...}) = 0

16:32:47.127801 [00007f4e1ccedea0] open("/var/log/syswatch", O_WRONLY|O_CREAT|O_APPEND, 0666) = 3

16:32:47.127873 [00007f4e1ccedd80] lseek(3, 0, SEEK_END) = 9525233

16:32:47.127944 [00007f4e1ca05a19] ioctl(3, TCGETS, 0x7ffd4091ba60) = -1 ENOTTY (Inappropriate ioctl for device)

16:32:47.128018 [00007f4e1ccedd80] lseek(3, 0, SEEK_CUR) = 9525233

16:32:47.128073 [00007f4e1ca002c4] fstat(3, {st_mode=S_IFREG|0644, st_size=9525233, ...}) = 0

16:32:47.128150 [00007f4e1cced894] fcntl(3, F_SETFD, FD_CLOEXEC) = 0

16:32:47.128220 [00007f4e1cced6a0] write(3, "Sat Nov  9 16:32:47 2019 debug:  system - connection summary (up:down:warn = total) - 0:1:0 = 1\n", 96) = 96

16:32:47.128319 [00007f4e1cced760] close(3) = 0



## what is this??##

16:32:47.128566 [00007f4e1c9d7817] geteuid() = 0

16:32:47.128686 [00007f4e1ca00710] open("/etc/protocols", O_RDONLY|O_CLOEXEC) = 3

16:32:47.128799 [00007f4e1ca002c4] fstat(3, {st_mode=S_IFREG|0644, st_size=6545, ...}) = 0

16:32:47.128917 [00007f4e1ca09d4a] mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f4e1e21b000

16:32:47.129008 [00007f4e1ca00950] read(3, "# /etc/protocols:\n# $Id: protocols,v 1.11 2011/05/03 14:45:40 ovasik Exp $\n#\n# Internet (IP) protoco"..., 4096) = 4096

16:32:47.129166 [00007f4e1ca01000] close(3) = 0

16:32:47.129245 [00007f4e1ca09dd7] munmap(0x7f4e1e21b000, 4096) = 0

16:32:47.129443 [00007f4e1ca10a07] socket(AF_INET, SOCK_RAW, IPPROTO_ICMP) = 3

16:32:47.129527 [00007f4e1ca05a19] ioctl(3, TCGETS, 0x7ffd4091bc50) = -1 ENOTTY (Inappropriate ioctl for device)

16:32:47.129663 [00007f4e1ccedd80] lseek(3, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek)

16:32:47.129743 [00007f4e1ca05a19] ioctl(3, TCGETS, 0x7ffd4091bc50) = -1 ENOTTY (Inappropriate ioctl for device)

16:32:47.129866 [00007f4e1ccedd80] lseek(3, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek)

16:32:47.129968 [00007f4e1cced894] fcntl(3, F_SETFD, FD_CLOEXEC) = 0

16:32:47.130045 [00007f4e1ca109aa] setsockopt(3, SOL_SOCKET, SO_BINDTODEVICE, "p1p1\0", 5) = 0

16:32:47.130203 [00007f4e1ca10577] bind(3, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("192.168.0.5")}, 16) = 0

16:32:47.130364 [00007f4e1ccedcc3] sendto(3, "\10\0\301DR\266\0\1\0\1\2\3\4\5\6\7\10\t\n\v\f\r\16\17\20\21\22\23\24\25\26\27\30\31\32\33\34\35\36\37 !\"#$%&'()*+,-./0123456789:;<=>?", 72, 0, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("192.168.0.1")}, 16) = 72

16:32:47.130518 [00007f4e1ca06933] select(8, [3], NULL, NULL, {2, 0}) = 0 (Timeout)

16:32:49.132844 [00007f4e1cced760] close(3) = 0



#writes "restarting DHCP connection" logline to file

16:32:49.133163 [00007f4e1ca00275] stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=3519, ...}) = 0

16:32:49.133526 [00007f4e1ccedea0] open("/var/log/syswatch", O_WRONLY|O_CREAT|O_APPEND, 0666) = 3

16:32:49.133669 [00007f4e1ccedd80] lseek(3, 0, SEEK_END) = 9525329

16:32:49.133862 [00007f4e1ca05a19] ioctl(3, TCGETS, 0x7ffd4091ba60) = -1 ENOTTY (Inappropriate ioctl for device)

16:32:49.133988 [00007f4e1ccedd80] lseek(3, 0, SEEK_CUR) = 9525329

16:32:49.134070 [00007f4e1ca002c4] fstat(3, {st_mode=S_IFREG|0644, st_size=9525329, ...}) = 0

16:32:49.134218 [00007f4e1cced894] fcntl(3, F_SETFD, FD_CLOEXEC) = 0

16:32:49.134303 [00007f4e1cced6a0] write(3, "Sat Nov  9 16:32:49 2019  info:    p1p1 - restarting DHCP connection\n", 69) = 69

16:32:49.134454 [00007f4e1cced760] close(3) = 0



## What all happens here?? Looks very interesting! ##

16:32:49.134596 [00007f4e1ca010e7] pipe([3, 4]) = 0

16:32:49.134725 [00007f4e1c9475e0] rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0

16:32:49.134836 [00007f4e1c9d6922] clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f4e1e20ba10) = 20614

16:32:49.135178 [00007f4e1cced760] close(4) = 0

16:32:49.135284 [00007f4e1ccee6fd] rt_sigaction(SIGINT, {SIG_IGN, [], SA_RESTORER, 0x7f4e1ccee5f0}, {0x7f4e1dd0df90, [], SA_RESTORER, 0x7f4e1ccee5f0}, 8) = 0

16:32:49.135496 [00007f4e1ccee6fd] rt_sigaction(SIGQUIT, {SIG_IGN, [], SA_RESTORER, 0x7f4e1ccee5f0}, {SIG_DFL, [], SA_RESTORER, 0x7f4e1ccee5f0}, 8) = 0

16:32:49.135602 [00007f4e1ccee14c] wait4(20614, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 20614

16:32:49.378266 [00007f4e1c9475e0] rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0

16:32:49.378402 [00007f4e1c9475e0] --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=20614, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---

16:32:49.378487 [00007f4e1ccee5f9] rt_sigreturn({mask=[]}) = 0

16:32:49.378555 [00007f4e1ccee6fd] rt_sigaction(SIGINT, {0x7f4e1dd0df90, [], SA_RESTORER, 0x7f4e1ccee5f0}, NULL, 8) = 0

16:32:49.378635 [00007f4e1ccee6fd] rt_sigaction(SIGQUIT, {SIG_DFL, [], SA_RESTORER, 0x7f4e1ccee5f0}, NULL, 8) = 0

16:32:49.378746 [00007f4e1cced700] read(3, "", 4) = 0

16:32:49.378849 [00007f4e1cced760] close(3) = 0

16:32:49.378911 [00007f4e1c9475e0] rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0

16:32:49.379019 [00007f4e1ccee6fd] rt_sigaction(SIGCHLD, NULL, {0x7f4e1dd0df90, [], SA_RESTORER, 0x7f4e1ccee5f0}, 8) = 0

16:32:49.379097 [00007f4e1ccee14c] wait4(-1, 0x7ffd4091ba54, 0, NULL) = -1 ECHILD (No child processes)

16:32:49.379162 [00007f4e1c9475e0] rt_sigprocmask(SIG_UNBLOCK, [CHLD], NULL, 8) = 0

16:32:49.379222 [00007f4e1ca010e7] pipe([3, 4]) = 0

16:32:49.379271 [00007f4e1c9475e0] rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0

16:32:49.379330 [00007f4e1c9d6922] clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f4e1e20ba10) = 20686

16:32:49.379582 [00007f4e1cced760] close(4) = 0

16:32:49.379661 [00007f4e1ccee6fd] rt_sigaction(SIGINT, {SIG_IGN, [], SA_RESTORER, 0x7f4e1ccee5f0}, {0x7f4e1dd0df90, [], SA_RESTORER, 0x7f4e1ccee5f0}, 8) = 0

16:32:49.379750 [00007f4e1ccee6fd] rt_sigaction(SIGQUIT, {SIG_IGN, [], SA_RESTORER, 0x7f4e1ccee5f0}, {SIG_DFL, [], SA_RESTORER, 0x7f4e1ccee5f0}, 8) = 0

16:32:49.379831 [00007f4e1ccee14c] wait4(20686, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 20686

16:33:00.764698 [00007f4e1c9475e0] rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0

16:33:00.764796 [00007f4e1c9475e0] --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=20686, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---

16:33:00.764901 [00007f4e1ccee5f9] rt_sigreturn({mask=[]}) = 0

16:33:00.764973 [00007f4e1ccee6fd] rt_sigaction(SIGINT, {0x7f4e1dd0df90, [], SA_RESTORER, 0x7f4e1ccee5f0}, NULL, 8) = 0

16:33:00.765033 [00007f4e1ccee6fd] rt_sigaction(SIGQUIT, {SIG_DFL, [], SA_RESTORER, 0x7f4e1ccee5f0}, NULL, 8) = 0

16:33:00.765078 [00007f4e1cced700] read(3, "", 4) = 0

16:33:00.765165 [00007f4e1cced760] close(3) = 0

16:33:00.765217 [00007f4e1c9475e0] rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0

16:33:00.765280 [00007f4e1ccee6fd] rt_sigaction(SIGCHLD, NULL, {0x7f4e1dd0df90, [], SA_RESTORER, 0x7f4e1ccee5f0}, 8) = 0

16:33:00.765339 [00007f4e1ccee14c] wait4(-1, 0x7ffd4091ba54, 0, NULL) = -1 ECHILD (No child processes)

16:33:00.765410 [00007f4e1c9475e0] rt_sigprocmask(SIG_UNBLOCK, [CHLD], NULL, 8) = 0

16:33:00.765526 [00007f4e1ca010e7] pipe([3, 4]) = 0

16:33:00.765571 [00007f4e1ca010e7] pipe([5, 6]) = 0

16:33:00.765619 [00007f4e1c9d6922] clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f4e1e20ba10) = 20857

16:33:00.765885 [00007f4e1cced760] close(6) = 0

16:33:00.765931 [00007f4e1cced760] close(4) = 0

16:33:00.765979 [00007f4e1cced700] read(5, "", 4) = 0

16:33:00.766279 [00007f4e1cced760] close(5) = 0

16:33:00.766334 [00007f4e1ca05a19] ioctl(3, TCGETS, 0x7ffd4091bac0) = -1 ENOTTY (Inappropriate ioctl for device)

16:33:00.766407 [00007f4e1ccedd80] lseek(3, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek)

16:33:00.766464 [00007f4e1ca002c4] fstat(3, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0

16:33:00.766549 [00007f4e1cced700] read(3, "3: p1p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 100"..., 8192) = 157

16:33:00.769087 [00007f4e1cced700] read(3, "    inet 192.168.0.5/24 brd 192.168.0.255 scope global dynamic p1p1\n       valid_lft 86400sec prefer"..., 8192) = 117

16:33:00.769155 [00007f4e1cced700] read(3, "", 8192) = 0

16:33:00.769250 [00007f4e1cced700] --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=20857, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---

16:33:00.769288 [00007f4e1ccee5f9] rt_sigreturn({mask=[]}) = 0

16:33:00.769333 [00007f4e1ca002c4] fstat(3, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0

16:33:00.769406 [00007f4e1cced760] close(3) = 0

16:33:00.769451 [00007f4e1ccee6fd] rt_sigaction(SIGHUP, {SIG_IGN, [], SA_RESTORER, 0x7f4e1ccee5f0}, {0x7f4e1dd0df90, [], SA_RESTORER, 0x7f4e1ccee5f0}, 8) = 0

16:33:00.769509 [00007f4e1ccee6fd] rt_sigaction(SIGINT, {SIG_IGN, [], SA_RESTORER, 0x7f4e1ccee5f0}, {0x7f4e1dd0df90, [], SA_RESTORER, 0x7f4e1ccee5f0}, 8) = 0

16:33:00.769554 [00007f4e1ccee6fd] rt_sigaction(SIGQUIT, {SIG_IGN, [], SA_RESTORER, 0x7f4e1ccee5f0}, {SIG_DFL, [], SA_RESTORER, 0x7f4e1ccee5f0}, 8) = 0

16:33:00.769598 [00007f4e1ccee14c] wait4(20857, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 20857

16:33:00.769687 [00007f4e1ccee6fd] rt_sigaction(SIGHUP, {0x7f4e1dd0df90, [], SA_RESTORER, 0x7f4e1ccee5f0}, NULL, 8) = 0

16:33:00.769730 [00007f4e1ccee6fd] rt_sigaction(SIGINT, {0x7f4e1dd0df90, [], SA_RESTORER, 0x7f4e1ccee5f0}, NULL, 8) = 0

16:33:00.769771 [00007f4e1ccee6fd] rt_sigaction(SIGQUIT, {SIG_DFL, [], SA_RESTORER, 0x7f4e1ccee5f0}, NULL, 8) = 0

16:33:00.769823 [00007f4e1c9475e0] rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0

16:33:00.769896 [00007f4e1ccee6fd] rt_sigaction(SIGCHLD, NULL, {0x7f4e1dd0df90, [], SA_RESTORER, 0x7f4e1ccee5f0}, 8) = 0

16:33:00.769959 [00007f4e1ccee14c] wait4(-1, 0x7ffd4091ba54, 0, NULL) = -1 ECHILD (No child processes)

16:33:00.770013 [00007f4e1c9475e0] rt_sigprocmask(SIG_UNBLOCK, [CHLD], NULL, 8) = 0

16:33:00.770179 [00007f4e1ca010e7] pipe([3, 4]) = 0

16:33:00.770223 [00007f4e1ca010e7] pipe([5, 6]) = 0

16:33:00.770269 [00007f4e1c9d6922] clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f4e1e20ba10) = 20859

16:33:00.770528 [00007f4e1cced760] close(6) = 0

16:33:00.770571 [00007f4e1cced760] close(4) = 0

16:33:00.770643 [00007f4e1cced700] read(5, "", 4) = 0

16:33:00.770847 [00007f4e1cced760] close(5) = 0

16:33:00.770900 [00007f4e1ca05a19] ioctl(3, TCGETS, 0x7ffd4091bac0) = -1 ENOTTY (Inappropriate ioctl for device)

16:33:00.770969 [00007f4e1ccedd80] lseek(3, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek)

16:33:00.771021 [00007f4e1ca002c4] fstat(3, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0

16:33:00.771081 [00007f4e1cced700] read(3, "3: p1p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 100"..., 8192) = 157

16:33:00.773578 [00007f4e1cced700] read(3, "    inet 192.168.0.5/24 brd 192.168.0.255 scope global dynamic p1p1\n       valid_lft 86400sec prefer"..., 8192) = 117

16:33:00.773642 [00007f4e1cced700] read(3, "", 8192) = 0

16:33:00.773757 [00007f4e1cced700] --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=20859, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---

16:33:00.773797 [00007f4e1ccee5f9] rt_sigreturn({mask=[]}) = 0

16:33:00.773873 [00007f4e1ca002c4] fstat(3, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0

16:33:00.773931 [00007f4e1cced760] close(3) = 0

16:33:00.773973 [00007f4e1ccee6fd] rt_sigaction(SIGHUP, {SIG_IGN, [], SA_RESTORER, 0x7f4e1ccee5f0}, {0x7f4e1dd0df90, [], SA_RESTORER, 0x7f4e1ccee5f0}, 8) = 0

16:33:00.774055 [00007f4e1ccee6fd] rt_sigaction(SIGINT, {SIG_IGN, [], SA_RESTORER, 0x7f4e1ccee5f0}, {0x7f4e1dd0df90, [], SA_RESTORER, 0x7f4e1ccee5f0}, 8) = 0

16:33:00.774115 [00007f4e1ccee6fd] rt_sigaction(SIGQUIT, {SIG_IGN, [], SA_RESTORER, 0x7f4e1ccee5f0}, {SIG_DFL, [], SA_RESTORER, 0x7f4e1ccee5f0}, 8) = 0

16:33:00.774158 [00007f4e1ccee14c] wait4(20859, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 20859

16:33:00.774254 [00007f4e1ccee6fd] rt_sigaction(SIGHUP, {0x7f4e1dd0df90, [], SA_RESTORER, 0x7f4e1ccee5f0}, NULL, 8) = 0

16:33:00.774297 [00007f4e1ccee6fd] rt_sigaction(SIGINT, {0x7f4e1dd0df90, [], SA_RESTORER, 0x7f4e1ccee5f0}, NULL, 8) = 0

16:33:00.774385 [00007f4e1ccee6fd] rt_sigaction(SIGQUIT, {SIG_DFL, [], SA_RESTORER, 0x7f4e1ccee5f0}, NULL, 8) = 0

16:33:00.774427 [00007f4e1c9475e0] rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0

16:33:00.774496 [00007f4e1ccee6fd] rt_sigaction(SIGCHLD, NULL, {0x7f4e1dd0df90, [], SA_RESTORER, 0x7f4e1ccee5f0}, 8) = 0

16:33:00.774545 [00007f4e1ccee14c] wait4(-1, 0x7ffd4091ba54, 0, NULL) = -1 ECHILD (No child processes)

16:33:00.774591 [00007f4e1c9475e0] rt_sigprocmask(SIG_UNBLOCK, [CHLD], NULL, 8) = 0



#writes "unable to determine IP address" logline to file

16:33:00.774739 [00007f4e1ca00275] stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=3519, ...}) = 0

16:33:00.774816 [00007f4e1ccedea0] open("/var/log/syswatch", O_WRONLY|O_CREAT|O_APPEND, 0666) = 3

16:33:00.774885 [00007f4e1ccedd80] lseek(3, 0, SEEK_END) = 9525398

16:33:00.774947 [00007f4e1ca05a19] ioctl(3, TCGETS, 0x7ffd4091ba60) = -1 ENOTTY (Inappropriate ioctl for device)

16:33:00.775013 [00007f4e1ccedd80] lseek(3, 0, SEEK_CUR) = 9525398

16:33:00.775062 [00007f4e1ca002c4] fstat(3, {st_mode=S_IFREG|0644, st_size=9525398, ...}) = 0

16:33:00.775110 [00007f4e1cced894] fcntl(3, F_SETFD, FD_CLOEXEC) = 0

16:33:00.775197 [00007f4e1cced6a0] write(3, "Sat Nov  9 16:33:00 2019 debug:    p1p1 - unable to determine IP address\n", 73) = 73

16:33:00.775264 [00007f4e1cced760] close(3) = 0



#writes "network IP address is 192.168..." logline to file

16:33:00.775343 [00007f4e1ca00275] stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=3519, ...}) = 0

16:33:00.775416 [00007f4e1ccedea0] open("/var/log/syswatch", O_WRONLY|O_CREAT|O_APPEND, 0666) = 3

16:33:00.775474 [00007f4e1ccedd80] lseek(3, 0, SEEK_END) = 9525471

16:33:00.775585 [00007f4e1ca05a19] ioctl(3, TCGETS, 0x7ffd4091ba60) = -1 ENOTTY (Inappropriate ioctl for device)

16:33:00.775631 [00007f4e1ccedd80] lseek(3, 0, SEEK_CUR) = 9525471

16:33:00.775699 [00007f4e1ca002c4] fstat(3, {st_mode=S_IFREG|0644, st_size=9525471, ...}) = 0

16:33:00.775806 [00007f4e1cced894] fcntl(3, F_SETFD, FD_CLOEXEC) = 0

16:33:00.775853 [00007f4e1cced6a0] write(3, "Sat Nov  9 16:33:00 2019 debug:    p1p1 - network - IP address - 192.168.0.5\n", 77) = 77

16:33:00.775921 [00007f4e1cced760] close(3) = 0



##writes out gateway IP to the routing mechanism??##

16:33:00.775976 [00007f4e1ca00275] stat("/var/lib/dhclient/p1p1.routers", {st_mode=S_IFREG|0644, st_size=12, ...}) = 0

16:33:00.776029 [00007f4e1ca010e7] pipe([3, 4]) = 0

16:33:00.776071 [00007f4e1ca010e7] pipe([5, 6]) = 0

16:33:00.776134 [00007f4e1c9d6922] clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f4e1e20ba10) = 20861

16:33:00.776410 [00007f4e1cced760] close(6) = 0

16:33:00.776448 [00007f4e1cced760] close(4) = 0

16:33:00.776493 [00007f4e1cced700] read(5, "", 4) = 0

16:33:00.776867 [00007f4e1cced760] close(5) = 0

16:33:00.776915 [00007f4e1ca05a19] ioctl(3, TCGETS, 0x7ffd4091bac0) = -1 ENOTTY (Inappropriate ioctl for device)

16:33:00.776970 [00007f4e1ccedd80] lseek(3, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek)

16:33:00.777024 [00007f4e1ca002c4] fstat(3, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0

16:33:00.777078 [00007f4e1cced700] read(3, "192.168.0.1\n", 8192) = 12

16:33:00.779354 [00007f4e1cced700] read(3, "", 8192) = 0

16:33:00.779527 [00007f4e1cced700] --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=20861, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---

16:33:00.779563 [00007f4e1ccee5f9] rt_sigreturn({mask=[]}) = 0

16:33:00.779605 [00007f4e1ca002c4] fstat(3, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0

16:33:00.779663 [00007f4e1cced760] close(3) = 0

16:33:00.779703 [00007f4e1ccee6fd] rt_sigaction(SIGHUP, {SIG_IGN, [], SA_RESTORER, 0x7f4e1ccee5f0}, {0x7f4e1dd0df90, [], SA_RESTORER, 0x7f4e1ccee5f0}, 8) = 0

16:33:00.779761 [00007f4e1ccee6fd] rt_sigaction(SIGINT, {SIG_IGN, [], SA_RESTORER, 0x7f4e1ccee5f0}, {0x7f4e1dd0df90, [], SA_RESTORER, 0x7f4e1ccee5f0}, 8) = 0

16:33:00.779806 [00007f4e1ccee6fd] rt_sigaction(SIGQUIT, {SIG_IGN, [], SA_RESTORER, 0x7f4e1ccee5f0}, {SIG_DFL, [], SA_RESTORER, 0x7f4e1ccee5f0}, 8) = 0

16:33:00.779858 [00007f4e1ccee14c] wait4(20861, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 20861

16:33:00.779906 [00007f4e1ccee6fd] rt_sigaction(SIGHUP, {0x7f4e1dd0df90, [], SA_RESTORER, 0x7f4e1ccee5f0}, NULL, 8) = 0

16:33:00.779946 [00007f4e1ccee6fd] rt_sigaction(SIGINT, {0x7f4e1dd0df90, [], SA_RESTORER, 0x7f4e1ccee5f0}, NULL, 8) = 0

16:33:00.779986 [00007f4e1ccee6fd] rt_sigaction(SIGQUIT, {SIG_DFL, [], SA_RESTORER, 0x7f4e1ccee5f0}, NULL, 8) = 0

16:33:00.780048 [00007f4e1c9475e0] rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0

16:33:00.780108 [00007f4e1ccee6fd] rt_sigaction(SIGCHLD, NULL, {0x7f4e1dd0df90, [], SA_RESTORER, 0x7f4e1ccee5f0}, 8) = 0

16:33:00.780171 [00007f4e1ccee14c] wait4(-1, 0x7ffd4091ba54, 0, NULL) = -1 ECHILD (No child processes)

16:33:00.780220 [00007f4e1c9475e0] rt_sigprocmask(SIG_UNBLOCK, [CHLD], NULL, 8) = 0



#writes "gateway is 192.168..." logline to file

16:33:00.780353 [00007f4e1ca00275] stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=3519, ...}) = 0

16:33:00.780433 [00007f4e1ccedea0] open("/var/log/syswatch", O_WRONLY|O_CREAT|O_APPEND, 0666) = 3

16:33:00.780498 [00007f4e1ccedd80] lseek(3, 0, SEEK_END) = 9525548

16:33:00.780547 [00007f4e1ca05a19] ioctl(3, TCGETS, 0x7ffd4091ba60) = -1 ENOTTY (Inappropriate ioctl for device)

16:33:00.780598 [00007f4e1ccedd80] lseek(3, 0, SEEK_CUR) = 9525548

16:33:00.780660 [00007f4e1ca002c4] fstat(3, {st_mode=S_IFREG|0644, st_size=9525548, ...}) = 0

16:33:00.780704 [00007f4e1cced894] fcntl(3, F_SETFD, FD_CLOEXEC) = 0

16:33:00.780762 [00007f4e1cced6a0] write(3, "Sat Nov  9 16:33:00 2019 debug:    p1p1 - network - gateway - 192.168.0.1\n", 74) = 74

16:33:00.780821 [00007f4e1cced760] close(3) = 0



#writes "network type private IP range" logline to file

16:33:00.780881 [00007f4e1ca00275] stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=3519, ...}) = 0

16:33:00.780942 [00007f4e1ccedea0] open("/var/log/syswatch", O_WRONLY|O_CREAT|O_APPEND, 0666) = 3

16:33:00.780988 [00007f4e1ccedd80] lseek(3, 0, SEEK_END) = 9525622

16:33:00.781030 [00007f4e1ca05a19] ioctl(3, TCGETS, 0x7ffd4091ba60) = -1 ENOTTY (Inappropriate ioctl for device)

16:33:00.781074 [00007f4e1ccedd80] lseek(3, 0, SEEK_CUR) = 9525622

16:33:00.781103 [00007f4e1ca002c4] fstat(3, {st_mode=S_IFREG|0644, st_size=9525622, ...}) = 0

16:33:00.781160 [00007f4e1cced894] fcntl(3, F_SETFD, FD_CLOEXEC) = 0

16:33:00.781194 [00007f4e1cced6a0] write(3, "Sat Nov  9 16:33:00 2019 debug:    p1p1 - network - type - private IP range\n", 76) = 76

16:33:00.781236 [00007f4e1cced760] close(3) = 0



#writes "checking IP address for changes" logline to file

16:33:00.781323 [00007f4e1ca00275] stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=3519, ...}) = 0

16:33:00.781389 [00007f4e1ccedea0] open("/var/log/syswatch", O_WRONLY|O_CREAT|O_APPEND, 0666) = 3

16:33:00.781452 [00007f4e1ccedd80] lseek(3, 0, SEEK_END) = 9525698

16:33:00.781496 [00007f4e1ca05a19] ioctl(3, TCGETS, 0x7ffd4091ba60) = -1 ENOTTY (Inappropriate ioctl for device)

16:33:00.781531 [00007f4e1ccedd80] lseek(3, 0, SEEK_CUR) = 9525698

16:33:00.781579 [00007f4e1ca002c4] fstat(3, {st_mode=S_IFREG|0644, st_size=9525698, ...}) = 0

16:33:00.781641 [00007f4e1cced894] fcntl(3, F_SETFD, FD_CLOEXEC) = 0

16:33:00.781675 [00007f4e1cced6a0] write(3, "Sat Nov  9 16:33:00 2019 debug:    p1p1 - checking IP address for changes\n", 74) = 74

16:33:00.781736 [00007f4e1cced760] close(3) = 0



#writes "skipping check on get_ip request" logline to file

16:33:00.781807 [00007f4e1ca00275] stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=3519, ...}) = 0

16:33:00.781885 [00007f4e1ccedea0] open("/var/log/syswatch", O_WRONLY|O_CREAT|O_APPEND, 0666) = 3

16:33:00.781930 [00007f4e1ccedd80] lseek(3, 0, SEEK_END) = 9525772

16:33:00.781960 [00007f4e1ca05a19] ioctl(3, TCGETS, 0x7ffd4091ba60) = -1 ENOTTY (Inappropriate ioctl for device)

16:33:00.782016 [00007f4e1ccedd80] lseek(3, 0, SEEK_CUR) = 9525772

16:33:00.782052 [00007f4e1ca002c4] fstat(3, {st_mode=S_IFREG|0644, st_size=9525772, ...}) = 0

16:33:00.782099 [00007f4e1cced894] fcntl(3, F_SETFD, FD_CLOEXEC) = 0

16:33:00.782132 [00007f4e1cced6a0] write(3, "Sat Nov  9 16:33:00 2019 debug:    p1p1 - skipping check on get_ip request - count 9\n", 85) = 85

16:33:00.782172 [00007f4e1cced760] close(3) = 0



#writes "network is down" logline to file

16:33:00.782234 [00007f4e1ca00275] stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=3519, ...}) = 0

16:33:00.782286 [00007f4e1ccedea0] open("/var/log/syswatch", O_WRONLY|O_CREAT|O_APPEND, 0666) = 3

16:33:00.782325 [00007f4e1ccedd80] lseek(3, 0, SEEK_END) = 9525857

16:33:00.782402 [00007f4e1ca05a19] ioctl(3, TCGETS, 0x7ffd4091ba60) = -1 ENOTTY (Inappropriate ioctl for device)

16:33:00.782440 [00007f4e1ccedd80] lseek(3, 0, SEEK_CUR) = 9525857

16:33:00.782470 [00007f4e1ca002c4] fstat(3, {st_mode=S_IFREG|0644, st_size=9525857, ...}) = 0

16:33:00.782510 [00007f4e1cced894] fcntl(3, F_SETFD, FD_CLOEXEC) = 0

16:33:00.782544 [00007f4e1cced6a0] write(3, "Sat Nov  9 16:33:00 2019 debug:    p1p1 - network is down\n", 58) = 58

16:33:00.782609 [00007f4e1cced760] close(3) = 0



#writes "no WANS available for Multi-WAN" logline to file

16:33:00.782681 [00007f4e1ca00275] stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=3519, ...}) = 0

16:33:00.782731 [00007f4e1ccedea0] open("/var/log/syswatch", O_WRONLY|O_CREAT|O_APPEND, 0666) = 3

16:33:00.782770 [00007f4e1ccedd80] lseek(3, 0, SEEK_END) = 9525915

16:33:00.782800 [00007f4e1ca05a19] ioctl(3, TCGETS, 0x7ffd4091ba60) = -1 ENOTTY (Inappropriate ioctl for device)

16:33:00.782835 [00007f4e1ccedd80] lseek(3, 0, SEEK_CUR) = 9525915

16:33:00.782863 [00007f4e1ca002c4] fstat(3, {st_mode=S_IFREG|0644, st_size=9525915, ...}) = 0

16:33:00.782899 [00007f4e1cced894] fcntl(3, F_SETFD, FD_CLOEXEC) = 0

16:33:00.782931 [00007f4e1cced6a0] write(3, "Sat Nov  9 16:33:00 2019 debug:  system - no WANS available for multiwan\n", 73) = 73

16:33:00.782991 [00007f4e1cced760] close(3) = 0



#writes "using failed interval time 10s" logline to file

16:33:00.783048 [00007f4e1ca00275] stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=3519, ...}) = 0

16:33:00.783099 [00007f4e1ccedea0] open("/var/log/syswatch", O_WRONLY|O_CREAT|O_APPEND, 0666) = 3

16:33:00.783144 [00007f4e1ccedd80] lseek(3, 0, SEEK_END) = 9525988

16:33:00.783175 [00007f4e1ca05a19] ioctl(3, TCGETS, 0x7ffd4091ba60) = -1 ENOTTY (Inappropriate ioctl for device)

16:33:00.783214 [00007f4e1ccedd80] lseek(3, 0, SEEK_CUR) = 9525988

16:33:00.783243 [00007f4e1ca002c4] fstat(3, {st_mode=S_IFREG|0644, st_size=9525988, ...}) = 0

16:33:00.783282 [00007f4e1cced894] fcntl(3, F_SETFD, FD_CLOEXEC) = 0

16:33:00.783314 [00007f4e1cced6a0] write(3, "Sat Nov  9 16:33:00 2019 debug:    p1p1 - using failed interval time (sec) - 10\n", 80) = 80

16:33:00.783360 [00007f4e1cced760] close(3) = 0





##I think this is similar or the same to the first block I have posted above##

16:33:00.783403 [00007f4e1c9475e0] rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0

16:33:00.783443 [00007f4e1c9474bd] rt_sigaction(SIGCHLD, NULL, {0x7f4e1dd0df90, [], SA_RESTORER, 0x7f4e1ccee5f0}, 8) = 0

16:33:00.783480 [00007f4e1c9475e0] rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0

16:33:00.783516 [00007f4e1c9d67f0] nanosleep({10, 0}, 0x7ffd4091bcf0) = 0

16:33:10.783866 [00007f4e1c9d7817] geteuid() = 0

16:33:10.784093 [00007f4e1ca00710] open("/etc/protocols", O_RDONLY|O_CLOEXEC) = 3

16:33:10.784324 [00007f4e1ca002c4] fstat(3, {st_mode=S_IFREG|0644, st_size=6545, ...}) = 0

16:33:10.784498 [00007f4e1ca09d4a] mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f4e1e21b000

16:33:10.784576 [00007f4e1ca00950] read(3, "# /etc/protocols:\n# $Id: protocols,v 1.11 2011/05/03 14:45:40 ovasik Exp $\n#\n# Internet (IP) protoco"..., 4096) = 4096

16:33:10.784703 [00007f4e1ca01000] close(3) = 0

16:33:10.784788 [00007f4e1ca09dd7] munmap(0x7f4e1e21b000, 4096) = 0

16:33:10.784989 [00007f4e1ca10a07] socket(AF_INET, SOCK_RAW, IPPROTO_ICMP) = 3

16:33:10.785051 [00007f4e1ca05a19] ioctl(3, TCGETS, 0x7ffd4091bc50) = -1 ENOTTY (Inappropriate ioctl for device)

16:33:10.785133 [00007f4e1ccedd80] lseek(3, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek)

16:33:10.785241 [00007f4e1ca05a19] ioctl(3, TCGETS, 0x7ffd4091bc50) = -1 ENOTTY (Inappropriate ioctl for device)

16:33:10.785431 [00007f4e1ccedd80] lseek(3, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek)

16:33:10.785519 [00007f4e1cced894] fcntl(3, F_SETFD, FD_CLOEXEC) = 0

16:33:10.785577 [00007f4e1ca109aa] setsockopt(3, SOL_SOCKET, SO_BINDTODEVICE, "p1p1\0", 5) = 0

16:33:10.785670 [00007f4e1ca10577] bind(3, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("192.168.0.5")}, 16) = 0

16:33:10.785783 [00007f4e1c9d7817] geteuid() = 0

16:33:10.785870 [00007f4e1ca00710] open("/etc/protocols", O_RDONLY|O_CLOEXEC) = 4

16:33:10.785990 [00007f4e1ca002c4] fstat(4, {st_mode=S_IFREG|0644, st_size=6545, ...}) = 0

16:33:10.786067 [00007f4e1ca09d4a] mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f4e1e21b000

16:33:10.786158 [00007f4e1ca00950] read(4, "# /etc/protocols:\n# $Id: protocols,v 1.11 2011/05/03 14:45:40 ovasik Exp $\n#\n# Internet (IP) protoco"..., 4096) = 4096

16:33:10.786256 [00007f4e1ca01000] close(4) = 0

16:33:10.786371 [00007f4e1ca09dd7] munmap(0x7f4e1e21b000, 4096) = 0

16:33:10.786484 [00007f4e1ca10a07] socket(AF_INET, SOCK_RAW, IPPROTO_ICMP) = 4

16:33:10.786548 [00007f4e1ca05a19] ioctl(4, TCGETS, 0x7ffd4091bc50) = -1 ENOTTY (Inappropriate ioctl for device)

16:33:10.786656 [00007f4e1ccedd80] lseek(4, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek)

16:33:10.786722 [00007f4e1ca05a19] ioctl(4, TCGETS, 0x7ffd4091bc50) = -1 ENOTTY (Inappropriate ioctl for device)

16:33:10.786786 [00007f4e1ccedd80] lseek(4, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek)

16:33:10.786841 [00007f4e1cced894] fcntl(4, F_SETFD, FD_CLOEXEC) = 0

16:33:10.786894 [00007f4e1ca109aa] setsockopt(4, SOL_SOCKET, SO_BINDTODEVICE, "p1p1\0", 5) = 0

16:33:10.786961 [00007f4e1ca10577] bind(4, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("192.168.0.5")}, 16) = 0

Thanks!

The reply is currently minimized Show

Accepted Answer
Nick Howitt

Offline
Sunday, November 10 2019, 09:15 AM - #Permalink
Resolved

0 votes

I think your linux abilities are beyond mine. In terms of routing, isn't the local LAN at Layer 2 level so dependant arp? The local LAN in, in this case, is between ClearOS and the router. The question seems to then become "why are you receiving a 00:00:00:00:00:00:00 MAC address from the router". It may be an invalid address which Windows is handling and linux is not. I think there is something similar with the network address between Windows and Linux. As an example, if your subnet is 192.168.0.0/24, M$ allows you to use the 192.168.0.0 address but Linux won't. In Windows, have a look at the arp table and see what it is picking up. If it is also 00:00:00:00:00:00:00, is it worth seeing if there is a firmware update available for the router?
The reply is currently minimized Show
Accepted Answer

Marvin Martin

Offline
Tuesday, November 12 2019, 01:27 PM - #Permalink
Resolved

0 votes

Yes, it does seem to be a Layer 2 issue--for some reason ClearOS isn't caching the ARP of the gateway correctly. In the ARP cache, it just shows <incomplete>, and there's no route in the routing table for the External interface. I'm guessing this is what is causing syswatch to consider the connection down.

I've confirmed with tcpdump that the ClearOS unit is receiving a valid ARP reply from the modem. And for good measure, I connected a new ClearOS box with similar hardware (and an identical NIC) to the modem last night, and it immediately pulled an address and worked just fine, as has been my experience with other equipment (Linux and Windows) connected to that modem.

So it would seem to me it's a kernel-level problem. Next up is to figure out how the kernel handles the ARP reply from the modem and writes it to the ARP cache. If someone can give me advice on how to do this, I'd appreciate it very much.
The reply is currently minimized Show
Accepted Answer
Nick Howitt

Offline
Tuesday, November 12 2019, 02:47 PM - #Permalink
Resolved

0 votes

As a thought, have you configured your WAN as a DHCP Server (Webconfig > Network > Infrastructure > DHCP Server? It seem to be a relatively common misconfiguration, but please can you delete it if you have?

The DHCP client is dhclient. This looks after obtaining an IP address and, presumably, receiving the MAC address.
The reply is currently minimized Show
Accepted Answer

Marvin Martin

Offline
Thursday, November 14 2019, 11:54 PM - #Permalink
Resolved

0 votes

As a thought, have you configured your WAN as a DHCP Server (Webconfig > Network > Infrastructure > DHCP Server?

Good point--I've seen it happen on systems too. Unfortunately (from the perspective of this problem), my system is set correctly: the DHCP server is only enabled for the LAN interface--it is not enabled for this problematic WAN interface.

---------------

Today I spent several hours trying to dig deeper and read up on how the innerworkings actually function. However I haven't been able to nail down the exact cause yet.

From what I can tell about the problem, about the only two possible culprits are:
1. a bug in the kernel ARP module in either the neighbor discovery or the ARP table input handling
2. a bug in dhclient regarding linking the gateway IP to the ARP and passing it on to the ARP table

BUT:
#1 seems unlikely because the ARP module has been around such a long time--I'd think that all the bugs would be worked out by now.
#2 seems questionable because I haven't been able to locate any section in the code that would indicate that it touches the ARP table. (but neither am I familar with C)

----------------

What I know about the situation so far:

* The ethernet cable on the link in question has been replaced, with zero change
* I put a old 10/100 Hub between the modem and the WAN interface, with zero change
* The ARP / IP Neighbour table always shows INCOMPLETE for my modem IP address:
gateway (192.168.0.1) at <incomplete> on p1p1

* syswatch can't reach the outside world, so it keeps blowing the interface away every 30 seconds or so. This seems to be a direct result of the gateway (modem) not being reachable due to the ARP issue.
* The ClearOS unit constantly is requesting the MAC for the gateway (modem), and every time the modem instantly responds. Confirmed with Wireshark on another computer (when I had the hub installed), and via TCPdump on the ClearOS unit. (see earlier post)
* dhclient receives an address as lickety-split as you can ask for as soon as the interface comes back up each time [syswatch resets it]:
[root@system ~]# dhclient -v p1p1 Internet Systems Consortium DHCP Client 4.2.5 Copyright 2004-2013 Internet Systems Consortium. All rights reserved. For info, please visit https://www.isc.org/software/dhcp/ Listening on LPF/p1p1/68:05:ca:80:10:82 Sending on LPF/p1p1/68:05:ca:80:10:82 Sending on Socket/fallback DHCPREQUEST on p1p1 to 255.255.255.255 port 67 (xid=0x1f740187) DHCPACK from 192.168.0.1 (xid=0x1f740187) bound to 192.168.0.5 -- renewal in 38086 seconds.

* The route for the gateway is set properly (layer 3):
192.168.0.0/24 dev p1p1 proto kernel scope link src 192.168.0.5

The big questions I have right now are: What sets the MAC for the gateway (192.168.0.1) in the ARP table: something with dhclient DHCP response parsing, or ARP Neighbor Discovery? And what debugging can I turn on or other tools can I use to watch this happen (or, in this case, Not happen) live?

Thoughts anyone?
The reply is currently minimized Show
Accepted Answer
Nick Howitt

Offline
Friday, November 15 2019, 10:57 AM - #Permalink
Resolved

0 votes

Have you rebooted since ClearOS Community went to 7.7? If "uname -r" shows 3.10.0-957.21.3.v7.x86_64 then try rebooting to activate the latest kernel, which is 3.10.0-1062.1.2.el7.x86_64.

I can't help with C as I am not a coder.
The reply is currently minimized Show
Accepted Answer

Marvin Martin

Offline
Friday, November 15 2019, 01:48 PM - #Permalink
Resolved

0 votes

I'm not sure what version of the kernel is running on this problem machine, and unfortunately I'm not nearby to check. I'll have to get it later and post then.

I've been avoiding a reboot as I fear that will cause it to start working--and having seen this issue appear randomly before on other machines, I'm very happy to have one that's staying "broken" in hopes of getting to the very bottom of it.

My prime goal is to find the exact cause of the issue for good in order to get it nailed down, and I'm willing to invest significant time and energy to achieve this (even if it is "inefficent" as opposed to a reboot).

What would it take to get a ClearOS developer involved at this point? I'd be willing to open a paid support ticket, but as it's a Community system, there's not an option for that per se in the ClearCenter panel.
The reply is currently minimized Show
Accepted Answer
Nick Howitt

Offline
Friday, November 15 2019, 02:22 PM - #Permalink
Resolved

0 votes

The developers generally don't get involved in "low level" issues (i.e deep down in the system). Their focus is more creating the Webconfig and the glue that ties the packages together. They do very little to other packages beyond package configuration so I am not sure if they would be of much help.

You should be able to open a per-incident ticket. I believe anyone can, but I suspect it wouldn't be accepted.

Have you tried swapping the NIC's round. Unfortunately they both use the e1000e driver so it won't ptove much except if ther is some sort of EEPROM update available.
The reply is currently minimized Show
Accepted Answer

Marvin Martin

Offline
Friday, November 15 2019, 02:44 PM - #Permalink
Resolved

0 votes

Ok, that's good to know.

Regarding switching the NICs: that's a good idea--I'll give it a try if I run out of other options.

Thanks a lot @Nick, for your prompt and helpful replies!
The reply is currently minimized Show
Accepted Answer

Marvin Martin

Offline
Tuesday, November 26 2019, 01:57 AM - #Permalink
Resolved

0 votes

A small update on this project. I haven't been giving it much attention, but it still is in the "broken" state.

If "uname -r" shows 3.10.0-957.21.3.v7.x86_64 ...

Yes, this is the kernel I'm running. (And, as noted earlier, I'm letting it in this state [not rebooting] in an attempt to get to the bottom of the problem.)

----

My latest thought is to capture a memory dump with gdb, but I'm having a problem reading the dump files without the proper debug information. Could someone please point me to the debug-info package for syswatch and dhclient? I was unable to find them with some searching I did a little bit ago here.
The reply is currently minimized Show
Accepted Answer
Nick Howitt

Offline
Tuesday, November 26 2019, 08:30 AM - #Permalink
Resolved

0 votes

Marvin Martin wrote:Could someone please point me to the debug-info package for syswatch and dhclient?
I don't think there are any. Syswatch is an ClearOS written program. It is written in perl which you can see in /usr/sbin/syswatch. If you knew any perl (I don't), you could insert your debug code into it. Dhclient comes from the centos repos, but I don't see any debug packages at all. I am not sure what calls dhclient, but can you try starting dhclient with the -v switch for verbose logging? It may mean finding the calling program.
The reply is currently minimized Show
Accepted Answer

Marvin Martin

Offline
Saturday, December 14 2019, 05:54 PM - #Permalink
Resolved

0 votes

Time for an update on this. I have been busy and not spending much time on the problem recently.

My latest analysis is that the system is not "listening for" or "processing" ARP traffic on the p1p1 interface. This is presumably why it is not "getting" the ARP reply from the modem.

Here's how I came to that conclusion:
I manually set the correct ARP table entry for my gateway. But I still can't ping the gateway: the packets just get lost.

tcpdump now shows the modem asking constantly for my system's MAC address, but my system never replies (at least according to tcpdump running on my system).

Similarly, tcpdump shows as if the system is making outgoing ping requests, but never logs replies--presumably because it is not seeing/listening for them (let alone that IP is not working due to the foundational ARP issue).

I've stopped syswatch, the system firewall, as well as some other firewall-related processes I'm running -- snort and fail2ban. No change.

So, with all that said: could someone advise me on what process or kernel module could be failing to cause the system to stop listening for/proccessing ARP packets?
The reply is currently minimized Show
Accepted Answer

Marvin Martin

Offline
Tuesday, June 30 2020, 09:18 PM - #Permalink
Resolved

0 votes

I just ran into this issue again with a client today. New ClearOS server, in Gateway mode, with the WAN port connected to a fiber ONT/modem: the ClearOS unit would simply not pull an address from the ISP.

With the help of an onsite tech, I was able to get SSH access and discovered that syswatch was very rapidly restarting the interface:

Tue Jun 30 15:27:26 2020 info: system - WAN network is not up Tue Jun 30 15:27:36 2020 info: p3p1 - ping check - no IP available Tue Jun 30 15:27:36 2020 info: p3p1 - restarting DHCP connection Tue Jun 30 15:27:42 2020 info: system - WAN network is not up Tue Jun 30 15:27:52 2020 info: p3p1 - ping check - no IP available Tue Jun 30 15:27:52 2020 info: p3p1 - restarting DHCP connection Tue Jun 30 15:27:59 2020 info: system - WAN network is not up Tue Jun 30 15:28:09 2020 info: p3p1 - ping check - no IP available Tue Jun 30 15:28:09 2020 info: p3p1 - restarting DHCP connection Tue Jun 30 15:28:15 2020 info: system - WAN network is not up Tue Jun 30 15:28:25 2020 info: p3p1 - ping check - no IP available Tue Jun 30 15:28:25 2020 info: p3p1 - restarting DHCP connection Tue Jun 30 15:28:31 2020 info: system - WAN network is not up Tue Jun 30 15:28:41 2020 info: p3p1 - ping check - no IP available Tue Jun 30 15:28:41 2020 info: p3p1 - restarting DHCP connection

After stopping the syswatch service with "systemctl stop syswatch.service" and waiting a bit (around 30 seconds or so), the system successfully pulled an IP address from the ISP. After that, I restarted the syswatch service, and it was happy.

I'm rather puzzled over why syswatch was resetting the interface so rapidly. I tried running a tcpdump capture while the problem was in-progress, but I couldn't get anything meaningful before the interface was taken down. It would seem that it wasn't even getting the DHCP request sent out.

I did start a tcpdump capture after stopping syswatch, and did find some interesting traffic: basically the ONT unit was sending out a bunch of LOOP packets to some unknown MAC address. A short time after the DHCP traffic establishes IP connectivity, these LOOP packets stop. I don't know if this is a factor or not.
The reply is currently minimized Show
Accepted Answer
Nick Howitt

Offline
Wednesday, July 01 2020, 08:08 AM - #Permalink
Resolved

0 votes

Have a look in /etc/syswatch. There is an option "failed_interval" which you could consider increasing. Yours is retrying every 16-17s which is probably the 15s plus processing time.
The reply is currently minimized Show
Accepted Answer
Nick Howitt

Offline
Saturday, July 11 2020, 08:54 AM - #Permalink
Resolved

0 votes

As another thought, if you don't have the file /etc/dhcp/dhclient.conf, create one, then put in it:
supersede dhcp-server-identifier 255.255.255.255;
I used to do it for a different reason (where the ISP was sending back an invalid DHCP server in the reply), but another user has reported success where he was not getting an IP address at all.

[edit]
Have a look in this thread for more info. I used that setting for years until I changed ISP but, as noted in the thread, it caused me a problem once.
[/edit]
The reply is currently minimized Show
Accepted Answer

Marvin Martin

Offline
Monday, October 19 2020, 07:00 PM - #Permalink
Resolved

0 votes

Thanks for the advice, Nick.

I haven't tried your second suggestion, but I did have an instance a few months ago that was helped by increasing the syswatch "failed_interval".

However, just this month, I've had two boxes encounter this exact issue again. Both times there'd been a power outage. The ClearOS7 operating system would come up just fine, but the External interface would just cycle Yes/No on the Link status instead of properly pulling a DHCP lease from the ISP.

The first situation like this I worked with this month was the same box I referenced above that I'd already increased the "failed_interval" on. I spent considerable time on the issue and was able to perform a packet capture on the interface while the problem was active. It appeared that the usual DHCP communication simply wasn't happening. After much more fiddling (and getting it working), I discovered the dhclient commands. ("dhclient -r [interface]" to release and "dhclient [interface]" to renew; use the "-v" option to get detailed output)

So today when this second situation popped up, I figured I had a quick fix: just run dhclient to get an address. But no, it didn't quite work that easily: though dhclient did successfully pull an address, apparently syswatch didn't get the note, because the interface was promptly taken down again.

Next up was to turn off syswatch (stop and disable), which I did. But then the link was stuck in the "down" state. So I used "ip link set dev [interface] down && ip link set dev [interface] up" to resolve this. Finally, I re-ran the dhclient commands, and the system immediately pulled an address from the ISP, which showed up in Webconfig, and the system started to route traffic.

FWIW, though, it seems that the "ip link set" commands can interfere with the firewall configuration; it was necessary to trigger a firewall reload to get ports in Incoming Firewall to open again. (Probably "systemctl restart firewall.service" would have done the trick as well.)

Takeaway:
While I don't completely understand why this issue keeps happening, I'm going to speculate it could be resolved by
1. increasing the syswatch failed_interval just a bit, and/or
2. automatically triggering dhclient when syswatch is restarting a DHCP interface.

That said, for Multi-WAN Failover setups, it's perfectly understandable why you'd want as low of a "failed_interface" time as possible to minimize downtime. Thankfully very few of our deployment is using Multi-WAN, so disabling Syswatch is a viable option for us most of the time.

Eager to hear your thoughts on this. Thanks for your time!
The reply is currently minimized Show
Accepted Answer

Marvin Martin

Offline
Monday, October 19 2020, 07:59 PM - #Permalink
Resolved

0 votes

Just thinking about this a bit more: I think this is the real question: Is dhclient getting automatically triggered every time an interface state changes to UP? I'd think that it should be, but these experiences seem to indicate that it's not.
The reply is currently minimized Show
Accepted Answer
Nick Howitt

Offline
Monday, October 19 2020, 08:07 PM - #Permalink
Resolved

0 votes

Hi Marvin, I appreciate your follow-up. I don't really know syswatch or MultiWAN much so I have to defer to the experts. I think the routines for bringing an interface up (the ifup command) come from upstream but syswatch is ours. I believe you had Peter looking at this and I'll draw your reply to his attention.
The reply is currently minimized Show
Accepted Answer

Marvin Martin

Offline
Monday, December 21 2020, 10:09 PM - #Permalink
Resolved

0 votes

Thanks for your replies on this issue, Nick, and for the tip about the location of the place where syswatch triggers an interface restart.

Just had another system (a network gateway) today that had "no internet" after a power loss. Going off my previous findings, I had a packet capture started for the sake of evidence, and then ran the dhclient script for the External interface. It immediately pulled an address and internet was up (though I needed to restart the firewall to get the Incoming Ports opened as shown in Webconfig).

Note: I'd disabled syswatch on this particular system over a month ago because of the problems it was having, so this system was not cycling Link Yes/No.

With this experience, I'm fully persuaded that the Link Yes/No cycling encountered earlier is due to the lack of an IP address, which for all purposes seems to be due to the dhclient script not getting called when it should be. We can know that dhclient isn't getting called because 1. dhclient logs all of its activities to /var/log/messages -- and there were zero DHCP requests when this problem was occurring. (Given that Syswatch was disabled, the only one that should have existed in messages would have been one from when ClearOS brought up the interface the first time on boot). And, 2. The tcpdump captures didn't indicate any DHCP activity ongoing until I manually triggered it.

I attempted to replicate the "no-internet" problem in my lab, however was not able to repeat it. With some research, I found that when syswatch is restarting an interface, after taking it down with /sbin/ifdown, then it calls /sbin/ifup, and, when things are working, this (apparently) calls /etc/sysconfig/network-scripts/ifup-eth ifcfg-<interface>, which calls dhclient for that interface.

This leaves a big question: why isn't dhclient getting called in these random situations? Perhaps the sysconfig/network-scripts/ifup script isn't getting called? (because /sbin/ifup has zero references to dhclient that I could find).

The boot log for this system does indicate that starting LSB-networking failed on boot. Again, I'm guessing, this was due to the lack of an IP address.

So what can be done? Theoretically we'd keep exploring until we ferret out the precise reason that dhclient isn't getting called. This would indeed be the preferred method. However, in the interim, I'd recommend that Clear update the syswatch program to call dhclient as part of a DHCP-configured interface restart routine (along with a firewall restart), or possibly call /etc/sysconfig/network-scripts/ifup-eth ifcfg-<interface> directly, rather than relying on /sbin/ifup , and then we could consider it case closed.

Does anyone have further thoughts? What would be the next step to get the dhclient trigger accepted into syswatch? a Git pull?

Side note: while doing lab testing, I discovered that dhclient will eventually fall back to a cached dhcp lease file if it doesn't get a response to its DHCP requests.
The reply is currently minimized Show
Accepted Answer
Nick Howitt

Offline
Saturday, December 26 2020, 11:00 AM - #Permalink
Resolved

0 votes

I honestly don't know how to help more. I thought the obtaining of a DHCP IP is done by dhclient and so is outside ClearOS. It is pure Centos. Beyond that, syswatch monitors for a change in IP then does its magic. If it does not have an IP, presumably it is repeatedly resetting its connection. Somehow it must be possible to debug syswatch by setting variable outputs in various parts of the program, but you need a failing system which I don't have. I also don't know perl, but, from what I way of it, it may be quite easy to add some debugginh. I did a little when it was last updated.

As an idea, when dhclient updates its IP, it writes to a file in /var/lib/dhclient. There is some horrible logic in syswatch to try to pick up the correct file. Perhaps it is picking up the wrong file. There is further logic in syswatch which then picks up the latest lease as multiple leases are held in the file and you need the latest, which is probably the last. It would be worth inspecting the file.
The reply is currently minimized Show
Accepted Answer
Nick Howitt

Offline
Tuesday, December 29 2020, 07:11 PM - #Permalink
Resolved

0 votes

A customer has reported that, with IPv6 enabled, syswatch keeps killing the interface and he had to disable it. I didn't think syswatch was IPv6 enabled, but it may be worth investigating. See method 2 here.
The reply is currently minimized Show
Accepted Answer

Marvin Martin

Offline
Thursday, January 21 2021, 03:53 PM - #Permalink
Resolved

0 votes

Thanks for your thoughts, Nick. Regarding your last two posts:

dhclient lease picking problem: this is a fascinating idea, however as noted on Problem #2 below, I think we can permanently overcome whatever lease struggles may be entering in on a DHCP interface reset by simply hitting release/renew by default.

IPv6: I don't think Syswatch is IPv6-aware either, so it would make sense that IF a customer has an IPv6-only connection, despite having connectivity through it, syswatch would reset the interface because of not being able to get ICMP responses on IPv4. (But maybe IPv6-only isn't the type of situation you're referencing.)

--------------------

I think it's time to wrap up this long thread as I'm feeling fairly confident we're basically at the end of the rabbit hole.

I feel that I've covered what is likely more than one problem on this long thread. Here's a summary of the current state of affairs:

Problem #1: Random connectivity loss on an interface even though an IP address was present. Status: I haven't worked with any of these situations in a long time. I'm going to assume that they were one-off issues and/or there were underlying hardware issues and/or the issue was fixed in a software update along the way, and so for now I'm going to say this is "resolved".

Problem #2: External interfaces configured via DHCP fail to pull an address and the interface Link Status cycles Yes/No in Webconfig. Status: This has been the focus of this thread since my June 30, 2020 post.

Re-stating the Problem: A ClearOS unit is unable to pull an IP address via DHCP; typically from ISP CPE equipment.
- (However, if another router or computer is connected to the ISP CPE equipment, it instantly pulls an address via DHCP.*)
- The Link Status displayed in Webconfig for the interface in question will show "Yes" the majority of the time, but will change to "No" briefly on a regular basis. This corresponds with the Link light blinking out briefly on the NIC.
- However even after waiting several minutes, the interface does not display an IP address in Webconfig. This state continues indefinitely; frequently even persisting after a reboot.

To summarize my findings on this for everyone's benefit:
- The Link Status cycling Yes/No is the work of Syswatch: if it can't detect upstream connectivity within the time it is configured for, it resets the network interface. So, if you have an interface that is cycling Yes/No for the Link Status in Webconfig (or the Link status light blinks out on the physical interface every 15 seconds or so), you can know that Syswatch can't see upstream connectivity and is resetting the interface.
- The log for the Syswatch process is at: /var/log/syswatch
- The configuration for the Syswatch process is at: /etc/syswatch
- To enable debug-level logging, edit the Syswatch configuration file to "debug=7" and restart the process: "systemctl restart syswatch.service"

As mentioned in my October 19, 2020 post, I became suspicious that whatever was responsible to pull the DHCP address at the time of the interface reset was not doing so. When a manual dhclient pull immediately fetched an address, that strengthened this hypothesis.

Nick then suggested adding dhclient release/renew lines to the "Restart for DHCP connection" code in Syswatch.

I've done this on a few systems, and it's looking very likely to be The Patch for this problem.

Based on this, I've opened this Issue on the Syswatch repo, and will see about getting it accepted there.

Many thanks to Nick for his faithfulness in replying and giving ideas!

--------------------

* In theory, MAC Locks by ISPs can enter in here. However, for the purposes of this post, assume there are no MAC locks. IF your ISP does do MAC locks, be advised no amount of DHCP re-requesting will resolve the "no address" issue until your ISP releases the MAC lock.
The reply is currently minimized Show
Accepted Answer
Nick Howitt

Offline
Friday, January 22 2021, 04:00 PM - #Permalink
Resolved

0 votes

I've patched syswatch and there is a version now available for testing. Note this only affects people with an interface which get its IP address by DHCP . To test, please do a:
yum update syswatch --enablerepo=clearos-updates-testing
The reply is currently minimized Show

Your Reply

Please login to post a reply

You will need to be logged in to be able to post a reply. Login using the form on the right or register an account if you are new here.

Community Forums

ClearOS Portal

ClearVM Platform

ClearVM 2 Platform

Forums