Forums

Resolved
0 votes
Hello all,

I recently enabled QoS on my home ClearOS firewall.
I started with a very simple, single-rule configuration for testing purposes and I can't seem to make it work, even though it looks to be configured properly.

To start, I am just trying to set all traffic from a single source IP to priority 1.
I see the tc classes and discs and I can see the iptables "mark" rules it generates.
The problem is, even though the hit counter on the mark rule is increasing correctly, the traffic does not seem to be hitting the correct traffic class/queue.

Can anyone please take a look at the attached screenshots and point me in the right direction?
My knowledge of QoS on linux is decent (not great but not terrible) and from what I can see, this looks like it should be working to me... but I am clearly missing something.

Any assistance would be greatly appreciated!

--Sean
Sunday, January 05 2020, 12:13 AM
Share this post:
Responses (15)
  • Accepted Answer

    Sunday, January 05 2020, 12:26 PM - #Permalink
    Resolved
    0 votes
    I always wondered how to test it as there was a big related change part way through 7.6 iirc. For a long time ClearOS have been patching the kernel to add an IMQ device used for QoS, but the kernel comes with native IFB support. It was the aim of ClearOS to move away from a patched kernel to an upstream kernel, but this meant moving QoS over to IFB. The overhead of patching the kernel was quite high.

    Anyway, can you do me a favour and try the same test using IMQ and see if you get the same results? To do this you will need to reboot ClearOS to a v7 kernel (so not the 1062 line as they are all el7). You get about 5secs during boot where you can select a different kernel.

    Once you have booted to the older kernel, go to /etc/clearos/qos.conf and change:
    QOS_ENABLE_IFB="yes"
    to:
    QOS_ENABLE_IFB="no"
    Then restart the firewall with a:
    systemctl restart firewall
    This will flip the QoS mechanism to IMQ. Then can you repeat your test?

    Do not try running with IMQ on an el7 kernel as the firewall will panic.
    The reply is currently minimized Show
  • Accepted Answer

    Sunday, January 05 2020, 10:43 PM - #Permalink
    Resolved
    0 votes
    I'm running 7.7.2
    I have two non-el7 kernels and neither will boot. After selecting them on the boot menu, the screen goes blank (like it does with the working kernels) but never comes back to life.
    The reply is currently minimized Show
  • Accepted Answer

    Monday, January 06 2020, 12:47 PM - #Permalink
    Resolved
    0 votes
    I am not sure why you can't boot to the old kernels. I've just tested on my test box and I had no problem.

    I really don't know how to troubleshoot this. I can try and contact the dev who did this but he has moved on. Otherwise it is off to the LARTC mailing lists.

    One tin I am curious about is why you changed your Rate to Quantum? It is probably not relevant as I get similar output to you.
    The reply is currently minimized Show
  • Accepted Answer

    Monday, January 06 2020, 03:58 PM - #Permalink
    Resolved
    0 votes
    Apparently the implementation may only be working for upstream.... :(
    The reply is currently minimized Show
  • Accepted Answer

    Tuesday, January 07 2020, 02:38 PM - #Permalink
    Resolved
    0 votes
    Nick Howitt wrote:

    Apparently the implementation may only be working for upstream.... :(


    I'm not sure what you mean... I don't see it as working for either up or down stream.
    Also, do you know why the ClearOS implementation would put the iptables reference in the postrouting chain as opposed to the pre?

    I tried the quantum left as auto and also set manually to approx. 90% of my link speed up/down. Neither worked but the issue isn't the size of the queue, it is that the packets are not correcting hitting the right queue.

    It seems to me like the entire QoS implementation is fundamentally broken...
    The reply is currently minimized Show
  • Accepted Answer

    Tuesday, January 07 2020, 02:42 PM - #Permalink
    Resolved
    0 votes
    Nick Howitt wrote:

    I am not sure why you can't boot to the old kernels. I've just tested on my test box and I had no problem.
    .


    Is it possible to get the kernel source packages that clearos us built on so I can try rolling a custom kernel with IMQ enabled?
    The reply is currently minimized Show
  • Accepted Answer

    Tuesday, January 07 2020, 03:18 PM - #Permalink
    Resolved
    0 votes
    It is looking like the upstream rules go in the POSTROUTING chain and the downstream rules in the PREROUTING chain. Where would you expect the rules to be?

    With respect to the kernel sources, we no longer compile our own and get them straight from Centos, so they will have the sources. You have made me think, however. If you disable QoS before you reboot, can you boot to kernel? The reason I ask is because iptables also needs to be patched for IMQ. I wonder if a mistake was made there and we should have still patched iptables to allow people to boot to old kernels with QoS? 'll ask some questions.
    The reply is currently minimized Show
  • Accepted Answer

    Tuesday, January 07 2020, 03:22 PM - #Permalink
    Resolved
    0 votes
    Nick Howitt wrote:

    It is looking like the upstream rules go in the POSTROUTING chain and the downstream rules in the PREROUTING chain. Where would you expect the rules to be?

    With respect to the kernel sources, we no longer compile our own and get them straight from Centos, so they will have the sources. You have made me think, however. If you disable QoS before you reboot, can you boot to kernel? The reason I ask is because iptables also needs to be patched for IMQ. I wonder if a mistake was made there and we should have still patched iptables to allow people to boot to old kernels with QoS? 'll ask some questions.


    Other implementations I have seen put the mark rules directly in the pre-routing chain for up and down stream.
    I will try disabling qos and booting one of the other kernels when I get home tonight and let you know how it works.
    The reply is currently minimized Show
  • Accepted Answer

    Tuesday, January 07 2020, 03:28 PM - #Permalink
    Resolved
    0 votes
    I just noticed something interesting... while there are no packets hitting the 1:10 queue as I would expect, given the mark rule... there ARE packets hitting the 1:11 queue which I don't understand.
    Could this be a simple "off-by-one" typo somewhere putting the packets in the wrong queue?
    Is it possible to packet capture only a specific TC queue with tcpdump to confirm the packets hitting 1:11 are the ones being marked by the lone iptables rule setting 0xa (which should go to 1:10) ?
    The reply is currently minimized Show
  • Accepted Answer

    Tuesday, January 07 2020, 07:55 PM - #Permalink
    Resolved
    0 votes
    Rather than waste your time, a couple of builds have been pushed through this afternoon. Can you do a:
    yum update app-qos --enablerepo=clearos-updates-testing
    You are trying to update to app-qos-2.5.5-1.v7. If it is not there, wait an hour or so. The dev says this does not fix all your issues (reading between the lines, the wrong bucket possibly being used), but it is a better starting point.

    If you go back to an old kernel, you may need to try with QoS disabled and you may need to downgrade iptables but there is no way of doing that unless I make the package specifically available to you. The person I needed to speak to tonight about iptables, is not around so I wouldn't even try testing IMQ.
    The reply is currently minimized Show
  • Accepted Answer

    Tuesday, January 07 2020, 08:16 PM - #Permalink
    Resolved
    0 votes
    Understood.
    I'll try updating the QoS package and let you know if that has any effect.
    Did the dev seem like he/she thought the "off-by-one-typo" theory on the 1:10 vs 1:11 queue was a possibility?
    And, if so, what would a fix for that look like from a turnaround standpoint?

    Thanks for your assistance thus far. Much appreciated
    The reply is currently minimized Show
  • Accepted Answer

    Tuesday, January 07 2020, 08:36 PM - #Permalink
    Resolved
    0 votes
    So good news and bad news...

    Bad news is, after looking at the "/etc/clearos/qos.conf" file, it realized the two included rules (for <=64 byte packets), are where the packets in 1:11 were coming from.
    QOS_PRIOMARK4_CUSTOM="\
    TCP_ACK_Up|*|1|0|1|-p tcp -m length --length :64
    TCP_ACK_Down|*|1|1|1|-p tcp -m length --length :64

    Good news is, upgrading to 2.5.5-1v.7 the 1:10 queue is working as expected now.

    Thank you very much for the help!



    Every 1.0s: tc -s -g class show dev enp1s0 Tue Jan 7 15:36:33 2020

    +---(1:1) htb rate 6Mbit ceil 6Mbit burst 1599b cburst 1599b
    | Sent 7843405 bytes 87944 pkt (dropped 0, overlimits 0 requeues 0)
    | backlog 0b 0p requeues 0
    |
    +---(1:11) htb prio 1 rate 900Kbit ceil 6Mbit burst 1599b cburst 1599b
    | Sent 5490979 bytes 83506 pkt (dropped 0, overlimits 0 requeues 0)
    | backlog 0b 0p requeues 0
    |
    +---(1:10) htb prio 0 rate 900Kbit ceil 6Mbit burst 1599b cburst 1599b
    | Sent 33768 bytes 144 pkt (dropped 0, overlimits 0 requeues 0) <<<<<<<<<<<<<<<<<<<<<<<<
    | backlog 0b 0p requeues 0
    |
    +---(1:13) htb prio 3 rate 840Kbit ceil 6Mbit burst 1599b cburst 1599b
    | Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
    | backlog 0b 0p requeues 0
    |
    +---(1:12) htb prio 2 rate 840Kbit ceil 6Mbit burst 1599b cburst 1599b
    | Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
    | backlog 0b 0p requeues 0
    |
    +---(1:15) htb prio 5 rate 840Kbit ceil 6Mbit burst 1599b cburst 1599b
    | Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
    | backlog 0b 0p requeues 0
    |
    +---(1:14) htb prio 4 rate 840Kbit ceil 6Mbit burst 1599b cburst 1599b
    | Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
    | backlog 0b 0p requeues 0
    |
    +---(1:16) htb prio 6 rate 840Kbit ceil 6Mbit burst 1599b cburst 1599b
    Sent 2318658 bytes 4294 pkt (dropped 0, overlimits 0 requeues 0)
    backlog 0b 0p requeues 0
    The reply is currently minimized Show
  • Accepted Answer

    Tuesday, January 07 2020, 09:14 PM - #Permalink
    Resolved
    0 votes
    The small packets are typically the SYN and ACK packets so they are given the highest priority. SSH is next because if there is a problem, you need SSH to sort it. For these reasons it is no recommended to use these two priority classes for your own rules.

    I've no idea about any turnround time. I've had only a couple of e-mails from the dev. He is in Canada and I'm in the UK so it is not like wandering down to chat to him.

    You are welcome to have a look at the code. It is in /usr/clearos/apps/qos/deploy/libqos.lua. I am not really a coder and I don't find LUA easy to read, especially its "if" statments and loops. The guts of it cames from RunBandwidthExternal(). There is an offset in some of the loops which may need adjusting, but I'm guessing.
    The reply is currently minimized Show
  • Accepted Answer

    Tuesday, January 14 2020, 09:55 PM - #Permalink
    Resolved
    0 votes
    Hi Sean,
    There is now an update available or syncing to the mirrors over the next couple of hours. Can you try:
    yum update app-firewall app-qos *netify-* --enablerepo=clearos-updates-testing,clearos-contribs-testing
    If you don't have any of the netify suite, including the Application and Protocol Filters, then you can drop *netify-* from the command. You must update the app-firewall at the same time as app-qos.
    The reply is currently minimized Show
  • Accepted Answer

    Wednesday, January 15 2020, 05:11 PM - #Permalink
    Resolved
    0 votes
    There is a further update to app-qos to app-qos-2.5.7-1.v7 as downstream marking was not taking place for packets to a specified LAN IP. to install do a:
    yum update app-qos --enablerepo=clearos-updates-testing


    The next bit is from the devs:
    When testing, to monitor upstream classes your tc comand works:
    tc -s -g class show dev ppp0
    Obviously change your external interface to suit.

    To monitor downstream classes you need to monitor the ifbX device e.g:
    tc -s -g class show dev ifb0
    For single-WAN environments, the IFB interface will always be ifb0.

    For multiwan, running firewall-start in debug mode will display the mapping, such as:
    # /usr/sbin/firewall-start -d
    ...
    firewall: Running external QoS bandwidth manager: app-qos/libqos
    firewall: 0: 1 => ppp0
    firewall: 1: 2 => ppp1
    firewall: 2: 3 => ens34
    ...
    Which is a little ugly, but this tells us that:

    ifb0 is mapped to ppp0.
    ifb1 is mapped to ppp1.
    ifb2 is mapped to ens34.
    The reply is currently minimized Show
Your Reply