Live Monitoring of Web Traffic in Proxy and Content Filter
This guide demonstrates how an admin can use the log files for the proxy server and content filter to monitor web traffic. Using the log files, you can dynamically watch user connections and filter your view of the traffic by IP, username, or even website.
ClearOS has a facility for monitoring log files as they are populated using the command line. Log into your ClearOS server and we will take a look at one of the two files associated with web traffic:
If you are only using the proxy, you will use the squid log file. If you are using the content filter you can use either. For the most part, they are the same. A key difference is that the squid file uses Unix epoch time for the entries and the dansguardian-av log uses a more human-friendly format.
There are a number of ways that you can view these files including using editors like nano, vi, or others to edit them or 'cat' to just spit the whole file out to the screen. For this demonstration we will be using 'tail'.
Tailing a file in Linux
Tailing a file in Linux means that you just want to see the last bits of the file. Typically this is the last ten rows but you can specify more. For our use, we will follow the file instead of spitting out the most recent 10 rows. To follow the file, issue the following command (using the appropriate log file):
tail -f /var/log/dansguardian-av/access.log
This command will follow and continue to follow this log file. As the file grows, the content will be spit to the screen. It will continue to do so until you cancel the follow with a Ctrl+c. When the content filter or proxy is running, it will show you each and every link that is hit realtime as your users browse the internet.
Searching the results using grep
The utility 'grep' is a regular expression matcher. If it sees what you are searching for, it will display it. It will ignore all other results. Grep can use 'regex' matches or simple words. To use tail and grep together, we will send the standard output (the data from the screen of the tail) into the standard input of grep with our search term and it will only display the results.
For example, if you wanted to monitor a user named 'user1', you would issue the following:
tail -f /var/log/dansguardian-av/access.log | grep user1
I may get results like this:
2013.10.3 7:59:33 user1 192.168.1.101 http://ad-emea.doubleclick.net/activity;src=4228629;met=1;v=1;pid=103167436;aid=276091750;ko=0;cid=55817230;rid=55706519;rv=2;×tamp=1380779973853;eid1=2;ecn1=0;etm1=30; GET 42 0 2 200 image/gif sales -
2013.10.3 7:59:34 user1 192.168.1.101 http://ad-emea.doubleclick.net/activity;src=4228629;met=1;v=1;pid=103167436;aid=276090300;ko=0;cid=55817226;rid=55706515;rv=2;×tamp=1380779974910;eid1=2;ecn1=0;etm1=30; GET 42 0 2 200 image/gif sales -
2013.10.3 7:59:37 user1 192.168.1.101 http://www.google.com/favicon.ico GET 982 0 2 200 image/x-icon sales -
2013.10.3 7:59:38 user1 192.168.1.101 http://www.bing.com/s/wlflag.ico GET 894 0 2 200 image/x-icon sales -
2013.10.3 7:59:39 user1 192.168.1.101 http://www.facebook.com/favicon.ico GET 1150 0 2 200 image/x-icon sales -
There are several key data points here.