Whitelisting http/https traffic on Linux
Given a network interface I want to restrict any http(s) traffic going through it to a limited list of domains. An additional constraint is that I do not want any specialised client configuration. Such a thing is possible through a mitm capable http proxy server (Squid), a DNS server (Unbound) and some firewall rules (iptables). The iptables part is what makes this post specific to Linux.
My motivation for doing this came form running tcpdump and seeing how much crap my Android phone is spewing out over the internet. It would be nice to have a WiFi access point that will transparently proxy only connections I whitelist beforehand.
Setup
For testing we’ll just use a bridge with network namespaces. When we are ready to deploy we can change the interface to a real wlan or eth device.
brctl addbr br-squid
ifconfig br-squid 192.168.50.1
Don’t forget to enable ip forwarding or none of this will work.
echo 1 | sudo tee /proc/sys/net/ipv4/ip_forward
firejail --private --net=br-squid -- curl http://google.com -o /dev/null
curl: (6) Couldn't resolve host 'google.com'
We’ll ignore DNS by allowing all DNS requests and forward them to Google DNS with iptables. Later we will use a local DNS server to do whitelisting.
iptables -t nat -A POSTROUTING -s 192.168.50.0/24 -p udp -d 8.8.8.8 \
--dport 53 -o eno1 -j MASQUERADE
iptables -I FORWARD -i eno1 -o br-squid -m state \
--state RELATED,ESTABLISHED -j ACCEPT
firejail --private --net=br-squid --dns=8.8.8.8 -- \
curl --connect-timeout 10 http://google.com -o /dev/null
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- 0:00:10 --:--:-- 0
curl: (28) Connection timed out after 10000 milliseconds
Squid http intercept
Squid supports transparent proxying under the name of “intercept”.
Here is a minimal squid.conf file:
acl http_whitelist dstdomain .google.com
acl http_whitelist dstdomain .google.co.uk
http_access allow http_whitelist
acl client_src src 192.168.50.0/24
http_access allow client_src
http_access deny all
http_port 192.168.50.1:3128 intercept
The intercept
keyword on http_port
is important to enable Squid to
handle http requests redirected by iptables from a client oblivious to the
presence of the proxy. In this configuration we allow 2 Google related
domains and block everything else. Note that it is also important to
whitelist the source address that our requests are originating from (the
address of our network bridge).
Firewall rule for doing the redirect looks like this
iptables -t nat -A PREROUTING -i br-squid -p tcp --dport 80 -j DNAT --to \
192.168.50.1:3128
Start squid like this
squid -f ./squid.conf -N -d1
Where -f
is the path to local config file, -N
is no daemonize and -d1
is
setting the debug level.
firejail --private --net=br-squid --dns=8.8.8.8 -- \
curl --connect-timeout 10 http://google.com -o /dev/null
% Total % Received % Xferd Average Speed Time Time Time
Current
Dload Upload Total Spent Left
Speed
100 271 100 271 0 0 271 0 0:00:01 --:--:-- 0:00:01
Squid https intercept
SSL is more tricky because the hostname the client is trying to connect to is not readily available like in plain http. Squid needs to start an SSL session with the server and parse the hostname from the returned certificate or parse the hostname from the SNI extension from the client. Squid calls this hostname identification step peek
. After that, if the hostname matches our whitelist we want to do what Squid calls splice
the connection, which is the blindly forward encrypted packets as if the proxy was not there. For all other hostnames we want to terminate
the connection. Squid also has a mitm mode called bump
but that’s outside the scope of this post.
Here are additional ssl specific lines to add to our squid.conf
file. Note that we need a special port for ssl intercept (https_port
) and the acl keywords are different (ssl::server_name
).
https_port 192.168.50.1:3129 intercept ssl-bump cert=./myCA.pem
acl https_whitelist ssl::server_name .google.com
acl https_whitelist ssl::server_name .google.co.uk
ssl_bump splice https_whitelist
ssl_bump peek all
ssl_bump terminate all
Note also that the order of the commands splice
, peek
and terminate
is important because intercept
is a multi stage process and Squid will match the first appropriate command and ignore the rest for each stage. In our case we want to splice
any connection that Squid can determine is allowed by our whitelist acl. If Squid cannot make this decision yet then the next command to try is peek
to gather the necessary information. If peek
ing is done and the connection is not splice
d because of the acl then the only option left is terminate
.
Even though we are not doing mitm, Squid won’t start ssl intercept without a self signed certificate, this is how we can make one.
openssl req -new -newkey rsa:2048 -sha256 -days 365 -nodes -x509 \
-extensions v3_ca -keyout myCA.pem -out myCA.pem
Redirect port 443 traffic to Squid like this
iptables -t nat -A PREROUTING -i br-squid -p tcp --dport 443 -j DNAT --to \
192.168.50.1:3129
For testing we can see that https://google.com (whitelisted) works and https://bing.com (not whitelisted) fails.
firejail --private --net=br-squid --dns=8.8.8.8 -- \
curl --connect-timeout 10 https://google.com -o /dev/null
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 222 100 222 0 0 905 0 --:--:-- --:--:-- --:--:-- 902
firejail --private --net=br-squid --dns=8.8.8.8 -- \
curl --connect-timeout 10 https://bing.com -o /dev/null
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
curl: (35) Unknown SSL protocol error in connection to bing.com:443
You may encounter failures on whitelisted domains with lines like the following in Squid log
SECURITY ALERT: Host header forgery detected on local=216.58.198.110:443 remote=192.168.50.41:40534 FD 13 flags=33 (local IP does not match any domain IP)
This happens because the client does a DNS lookup and Squid also does an independent DNS lookup. Different DNS servers, may return different addresses on different queries. The same DNS system like 8.8.8.8 may also return different addresses on different queries only because of load balancing. We can fix this problem by redirecting the client and Squid through the same local caching DNS server which is our next topic.
DNS
We can use the DNS server Unbound to do DNS based whitelisting and fix our SSL errors above.
A minimal Unbound config looks like this:
server:
interface: 192.168.50.1
access-control: 192.168.50.1/24 allow
Even though we bind to a non-localhost address, Unbound will reject requests from non-localhost clients by default.
debug: refused query from ip4 192.168.50.67 port 50214
So that’s why we need the access control line.
Remove the old DNS firewall rule with iptables -D
and add a new one redirecting all DNS requests on our target interface to Unbound.
iptables -t nat -D POSTROUTING -s 192.168.50.0/24 -d 8.8.8.8/32 -o eno1 -p udp -m udp \
--dport 53 -j MASQUERADE
iptables -t nat -A PREROUTING -i br-squid -p udp --dport 53 -j DNAT --to \
192.168.50.1:53
Change the dns_namesevers
line in squid.conf
to use Unbound to avoid host header forgery errors discussed above
dns_nameservers 192.168.50.1
Run unbound and check it works
unbound -d -c ./unbound.conf
firejail --private --net=br-squid -- \
curl --connect-timeout 10 https://bing.com -o /dev/null
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
curl: (35) Unknown SSL protocol error in connection to bing.com:443
firejail --private --net=br-squid -- \
curl --connect-timeout 10 https://google.com -o /dev/null
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 272 100 272 0 0 479 0 --:--:-- --:--:-- --:--:-- 479
By default Unbound serves all DNS requests because that’s what most people want in a DNS server. To make Unbound behave like a whitelist, we refuse
lookups for all domains and then set our whitelisted ones to transparent
. The order of the lines does not matter, Unbound will match the most specific line (so "google.com" transparent
will win over "." refuse
).
local-zone: "." refuse
local-zone: "google.com" transparent
local-zone: "google.co.uk" transparent
When we test we can see the DNS lookup for our non-whitelisted domain bing.com is instantly rejected.
firejail --private --net=br-squid -- \
curl --connect-timeout 10 https://bing.com -o /dev/null
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
curl: (6) Could not resolve host: bing.com
firejail --private --net=br-squid -- \
curl --connect-timeout 10 https://google.com -o /dev/null
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 272 100 272 0 0 1066 0 --:--:-- --:--:-- --:--:-- 1062
Maintaining a different whitelist for Squid and Unbound may seem excessive and cumbersome. The value of Squid whitelisting is to ensure clients cannot connect directly to IP addresses and avoid the Unbound whitelist. The value of the Unbound whitelist is to ensure clients cannot tunnel their traffic through unrestricted DNS lookups and avoid the Squid whitelist.
If your threat model does not merit having 2 whitelists then DNS tunnelling is less likely than clients attempted to connect to hardcoded IP addresses so the Squid whitelist is more valuable.