Given a network interface I want to restrict any http(s) traffic going through it to a limited list of domains. An additional constraint is that I do not want any specialised client configuration. Such a thing is possible through a mitm capable http proxy server (Squid), a DNS server (Unbound) and some firewall rules (iptables). The iptables part is what makes this post specific to Linux.

My motivation for doing this came form running tcpdump and seeing how much crap my Android phone is spewing out over the internet. It would be nice to have a WiFi access point that will transparently proxy only connections I whitelist beforehand.

Setup

For testing we’ll just use a bridge with network namespaces. When we are ready to deploy we can change the interface to a real wlan or eth device.

brctl addbr br-squid
ifconfig br-squid 192.168.50.1

Don’t forget to enable ip forwarding or none of this will work.

echo 1 | sudo tee /proc/sys/net/ipv4/ip_forward

firejail --private --net=br-squid -- curl http://google.com -o /dev/null
curl: (6) Couldn't resolve host 'google.com'

We’ll ignore DNS by allowing all DNS requests and forward them to Google DNS with iptables. Later we will use a local DNS server to do whitelisting.

iptables -t nat -A POSTROUTING -s 192.168.50.0/24 -p udp -d 8.8.8.8 \
  --dport 53 -o eno1 -j MASQUERADE
iptables -I FORWARD -i eno1 -o br-squid -m state \
  --state RELATED,ESTABLISHED -j ACCEPT

firejail --private --net=br-squid --dns=8.8.8.8 -- \
  curl --connect-timeout 10 http://google.com -o /dev/null
% Total    % Received % Xferd  Average Speed   Time    Time     Time Current
                               Dload  Upload   Total   Spent    Left Speed
0     0    0     0    0     0      0      0 --:--:--  0:00:10 --:--:--     0
curl: (28) Connection timed out after 10000 milliseconds

Squid http intercept

Squid supports transparent proxying under the name of “intercept”.

Here is a minimal squid.conf file:

acl http_whitelist dstdomain .google.com
acl http_whitelist dstdomain .google.co.uk

http_access allow http_whitelist
acl client_src src 192.168.50.0/24
http_access allow client_src
http_access deny all

http_port 192.168.50.1:3128 intercept

The intercept keyword on http_port is important to enable Squid to handle http requests redirected by iptables from a client oblivious to the presence of the proxy. In this configuration we allow 2 Google related domains and block everything else. Note that it is also important to whitelist the source address that our requests are originating from (the address of our network bridge).

Firewall rule for doing the redirect looks like this

iptables -t nat -A PREROUTING -i br-squid -p tcp --dport 80 -j DNAT --to \
  192.168.50.1:3128

Start squid like this

squid -f ./squid.conf -N -d1

Where -f is the path to local config file, -N is no daemonize and -d1 is setting the debug level.

firejail --private --net=br-squid --dns=8.8.8.8 -- \
   curl --connect-timeout 10 http://google.com -o /dev/null
% Total    % Received % Xferd  Average Speed   Time    Time     Time
Current
                                 Dload  Upload   Total   Spent    Left
Speed
100   271  100   271    0     0    271      0  0:00:01 --:--:--  0:00:01

Squid https intercept

SSL is more tricky because the hostname the client is trying to connect to is not readily available like in plain http. Squid needs to start an SSL session with the server and parse the hostname from the returned certificate or parse the hostname from the SNI extension from the client. Squid calls this hostname identification step peek. After that, if the hostname matches our whitelist we want to do what Squid calls splice the connection, which is the blindly forward encrypted packets as if the proxy was not there. For all other hostnames we want to terminate the connection. Squid also has a mitm mode called bump but that’s outside the scope of this post.

Here are additional ssl specific lines to add to our squid.conf file. Note that we need a special port for ssl intercept (https_port) and the acl keywords are different (ssl::server_name).

https_port 192.168.50.1:3129 intercept ssl-bump cert=./myCA.pem

acl https_whitelist ssl::server_name .google.com
acl https_whitelist ssl::server_name .google.co.uk

ssl_bump splice https_whitelist
ssl_bump peek all
ssl_bump terminate all

Note also that the order of the commands splice, peek and terminate is important because intercept is a multi stage process and Squid will match the first appropriate command and ignore the rest for each stage. In our case we want to splice any connection that Squid can determine is allowed by our whitelist acl. If Squid cannot make this decision yet then the next command to try is peek to gather the necessary information. If peeking is done and the connection is not spliced because of the acl then the only option left is terminate.

Even though we are not doing mitm, Squid won’t start ssl intercept without a self signed certificate, this is how we can make one.

openssl req -new -newkey rsa:2048 -sha256 -days 365 -nodes -x509 \
      -extensions v3_ca -keyout myCA.pem  -out myCA.pem

Redirect port 443 traffic to Squid like this

iptables -t nat -A PREROUTING -i br-squid -p tcp --dport 443 -j DNAT --to \
  192.168.50.1:3129

For testing we can see that https://google.com (whitelisted) works and https://bing.com (not whitelisted) fails.

firejail --private --net=br-squid --dns=8.8.8.8 -- \
  curl --connect-timeout 10 https://google.com -o /dev/null
% Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                               Dload  Upload   Total   Spent    Left  Speed
100   222  100   222    0     0    905      0 --:--:-- --:--:-- --:--:--   902

firejail --private --net=br-squid --dns=8.8.8.8 -- \
  curl --connect-timeout 10 https://bing.com -o /dev/null
% Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
curl: (35) Unknown SSL protocol error in connection to bing.com:443

You may encounter failures on whitelisted domains with lines like the following in Squid log

SECURITY ALERT: Host header forgery detected on local=216.58.198.110:443 remote=192.168.50.41:40534 FD 13 flags=33 (local IP does not match any domain IP)

This happens because the client does a DNS lookup and Squid also does an independent DNS lookup. Different DNS servers, may return different addresses on different queries. The same DNS system like 8.8.8.8 may also return different addresses on different queries only because of load balancing. We can fix this problem by redirecting the client and Squid through the same local caching DNS server which is our next topic.

DNS

We can use the DNS server Unbound to do DNS based whitelisting and fix our SSL errors above.

A minimal Unbound config looks like this:

server:
  interface: 192.168.50.1
  access-control: 192.168.50.1/24 allow

Even though we bind to a non-localhost address, Unbound will reject requests from non-localhost clients by default.

debug: refused query from ip4 192.168.50.67 port 50214

So that’s why we need the access control line.

Remove the old DNS firewall rule with iptables -D and add a new one redirecting all DNS requests on our target interface to Unbound.

iptables -t nat -D POSTROUTING -s 192.168.50.0/24 -d 8.8.8.8/32 -o eno1 -p udp -m udp \
  --dport 53 -j MASQUERADE
iptables -t nat -A PREROUTING -i br-squid -p udp --dport 53 -j DNAT --to \
  192.168.50.1:53

Change the dns_namesevers line in squid.conf to use Unbound to avoid host header forgery errors discussed above

dns_nameservers 192.168.50.1

Run unbound and check it works

unbound -d -c ./unbound.conf

firejail --private --net=br-squid -- \
  curl --connect-timeout 10 https://bing.com -o /dev/null
% Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                               Dload  Upload   Total   Spent    Left  Speed
0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
curl: (35) Unknown SSL protocol error in connection to bing.com:443

firejail --private --net=br-squid -- \
  curl --connect-timeout 10 https://google.com -o /dev/null
% Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   272  100   272    0     0    479      0 --:--:-- --:--:-- --:--:--   479

By default Unbound serves all DNS requests because that’s what most people want in a DNS server. To make Unbound behave like a whitelist, we refuse lookups for all domains and then set our whitelisted ones to transparent. The order of the lines does not matter, Unbound will match the most specific line (so "google.com" transparent will win over "." refuse).

local-zone: "." refuse
local-zone: "google.com" transparent
local-zone: "google.co.uk" transparent

When we test we can see the DNS lookup for our non-whitelisted domain bing.com is instantly rejected.

firejail --private --net=br-squid -- \
  curl --connect-timeout 10 https://bing.com -o /dev/null
% Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
curl: (6) Could not resolve host: bing.com

firejail --private --net=br-squid -- \
  curl --connect-timeout 10 https://google.com -o /dev/null
% Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   272  100   272    0     0   1066      0 --:--:-- --:--:-- --:--:--  1062

Maintaining a different whitelist for Squid and Unbound may seem excessive and cumbersome. The value of Squid whitelisting is to ensure clients cannot connect directly to IP addresses and avoid the Unbound whitelist. The value of the Unbound whitelist is to ensure clients cannot tunnel their traffic through unrestricted DNS lookups and avoid the Squid whitelist.

If your threat model does not merit having 2 whitelists then DNS tunnelling is less likely than clients attempted to connect to hardcoded IP addresses so the Squid whitelist is more valuable.