Squid (web browsing) circuit
Web browsing involves the following subsystems:
- The squid proxy cache:
Used for caching web pages. It can also be used for logging and
restring access to certain pages.
- Caching web pages: When a user requests a Web page for
the first time, squid fetches it from the original web server, and
caches it on disk. If another request is made, squid supplies the
page from its cache, thus saving a slow request to an external web
site. N.B. Even for accesses served out of its cache, squid still
sends a query to verify the last-modified date of the Web page to
the Web server. Such a date verification query is significantly
faster than a full query. Thus, no modifications are lost, while
still keeping the advantage of having a cache.
- Logging: Page URLs, IP addresses of requesting machines,
and user name are logged. Only the resource part of the URL (up to
the question mark) is logged, query parameters (if any) are not
logged, for privacy reasons.
- Access control: It is possible, using ltnb10's webmin to
restrict access for certain users or to certain pages. For
technical reasons, URLs which contain raw IP addresses (such as for
example http://22.214.171.124/ rather than
http://www.pt.lu) are restricted if this address cannot be
resolved back to an hostname. This is done to prevent users from
connecting to blocked web sites by manually supplying the IP
address rather than the name. Legitimate sites which use such
addresses are extremely rare; if this happens with legitimate
sites, it is usually due to a misconfiguration at the remote site,
which is usually resolved within days.
- Local browser: Squid serves web pages to the local
browsers; when a user logs in from a Windows Workstation in the
classrooms, it asks for a proxy password to authenticate the user.
- Samba: As squid has no direct access to the password
database, it cannot itself verify the username and
password. Instead it connects to samba, and attempts to mount the
proxyauth share from ltnb0. If this succeeds, it meeds
that the password is correct, and the user is granted access. This
action is performed by the /usr/bin/smb_auth and
- Identd: If the user connects from a Unix workstation
(athos/aramis/portos/torr), password authentication is not
necessary, as Unix has a service (identd) to query for the owner of
a network connection.
- DNS: Squid uses DNS to resolve the external web server's
names to IP addresses (and vice versa, if the user supplied an IP
- Remote Web server: Squid communicates with the remote
web server to fetch its pages, and to verify the dates of already
- The Apache webserver
This contains the school's own webpage, and the pages of the
students. Moreover, it contains the browser's "automatic" proxy
configuration, which is stored in
/home/admin/public_html/proxy.pac. Moreover it serves empty
images/empty java scripts for certain blocked ad servers (such as
ad.doubleclick.net). This block works by having the DNS
answer www.ltnb.lu's rather than advertisement server's
real address. These advertisements are blocked for two reasons:
Configuration of this feature is in
/etc/httpd/spamcontrol.conf (on ltnb0) and
/etc/named.conf (on ltnb10).
- They attempt to track users via cookies, which may be contrary
to our data protection legislation. Many of these services actively
try to circumvent protection built into the browser such as
Only accept cookies originating form the same server the page
- Their sites are very often overloaded, and slow down display of
pages. Especially since, in the name of tracking, they often
disable caching of their ads.
The firewall filters any direct connection from browsers to outside
servers, in order to prevent users from changing their browser config
to directly go to the remote site rather than through the squid.
- The local browser
Connects to apache (local pages, proxy config, ad redirects) and
squid (remote pages).
- Samba, identd
Used for user authentication
Supplies IP-hostname mappings of remote sites to squid.
Needed cisco ports
The followin ports need to be open in the Cisco for outside access:
- TCP port 80 from ltnb10 to outside
- TCP port 443 from ltnb10 to outside