HTTP referrer

The referrer, or HTTP referrer—also known by the common misspelling referer that occurs as an HTTP header field—identifies, from the point of view of an internet webpage or resource, the address of the webpage (commonly the URL, the more generic URI or the i18n updated IRI) of the resource which links to it. By checking the referrer, the new page can see where the request came from. Referrer logging is used to allow websites and web servers to identify where people are visiting them from, for promotional or security purposes. Referrer is a popular tool to combat cross-site request forgery, but such security mechanisms do not work when the referrer is disabled. Referrer is widely used for statistical purposes.

A dereferrer is a means to strip the details of the referring website from a link request so that the target website cannot identify the page which was clicked on to originate a request.

Origin of the term referer

The misspelling referer originated in the original proposal by computer scientist Phillip Hallam-Baker to incorporate the field into the HTTP specification.[1] The misspelling was set in stone by the time of its incorporation into the standards document RFC 1945; document co-author Roy Fielding has remarked that neither “referrer” nor the misspelling “referer” were recognized by the standard Unix spell checker of the period. “Referer” has since become a widely used spelling in industry when discussing HTTP referrers; usage of the misspelling is not universal, though, as the correct spelling of “referrer” is used in some web specifications such as the Document Object Model.

Details

When visiting a webpage, the referrer or referring page is the URL of the previous webpage from which a link was followed. Server code running through CGI and PHP can access it with the HTTP_REFERER environment variable.

More generally, a referrer is the URL of a previous item which led to this request. The referrer for an image, for example, is generally the HTML page on which it is to be displayed. The referrer field is an optional part of the HTTP request sent by the browser program to the web server.

Many web sites log referrers as part of their attempt to track their users. Most web log analysis software can process this information. As referrer information can violate privacy, some browsers allow the user to disable the sending of referrer information. Some proxy and firewall software will also filter out referrer information, to avoid leaking the location of non-public websites. This can in turn cause problems: some servers block parts of their site to browsers that don’t send the right referrer information, in an attempt to prevent deep linking or unauthorised use of images (bandwidth theft). Some proxy software has the ability to give the top-level address of the target site as the referrer, which usually prevents these problems while still not divulging the user’s last visited site.

Recently many blogs have started publishing referrer information in order to link back to people who are linking to them, and hence broaden the conversation. This has led, in turn, to the rise of referrer spam: the sending of fake referrer information in order to popularize the spammer’s site.

Many pornographic paysites utilize referrer information to secure their materials: only browsers arriving from a small set of approved (login-) pages are given access; this facilitates the sharing of materials among a group of cooperating paysites. Referrer spoofing is often used to gain free access to these sites.

Referrer hiding

Most web servers will maintain logs of all traffic, and record the HTTP referrer sent by the browser for each request. This raises a number of privacy concerns, and as a result a number of systems to prevent servers being sent the real referring URL have been developed. These systems work either by blanking the referrer header or by replacing it with inaccurate data. Generally, internet security suites blank the referrer data, while web based servers replace it with a false URL, usually their own — of course, this raises the problem of referrer spam. The technical details of both methods are fairly consistent — software applications act as a proxy server and manipulate the HTTP request, while web based methods load websites within frames, causing the browser to send a referrer URL of their website address. Some web browsers give their users the option to turn off referrer headers.

Most browsers do not send the referrer header when they are instructed to redirect using the “Refresh” HTTP header. This does not include some versions of Opera and many Mobile browsers. However, this method of redirection is discouraged by the W3C.

If a website accessed from an HTTP Secure connection and a link points to a non-secure connection, then the referrer header is not sent.

One thought on “HTTP referrer

Leave a Reply