Robust Hotlink Protection Strategies
If your website all your resources are in the same domain, add the `Cross-Origin-Resource-Policy: same-site` response header to your resources. If you use a CDN or serve some resources from an external domain, add the `Cross-Origin-Resource-Policy: same-origin` *and* `Access-Control-Allow-Origin: https://yourbusiness.example` response headers to your (external) resources and force a CORS request by using the `crossorigin` attribute.
Join the DZone community and get the full member experience.
Join For FreeHotlinking refers to the practice of third-party web properties loading resources (most commonly images) directly from your own server.
For example, if you operate the website yourbusiness.example
, you may have an image at https://yourbusiness.example/infographic.png
that you use within your website. A hotlink is when an unrelated property (for example, the site anotherbusiness.test
) embeds that image directly on their website by reference to your server, for example using the following HTML code:
<img
src=https://yourbusiness.example/infographic.png
alt="Widget purchases per capita"
>
Unauthorised hotlinks are generally undesirable, not only because they can facilitate reproducing your content without permission but also because, since the resources are being loaded directly from your server, they can burden you with additional server costs in computing and bandwidth.
TL;DR: If your website all your resources are in the same domain, add the Cross-Origin-Resource-Policy: same-site
response header to your resources. If you use a CDN or serve some resources from an external domain, add the Cross-Origin-Resource-Policy: same-origin
and Access-Control-Allow-Origin: https://yourbusiness.example
response headers to your (external) resources and force a CORS request by using the crossorigin
attribute.
Older Approach to Hotlink Protection
Hotlink protection has historically relied on the HTTP Referer
header, which indicates the source of the document loading the resource.
When a request is made to https://yourbusiness.example/infographic.png
from your website, this referrer header would look something like Referer: https://yourbusiness.example/statistics.html
, whereas when it’s embedded in a third-party website, it might look like Referer: https://anotherbusiness.test/widgets-consumption/
.
This approach is usually implemented in various webservers as follows.
Example Configuration for Referer
-based Protection
Apache: .htaccess
RewriteEngine on
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^https?://(.+\.)?yourbusiness\.example [NC]
RewriteRule \.(jpe?g|png|gif)$ - [NC,F,L]
nginx
location ~ \.(jpe?g|png|gif)$ {
valid_referers
none blocked
server_names
*.yourbusiness.example
yourbusiness.example;
if ($invalid_referer) {
return 403;
}
Microsoft IIS
<rule
name="Hotlinking Prevention"
stopProcessing="true"
>
<match
url=".*\.(jpe?g|png|gif)"
/>
<conditions>
<add
input="{HTTP_REFERER}"
pattern="^$"
negate="true" />
<add
input="{HTTP_REFERER}"
pattern="^https?://(.+\.)?yourbusiness\.example/.*$"
negate="true" />
</conditions>
<action
type="CustomResponse"
statusCode="403"
/>
</rule>
Issues With Referer
-based Protection
Hotlink protection using the Referer
has a number of issues that make it inappropriate in many scenarios.
Hotlinking vs. Direct Linking
Perhaps the most compelling argument against relying on the Referer
server is that doing so also breaks regular direct links, which may not always be desirable. This is because the Referer
header is not only sent when loading resources but also when a navigation action happens. Hence, a citation like From YourBusiness' <a href=https://yourbusiness.example/infographic.png>widget purchases per capita</a> have increased steadily over the last century
would result in a Referer
header being sent from an external domain, which will trigger hotlink protection and result in blocked content. This is in most cases a bad user experience.
Referer
omission
Another reason why this way of protection is ineffective is that the Referer
header is not sent along requests in all cases.
One common case is that generally resources loaded from a plain HTTP location (i.e., with the URL starting with http://
) will not include a Referer
when they are being loaded from an HTTPS site. Therefore, an image loaded from http://yourbusiness.example/infographic.png
at https://example.com
will typically not have a Referer
header.
Although the above scenario is not as relevant nowadays as most sites use HTTPS, that is just one example of how the Referer
header can be omitted. A much more relevant argument is that the embedding site is in full control of whether a Referer
header is sent along, which is as simple as including the referrerpolicy="no-referrer"
attribute to the img
tag.
It may be tempting to address this issue by requiring that a Referer
header be present. However, there are other reasons why a Referer
header may not be sent, such as direct navigation by a user, browser configuration and extensions that block this header or proxies that remove it. Thus, simply blocking incoming requests missing this header could result in poor user experience for some users, as your page will look ‘broken’ since resources won’t load.
Additional Processing
A third disadvantage of Referer
-based protection is that it’s dynamic by nature, meaning that every incoming request needs to be evaluated by your server to check whether the Referer
contains an allowed value. This is a relatively small concern, but it introduces a tiny amount of latency to the response and results in more complex cache management.
As the web is moving more and more towards static or pre-rendered content and serverless platforms, effective protection that involves as little server logic as possible is ideal.
A More Robust Approach
Web standards and browsers have come a long way in the last few decades, and they now include all of the tools needed for effective and robust protection against hotlinking in the most common scenarios. Specifically, developments in the fetch
standard regarding cross-origin requests arm us with request and response headers that we can use to implement browser-enforced hotlink protection.
Some Relevant Headers
Origin
The Origin
HTTP request header is in many respects similar to the Referer
header with some enhancements that make it more suitable for requests across different web properties.
One key difference between Referer
and Origin
is that the former typically is a full URL (for example, https://www.example.com/page/
) whilst the latter is always just an origin, or the first part of the URL, like https://www.example.com
.
The Origin
request header is sent for requests with methods other than GET
or HEAD
, or for requests that are explicitly marked as cross-origin (i.e., from one site to another).
Cross-Origin-Resource-Policy
The Cross-Origin-Resource-Policy
HTTP response header is used to define a policy for cross-origin requests made in no-cors
mode.
In practical terms, this header alone is sufficient in most cases for effective hotlink protection.
Cross-Origin-Resource-Policy
can take one of three values: same-origin
, same-site
or cross-origin
. All we need to do is include the header Cross-Origin-Resource-Policy
with either of same-origin
or same-site
, and embedding of your resources by external websites will be blocked.
Difference Between same-origin
and same-site
The value same-origin
specifies stricter policy than same-site
.
Consider the site at https://example.com
and the resources https://example.com/a.png
and https://images.example.com/b.png
. While both resources are part of the same site (i.e., example.com
), each resource has a different origin: https://example.com
and https://images.example.com
, respectively. While both same-origin
and same-site
will allow for the site to use the first resource, only same-site
will allow it to embed the second resource, as the origins are different.
Access-Control-Allow-Origin
The Access-Control-Allow-Origin
HTTP response header is relevant for so-called CORS requests, which are requests that include an Origin
header. Furthermore, the CORS protocol defines some versatile mechanisms that allow sites to define policies for cross-origin resource sharing.
Normally, most resources that we would like to protect against hotlinking (for example, images) are not loaded using CORS requests. However, CORS requests can be made explicitly by including the crossorigin
or crossorigin="anonymous"
attribute to the resource tag, for example like this: <img crossorigin alt=Example src=sample.jpg>
.
The Access-Control-Allow-Origin
instructs the browser that a certain origin actually allowed to make a cross-origin request, and it should take either of two values: *
, indicating that the resource can be requested by all origins, or the value of the Origin
header included with the request. Other values or absence of this header tell the browser that the request is not allowed in the context of the CORS protocol.
Hotlink Protection for Sites Using a Single Origin
Many smaller sites serve all of their content from a single origin. For example, if you use WordPress, you may have the site https://yourbusiness.example
and have most of your images under the https://yourbusiness.example/wp-content/uploads
directory.
For this simple case, an effective way to implement hotlink protection is to include the header Cross-Origin-Resource-Policy: same-origin
along with your responses, and this will prevent hotlinking by any other sites.
It is important that the Access-Control-Allow-Origin
header not be sent, or if it is, that it be set to the origin of your site (e.g., https://yourbusiness.example
). Sending Access-Control-Allow-Origin
with any other value, especially *
or replying back with the Origin
header included in the request will allow hotlinking if a CORS request is made.
Configuration
Apache .htaccess
This policy can be implemented in Apache by using the .htaccess
file (or alternatively the main configuration file) with something along these lines:
<FilesMatch "\.(jpe?g|png|gif)$">
<IfModule mod_headers.c>
Header set Cross-Origin-Resource-Policy "same-origin"
</IfModule>
</FilesMatch>
The FilesMatch
directive can be adjusted as needed depending on the files that require hotlink protection. Because this technique does not have many of the limitations of the Referer
-based one, it is even possible to skip this check and include the header with all responses.
nginx
In the relevant server
block, this policy can be implemented as follows, adjusting the location
part as needed:
location ~ \.(jpe?g|png|gif)$ {
add_header cross-origin-resource-policy same-origin;
}
Hotlink Protection for Sites Using Subdomains
Oftentimes sites serve resources from a subdomain. For instance, the main site could be available at https://yourbusiness.example
while images and other static resources are hosted at https://static.yourbusiness.example
. It may also be the case that certain parts of the site reside in a subdomain (for example, https://store.yourbusiness.example
) and subdomains share resources.
For these scenarios, hotlink protection can still use the Cross-Origin-Resource-Policy
header, except that the same-site
value (instead of same-origin
) is likely the most appropriate choice.
Hotlink Protection for Sites Using External Resources
It is increasingly common for sites to use CDNs to serve static resources, which often are accessed through a separate domain. For example, https://yourbusiness.example
might load images from the origin https://yourbusiness.cdnprovider.example
.
Hotlink protection in this scenario is slightly more involved than when all resources are part of the same site because then requests are by definition cross-origin and cross-site and CORS policies are only enforced for requests that use the CORS protocol.
While want to load resources from a different origin, we can’t rely on the Cross-Origin-Resource-Policy
alone. This is because the only value that would seem appropriate is cross-origin
, which does not provide any hotlink protection whatsoever: adding the header Cross-Origin-Resource-Policy: cross-origin
to CDN responses would result in anyone being able to load the resource in question from any origin, which is exactly the situation that we are trying to avoid.
Fortunately, we can force requests to use the CORS protocol and this way have more granular control over whom has access.
Counter-intuitively, an appropriate value for the Cross-Origin-Resource-Policy
header in CDN responses is same-origin
. By using this value, non-CORS requests to the CDN will fail, which leaves only CORS requests as a way to load resources, which equips us with more granular ways of defining an access policy through the Access-Control-Allow-Origin
header.
For the CDN or external domain case, hence we need three elements for hotlink protection:
Cross-Origin-Resource-Policy: same-origin
in the external resource response. This blocks non-CORS requestsAccess-Control-Allow-Origin: https://yourbusiness.example
in the external resource response (wherehttps://yourbusiness.example
is the origin the resource will be loaded from). This tells the browser that the response is intended for use by this origin and this origin only.crossorigin
(orcrossorigin="anonymous"
) attribute in the document referencing the resource. This means that<img src=https://yourbusiness.cdnprovider.example/image.webp>
becomes<img crossorigin src=https://yourbusiness.cdnprovider.example/image.webp>
.
This approach will effectively block hotlinking from origins other than https://yourbusiness.example
, which is exactly what we are looking after. As an added bonus, this same configuration also works for the single-origin case discussed earlier and, with the caveats that follow, for the single-site case.
Advantages, Caveats, and Limitations
The solution presented is simple (requires adding a few HTTP headers to the response and a small change to the HTML markup), robust, has good browser support and because the policy is enforced by the browser itself, it degrades gracefully, meaning that the resources will still load normally in the few browsers still in use that don’t support these headers.
Moreover, this solution for hotlink protection is in many cases stateless meaning that no conditional logic is required in the server, as the values for the headers are predetermined in advance.
The main caveat that applies, which is most relevant to this sites that make use of multiple domains or subdomains, and
- load content served from an external domain (such as a CDN), or
- use the
crossorigin
attribute
Since the Access-Control-Allow-Origin
can only contain a single origin (and there is no syntax for allowing subdomains), these sites will require some server-side logic give an appropriate value to the Access-Control-Allow-Origin
header. The Access-Control-Allow-Origin
must contain the value of the Origin
sent in the original request, and this value must be validated first to ensure that it’s an allowed value. Moreover, it’s likely that these sites will need to add Access-Control-Allow-Origin
to their Vary
response header to allow for proper caching.
Direct Linking Protection
In certain scenarios, it may be desirable to prevent direct access or direct linking to certain resources. The new Sec-
headers allow for controlling these actions.
Images and Other Embeddable Content
The Sec-Fetch-Dest
request header is useful in these cases to have fine-grained control over when a browser is allowed to download a certain resource.
For example, to prevent direct access to an image, you can check if this header is set to image
. To allow direct access and embedding as an image, but not other uses (for example, a fetch
or XHR
request), only the document
and image
values could be allowed. To block direct access, just image
would be allowed.
Note that this header is not yet supported by all browsers (most notably Safari), so it’s advised that, if you decide to make decisions based on this header, you allow requests that do not have it set.
External Direct Links
You may want to prevent or discourage external websites directly linking to certain files on your site (for instance, a large PDF file) while still allowing your users to access this content by directly linking files internally.
The Sec-Fetch-Site
request header can be helpful for taking different actions based on where a user came from. For example, if this header is set to cross-site
, then you might decide to issue a 303 See Other
redirect to a page discussing the resource in question.
Like Sec-Fetch-Dest
, Sec-Fetch-Site
as of yet does not have wide enough support and may not be present in all requests. It’s recommended to allow through normally requests without this header.
As of the time of writing,
Cross-Origin-Resource-Policy
is supported by over 93% of global users.
Published at DZone with permission of Ricardo Ivan Vieitez Parra. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments