About Using Regexp in Nginx Map
Nginx has a wonderful map directive that allows you to significantly simplify and shorten configuration files.
Join the DZone community and get the full member experience.
Join For FreeThe essence of the Nginx map directive is that it allows you to create a new variable whose value depends on the values of one or more of the source variables. The directive becomes even more efficient when regular expressions are used. However, one important point may be forgotten. Excerpt from the manual:
Since variables are evaluated only when they are used, the mere declaration even of a large number of map variables does not add any extra costs to request processing.
The important factor is not only that "map does not add any extra costs to request processing", but also that "variables are evaluated only when they are used".
As you know, the Nginx configuration is mostly declarative. This also applies to the map directive. Although it is contained in the HTTP context, it cannot be evaluated until the request is processed. That is when using the resulting variable in the contexts server, location, if, etc., we specify not the result of the evaluation but the formula by which this result will be evaluated when required. This casuistic approach does not cause any problems until we use regular expressions. We’re talking about regular expressions with captures or, to be more precise, regular expressions with unnamed captures. It is better to provide an example.
Let’s say we have a domain example.com with many third-level subdomains, i.e. en.example.com, de.example.com, etc., and we want to redirect them to new subdomains en.example.org, de.example.org, etc. Instead of describing hundreds of redirect lines, we will do this:
map $host $redirect_host {
default "example.org";
"~^(\S+)\.example\.com$" $1.example.org;
}
server {
listen *:80;
server_name .example.com;
location / {
rewrite ^(.*)$ https://$redirect_host$1 permanent;
}
}
We expect that when ru.example.com is requested, the regex will be evaluated in the map. Accordingly, when it gets to the location, the $redirect_host variable will contain the ru.example.org value. However, in reality, everything looks different:
$ GET -Sd en.example.com
GET http://en.example.com
301 Moved Permanently
GET https://en.example.orgen
It turns out that at the time when the request is executed, our variable is equal to en.example.orgen
. This is because we omit the warning "variables are evaluated only when they are used", and now our rewrite has a certain regex nested in another regex.
The simplest solution is not to use regexp both in the map and in the place where the variable is evaluated. For example, for this particular case it might look like this:
map $host $redirect_host {
default "example.org";
"~^(\S+)\.example\.com$" $1.example.org;
}
server {
listen *:80;
server_name .example.com;
location / {
return 301 https://$redirect_host$request_uri;
}
}
But what if there is no alternative solution without regexes (or you just really want to use them).
Let’s try to use named captures in the map:
map $host $redirect_host {
default "example.org";
"~^(?<domain3>\S+)\.example\.com$" $domain3.example.org;
}
server {
listen *:80;
server_name .example.com;
location / {
rewrite ^(.*)$ https://$redirect_host$1 permanent;
}
}
The attempt is unsuccessful:
$ GET -Sd en.example.com
GET http://en.example.com
301 Moved Permanently
GET https://en.example.orgen
Since our unnamed capture, $1 will get the result of named $domain3. This means you need to use named captures in both regexes:
map $host $redirect_host {
default "example.org";
"~^(?<domain3>\S+)\.example\.com$" $domain3.example.org;
}
server {
listen *:80;
server_name .example.com;
location / {
rewrite ^(?<requri>.*)$ https://$redirect_host$requri permanent;
}
}
And now everything works as expected:
$ GET -Sd en.example.com
GET http://en.example.com
301 Moved Permanently
GET https://en.example.org/
Opinions expressed by DZone contributors are their own.
Comments