0%

Nginx Url Rewrite 讲解

break与last区别

If the directives of this module are given at the server level, then they are carried out before the location of the request is determined. If in that selected location there are further rewrite directives, then they also are carried out. If the URI changed as a result of the execution of directives inside location, then location is again determined for the new URI.

This cycle can be repeated up to 10 times, after which Nginx returns a 500 error.

Rewrite(URL 重写)指令可以出现在 server {} 下,也可以出现在 location {} 下,它们之间是有区别的!对于出现在 server{} 下的 rewrite 指令,它的执行会在 location 匹配之前;对于出现在 location{} 下的 rewrite 指令,它的执行当然是在 location 匹配之后,但是由于 rewrite 导致 HTTP 请求的 URI 发生了变化,所以 location{} 下的 rewrite 后的 URI 又需要重新匹配 location ,就好比一个新的 HTTP 请求一样(注意由 location{} 内的 rewrite 导致的这样的循环匹配最多不超过 10 次,否则 nginx 会报 500 错误)。总的来说,如果 server{} 和 location{} 下都有 rewrite ,依然是先执行 server{} ,然后进行 location 匹配,如果被匹配的 location{} 之内还有 rewrite 指令,那么继续执行 rewrite ,同时因为 location{} 内的 rewrite 改变了 URI ,那么重写后的结果 URI 需要当做一个新的请求,重新匹配 location (应该也包括重新执行 server{} 下的 rewrite 吧)。

Last 与 break flag 的区别

  • last - completes processing of rewrite directives, after which searches for corresponding URI and location

  • break - completes processing of rewrite directives ”

可见两者都有“不让继续执行后面的 rewrite 指令”的含义,但是两者的区别并没有展开。

这里我用实验来告诉大家区别。实验准备:

  1. 安装 nginx:如果对安装和 location 不了解的,请参考 http://eyesmore.iteye.com/blog/1141660
  2. 在 nginx 安装目录的 html 子目录下创建 4 个文件,分别叫 aaa.html、bbb.html、ccc.html 和 ddd.html ,文件内容分别是各自的文件名(例 aaa.html 文件内容不妨写 aaa html file )。
  3. Nginx 配置文件初始化是:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21

error_log logs/error.log info; # URL Rewrite发生时会写该日志

server {

listen 9090;
server_name localhost;

root html;
rewrite_log on; # 打开Rewrite日志,以便写入 error_log

location /aaa.html {
rewrite "^/aaa\.html$" /bbb.html;
rewrite "^/bbb\.html$" /ddd.html;
}

location /bbb.html {
rewrite "^/bbb\.html$" /ccc.html;
}

}

上述配置注意两点:

  1. 打开 rewrite 模块的日志开关,以便 rewrite 执行日志写入 error_log (注: rewrite 日志写入 error_log 的级别是 notice ,所以要注意 error_log 日志级别,此处用 info );
  2. 定义了两个 location ,分别是 /aaa.html 和 /bbb.html ,但是在 /aaa.html 中,把 /aaa.html 重写成 /bbb.html ,接着又把 /bbb.html 重写成 /ddd.html ;在 /bbb.html 中,把 /bbb.html 重写成 /ccc.html 。

没有last和break时,请求/aaa.html

  • 实例-1:没有last和break指令时,请求/aaa.html

  • 配置文件

1
2
3
4
5
6
7
8
location  /aaa.html {
rewrite "^/aaa\.html$" /bbb.html;
rewrite "^/bbb\.html$" /ddd.html;
}

location /bbb.html {
rewrite "^/bbb\.html$" /ccc.html;
}
  • 请求响应
1
2
3
[root@web108 ~]# curl http://localhost:9090/aaa.html

ddd html file
  • Rewrite Log
1
2
3
4
5
6
7
8
9
10

2011/08/07 22:13:23 [notice] 9066#0: *85 "^/aaa\.html$" matches "/aaa.html" , client: 127.0.0.1, server: localhost, request: "GET /aaa.html HTTP/1.1", host: "localhost:9090"

2011/08/07 22:13:23 [notice] 9066#0: *85 rewritten data: "/bbb.html" , args: "", client: 127.0.0.1, server: localhost, request: "GET /aaa.html HTTP/1.1", host: "localhost:9090"

2011/08/07 22:13:23 [notice] 9066#0: *85 "^/bbb\.html$" matches "/bbb.html" , client: 127.0.0.1, server: localhost, request: "GET /aaa.html HTTP/1.1", host: "localhost:9090"

2011/08/07 22:13:23 [notice] 9066#0: *85 rewritten data: "/ddd.html" , args: "", client: 127.0.0.1, server: localhost, request: "GET /aaa.html HTTP/1.1", host: "localhost:9090"

2011/08/07 22:13:23 [info] 9066#0: *85 client 127.0.0.1 closed keepalive connection
  • 过程解说

URL 重写模块的日志告诉我们:对于一个 HTTP 请求“ GET /aaa.html ”,重写过程是:

  1. 先 /aaa.html 被重写为 /bbb.html ;然后 rewritten data: /bbb.html
  2. 继续执行后面的 rewrite 指令,进而被重写为 /ddd.html ,然后 rewrittern data: /ddd.html
  3. 后面没有重写了(其实此时 /ddd.html 需要再次重新匹配 location 的,只是日志没有体现出来,接下来的测试 2 会体现这点),于是输出 /ddd.html 的内容。

使用last标记时,请求/aaa.html

  • 配置文件: 在第一个Rewrite后面使用last标记
1
2
3
4
5
6
7
8
location  /aaa.html {
rewrite "^/aaa\.html$" /bbb.html last; # 增加last标记
rewrite "^/bbb\.html$" /ddd.html;
}

location /bbb.html {
rewrite "^/bbb\.html$" /ccc.html;
}
  • 请求响应
1
2
3
[root@web108 ~]# curl http://localhost:9090/aaa.html

ccc html file
  • Rewrite Log:
1
2
3
4
5
6
7
8
9
2011/08/07 22:24:31 [notice] 18569#0: *86 "^/aaa\.html$" matches "/aaa.html" , client: 127.0.0.1, server: localhost, request: "GET /aaa.html HTTP/1.1", host: "localhost:9090"

2011/08/07 22:24:31 [notice] 18569#0: *86 rewritten data: "/bbb.html" , args: "", client: 127.0.0.1, server: localhost, request: "GET /aaa.html HTTP/1.1", host: "localhost:9090"

2011/08/07 22:24:31 [notice] 18569#0: *86 "^/bbb\.html$" matches "/bbb.html" , client: 127.0.0.1, server: localhost, request: "GET /aaa.html HTTP/1.1", host: "localhost:9090"

2011/08/07 22:24:31 [notice] 18569#0: *86 rewritten data: "/ccc.html" , args: "", client: 127.0.0.1, server: localhost, request: "GET /aaa.html HTTP/1.1", host: "localhost:9090"

2011/08/07 22:24:31 [info] 18569#0: *86 client 127.0.0.1 closed keepalive connection
  • 过程解说

不知道读者看到 GET /aaa.html 显示的结果“ ccc html file ”会不会惊讶:“为什么结果不是 bbb html file ”。

下面解释下整个过程:首先 /aaa.html 匹配了 location /aaa.html {} ,于是执行 rewrite "^/aaa\.html$" /bbb.html last ,把 /aaa.html 重写为 /bbb.html ,同时由于 last flag 的使用,后面的 rewrite 指令(指的是 rewrite "^/bbb\.html$" /ddd.html )不会被执行。

似乎此时应该输出“ bbb html file ”才对,但是我们看看 nginx 官方解释:“ last - completes processing of rewrite directives, after which searches for corresponding URI and location ” 意思是说 last 不再匹配后面的 rewrite 指令,但是紧接着需要对重写后的 URI 重新匹配 location

让我们再看看官方的“ If the directives of this module are given at the server level, then they are carried out before the location of the request is determined. If in that selected location there are further rewrite directives, then they also are carried out. If the URI changed as a result of the execution of directives inside location, then location is again determined for the new URI. This cycle can be repeated up to 10 times, after which Nginx returns a 500 error. ” 因此,重新匹配的时候,匹配到了新的 location /bbb.html {} ,执行“ rewrite "^/bbb\.html$" /ccc.html ”,最后的内容是“ ccc html file ”。

使用break标记时,请求/aaa.html

将上述 location /aaa.html {} 修改成使用 break 标记:

  • 配置文件: 在第一个Rewrite后面使用break标记
1
2
3
4
5
6
7
8
location  /aaa.html {
rewrite "^/aaa\.html$" /bbb.html break; # 添加break标记
rewrite "^/bbb\.html$" /ddd.html;
}

location /bbb.html {
rewrite "^/bbb\.html$" /ccc.html;
}
  • 请求响应
1
2
3
[root@web108 ~]# curl http://localhost:9090/aaa.html

bbb html file
  • Rewrite Log
1
2
3
4
5
2011/08/07 22:37:49 [notice] 21069#0: *89 "^/aaa\.html$" matches "/aaa.html" , client: 127.0.0.1, server: localhost, request: "GET /aaa.html HTTP/1.1", host: "localhost:9090"

2011/08/07 22:37:49 [notice] 21069#0: *89 rewritten data: "/bbb.html" , args: "", client: 127.0.0.1, server: localhost, request: "GET /aaa.html HTTP/1.1", host: "localhost:9090"

2011/08/07 22:37:49 [info] 21069#0: *89 client 127.0.0.1 closed keepalive connection
  • 过程解说

我想这个结果不用多做解释了,充分体现了 break 和 last 的区别:

  • last - completes processing of rewrite directives, after which searches for corresponding URI and location
  • break - completes processing of rewrite directives

Break 和 last 都能阻止继续执行后面的 rewrite 指令,*但是 last 如果在 location 下用的话,对于重写后的 URI 会重新匹配 location ,但是 break 则不会重新匹配 location *

简单的说, break 终止的力度比 last 更加彻底。跟程序设计领域的while循环做个类比:break就是break,它能直接退出循环;last好比continue,结束本次迭代,直接进入下次迭代。

Rewrite的迭代

实例1

  • Nginx配置
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
error_log  logs/error.log info;

server {

listen 9090;
server_name localhost;
root html;

rewrite_log on;
rewrite "^/aaa\.html$" /bbb.html;

location /ccc.html {
rewrite "^/ccc\.html$" /eee.html;
}

location /bbb.html {
rewrite "^/bbb\.html$" /ccc.html;
rewrite "^/ccc\.html$" /ddd.html;
}

}
  • 请求响应
1
2
3
[root@web108 ~]# curl http://localhost:9090/aaa.html

ddd html file
  • RewriteLog
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
2011/08/08 10:05:41 [notice] 31592#0: *90 "^/aaa\.html$" matches "/aaa.html" , client: 127.0.0.1, server: localhost, request: "GET /aaa.html HTTP/1.1", host: "localhost:9090"

2011/08/08 10:05:41 [notice] 31592#0: *90 rewritten data: "/bbb.html" , args: "", client: 127.0.0.1, server: localhost, request: "GET /aaa.html HTTP/1.1", host: "localhost:9090"

2011/08/08 10:05:41 [notice] 31592#0: *90 "^/bbb\.html$" matches "/bbb.html" , client: 127.0.0.1, server: localhost, request: "GET /aaa.html HTTP/1.1", host: "localhost:9090"

2011/08/08 10:05:41 [notice] 31592#0: *90 rewritten data: "/ccc.html" , args: "", client: 127.0.0.1, server: localhost, request: "GET /aaa.html HTTP/1.1", host: "localhost:9090"

2011/08/08 10:05:41 [notice] 31592#0: *90 "^/ccc\.html$" matches "/ccc.html" , client: 127.0.0.1, server: localhost, request: "GET /aaa.html HTTP/1.1", host: "localhost:9090"

2011/08/08 10:05:41 [notice] 31592#0: *90 rewritten data: "/ddd.html" , args: "", client: 127.0.0.1, server: localhost, request: "GET /aaa.html HTTP/1.1", host: "localhost:9090"

2011/08/08 10:05:41 [notice] 31592#0: *90 "^/aaa\.html$" does not match "/ddd.html" , client: 127.0.0.1, server: localhost, request: "GET /aaa.html HTTP/1.1", host: "localhost:9090"

2011/08/08 10:05:41 [info] 31592#0: *90 client 127.0.0.1 closed keepalive connection
  • 过程解释:

GET /aaa.html 请求,首先执行 server 级的 rewrite 指令,被重写为 /bbb.html ,然后匹配到 location /bbb.html {} ,接着执行 location 级的 rewrite 指令,先重写为 /ccc.html ,再重写为 /ddd.html ;由于 URI 被 location 级的 rewrite 指令重写了,因此需要重新进行 location 的匹配,相当于重写后的 URI 被当做一个新的请求,会重新执行 server 级的 rewrite ,然后重新匹配 location ,日志2011/08/08 10:05:41 [notice] 31592#0: *90 ***"^/aaa\.html$" does not match "/ddd.html"\* , client: 127.0.0.1, server: localhost, request: “GET /aaa.html HTTP/1.1”, host: “localhost:9090” ”体现了重新匹配 location 的流程。

实例2

  • Nginx配置
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
error_log  logs/error.log info;

server {

listen 9090;
server_name localhost;
root html;


rewrite_log on;
rewrite "^/aaa\.html$" /bbb.html;
rewrite "^/ccc\.html$" /ddd.html;

location /bbb.html {
rewrite "^/bbb\.html$" /ccc.html;
}

location /ddd.html {
rewrite "^/ddd\.html$" /eee.html;
}

}
  • 请求响应
1
2
3
[root@web108 ~]# curl http://localhost:9090/aaa.html

eee html file
  • RewriteLog
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
2011/08/08 10:21:00 [notice] 2218#0: *91 "^/aaa\.html$" matches "/aaa.html" , client: 127.0.0.1, server: localhost, request: "GET /aaa.html HTTP/1.1", host: "localhost:9090"

2011/08/08 10:21:00 [notice] 2218#0: *91 rewritten data: "/bbb.html" , args: "", client: 127.0.0.1, server: localhost, request: "GET /aaa.html HTTP/1.1", host: "localhost:9090"

2011/08/08 10:21:00 [notice] 2218#0: *91 "^/ccc\.html$" does not match "/bbb.html" , client: 127.0.0.1, server: localhost, request: "GET /aaa.html HTTP/1.1", host: "localhost:9090"

2011/08/08 10:21:00 [notice] 2218#0: *91 "^/bbb\.html$" matches "/bbb.html" , client: 127.0.0.1, server: localhost, request: "GET /aaa.html HTTP/1.1", host: "localhost:9090"

2011/08/08 10:21:00 [notice] 2218#0: *91 rewritten data: "/ccc.html" , args: "", client: 127.0.0.1, server: localhost, request: "GET /aaa.html HTTP/1.1", host: "localhost:9090"

2011/08/08 10:21:00 [notice] 2218#0: *91 "^/aaa\.html$" does not match "/ccc.html" , client: 127.0.0.1, server: localhost, request: "GET /aaa.html HTTP/1.1", host: "localhost:9090"

2011/08/08 10:21:00 [notice] 2218#0: *91 "^/ccc\.html$" matches "/ccc.html" , client: 127.0.0.1, server: localhost, request: "GET /aaa.html HTTP/1.1", host: "localhost:9090"

2011/08/08 10:21:00 [notice] 2218#0: *91 rewritten data: "/ddd.html" , args: "", client: 127.0.0.1, server: localhost, request: "GET /aaa.html HTTP/1.1", host: "localhost:9090"

2011/08/08 10:21:00 [notice] 2218#0: *91 "^/ddd\.html$" matches "/ddd.html" , client: 127.0.0.1, server: localhost, request: "GET /aaa.html HTTP/1.1", host: "localhost:9090"

2011/08/08 10:21:00 [notice] 2218#0: *91 rewritten data: "/eee.html" , args: "", client: 127.0.0.1, server: localhost, request: "GET /aaa.html HTTP/1.1", host: "localhost:9090"

2011/08/08 10:21:00 [notice] 2218#0: *91 "^/aaa\.html$" does not match "/eee.html" , client: 127.0.0.1, server: localhost, request: "GET /aaa.html HTTP/1.1", host: "localhost:9090"

2011/08/08 10:21:00 [notice] 2218#0: *91 "^/ccc\.html$" does not match "/eee.html" , client: 127.0.0.1, server: localhost, request: "GET /aaa.html HTTP/1.1", host: "localhost:9090"

2011/08/08 10:21:00 [info] 2218#0: *91 client 127.0.0.1 closed keepalive connection
  • 过程解释

    • 第一次迭代 location 匹配

      GET /aaa.html ,首先执行 server 级的重写,rewrite "^/aaa\.html$" /bbb.html 把 /aaa.html 重写为 /bbb.html ,但 /bbb.html 没匹配上rewrite "^/ccc\.html$" /ddd.html,最终保留 /bbb.html ;接着,匹配 location /bbb.html {} ,执行 location 级的 rewrite 指令,把 /bbb.html 重写为 /ccc.html ,由于 URI 被 location 级 rewrite 重写,因此需要重新迭代 location 匹配。

    • 第二次迭代 location 匹配

      对于第一次迭代结果 /ccc.html ,首先依然是执行 server 级的 rewrite 指令,rewrite "^/aaa\.html$" /bbb.html; 跟 /ccc.html 不匹配,但rewrite "^/ccc\.html$" /ddd.html; 把 /ccc.html 重写为 /ddd.html ; server 级 rewrite 执行完后,接着 location 匹配, /ddd.html 匹配到 location /ddd.html {} ,执行 location 级的 rewrite 指令,把 /ddd.html 重写为 /eee.html 。同样由于 URI 被 location 级的 rewrite 指令重写,于是需要重新迭代 location 匹配。

    • 第三次迭代 location 匹配

      对于第二次迭代结果 /eee.html ,首先依然执行 server 级的 rewrite 指令,rewrite "^/aaa\.html$" /bbb.html;rewrite "^/ccc\.html$" /ddd.html;,只不过它们都没匹配上 /eee.html ,接着 /eee.html 进行 location 匹配,也没有,最终结果是 /eee.html ,返回“ eee html file ”页面。

Server级与location级顺序无关

最后说明下,如果把上述配置修改成server级rewrite和location的编辑顺序调整:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
server {

listen 9090;
server_name localhost;
root html;

rewrite_log on;

location /bbb.html {
rewrite "^/bbb\.html$" /ccc.html;
}
location /ddd.html {
rewrite "^/ddd\.html$" /eee.html;
}

rewrite "^/aaa\.html$" /bbb.html;
rewrite "^/ccc\.html$" /ddd.html;
}

结果是不会受影响的,也就是说location匹配迭代总是先执行server级rewrite,再进行location匹配,再执行location级的rewrite,如果URI因location级rewrite指令重写,则需要进行下一次迭代。但总的迭代次数不超过10次,否则nginx报500错误。

Rewrite逻辑伪代码描述

简单伪代码描述下rewrite执行过程:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
boolean match_finish = false;
int match_count = 0;

while(!match_finish && match_count < 10) {
match_count ++;

(1) 按编辑顺序执行server级的rewrite指令;
(2) 按重写后的URI匹配location;
(3)
String uri_before_location = uri;
按编辑顺序执行location级的rewrite指令;

String uri_after_location = rewrite(uri);
if(uri_before_location != uri_after_location) {
match_finish = false;
} else {
match_finish = true;
}

if(location rewrite has last flag) {
continue; // 表示不执行后面的rewrite,直接进入下一次迭代
}
if(location rewrite has break flag) {
break; // 表示不执行后面的rewrite,并退出循环迭代
}
}

if(match_count <= 10) {
return HTTP_200;
} else {
return HTTP_500;
}

参考资料