nginx+lua+redis实现后端反爬虫(未完成),nginxredis
nginx+lua+redis实现后端反爬虫(未完成),nginxredis
一、通过nginx判断user-agent实现反爬虫进入到nginx安装目录下的conf目录,将如下代码保存为 agent_deny.confcd /usr/local/nginx/confvim agent_deny.conf123456789101112 | #禁止Scrapy等工具的抓取if ($http_user_agent ~* (Scrapy|Curl|HttpClient)) { return 403;}#禁止指定UA及UA为空的访问if ($http_user_agent ~ "FeedDemon|JikeSpider|Indy Library|Alexa Toolbar|AskTbFXTV|AhrefsBot|CrawlDaddy|CoolpadWebkit|Java|Feedly|UniversalFeedParser|ApacheBench|Microsoft URL Control|Swiftbot|ZmEu|oBot|jaunty|Python-urllib|lightDeckReports Bot|YYSpider|DigExt|YisouSpider|HttpClient|MJ12bot|heritrix|EasouSpider|Ezooms|^$" ) { return 403; }#禁止非GET|HEAD|POST方式的抓取if ($request_method !~ ^(GET|HEAD|POST)$) { return 403;} |
然后,在网站相关配置中的 location / { 之后插入如下代码:include agent_deny.conf;
保存后,执行如下命令,平滑重启nginx即可:/usr/local/nginx/sbin/nginx -s reloadfrom:http://support.huawei.com/ecommunity/bbs/10231865.html
二、通过lua+nginx+redis实现反爬虫2.1.安装nginx、lua、redis库2.1.1.直接下载各种库(我下载的地址是usr/local/)git clone https://github.com/simpl/ngx_devel_kit.git
git clone https://github.com/chaoslawful/lua-nginx-module.git
git clone https://github.com/agentzh/redis2-nginx-module.git
git clone https://github.com/agentzh/set-misc-nginx-module.gitgit clone https://github.com/agentzh/echo-nginx-module.gityum -y install pcre pcre-dev*from:http://www.tuicool.com/articles/6NbEbeV
2.1.2.安装Luajit库
- wget http://luajit.org/download/LuaJIT-2.0.0-beta9.tar.gz
- tar zxvf LuaJIT-2.0.0-beta9.tar.gz
- cd LuaJIT-2.0.0-beta9
- make
- sudo make install PREFIX=/usr/local/luajit
- sudo ln -sf luajit-2.0.0-beta9 /usr/local/bin/luajit
- -- luajit --
- # tell nginx's build system where to find LuaJIT:
- export LUAJIT_LIB=/path/to/luajit/lib
- export LUAJIT_INC=/path/to/luajit/include/luajit-2.0
from:http://www.tuicool.com/articles/6NbEbeV
2.1.3.重新编译nginx./configure \--prefix=/usr/local/nginx \--with-debug \--with-ld-opt="-Wl,-rpath,$LUAJIT_LIB" \--add-module=/usr/local/ngx_devel_kit \--add-module=/usr/local/echo-nginx-module/ \--add-module=/usr/local/lua-nginx-module/ \--add-module=/usr/local/set-misc-nginx-module/ \--add-module=/usr/local/redis2-nginx-module
make -j2
make install
2.2.安装lua-redis-parser
1 234 | # git clone https://github.com/agentzh/lua-redis-parser.git# export LUA_INCLUDE_DIR=/usr/include/lua5.1# make CC=gcc# make install CC=gcc |
然后修改nginx/conf/nginx.conf文件1)在http加lua_package_path "/usr/local/lua-resty-redis/lib/?.lua;;";让resty.redis能找到lua-resty-redis库
2)在server加lua_code_cache off;作用:更新lua脚本后,只需要执行nginx -s reload,而不用重启nginx
2.3.安装luarocks直接yum -y install luarocks
未完待续...
评论暂时关闭