An interesting way to play wget
in Note with 0 comment
An interesting way to play wget
in Note with 0 comment

Many people who play Linux will use wget to download some files to configure or install software. Of course, others will use it to recursively download your website content. Here's how to configure Nginx to prevent others from downloading your website content through wget, and how to crack the prohibition of wget or curl for Nginx or Apache settings.

Prevent Wget recursive download

Assume that the Nginx default configuration file is in the directory: /usr/local/nginx/conf/vhost
The default User Agent of wget is GNU/Linux, wget, Therefore, we only need to block the UA's access and return 403.

Nginx is configured as follows:

 if ($http_user_agent ~* (Wget|ab) ) { return 403; } if ($http_user_agent ~* LWP::Simple|BBBike|wget) { return 403; }

Extension:
There is a unique HTTP 444 status in Nginx. If return 444 is configured, the other party will not receive error messages. It seems that the website server is slow to open and cannot connect.

Cracking prohibits wget or curl downloading

Some websites have set the parameters of website server or iptables to prohibit the access of wget/url. What should we do if we can't stop happily? In fact, most of them only disable the user agent of wget/url. We only need to set a normal browser UA for them.

Temporarily change the UA of wget

Add parameters before wget -U , represents the setting of User Aagent

 wget www.google.com -U ‘Mozilla/5.0 (Windows NT 10.0;  WOW64; rv:43.0) Gecko/20100101

Permanently change the UA of Wget

Add the following code to/etc/wgetrc

 header = User-Agent: Mozilla/5.0 (Windows NT 10.0;  WOW64; rv:43.0) Gecko/20100101 Firefox/43.0

Change the UA of curl

Use the following parameters:

 curl www.google.com --user-agent "Mozilla/5.0 (Windows NT 10.0;  WOW64; rv:43.0) Gecko/20100101 Firefox/43.0"

At the same time, Curl can also pass the refer information to skip certain immobilizer settings:

 curl -e  http://www.google.com   http://www.linpx.com

This represents a jump from Google to our website

Ok, That's it···

Responses