Scraping through Tor
Perl February 6th, 2008
If, for whatever reason, you want to scrape a website through a proxy, this is pretty easy in Perl, using WWW::Mechanize and Vidalia / Tor:
PERL:
-
#!/usr/bin/perl -w
-
use strict;
-
use WWW::Mechanize;
-
-
my $mech = WWW::Mechanize->new( agent => 'Whatever bot' );
-
$mech->proxy(['http', 'ftp'], 'http://127.0.0.1:8118/');
-
-
my $url = "http://www.example.com";
-
$mech->get($url);
-
# do some stuff...
Obviously the Tor/Privoxy/Vidalia trio should be running on your system when you execute the script. It's damn slow (naturally), but pretty damn effective ![]()






