Scraping through Tor

Perl February 6th, 2008

If, for whatever reason, you want to scrape a website through a proxy, this is pretty easy in Perl, using WWW::Mechanize and Vidalia / Tor:

[perl]
#!/usr/bin/perl -w
use strict;
use WWW::Mechanize;

my $mech = WWW::Mechanize->new( agent => ‘Whatever bot’ );
$mech->proxy(['http', 'ftp'], ‘http://127.0.0.1:8118/’);

my $url = “http://www.example.com”;
$mech->get($url);
# do some stuff…

[/perl]

Obviously the Tor/Privoxy/Vidalia trio should be running on your system when you execute the script. It’s damn slow (naturally), but pretty damn effective 8-)

Comments are closed.