Scraping through Tor

Perl February 6th, 2008

If, for whatever reason, you want to scrape a website through a proxy, this is pretty easy in Perl, using WWW::Mechanize and Vidalia / Tor:

PERL:
  1. #!/usr/bin/perl -w
  2. use strict;
  3. use WWW::Mechanize;
  4.  
  5. my $mech = WWW::Mechanize->new( agent => 'Whatever bot' );
  6. $mech->proxy(['http', 'ftp'], 'http://127.0.0.1:8118/');
  7.  
  8. my $url = "http://www.example.com";
  9. $mech->get($url);
  10. # do some stuff...

Obviously the Tor/Privoxy/Vidalia trio should be running on your system when you execute the script. It's damn slow (naturally), but pretty damn effective 8-)

Share and Enjoy:
  • Digg
  • del.icio.us
  • Furl
  • Slashdot
  • Technorati
  • YahooMyWeb
  • Google

Leave a Reply