Git Product home page Git Product logo

tor-ip-rotation-python-example's Introduction

Tor IP Rotation in Python

A simple Python script that requests new IPs from the Tor network.

Article: https://medium.com/@amine.btt/a-crawler-that-beats-bot-detection-879888f470eb

Adapted from:

Requirements

PS: These are the requirments for Mac OS X. You can find the requirements for Linux in PyTorStemPrivoxy.

Tor

brew update
brew install tor

Notice that the socks listener is on port 9050.

Next, do the following:

  • Enable the ControlPort listener for Tor to listen on port 9051, as this is the port to which Tor will listen for any communication from applications talking to the Tor controller.
  • Hash a new password that prevents random access to the port by outside agents.
  • Implement cookie authentication as well.

You can create a hashed password out of your password using:

tor --hash-password my_password

Then, update the /usr/local/etc/tor/torrc with the port, hashed password, and cookie authentication.

# content of torrc
ControlPort 9051
# hashed password below is obtained via `tor --hash-password my_password`
HashedControlPassword 16:E600ADC1B52C80BB6022A0E999A7734571A451EB6AE50FED489B72E3DF
CookieAuthentication 1

Restart Tor again to the configuration changes are applied.

brew services restart tor

Privoxy

Tor itself is not a http proxy. So in order to get access to the Tor Network, use privoxy as an http-proxy though socks5.

Install privoxy via the following command:

brew install privoxy

Now, tell privoxy to use TOR by routing all traffic through the SOCKS servers at localhost port 9050. To do that append /usr/local/etc/privoxy/config with the following

forward-socks5t / 127.0.0.1:9050 . # the dot at the end is important

Restart privoxy after making the change to the configuration file.

brew services restart privoxy

Stem

Next, install stem which is a Python-based module used to interact with the Tor Controller, letting us send and receive commands to and from the Tor Control port programmatically.

pip install stem

Example Script

In the script below, urllib is using privoxy which is listening on port 8118 by default, and forwards the traffic to port 9050 on which the Tor socks is listening.

Additionally, in the renew_connection() function, a signal is being sent to the Tor controller to change the identity, so you get new identities without restarting Tor. Doing such comes in handy when crawling a web site and one doesn't wanted to be blocked based on IP address.

...

wait_time = 2
number_of_ip_rotations = 3
tor_handler = TorHandler()

ip = tor_handler.open_url('http://icanhazip.com/')
print('My first IP: {}'.format(ip))

# Cycle through the specified number of IP addresses via TOR
for i in range(0, number_of_ip_rotations):
    old_ip = ip
    seconds = 0

    tor_handler.renew_connection()

    # Loop until the 'new' IP address is different than the 'old' IP address,
    # It may take the TOR network some time to effect a different IP address
    while ip == old_ip:
        time.sleep(wait_time)
        seconds += wait_time
        print('{} seconds elapsed awaiting a different IP address.'.format(seconds))

        ip = tor_handler.open_url('http://icanhazip.com/')

    print('My new IP: {}'.format(ip))

Execute the Python 3 script above via the following command:

python main.py

When the above script is executed, one should see that the IP address is changing every few seconds.

Changes from PyTorStemPrivoxy

  • Requirements for Mac OS X
  • Python 3
  • Coding style

tor-ip-rotation-python-example's People

Contributors

baatout avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.