How i can get new ip from tor every requests in threads?

我怕爱的太早我们不能终老 提交于 2019-12-05 14:02:49

If you want different IPs for each connection, you can also use Stream Isolation over SOCKS by specifying a different proxy username:password combination for each connection.

With this method, you only need one Tor instance and each requests client can use a different stream with a different exit node.

In order to set this up, add unique proxy credentials for each requests.session object like so: socks5h://username:password@localhost:9050

import random
from multiprocessing import Pool
import requests

def check_ip():
    session = requests.session()
    creds = str(random.randint(10000,0x7fffffff)) + ":" + "foobar"
    session.proxies = {'http': 'socks5h://{}@localhost:9050'.format(creds), 'https': 'socks5h://{}@localhost:9050'.format(creds)}
    r = session.get('http://httpbin.org/ip')
    print(r.text)


with Pool(processes=8) as pool:
    for _ in range(9):
        pool.apply_async(check_ip)
    pool.close()
    pool.join()

Tor Browser isolates streams on a per-domain basis by setting the credentials to firstpartydomain:randompassword, where randompassword is a random nonce for each unique first party domain.

If you're crawling the same site and you want random IP's, then use a random username:password combination for each session. If you are crawling random domains and want to use the same circuit for requests to a domain, use Tor Browser's method of domain:randompassword for credentials.

You only have one proxy, which is listening on the port 9050. All 3 processes are sending requests in parallel through that proxy so they share the same IP.

What is happening is:

  1. All 3 processes ask the proxy to get a new IP
  2. The proxy either request a new IP 3 times, receive 3 responses and apply the last one or it will recognize that it is already waiting for a new IP and disregard 2 of the requests, answering the 3 of them together. That will depend on the proxy implementation.
  3. The processes send their requests through the proxy, which results in the same IP.
  4. The processes are completed and another 3 processes are initiated. Rinse and repeat.

That is why the IPs are the same for every block of 3 requests.
You'll need 3 independent proxies to have 3 different IPs at the same time.


EDIT:

Possible solution using locks and assuming 3 proxies running on the background:

import contextlib, threading, time

_controller_ports = [
    # (Controller Lock, connection port, management port)
    (threading.Lock(), 9050, 9051),
    (threading.Lock(), 9060, 9061),
    (threading.Lock(), 9070, 9071),
]

def get_new_ip_for(port):
    with Controller.from_port(port=port) as controller:
        controller.authenticate(password="password")
        controller.signal(Signal.NEWNYM)
        time.sleep(controller.get_newnym_wait())

@contextlib.contextmanager
def get_port_with_new_ip():
    while True:
        for lock, con_port, manage_port in _controller_ports:
            if lock.acquire(blocking=False):
                get_new_ip_for(manage_port)
                yield con_port
                lock.release()
                break
        time.sleep(1)

def check_ip():
    with get_port_with_new_ip() as port:
        session = requests.session() 
        session.proxies = {'http': f'socks5h://localhost:{port}', 'https': f'socks5h://localhost:{port}'}
        r = session.get('http://httpbin.org/ip')
        print(r.text)

with Pool(processes=3) as pool:
    for _ in range(9):
        pool.apply_async(check_ip)
    pool.close()
    pool.join()
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!