Python Asynchronous Reverse DNS Lookups

大兔子大兔子 提交于 2019-12-04 12:10:29
Wert

I discovered my main issue was IPs failing to resolve and thus sockets not obeying their set timeouts and failing after 30 seconds. See Python 2.6 urlib2 timeout issue.

adns-python was a no-go because of its lack of support for IPv6 (without patches).

After searching around I found this: Reverse DNS Lookups with dnspython and implemented a similar version in my code (his code also uses an optional thread pool and implements a timeout).

In the end I used dnspython with a concurrent.futures thread pool for asynchronous reverse DNS lookups (see Python: Reverse DNS Lookup in a shared hosting and Dnspython: Setting query timeout/lifetime). With a timeout of 1 second this cut runtime from about 22 minutes to about 16 seconds on 2500 IP addresses. The large difference can probably be attributed to the Global Interpreter Lock on sockets and the 30 second timeouts.

Code Snippet:

import concurrent.futures
from dns import resolver, reversename
dns_resolver = resolver.Resolver()
dns_resolver.timeout = 1
dns_resolver.lifetime = 1
ips = [...]
results = []

with concurrent.futures.ThreadPoolExecutor(max_workers = 16) as pool:
    results = list(pool.map(get_hostname_from_ip, ips))

def get_hostname_from_ip(ip):
    try:
        reverse_name = reversename.from_address(ip)
        return dns_resolver.query(reverse_name, "PTR")[0].to_text()[:-1]
    except:
        return ""

please, use asynchronous DNS, everything else will give you a very poor performance.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!