Curl slow multithreading dns

て烟熏妆下的殇ゞ 提交于 2019-12-24 20:28:29

问题


The program is made in C++, and it indexes webpages, so all domains are random domain names from the web. The strange part is that the dns fail/not found percentage is small (>5%).

here is the pmp stack trace:

   3886 __GI___poll,send_dg,buf=0xADDRESS,__libc_res_nquery,__libc_res_nquerydomain,__libc_res_nsearch,_nss_dns_gethostbyname3_r,gaih_inet,__GI_getaddrinfo,Curl_getaddrinfo_ex
    601 __GI___poll,Curl_socket_check,waitconnect,singleipconnect,Curl_connecthost,ConnectPlease,protocol_done=protocol_done@entry=0xADDRESS),Curl_connect,connect_host,at
    534 __GI___poll,Curl_socket_check,Transfer,at,getweb,athread,start_thread,clone,??
    498 nanosleep,__sleep,athread,start_thread,clone,??
     50 __GI___poll,Curl_socket_check,Transfer,at,getweb,getweb,athread,start_thread,clone,??
     15 __GI___poll,Curl_socket_check,Transfer,at,getweb,getweb,getweb,athread,start_thread,clone
      7 nanosleep,usleep,main

Why are there so many threads at _nss_dns_gethostbyname3_r? What could I do to speed it up.

Could it be because I'm using curl's default synchronous DNS resolver with CURLOPT_NOSIGNAL?

The program is running on a intel I7 (8 cores HT), 16GB ram, Ububtu 12.10.

The bandwidth varies from of 6MB/s (ISP limit) -> 2MB/s at an irregular interval, and it sometimes even drops to a few 100KB/s.


回答1:


The threads you are seeing are probably waiting for DNS answers. A way of speeding that up would be to do the looking up beforehand, so they get cached in your neighbor recursive DNS server. Also make sure nobody is asking for autoritative answers, that is slow always.




回答2:


I've found that the solution was to change the default curl dns resolver to c-ares and to specifically ask for ipv4 as ipv6 is not supported yet by my network.

Changing to c-ares also allowed me to add more set dns servers and to circle them in order to improve the number of dns queries/s.

The outcome:

//set to ipv4 only
curl_easy_setopt(curl, CURLOPT_IPRESOLVE, CURL_IPRESOLVE_V4);

//cicle dns Servers
dns_index=DNS_SERVER_I;
pthread_mutex_lock(&running_mutex);
    if(DNS_SERVER_I>DNS_SERVERS.size())
    {
        DNS_SERVER_I=1;
    }else
    {
        DNS_SERVER_I++;
    }
pthread_mutex_unlock(&running_mutex);

string dns_servers_string=DNS_SERVERS.at(dns_index%DNS_SERVERS.size())+","+DNS_SERVERS.at((dns_index+1)%DNS_SERVERS.size())+","+DNS_SERVERS.at((dns_index+2)%DNS_SERVERS.size());

// set curl DNS (option available only when curl is built with c-ares)
curl_easy_setopt(curl, CURLOPT_DNS_SERVERS, &dns_servers_string[0]);


来源:https://stackoverflow.com/questions/15956328/curl-slow-multithreading-dns

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!