问题
The program is made in C++, and it indexes webpages, so all domains are random domain names from the web. The strange part is that the dns fail
/not found
percentage is small (>5%).
here is the pmp stack trace:
3886 __GI___poll,send_dg,buf=0xADDRESS,__libc_res_nquery,__libc_res_nquerydomain,__libc_res_nsearch,_nss_dns_gethostbyname3_r,gaih_inet,__GI_getaddrinfo,Curl_getaddrinfo_ex
601 __GI___poll,Curl_socket_check,waitconnect,singleipconnect,Curl_connecthost,ConnectPlease,protocol_done=protocol_done@entry=0xADDRESS),Curl_connect,connect_host,at
534 __GI___poll,Curl_socket_check,Transfer,at,getweb,athread,start_thread,clone,??
498 nanosleep,__sleep,athread,start_thread,clone,??
50 __GI___poll,Curl_socket_check,Transfer,at,getweb,getweb,athread,start_thread,clone,??
15 __GI___poll,Curl_socket_check,Transfer,at,getweb,getweb,getweb,athread,start_thread,clone
7 nanosleep,usleep,main
Why are there so many threads at _nss_dns_gethostbyname3_r
? What could I do to speed it up.
Could it be because I'm using curl's default synchronous DNS resolver with CURLOPT_NOSIGNAL
?
The program is running on a intel I7 (8 cores HT), 16GB ram, Ububtu 12.10.
The bandwidth varies from of 6MB/s (ISP limit) -> 2MB/s at an irregular interval, and it sometimes even drops to a few 100KB/s.
回答1:
The threads you are seeing are probably waiting for DNS answers. A way of speeding that up would be to do the looking up beforehand, so they get cached in your neighbor recursive DNS server. Also make sure nobody is asking for autoritative answers, that is slow always.
回答2:
I've found that the solution was to change the default curl dns resolver to c-ares
and to specifically ask for ipv4
as ipv6
is not supported yet by my network.
Changing to c-ares
also allowed me to add more set dns servers and to circle them in order to improve the number of dns queries/s.
The outcome:
//set to ipv4 only
curl_easy_setopt(curl, CURLOPT_IPRESOLVE, CURL_IPRESOLVE_V4);
//cicle dns Servers
dns_index=DNS_SERVER_I;
pthread_mutex_lock(&running_mutex);
if(DNS_SERVER_I>DNS_SERVERS.size())
{
DNS_SERVER_I=1;
}else
{
DNS_SERVER_I++;
}
pthread_mutex_unlock(&running_mutex);
string dns_servers_string=DNS_SERVERS.at(dns_index%DNS_SERVERS.size())+","+DNS_SERVERS.at((dns_index+1)%DNS_SERVERS.size())+","+DNS_SERVERS.at((dns_index+2)%DNS_SERVERS.size());
// set curl DNS (option available only when curl is built with c-ares)
curl_easy_setopt(curl, CURLOPT_DNS_SERVERS, &dns_servers_string[0]);
来源:https://stackoverflow.com/questions/15956328/curl-slow-multithreading-dns