wget

SSL connection fails with wget, curl, but succeed with firefox and lynx

霸气de小男生 提交于 2019-12-24 04:41:12
问题 I'm having trouble accessing this website by automated script: https://mydtac.dtac.co.th/EserviceLogin/Login?page=N&lang=en If i view from a browser (chrome, firefox, even lynx is working), it's all ok. I if try to load it from PHP (fsockopen), wget, or curl, it's complaining: Warning: stream_socket_enable_crypto(): SSL operation failed with code 1. OpenSSL Error messages: error:140943FC:SSL routines:SSL3_READ_BYTES:sslv3 alert bad record mac in Also the openssl check fails: openssl s_client

Use wget with Hadoop?

梦想的初衷 提交于 2019-12-24 04:19:11
问题 I have a dataset (~31GB, zipped file with extension .gz) which is present on a web location, and I want to run my Hadoop program on it. The program is a slight modification from the original WordCount example that comes shipped with Hadoop. In my case, Hadoop is installed on a remote machine (to which I connect via ssh and then run my jobs). The problem is that I can't transfer this large dataset to my home directory on the remote machine (due to disk usage quota). So, I tried searching for

HTML file fetched using 'wget' reported as binary by 'less'

瘦欲@ 提交于 2019-12-24 02:17:06
问题 If I use wget to download this page: wget http://www.aqr.com/ResearchDetails.htm -O page.html and then attempt to view the page in less , less reports the file as being a binary. less page.html "page.html" may be a binary file. See it anyway? These are the response headers: Accept-Ranges:bytes Cache-Control:private Content-Encoding:gzip Content-Length:8295 Content-Type:text/html Cteonnt-Length:44064 Date:Sun, 25 Sep 2011 12:15:53 GMT ETag:"c0859e4e785ecc1:6cd" Last-Modified:Fri, 19 Aug 2011

Why does Wget have Host Header in its HTTP request?

隐身守侯 提交于 2019-12-24 00:28:47
问题 The main difference between HTTP/1.0 and HTTP/1.1 is that HTTP/1.1 has a mandatory Host header in it (source: HTTP Pocket Reference - O'reilly). So, why is that Wget, which uses the HTTP/1.0 protocol, has a host header in it? My output of Wget with netcat: GET / HTTP/1.0 User-Agent: Wget/1.12 (linux-gnu) Accept: */* Host: 127.0.0.1:10101 Connection: Keep-Alive As it is clear that Wget uses the HTTP/1.0 protocol version, how can it have a host header in it? Or am I getting somewhere wrong with

error: Autoconf version 2.67 or higher is required

回眸只為那壹抹淺笑 提交于 2019-12-23 13:08:54
error: Autoconf version 2.67 or higher is required 今天linux下遇到这种错误,顺便记录下来。 #rpm -qf /usr/bin/autoconf 查看当前的版本。 #rpm -e --nodeps autoconf-2.63 卸载当前的版本 #wget ftp://ftp.gnu.org/gnu/autoconf/autoconf-2.68.tar.gz 安装最新版本 下载并安装 # wget ftp://ftp.gnu.org/gnu/autoconf/autoconf-2.68.tar.gz # tar zxvf autoconf-2.68.tar.gz # cd autoconf-2.68 # ./configure --prefix=/usr/ # make && make install 来源: https://www.cnblogs.com/tonyY/p/4817792.html

Possible to assign a new IP address on every http request?

时光毁灭记忆、已成空白 提交于 2019-12-23 07:45:43
问题 Is it possible for me to change or assign my server a new IP address every time it needs to make a http request with commands such as wget? Thanks all Update The reason for this is exactly what the Tor project is trying to achieve. I do not want to leave a trace of what requests my server makes and I thought constantly changing my IP address could help me and my users use the internet without being followed around. :) 回答1: If you have a large pool of proxies you can use, then I suppose you

download.file in R including pre-requisites

爷,独闯天下 提交于 2019-12-23 04:47:48
问题 I'm trying to use download.file to get some webpages including embedded images, etc. I think using wget it's the equivalent of the -p -k options, but I can't see how to do this... if I do: download.file("http://guardian.co.uk","test.html") That obviously works, but I get this error: Warning messages: 1: running command 'wget -p -k "http://guardian.co.uk" -O "test.html"' had status 1 2: In download.file("http://guardian.co.uk", "test.html", method = "wget", : download had nonzero exit status

Bulk download of pdf with Scrapy and Python3

与世无争的帅哥 提交于 2019-12-23 04:30:05
问题 I would like to bulk download free-to-download pdfs (copies of an old newspaper from 1843 to 1900 called Gaceta) from this website of the Nicaraguan National Assembly with Python3 / Scrapy . I am a absolute beginner in programming and python, but tried to start with a(n unfinished) script: #!/usr/bin/env python3 from urllib.parse import urlparse import scrapy from scrapy.http import Request class gaceta(scrapy.Spider): name = "gaceta" allowed_domains = ["digesto.asamblea.gob.ni"] start_urls =

What is the wget command to submit data to this form?

ⅰ亾dé卋堺 提交于 2019-12-23 04:01:47
问题 I am trying to post the name, email address, and message to this page: http://zenlyzen.com/test1/index.php?main_page=contact_us using wget. This command: wget --post-data 'contactname=test&email=a@a.com&enquiry=testmessage' http://www.zenlyzen.com/test1/index.php?main_page=contact_us\&action=send%20method="post" saves this page: http://www.zenlyzen.com/wgettest.html I've poked around with cookies and session cookies, to no avail. Thanks in advance, Mark 回答1: Using curl : $ mech-dump --forms

What is the wget command to submit data to this form?

主宰稳场 提交于 2019-12-23 04:01:47
问题 I am trying to post the name, email address, and message to this page: http://zenlyzen.com/test1/index.php?main_page=contact_us using wget. This command: wget --post-data 'contactname=test&email=a@a.com&enquiry=testmessage' http://www.zenlyzen.com/test1/index.php?main_page=contact_us\&action=send%20method="post" saves this page: http://www.zenlyzen.com/wgettest.html I've poked around with cookies and session cookies, to no avail. Thanks in advance, Mark 回答1: Using curl : $ mech-dump --forms