urllib2.HTTPError: HTTP Error 403: Forbidden

后端 未结 3 1919
无人及你
无人及你 2020-11-22 10:44

I am trying to automate download of historic stock data using python. The URL I am trying to open responds with a CSV file, but I am unable to open using urllib2. I have tri

3条回答
  •  暗喜
    暗喜 (楼主)
    2020-11-22 11:18

    NSE website has changed and the older scripts are semi-optimum to current website. This snippet can gather daily details of security. Details include symbol, security type, previous close, open price, high price, low price, average price, traded quantity, turnover, number of trades, deliverable quantities and ratio of delivered vs traded in percentage. These conveniently presented as list of dictionary form.

    Python 3.X version with requests and BeautifulSoup

    from requests import get
    from csv import DictReader
    from bs4 import BeautifulSoup as Soup
    from datetime import date
    from io import StringIO 
    
    SECURITY_NAME="3MINDIA" # Change this to get quote for another stock
    START_DATE= date(2017, 1, 1) # Start date of stock quote data DD-MM-YYYY
    END_DATE= date(2017, 9, 14)  # End date of stock quote data DD-MM-YYYY
    
    
    BASE_URL = "https://www.nseindia.com/products/dynaContent/common/productsSymbolMapping.jsp?symbol={security}&segmentLink=3&symbolCount=1&series=ALL&dateRange=+&fromDate={start_date}&toDate={end_date}&dataType=PRICEVOLUMEDELIVERABLE"
    
    
    
    
    def getquote(symbol, start, end):
        start = start.strftime("%-d-%-m-%Y")
        end = end.strftime("%-d-%-m-%Y")
    
        hdr = {'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.64 Safari/537.11',
             'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
             'Referer': 'https://cssspritegenerator.com',
             'Accept-Charset': 'ISO-8859-1,utf-8;q=0.7,*;q=0.3',
             'Accept-Encoding': 'none',
             'Accept-Language': 'en-US,en;q=0.8',
             'Connection': 'keep-alive'}
    
        url = BASE_URL.format(security=symbol, start_date=start, end_date=end)
        d = get(url, headers=hdr)
        soup = Soup(d.content, 'html.parser')
        payload = soup.find('div', {'id': 'csvContentDiv'}).text.replace(':', '\n')
        csv = DictReader(StringIO(payload))
        for row in csv:
            print({k:v.strip() for k, v in row.items()})
    
    
     if __name__ == '__main__':
         getquote(SECURITY_NAME, START_DATE, END_DATE)
    

    Besides this is relatively modular and ready to use snippet.

提交回复
热议问题