Parsing EDGAR filings

前端 未结 3 1128
耶瑟儿~
耶瑟儿~ 2020-12-30 10:03

I would like to use python2.7 to remove anything that isn\'t the documents\' text from EDGAR filings (which are available online as .txt files). An example of what the file

3条回答
  •  无人及你
    2020-12-30 10:55

    The link below is a library that parses EDGAR filings into a SQLite DB. It contains functionality to pull Form10k and Form8Qk filings from the EDGAR FPT site for years that you specify and load them into a normalized format in SQLite DB tables. Considering the poorly adhered to standard for the filings, writing your own parsing script would be a significant undertaking. That library and code similar to the below will load filings for the wanted quarter and from there you can simply query the table for the data you are seeking.

    edgar.database.create()
    # Load quarterly master index files into local sqlite db
    quarters = []
    #Q3 2009
    quarters.add(2009,3)
    #Q3 2008
    quarters.add(2008,3)
    edgar.database.load(quarters)
    

    http://rf-contrib.googlecode.com/svn/trunk/ha/src/main/python/edgar/

提交回复
热议问题