twisted

Python Scrapy - mimetype based filter to avoid non-text file downloads

杀马特。学长 韩版系。学妹 提交于 2019-12-22 06:47:24
问题 I have a running scrapy project, but it is being bandwidth intensive because it tries to download a lot of binary files (zip, tar, mp3, ..etc). I think the best solution is to filter the requests based on the mimetype (Content-Type:) HTTP header. I looked at the scrapy code and found this setting: DOWNLOADER_HTTPCLIENTFACTORY = 'scrapy.core.downloader.webclient.ScrapyHTTPClientFactory' I changed it to: DOWNLOADER_HTTPCLIENTFACTORY = 'myproject.webclients.ScrapyHTTPClientFactory' And played a

Python : why a method from super class not seen?

会有一股神秘感。 提交于 2019-12-22 05:14:03
问题 i am trying to implement my own version of a DailyLogFile from twisted.python.logfile import DailyLogFile class NDailyLogFile(DailyLogFile): def __init__(self, name, directory, rotateAfterN = 1, defaultMode=None): DailyLogFile.__init__(self, name, directory, defaultMode) # why do not use super. here? lisibility maybe? # self.rotateAfterN = rotateAfterN def shouldRotate(self): """Rotate when N days have passed since file creation""" delta = datetime.date(*self.toDate()) - datetime.date(*self

Running scrapy from script not including pipeline

喜夏-厌秋 提交于 2019-12-22 04:43:45
问题 I'm running scrapy from a script but all it does is activate the spider. It doesn't go through my item pipeline. I've read http://scrapy.readthedocs.org/en/latest/topics/practices.html but it doesn't say anything about including pipelines. My setup: Scraper/ scrapy.cfg ScrapyScript.py Scraper/ __init__.py items.py pipelines.py settings.py spiders/ __init__.py my_spider.py My script: from twisted.internet import reactor from scrapy.crawler import Crawler from scrapy.settings import Settings

Python twisted reactor - address already in use

末鹿安然 提交于 2019-12-22 04:08:35
问题 I'm following a tutorial http://www.raywenderlich.com/3932/how-to-create-a-socket-based-iphone-app-and-server for creating a sample using socket programming in Mac OS X enviromment. I'm writing using post 80 for reactor.listenTCP(80, factory). When I run the server.py file, getting an error: File "server.py", line 10, in <module> reactor.listenTCP(6, factory) File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/twisted/internet/posixbase.py", line 436, in listenTCP

“SyntaxError: unexpected EOF while parsing” while iterating a dictionary in PDB

允我心安 提交于 2019-12-22 04:08:22
问题 I have a pdb trace set inside a GET request. I want to print all the attributes of the request object. I am trying the following, in pdb: (Pdb) request <GET /foo HTTP/1.1> (Pdb) for d in dir(request): *** SyntaxError: unexpected EOF while parsing (<stdin>, line 1) I am sure there is something fundamental I am missing here. 回答1: You can't enter multi-line statements in pdb . You can use the commands command if the code block is to be executed on a break point, though; help commands for more

Async query database for keys to use in multiple requests

那年仲夏 提交于 2019-12-22 01:22:20
问题 I want to asynchronously query a database for keys, then make requests to several urls for each key. I have a function that returns a Deferred from the database whose value is the key for several requests. Ideally, I would call this function and return a generator of Deferreds from start_requests . @inlineCallbacks def get_request_deferred(self): d = yield engine.execute(select([table])) # async d.addCallback(make_url) d.addCallback(Request) return d def start_requests(self): ???? But

Iron Python Twisted

混江龙づ霸主 提交于 2019-12-21 21:46:43
问题 Is there an Iron Python .net port of the twisted libraries, or can Iron Python use the standard one? 回答1: Twisted will not currently run in IronPython, but it is being worked on. Stay tuned. 回答2: I've not used it myself, but you may get some mileage out of Ironclad - it supposedly lets you use CPython from IronPython... 来源: https://stackoverflow.com/questions/676681/iron-python-twisted

Determine the current number of backlogged connections in TCP listen() queue

大城市里の小女人 提交于 2019-12-21 10:46:25
问题 Is there a way to find out the current number of connection attempts awaiting accept() on a TCP socket on Linux? I suppose I could count the number of accepts() that succeed before hitting EWOULDBLOCK on each event loop, but I'm using a high-level library (Python/Twisted) that hides these details. Also it's using epoll() rather than an old-fashioned select()/poll() loop. I am trying to get a general sense of the load on a high-performance non-blocking network server, and I think this number

How can I write tests for code using twisted.web.client.Agent and its subclasses?

梦想的初衷 提交于 2019-12-21 05:31:25
问题 I read the official tutorial on test-driven development, but it hasn't been very helpful in my case. I've written a small library that makes extensive use of twisted.web.client.Agent and its subclasses ( BrowserLikeRedirectAgent , for instance), but I've been struggling in adapting the tutorial's code to my own test cases. I had a look at twisted.web.test.test_web , but I don't understand how to make all the pieces fit together. For instance, I still have no idea how to get a Protocol object

Twisted Python script on Raspberry Pi (Debian) to communicate with Arduino via USB

拜拜、爱过 提交于 2019-12-21 05:25:05
问题 I have been working on an Arduino/Raspberry Pi project where I have found myself learning not just Python but Twisted Python as well; so I apologize in advance for my newbness. I am trying to keep it simple for now and just trying to send a char at any one time between the two devices. So far I am able to send from the Raspberry Pi to the Arduino and effectively turn its LED off/on just as expected. However I cannot seem to generate Twisted code which will detect anything coming from the