问题
When I am calling Spider through a Python script, it is giving me an ImportError:
ImportError: No module named app.models
My items.py is like this:
from scrapy.item import Item, Field
from scrapy.contrib.djangoitem import DjangoItem
from app.models import Person
class aqaqItem(DjangoItem):
django_model=Person
pass
My settings.py is like this:
#
# For simplicity, this file contains only the most important settings by
# default. All the other settings are documented here:
#
# http://doc.scrapy.org/topics/settings.html
#
BOT_NAME = 'aqaq'
BOT_VERSION = '1.0'
SPIDER_MODULES = ['aqaq.spiders']
NEWSPIDER_MODULE = 'aqaq.spiders'
USER_AGENT = '%s/%s' % (BOT_NAME, BOT_VERSION)
ITEM_PIPELINES = [
'aqaq.pipelines.JsonWithEncodingPipeline']
import sys
import os
c=os.getcwd()
os.chdir("../../myweb")
d=os.getcwd()
os.chdir(c)
sys.path.insert(0, d)
# Setting up django's settings module name.
# This module is located at /home/rolando/projects/myweb/myweb/settings.py.
import os
os.environ['DJANGO_SETTINGS_MODULE'] = 'myweb.settings'
My Python script to call the spider is like this:
from twisted.internet import reactor
from scrapy.crawler import Crawler
from scrapy import log, signals
from final.aqaq.aqaq.spiders.spider import aqaqspider
from scrapy.utils.project import get_project_settings
def stop_reactor():
reactor.stop()
spider = aqaqspider(domain='aqaq.com')
settings = get_project_settings()
crawler = Crawler(settings)
crawler.signals.connect(reactor.stop, signal=signals.spider_closed)
crawler.configure()
crawler.crawl(spider)
crawler.start()
log.start()
reactor.run()
My directory structure is like this:
.
|-- aqaq
| |-- aqaq
| | |-- call.py
| | |-- __init__.py
| | |-- __init__.pyc
| | |-- items.py
| | |-- items.pyc
| | |-- pipelines.py
| | |-- pipelines.pyc
| | |-- settings.py
| | |-- settings.pyc
| | `-- spiders
| | |-- aqaq.json
| | |-- __init__.py
| | |-- __init__.pyc
| | |-- item.json
| | |-- spider.py
| | |-- spider.pyc
| | `-- url
| |-- call.py
| |-- call_spider.py
| `-- scrapy.cfg
|-- mybot
| |-- mybot
| | |-- __init__.py
| | |-- items.py
| | |-- pipelines.py
| | |-- settings.py
| | `-- spiders
| | |-- example.py
| | `-- __init__.py
| `-- scrapy.cfg
`-- myweb
|-- app
| |-- admin.py
| |-- admin.pyc
| |-- __init__.py
| |-- __init__.pyc
| |-- models.py
| |-- models.pyc
| |-- tests.py
| `-- views.py
|-- manage.py
`-- myweb
|-- file
|-- __init__.py
|-- __init__.pyc
|-- settings.py
|-- settings.pyc
|-- urls.py
|-- urls.pyc
|-- wsgi.py
`-- wsgi.pyc
Please help me as I am new to Scrapy.
i am real confused i tried importing
import os
os.environ['DJANGO_SETTINGS_MODULE'] = 'myweb.settings
in my script at the top new error came that
get_project_settings is invalid
also my scarapy version is 18
Thank you all i got the solution
回答1:
Perhaps your problem is that you are importing the spider before the settings. The ImportError might come from the from app.models import Person in your items.py.
So, import your spider after you set up the settings:
crawler.configure()
from final.aqaq.aqaq.spiders.spider import aqaqspider
spider = aqaqspider(domain='aqaq.com')
crawler.crawl(spider)
来源:https://stackoverflow.com/questions/19164482/scrapy-and-django-import-error