Split on comma using python and scrapy

末鹿安然 提交于 2020-06-29 03:40:18

问题


Am using scrapy to extract data from a certain website, I have a field am extracting that returns both the city and the region I want to be able to split the returned data on the comma and store the first part of it inside the city field and second part of it in the region field The code am using to extract the data :

 loader.add_css('region','.seller-box__seller-address__label::text')

the output of the data is : a column named region with for example this value :

Elbląg, Warmińsko-mazurskie

the desired output would be two columns the first being city with the value of : Elbląg and region with the value of : Warmińsko-mazurskie

UPDATE :

apprently the loader can take an additional arrgument for regular expressions : i was able to split the data by passing

loader.add_css('region','.seller-box__seller-address__label::text',re='([^,]+)$')

This will remove everything before the comma.


回答1:


I don't know if loader has special method for split value into two fields.

Normally I would do

text = response.css('.seller-box__seller-address__label::text').extract_first().strip()

city, region = tex.split(', ') 

loader.add_value('city', city)
loader.add_value('region', region)


来源:https://stackoverflow.com/questions/62595620/split-on-comma-using-python-and-scrapy

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!