问题
i have function that get me all the titles from my website i dont want to get the title from some products is this the right way ? i dont want titles from products with the words "OLP NL" or "Arcserve" or "LicSAPk" or "symantec"
def get_title ( u ):
html = requests.get ( u )
bsObj = BeautifulSoup ( html.content, 'xml' )
title = str ( bsObj.title ).replace ( '<title>', '' ).replace ( '</title>',
'' )
if (title.find ( 'Arcserve' ) or title.find ( 'OLP NL' ) or title.find (
'LicSAPk' ) or title.find (
'Symantec' ) is not -1):
return 'null'
else:
return title
if (title != 'null'):
ws1 [ 'B1' ] = title
meta_desc = get_metaDesc ( u )
ws1 [ 'C1' ] = meta_desc
meta_keyWrds = get_metaKeyWrds ( u )
ws1 [ 'D1' ] = meta_keyWrds
print ( "writing product no." + str ( i ) )
else:
print("skipped product no. " + str ( i ))
continue;
the problem is that the program exclude all my products and all i'm seeing is "skipped product no." ? whay ? not all of them have these words ...
回答1:
You can change the if statement for (title.find ( 'Arcserve' )!=-1 or title.find ( 'OLP NL' )!=-1 or title.find ('LicSAPk' )!=-1 or title.find ('Symantec' )!=-1)
or you can create a function to evaluate the terms that you want to find
def TermFind(Title):
terms=['Arcserve','OLP NL','LicSAPk','Symantec']
disc=False
for val in terms:
if Title.find(val)!=-1:
disc=True
break
return disc
When I used the if statement always returned True regardless of the title value. I couldn't find an explanation for such behavior, but you can try checking this [Python != operation vs "is not" and [nested "and/or" if statements. Hope it helps.
回答2:
A similar idea using any
import requests
from bs4 import BeautifulSoup
url = 'https://www.cdsoft.co.il/index.php?id_product=300610&controller=product'
html = requests.get(url)
bsObj = BeautifulSoup(html.content, 'lxml')
title = str ( bsObj.title ).replace ( '<title>', '' ).replace ( '</title>', '' )
items = ['Arcserve','OLP NL','LicSAPk','Symantec']
if not any(item in title for item in items):
print(title)
来源:https://stackoverflow.com/questions/55067248/how-to-exclude-all-title-with-find