问题
I have a Text file as below:
Education:
askdjbnakjfbuisbrkjsbvxcnbvfiuregifuksbkvjb.iasgiufdsegiyvskjdfbsldfgd
Technical skills :
java,j2ee etc.,
work done:
oaugafiuadgkfjwgeuyrfvskjdfviysdvfhsdf,aviysdvwuyevfahjvshgcsvdfs,bvisdhvfhjsvjdfvshjdvhfjvxjhfvhjsdbvfkjsbdkfg
I would like to extract only the heading names such as Education,Technical Skills etc.
the code is :
with open("aks.txt") as infile, open("fffm",'w') as outfile:
copy = False
for line in infile:
if line.strip() == "Technical Skills":
copy =True
elif line.strip() == "Workdone":
copy = True
elif line.strip() == "Education":
copy = False
elif copy:
outfile.write(line)
fh = open("fffm.txt", 'r')
contents = fh.read()
len(contents)
回答1:
To get just the headings from your text file, you could use the follows:
import re
with open('aks.txt') as f_input:
headings = re.findall(r'(.*?)\s*:', f_input.read())
print headings
This would display the following:
['Education', 'Technical skills', 'work done']
回答2:
If you are sure that the title names occure before a colon (:) then you can write a regex to search for such a pattern.
import re
with open("aks.txt") as infile:
for s in re.finditer(r'(?<=\n).*?(?=:)',infile.read()):
print s.group()
The output will be like
Education
Technical skills
work done
来源:https://stackoverflow.com/questions/34004631/how-can-i-get-only-heading-names-from-the-text-file