Compare two strings and Extract value of variable data in Python

落爺英雄遲暮 提交于 2021-02-11 06:16:25

问题


In my python script, I have a list of strings like,

birth_year = ["my birth year is *","i born in *","i was born in *"]

I want to compare one input sentence with the above list and need a birth year as output.

The input sentence is like:

Example1: My birth year is 1994.
Example2: I born in 1995

The output will be:

Example1: 1994
Example2: 1995

I applied many approaches by using regex. But I didn't find a perfect solution for the same.


回答1:


If you change birth_year to a list of regexes you could match more easily with your input string. Use a capturing group for the year.

Here's a function that does what you want:

def match_year(birth_year, input):  
    for s in birth_year:
        m = re.search(s, input, re.IGNORECASE)
        if m:
            output = f'{input[:m.start(0)]}{m[1]}'
            print(output)
            break

Example:

birth_year = ["my birth year is (\d{4})","i born in (\d{4})","i was born in (\d{4})"]

match_year(birth_year, "Example1: My birth year is 1994.")
match_year(birth_year, "Example2: I born in 1995")

Output:

Example1: 1994
Example2: 1995

You need at least Python 3.6 for f-strings.




回答2:


str1=My birth year is 1994.
str2=str1.replace('My birth year is ','')

You can try something like this and replace the unnecessary string with empty string.

For the code you shared, you can do something like :

for x in examples:
   for y in birth_year:
      if x.find(y)==1: #checking if the substring exists in example
         x.replace(y,'') #if it exists we replace it with empty string 

I think the above code might work




回答3:


If you can guaranty those "strings like" always contain one 4 digits number, which is a year of birth, somewhere in there... i'd say just use regex to get whatever 4 digits in there surrounded by non-digits. Rather dumb, but hey, works with your data.

import re

examples = ["My birth year is 1993.", "I born in 1995", "я родился в 1976м году"]
for str in examples:
    y = int(re.findall(r"^[^\d]*([\d]{4})[^\d]*$", str)[0])
    print(y)


来源:https://stackoverflow.com/questions/60665158/compare-two-strings-and-extract-value-of-variable-data-in-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!