determine “type of value” from a string in python

后端 未结 4 673
难免孤独
难免孤独 2021-01-14 02:39

I\'m trying to write a function in python, which will determine what type of value is in string; for example

if in string is 1 or 0 or True or False the value is BI

4条回答
  •  长情又很酷
    2021-01-14 03:24

    You said that you used these for input:

    • 2010-00-10 (was int, not text)
    • 20.90 (was int, not float)

    Your original code:

    def dataType(string):
    
     odp=''
     patternBIT=re.compile('[01]')
     patternINT=re.compile('[0-9]+')
     patternFLOAT=re.compile('[0-9]+\.[0-9]+')
     patternTEXT=re.compile('[a-zA-Z0-9]+')
     if patternTEXT.match(string):
         odp= "text"
     if patternFLOAT.match(string):
         odp= "FLOAT"
     if patternINT.match(string):
         odp= "INT"
     if patternBIT.match(string):
         odp= "BIT"
    
     return odp 
    

    The Problem

    Your if statements would be sequentially executed - that is:

    if patternTEXT.match(string):
        odp= "text"
    if patternFLOAT.match(string):
        odp= "FLOAT"
    if patternINT.match(string)
        odp= "INT"
    if patternBIT.match(string):
        odp= "BIT"
    

    "2010-00-10" matches your text pattern, but then it will then try to match against your float pattern (fails because there's not .), then matches against the int pattern, which works because it does contain [0-9]+.

    You should use:

    if patternTEXT.match(string):
        odp = "text"
    elif patternFLOAT.match(string):
        ...
    

    Though for your situation, you probably want to go more specific to less specific, because as you've seen, stuff that is text might also be int (and vice versa). You would need to improve your regular expressions too, as your 'text' pattern only matches for alphanumeric input, but doesn't match against special symbols.

    I will offer my own suggestion, though I do like the AST solution more:

    def get_type(string):
    
        if len(string) == 1 and string in ['0', '1']:
            return "BIT"
    
        # int has to come before float, because integers can be
        # floats.
        try:
            long(string)
            return "INT"
        except ValueError, ve:
            pass
    
        try:
            float(string)
            return "FLOAT"
        except ValueError, ve:
            pass
    
        return "TEXT"
    

    Run example:

    In [27]: get_type("034")
    Out[27]: 'INT'
    
    In [28]: get_type("3-4")
    Out[28]: 'TEXT'
    
    
    In [29]: get_type("20.90")
    Out[29]: 'FLOAT'
    
    In [30]: get_type("u09pweur909ru20")
    Out[30]: 'TEXT'
    

提交回复
热议问题