Getting file extension using pattern matching in python

后端 未结 6 1316
逝去的感伤
逝去的感伤 2020-12-11 19:58

I am trying to find the extension of a file, given its name as a string. I know I can use the function os.path.splitext but it does not work as expected in case

相关标签:
6条回答
  • 2020-12-11 20:07

    Starting from phihags answer:

    DOUBLE_EXTENSIONS = ['tar.gz','tar.bz2'] # Add extra extensions where desired.
    
    def guess_extension(filename):
        """
        Guess the extension of given filename.
        """
        root,ext = os.path.splitext(filename)
        if any([filename.endswith(x) for x in DOUBLE_EXTENSIONS]):
            root, first_ext = os.path.splitext(root)
            ext = first_ext + ext
        return root, ext
    
    0 讨论(0)
  • 2020-12-11 20:11
    >>> print re.compile(r'^.*[.](?P<ext>tar\.gz|tar\.bz2|\w+)$').match('a.tar.gz').group('ext')
    gz
    >>> print re.compile(r'^.*?[.](?P<ext>tar\.gz|tar\.bz2|\w+)$').match('a.tar.gz').group('ext')
    tar.gz
    >>>
    

    The ? operator tries to find the minimal match, so instead of .* eating ".tar" as well, .*? finds the minimal match that allows .tar.gz to be matched.

    0 讨论(0)
  • 2020-12-11 20:21
    root,ext = os.path.splitext('a.tar.gz')
    if ext in ['.gz', '.bz2']:
       ext = os.path.splitext(root)[1] + ext
    

    Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems.

    0 讨论(0)
  • 2020-12-11 20:21

    this is simple and works on both single and multiple extensions

    In [1]: '/folder/folder/folder/filename.tar.gz'.split('/')[-1].split('.')[0]
    Out[1]: 'filename'
    
    In [2]: '/folder/folder/folder/filename.tar'.split('/')[-1].split('.')[0]
    Out[2]: 'filename'
    
    In [3]: 'filename.tar.gz'.split('/')[-1].split('.')[0]
    Out[3]: 'filename'
    
    0 讨论(0)
  • 2020-12-11 20:21

    Continuing from phihags answer to generic remove all double or triple extensions such as CropQDS275.jpg.aux.xml use while '.' in:

    tempfilename, file_extension = os.path.splitext(filename)
    while '.' in tempfilename:
         tempfilename, tempfile_extension = os.path.splitext(tempfilename)
         file_extension = tempfile_extension + file_extension
    
    0 讨论(0)
  • 2020-12-11 20:23

    I have idea which is much easier than breaking your head with regex,sometime it might sound stupid too.
    name="filename.tar.gz"
    extensions=('.tar.gz','.py')
    [x for x in extensions if name.endswith(x)]

    0 讨论(0)
提交回复
热议问题