Cheap way to search a large text file for a string

后端 未结 9 1083
隐瞒了意图╮
隐瞒了意图╮ 2020-11-27 04:15

I need to search a pretty large text file for a particular string. Its a build log with about 5000 lines of text. Whats the best way to go about doing that? Using regex sho

9条回答
  •  执笔经年
    2020-11-27 04:57

    I like the solution of Javier. I did not try it, but it sounds cool!

    For reading through a arbitary large text and wanting to know it a string exists, replace it, you can use Flashtext, which is faster than Regex with very large files.

    Edit:

    From the developer page:

    >>> from flashtext import KeywordProcessor
    >>> keyword_processor = KeywordProcessor()
    >>> # keyword_processor.add_keyword(, )
    >>> keyword_processor.add_keyword('Big Apple', 'New York')
    >>> keyword_processor.add_keyword('Bay Area')
    >>> keywords_found = keyword_processor.extract_keywords('I love Big Apple and Bay Area.')
    >>> keywords_found
    >>> # ['New York', 'Bay Area']
    

    Or when extracting the offset:

    >>> from flashtext import KeywordProcessor
    >>> keyword_processor = KeywordProcessor()
    >>> keyword_processor.add_keyword('Big Apple', 'New York')
    >>> keyword_processor.add_keyword('Bay Area')
    >>> keywords_found = keyword_processor.extract_keywords('I love big Apple and Bay Area.', span_info=True)
    >>> keywords_found
    >>> # [('New York', 7, 16), ('Bay Area', 21, 29)]
    

提交回复
热议问题