Python Sliding Window on sentence string

后端 未结 3 804
渐次进展
渐次进展 2021-01-15 10:53

I\'m looking for a sliding window splitter of string composed with words with window size N.

Input: \"I love food and I like drink\" , window size 3

3条回答
  •  情歌与酒
    2021-01-15 11:46

    An approach based on subscripting the string sequence:

    def split_on_window(sequence="I love food and I like drink", limit=4):
        results = []
        split_sequence = sequence.split()
        iteration_length = len(split_sequence) - (limit - 1)
        max_window_indicies = range(iteration_length)
        for index in max_window_indicies:
            results.append(split_sequence[index:index + limit])
        return results
    

    Sample Output:

    >>> split_on_window("I love food and I like drink", 3)
    ['I', 'love', 'food']
    ['love', 'food', 'and']
    ['food', 'and', 'I']
    ['and', 'I', 'like']
    ['I', 'like', 'drink']
    

    Here's an alternative answer inspired by @SuperSaiyan:

    from itertools import izip
    
    def split_on_window(sequence, limit):
        split_sequence = sequence.split()
        iterators = [iter(split_sequence[index:]) for index in range(limit)]
        return izip(*iterators)
    

    Sample Output:

    >>> list(split_on_window(s, 4))
    [('I', 'love', 'food', 'and'), ('love', 'food', 'and', 'I'), 
    ('food', 'and', 'I', 'like'), ('and', 'I', 'like', 'drink')]
    

    Benchmarks:

    Sequence = I love food and I like drink, limit = 3
    Repetitions = 1000000
    Using subscripting -> 3.8326420784
    Using izip -> 5.41380286217 # Modified to return a list for the benchmark.
    

提交回复
热议问题