How to split/partition a dataset into training and test datasets for, e.g., cross validation?

前端 未结 12 2051
醉话见心
醉话见心 2020-11-27 10:42

What is a good way to split a NumPy array randomly into training and testing/validation dataset? Something similar to the cvpartition or crossvalind

12条回答
  •  醉酒成梦
    2020-11-27 11:15

    I wrote a function for my own project to do this (it doesn't use numpy, though):

    def partition(seq, chunks):
        """Splits the sequence into equal sized chunks and them as a list"""
        result = []
        for i in range(chunks):
            chunk = []
            for element in seq[i:len(seq):chunks]:
                chunk.append(element)
            result.append(chunk)
        return result
    

    If you want the chunks to be randomized, just shuffle the list before passing it in.

提交回复
热议问题