What is a good way to split a NumPy array randomly into training and testing/validation dataset? Something similar to the cvpartition or crossvalind
I wrote a function for my own project to do this (it doesn't use numpy, though):
def partition(seq, chunks):
"""Splits the sequence into equal sized chunks and them as a list"""
result = []
for i in range(chunks):
chunk = []
for element in seq[i:len(seq):chunks]:
chunk.append(element)
result.append(chunk)
return result
If you want the chunks to be randomized, just shuffle the list before passing it in.