Order between using validation, training and test sets
I am trying to understand the process of model evaluation and validation in machine learning. Specifically, in which order and how the training, validation and test sets must be used. Let's say I have a dataset and I want to use linear regression. I am hesitating among various polynomial degrees (hyper-parameters). In this wikipedia article , it seems to imply that the sequence should be: Split data into training set, validation set and test set Use the training set to fit the model (find the best parameters: coefficients of the polynomial). Afterwards , use the validation set to find the best