I am using InstanceQuery , SQL queries, to construct my Instances. But my query results does not come in the same order always as it is normal in SQL. Beacuse of this Instance
I solved a similar problem with the Add
filter that allows adding attributes to Instances
. You need to add a correct Attibute
with proper list of values to both datasets (in my case - to test dataset only):
Load train and test data:
/* "train" contains labels and data */
/* "test" contains data only */
CSVLoader csvLoader = new CSVLoader();
csvLoader.setFile(new File(trainFile));
Instances training = csvLoader.getDataSet();
csvLoader.reset();
csvLoader.setFile(new File(predictFile));
Instances test = csvLoader.getDataSet();
Set a new attribute with Add
filter:
Add add = new Add();
/* the name of the attribute must be the same as in "train"*/
add.setAttributeName(training.attribute(0).name());
/* getValues returns a String with comma-separated values of the attribute */
add.setNominalLabels(getValues(training.attribute(0)));
/* put the new attribute to the 1st position, the same as in "train"*/
add.setAttributeIndex("1");
add.setInputFormat(test);
/* result - a compatible with "train" dataset */
test = Filter.useFilter(test, add);
As a result, the headers of both "train" and "test" are the same (compatible for Weka machine learning)