arff

Import ARFF dataset using RWeka in RStudio (depencendy error: rJava)

痞子三分冷 提交于 2020-01-24 22:13:09
问题 I am currently using R for Windows verison 3.5.3 and RStudio version 1.2.1335. My goal is to import an ARFF dataset using the RWeka package in order to do some Association analysis, more specifically, to apply the Apriori algorithm. I want to analyze a dataset (.ARFF) in R and, due to convenience, I am using the RWeka package, as my goal is to apply the Apriori algorithm, one of the associators available on that package. That package requires some dependencies (RWekajars e rJava) and they

How to deal with data from arff file with python?

生来就可爱ヽ(ⅴ<●) 提交于 2020-01-24 12:42:21
问题 I am pretty new for python. I am using python to read the arff file now: import arff for row in arff.load('cpu.arff'): x = row print(x) The part of sample output is like this format: <Row(125.0,256.0,6000.0,256.0,16.0,128.0,198.0)> <Row(29.0,8000.0,32000.0,32.0,8.0,32.0,269.0)> <Row(29.0,8000.0,32000.0,32.0,8.0,32.0,220.0)> <Row(29.0,8000.0,32000.0,32.0,8.0,32.0,172.0)> <Row(29.0,8000.0,16000.0,32.0,8.0,16.0,132.0)> <Row(26.0,8000.0,32000.0,64.0,8.0,32.0,318.0)> <Row(23.0,16000.0,32000.0,64.0

How to deal with data from arff file with python?

旧巷老猫 提交于 2020-01-24 12:42:17
问题 I am pretty new for python. I am using python to read the arff file now: import arff for row in arff.load('cpu.arff'): x = row print(x) The part of sample output is like this format: <Row(125.0,256.0,6000.0,256.0,16.0,128.0,198.0)> <Row(29.0,8000.0,32000.0,32.0,8.0,32.0,269.0)> <Row(29.0,8000.0,32000.0,32.0,8.0,32.0,220.0)> <Row(29.0,8000.0,32000.0,32.0,8.0,32.0,172.0)> <Row(29.0,8000.0,16000.0,32.0,8.0,16.0,132.0)> <Row(26.0,8000.0,32000.0,64.0,8.0,32.0,318.0)> <Row(23.0,16000.0,32000.0,64.0

How to use StringToWordVector (weka) in java?

核能气质少年 提交于 2020-01-03 20:04:58
问题 This is my arff file @relation hamspam @attribute text string @attribute class {ham,spam} @data 'good',ham 'very good',ham 'bad',spam 'very bad',spam 'very bad, very bad',spam What i want to do is to classify it with weka clasiffier in my java program, but i don't know how to use StringToWordVector and then classify it. this my code: Classifier j48tree = new J48(); Instances train = new Instances(new BufferedReader(new FileReader("data.arff"))); StringToWordVector filter = new

Train and test set are not compatible error in weka?

两盒软妹~` 提交于 2020-01-01 09:03:10
问题 I'm trying to test my model with new dataset. I have done the same preprocessing step as i have done for building my model. I have compared two files but there is no issues. I have all the attributes(train vs test dataset) in same order, same attribute names and data types. But still i'm not able to resolve the issue. Both of the files train and test seems to be similar but the weka explorer is giving me error saying Train and test set are not compatible. How to resolve this error? Is there

Train and test set are not compatible error in weka?

谁说胖子不能爱 提交于 2020-01-01 09:02:14
问题 I'm trying to test my model with new dataset. I have done the same preprocessing step as i have done for building my model. I have compared two files but there is no issues. I have all the attributes(train vs test dataset) in same order, same attribute names and data types. But still i'm not able to resolve the issue. Both of the files train and test seems to be similar but the weka explorer is giving me error saying Train and test set are not compatible. How to resolve this error? Is there

generator' object has no attribute 'data', problems loading some file with scipy?

不羁的心 提交于 2019-12-25 14:48:12
问题 Im new with python and I'm triying to load .arff file with python this is what i tried: import arff , numpy as np file1 = open('/Users/user/Desktop/example.arff') dataset = arff.load(file1) print dataset data = np.array(dataset.data) print data The problem is the following output: data = np.array(dataset.data) AttributeError: 'generator' object has no attribute 'data' Why is this happening? and how should i avoid it?. This is the .arff: @relation foo @attribute width numeric @attribute height

Classifying unlabelled data in Weka

邮差的信 提交于 2019-12-23 20:47:58
问题 I'm currently using various classifiers in Weka. My testing data is labelled, e.g.: @relation bmwreponses @attribute IncomeBracket {0,1,2,3,4,5,6,7} @attribute FirstPurchase numeric @attribute LastPurchase numeric @attribute responded {1,0} @data 4,200210,200601,0 5,200301,200601,1 6,200411,200601,0 5,199609,200603,0 6,200310,200512,1 ... The last value per row is the class element, i.e. responded. But if I try unlabelled test data, e.g.: @relation bmwreponses @attribute IncomeBracket {0,1,2

How to change attribute type to String (WEKA - CSV to ARFF)

可紊 提交于 2019-12-23 02:57:19
问题 I'm trying to make an SMS SPAM classifier using the WEKA library. I have a CSV file with "label" and "text" headings. When I use the code below, it creates an ARFF file with two attributes: @attribute label {ham,spam} @attribute text {'Go until jurong point','Ok lar...', etc.} Currently, it seems that the text attribute is formatted as a nominal attribute with each message's text as a value. But I need the text attribute to be a String attribute, not a list of all of the text from all

Too many attributes for ARFF format in Weka

风流意气都作罢 提交于 2019-12-20 02:58:42
问题 I am working with a data-set of dimension more than 10,000. To use Weka I need to convert text file into ARFF format, but since there are too many attributes even after using sparse ARFF format file size is too large. Is there any similar method as for data to avoid writing so many attribute identifier as in header of ARFF file. for example : @attribute A1 NUMERICAL @attribute A2 NUMERICAL ... ... @attribute A10000 NUMERICAL 回答1: I coded a script in AWK to format the following lines (in a TXT