classification | 易学教程

BERT get sentence level embedding after fine tuning

阅读更多关于 BERT get sentence level embedding after fine tuning

问题 I came across this page 1) I would like to get sentence level embedding (embedding given by [CLS] token) after the fine tuning is done. How could I do it? 2) I also noticed that the code on that page takes a lot of time to return results on the test data. Why is that? When i trained the model it took less time as compared to when i tried to get test predictions. From the code on that page, I didnt use below blocks of the code test_InputExamples = test.apply(lambda x: bert.run_classifier

How to efficiently connect features to LSTM model?

阅读更多关于 How to efficiently connect features to LSTM model?

问题 I have the following ‍LSTM model where I input my time-series to the LSTM layer. The other input (which is the dense layer) contains the 10 features I manually extracted from the time-series. input1 = Input(shape=(26,6)) x1 = LSTM(100)(input1) input2 = Input(shape=(10,1)) x2 = Dense(50)(input2) x = concatenate([x1,x2]) x = Dense(200)(x) output = Dense(1, activation='sigmoid')(x) model = Model(inputs=[input1,input2], outputs=output) I thought that the performance of my model will hugely

How to efficiently connect features to LSTM model?

阅读更多关于 How to efficiently connect features to LSTM model?

How do I use knn model for new data in R?

阅读更多关于 How do I use knn model for new data in R?

问题 I've just written a knn model in R. However, I don't know how to use the output to predict new data. # split into train (treino) and test (teste) treino_index <- sample(seq_len(nrow(iris)), size = round(0.75*nrow(iris))) treino <- iris[treino_index, ] teste <- iris[-treino_index, ] # take a look at the sample head(treino) head(teste) # save specie from later treino_especie = treino$Species teste_especie = teste$Species # exclude species from train and test dataset treino = treino[-5] teste =

How to combine two LSTM layers with different input sizes in Keras?

阅读更多关于 How to combine two LSTM layers with different input sizes in Keras?

问题 I have two types of input sequences where input1 contains 50 values and input2 contains 25 values. I tried to combine these two sequence types using a LSTM model in functional API. However since the length of my two input sequences are different, I am wondering whether what I am currently doing is the right way. My code is as follows: input1 = Input(shape=(50,1)) x1 = LSTM(100)(input1) input2 = Input(shape=(25,1)) x2 = LSTM(50)(input2) x = concatenate([x1,x2]) x = Dense(200)(x) output = Dense

Error in eval(predvars, data, env) : object 'Rm' not found

阅读更多关于 Error in eval(predvars, data, env) : object 'Rm' not found

问题 dataset = read.csv('dataset/housing.header.binary.txt') dataset1 = dataset[6] #higest positive correlation dataset2 = dataset[13] #lowest negative correlation dependentVal= dataset[14] #dependent value new_dataset = cbind(dataset1,dataset2, dependentVal) # new matrix #split dataset #install.packages('caTools') library(caTools) set.seed(123) #this is needed to garantee that every run will produce the same output split = sample.split(new_dataset, SplitRatio = 0.75) train_set = subset(new

Apple turicreate always return the same label

阅读更多关于 Apple turicreate always return the same label

问题 I'm test-driving turicreate, to resolve a classification issue, in which data consists of 10-uples (q,w,e,r,t,y,u,i,o,p,label), where 'q..p' is a sequence of characters (for now of 2 types), +,-, like this: q,w,e,r,t,y,u,i,o,p,label -,+,+,e,e,e,e,e,e,e,type2 +,+,e,e,e,e,e,e,e,e,type1 -,+,e,e,e,e,e,e,e,e,type2 'e' is just a padding character, so that vectors have a fixed lenght of 10. note:data is significantly tilted toward one label (90% of it), and the dataset is small, < 100 points. I use

Apple turicreate always return the same label

阅读更多关于 Apple turicreate always return the same label

Apple turicreate always return the same label

阅读更多关于 Apple turicreate always return the same label

Spark Decision Tree

阅读更多关于 Spark Decision Tree

决策树归纳是从有类标号的训练元组中学习决策树。决策树是一种类似于流程图的树结构，其中，每个内部节点表示在一个属性上的测试，每个分支代表该测试的一个输出，而每个树叶节点存放一个类标号，最顶层节点是跟节点。一：算法逻辑 D：数据 Attribute_list:元组属性列表 Attribute_selection_method:选择可以按类最好地区分给定元组的属性树从单个节点N开始，N代表D中的训练元组如果D中的元组都为同一类，则节点N变为树叶，并用该类标记他。否则继续调用attribute_selection_method确定如果分裂。目的是使得每个分支上的输出分区都尽可能“纯”。对分裂准则的每个输出，由节点N生长一个分支，D中的元组据此进行划分。假设A是分裂属性，根据训练数据，由三种可能的情况： (1) A是离散的：在这种情况下，节点N的测试输出直接对应与A的已知值。对A的每个已知值创建一个分支。 (2)A是连续的：有两个可能的输出，分别对应与条件A<=split,A>split (3) A是离散的且必须产生二叉树：子集作为分裂条件对于D的每个结果分区上的元组，算法使用同样的递归形成决策树。终止条件 (1)分区D的所有元组都属于同一个类 (2)没有剩余属性可以用来进一步划分元组，使用多数表决。返回决策树二：属性选择度量属性选择度量是一种选择分裂准则