prediction

Incorporating user feedback in a ML model

故事扮演 提交于 2019-12-03 05:13:37
问题 I have developed a ML model for a classification (0/1) NLP task and deployed it in production environment. The prediction of the model is displayed to users, and the users have the option to give a feedback (if the prediction was right/wrong). How can I continuously incorporate this feedback in my model ? From a UX stand point you dont want a user to correct/teach the system more than twice/thrice for a specific input, system shld learn fast i.e. so the feedback shld be incorporated "fast".

Predicting missing values with scikit-learn's Imputer module

我是研究僧i 提交于 2019-12-03 05:05:44
I am writing a very basic program to predict missing values in a dataset using scikit-learn's Imputer class. I have made a NumPy array, created an Imputer object with strategy='mean' and performed fit_transform() on the NumPy array. When I print the array after performing fit_transform(), the 'Nan's remain, and I dont get any prediction. What am I doing wrong here? How do I go about predicting the missing values? import numpy as np from sklearn.preprocessing import Imputer X = np.array([[23.56],[53.45],['NaN'],[44.44],[77.78],['NaN'],[234.44],[11.33],[79.87]]) print X imp = Imputer(missing

How to train a RNN with LSTM cells for time series prediction

故事扮演 提交于 2019-12-03 03:47:01
问题 I'm currently trying to build a simple model for predicting time series. The goal would be to train the model with a sequence so that the model is able to predict future values. I'm using tensorflow and lstm cells to do so. The model is trained with truncated backpropagation through time. My question is how to structure the data for training. For example let's assume we want to learn the given sequence: [1,2,3,4,5,6,7,8,9,10,11,...] And we unroll the network for num_steps=4 . Option 1 input

Prediction: Time-series prediction of future events using SVR module

匿名 (未验证) 提交于 2019-12-03 01:58:03
可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试): 问题: I want to perform time-series prediction of future events using SVR module from scikit-learn. Here is my source code I am trying to work with: import csv import numpy as np from sklearn.svm import SVR import matplotlib.pyplot as plt plt.switch_backend('newbackend') seq_num=[] win=[] def get_data(filename): with open(filename, 'r') as csvfile: csvFileReader = csv.reader(csvfile) next(csvFileReader) # skipping column names for row in csvFileReader: seq_num.append(int(row[0]) win.append(int(row[6])) return def predict_win(X, y, x): win = np

Prediction failed: contents must be scalar

匿名 (未验证) 提交于 2019-12-03 00:56:02
可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试): 问题: I have successfully trained, exported and uploaded my 'retrained_graph.pb' to ML Engine. My export script is as follows: import tensorflow as tf from tensorflow.python.saved_model import signature_constants from tensorflow.python.saved_model import tag_constants from tensorflow.python.saved_model import builder as saved_model_builder input_graph = 'retrained_graph.pb' saved_model_dir = 'my_model' with tf.Graph().as_default() as graph: # Read in the export graph with tf.gfile.FastGFile(input_graph, 'rb') as f: graph_def = tf.GraphDef() graph

What does negative %IncMSE in RandomForest package mean?

ぐ巨炮叔叔 提交于 2019-12-03 00:06:53
I used RandomForest for a regression problem. I used importance(rf,type=1) to get the %IncMSE for the variables and one of them has a negative %IncMSE. Does this mean that this variable is bad for the model? I searched the Internet to get some answers but I didn't find a clear one. I also found something strange in the model's summary ( attached below), It seems that only one tree was used although I defined ntrees as 800. model: rf<-randomForest(var1~va2+var3+..+var35,data=d7depo,ntree=800,keep.forest=FALSE, importance=TRUE) summary(rf) Length Class Mode call 6 -none- call type 1 -none-

Explain onehotencoder using python

痞子三分冷 提交于 2019-12-03 00:03:01
I am new to scikit-learn library and have been trying to play with it for prediction of stock prices. I was going through its documentation and got stuck at the part where they explain OneHotEncoder() . Here is the code that they have used : >>> from sklearn.preprocessing import OneHotEncoder >>> enc = OneHotEncoder() >>> enc.fit([[0, 0, 3], [1, 1, 0], [0, 2, 1], [1, 0, 2]]) OneHotEncoder(categorical_features='all', dtype=<... 'numpy.float64'>, handle_unknown='error', n_values='auto', sparse=True) >>> enc.n_values_ array([2, 3, 4]) >>> enc.feature_indices_ array([0, 2, 5, 9]) >>> enc.transform

difference between speculation and prediction

倾然丶 夕夏残阳落幕 提交于 2019-12-02 23:10:58
In computer architecture, what is difference between (branch) prediction and speculation?? These seems very similar, but i think there is a subtle distinction between them. Guffa Branch prediction is done by the processor to try to determine where the execution will continue after a conditional jump, so that it can read the next instruction(s) from memory. Speculative execution goes one step further and determines what the result would be from executing the next instruction(s). If the branch prediction was correct, the result is used, otherwise it is discarded. Note that speculative execution

Assessing and Improving Prediction and Classification 免积分下载

匿名 (未验证) 提交于 2019-12-02 23:04:42
图书说明: 以准确反映其实际性能的方式评估预测和分类模型的质量,然后使用最先进的算法(例如基于委员会的决策制定,重新采样数据集和提升)来提高性能。本书介绍了许多重要的技术,可用于构建功能强大的模型,并在应用程序中投入使用时量化其预期行为。 对信息理论给予了相当的关注,特别是因为它涉及发现和利用模型所使用的变量之间的关系。这种常常令人困惑的主题的呈现避免了高等数学,而是专注于数学中具有适度背景的人容易理解的概念。 所有算法都包括对操作的直观解释,基本方程,对更严格理论的引用以及评论的C ++源代码。这些技术中的许多是最近的发展,仍未广泛使用。其他是标准算法,给人一种全新的外观。在每种情况下,重点都放在实际适用性上,所有代码都以这样的方式编写,以便可以轻松地包含在任何程序中。 你将学到什么 计算熵以检测有问题的预测变量 使用约束和无约束组合,方差加权插值和核回归平滑来改进数值预测 使用Borda计数,MinMax和MaxMin规则,并集和交集规则,逻辑回归,局部精度选择,模糊积分最大化和成对耦合来执行分类决策 利用信息理论技术快速筛选大量候选预测变量,识别那些特别有前景的预测变量 使用蒙特卡罗排列方法来评估好运在绩效结果中的作用 计算预测的置信度和容差区间,以及分类决策的置信度 本书适用于谁 任何创建预测或分类模型的人都会在本书中找到大量有用的算法。尽管所有代码示例都是用C ++编写的

Non-linear multivariate time-series response prediction using RNN

对着背影说爱祢 提交于 2019-12-02 18:39:21
I am trying to predict the hygrothermal response of a wall, given the interior and exterior climate. Based on literature research, I believe this should be possible with RNN but I have not been able to get good accuracy. The dataset has 12 input features (time-series of exterior and interior climate data) and 10 output features (time-series of hygrothermal response), both containing hourly values for 10 years. This data was created with hygrothermal simulation software, there is no missing data. Dataset features: Dataset targets: Unlike most time-series prediction problems, I want to predict