data-analysis

A question on cross-correlation & correlation coefficient [duplicate]

半腔热情 提交于 2019-12-03 03:21:35
问题 This question already has answers here : Closed 8 years ago . Possible Duplicate: Matlab Cross correlation vs Correlation Coefficient question When I cross correlate 2 data sets a and b (each 73 points long) in MATLAB and graph it, it appears like a triangle with 145 points. I'm confused between the correlation coefficient and the triangle-like graph when I plot the cross correlation output which ranges from +/- 1. 回答1: I seriously think you need to read up more on cross-correlation functions

How to find the closest word to a vector using word2vec

浪子不回头ぞ 提交于 2019-12-03 02:28:36
I have just started using Word2vec and I was wondering how can we find the closest word to a vector suppose. I have this vector which is the average vector for a set of vectors: array([-0.00449447, -0.00310097, 0.02421786, ...], dtype=float32) Is there a straight forward way to find the most similar word in my training data to this vector? Or the only solution is to calculate the cosine similarity between this vector and the vectors of each word in my training data, then select the closest one? Thanks. For gensim implementation of word2vec there is most_similar() function that lets you find

A question on cross-correlation & correlation coefficient [duplicate]

余生长醉 提交于 2019-12-02 17:21:10
Possible Duplicate: Matlab Cross correlation vs Correlation Coefficient question When I cross correlate 2 data sets a and b (each 73 points long) in MATLAB and graph it, it appears like a triangle with 145 points. I'm confused between the correlation coefficient and the triangle-like graph when I plot the cross correlation output which ranges from +/- 1. I seriously think you need to read up more on cross-correlation functions & correlation coefficient from a statistics book, because your confusion here is more fundamental than related to MATLAB. Unless you know what you're dealing with, you

R and SPSS difference

那年仲夏 提交于 2019-12-02 14:17:57
I will be analysing vast amount of network traffic related data shortly, and will pre-process the data in order to analyse it. I have found that R and SPSS are among the most popular tools for statistical analysis. I will also be generating quite a lot of graphs and charts. Therefore, I was wondering what is the basic difference between these two softwares. I am not asking which one is better, but just wanted to know what are the difference in terms of workflow between the two (besides the fact that SPSS has a GUI). I will be mostly working with scripts in either case anyway so I wanted to

mapping matching word count on a column using pandas in python

流过昼夜 提交于 2019-12-02 14:01:12
问题 I have a df, Name Step Description Ram 1 Ram is oNe of the good cricketer Ram 2 gopal one Sri 1 Sri is one of the member Sri 2 ravi good Kumar 1 Kumar is a keeper Madhu 1 good boy Vignesh 1 oNe little Pechi 1 one book mario 1 good randokm Roger 1 one milita good bala 1 looks good raj 1 more one venk 1 likes good and a list, my_list=["one","good"] I am trying to get the rows which are having atleast one keyword from my_list. I tried, mask=df["Description"].str.contains("|".join(my_list),na

Plotting multiple segments with colors based on some variable with matplotlib

回眸只為那壹抹淺笑 提交于 2019-12-02 10:27:56
Following the answers of both topics Matplotlib: Plotting numerous disconnected line segments with different colors and matplotlib: how to change data points color based on some variable , I am trying to plot a set of segments given by a list, for instance: data = [(-118, -118), (34.07, 34.16), (-117.99, -118.15), (34.07, 34.16), (-118, -117.98), (34.16, 34.07)] and I would like to plot each segments with a color based on a second list for instance: color_param = [9, 2, 21] with a colormap. So far I am using this line to display the segments: plt.plot(*data) I was expecting that something like

how to combine two bar chart of two files in one diagram in matplotlib pandas

烂漫一生 提交于 2019-12-02 07:20:56
问题 I have two dataframe with the same columns but different content. I have plotted dffinal data frame . now I want to plot another dataframe dffinal_no on the same diagram to be comparable. for example one bar chart in blue colour , and the same bar chart with another colour just differentiating in y-axis . This is part of the code in which I have plotted the first data frame. dffinal = df[['6month','final-formula','numPatients6month']].drop_duplicates().sort_values(['6month']) ax=dffinal.plot

mapping matching word count on a column using pandas in python

江枫思渺然 提交于 2019-12-02 05:15:57
I have a df, Name Step Description Ram 1 Ram is oNe of the good cricketer Ram 2 gopal one Sri 1 Sri is one of the member Sri 2 ravi good Kumar 1 Kumar is a keeper Madhu 1 good boy Vignesh 1 oNe little Pechi 1 one book mario 1 good randokm Roger 1 one milita good bala 1 looks good raj 1 more one venk 1 likes good and a list, my_list=["one","good"] I am trying to get the rows which are having atleast one keyword from my_list. I tried, mask=df["Description"].str.contains("|".join(my_list),na=False) I am getting the output_df, Name Description Ram Ram is one of the good cricketer Sri Sri is one of

how to append two or more dataframes in pandas and do some analysis

落爺英雄遲暮 提交于 2019-12-02 02:37:16
I have 3 df's: df1=pd.DataFrame({"Name":["one","two","three"],"value":[4,5,6]}) df2=pd.DataFrame({"Name":["four","one","three"],"value":[8,6,2]}) df3=pd.DataFrame({"Name":["one","four","six"],"value":[1,1,1]}) I can append one by one but I want to append all the three data frames at a time and do some analysis. I am trying to count the name contains in how many data frame divided by total dataframes name present in dataframes/total dataframes My desired output is, Name value Count one 11 1 two 5 0.333 three 8 0.666 four 9 0.666 six 1 0.333 Please help, thanks in advance! Use: first concat

Replace each element equal to zero of a matrix with the corresponding element of the row above

时间秒杀一切 提交于 2019-12-02 01:42:51
问题 I'm using R. I have a matrix and I want to replace each element of it equal to zero with the corresponding element of the row above. For example, I created the following matrix: AA <- matrix(c(1,2,3,1,4,5,1,0,2), ncol=3, nrow=3) [,1] [,2] [,3] [1,] 1 1 1 [2,] 2 4 0 [3,] 3 5 2 I want to replace 0 with the element AA[1,3]. I would like a function able of doing this for each element of a matrix. 回答1: We could find the row/column index of elements that are 0 in the matrix ('i1'), then extract the