data-analysis

Fourier transform with python

被刻印的时光 ゝ 提交于 2019-12-23 01:50:13
问题 I have a set of data. It is obviously have some periodic nature. I want to find out what frequency it has by using the fourier transformation and plot it out. Here is a shot of mine, but it seems not so good. This is the corresponding code, I don't konw why it fails: import numpy from pylab import * from scipy.fftpack import fft,fftfreq import matplotlib.pyplot as plt dataset = numpy.genfromtxt(fname='data.txt',skip_header=1) t = dataset[:,0] signal = dataset[:,1] npts=len(t) FFT = abs(fft

how to merge two dataframes based on a column in pandas [duplicate]

帅比萌擦擦* 提交于 2019-12-22 12:24:07
问题 This question already has answers here : Pandas Merging 101 (2 answers) Closed last year . I have two data frames, df1=pd.DataFrame({"Req":["Req 1","Req 2","Req 3"],"Count":[1,2,1]}) Req Count 0 Req 1 1 1 Req 2 2 2 Req 3 1 df2=pd.DataFrame({"Req":["Req 1","Req 2"],"Count":[0,1]}) Req Count 0 Req 1 0 1 Req 2 1 I am trying to merge these df's based on "Req" column My desired output is, Req total from_1 from_2 Req 1 1 1 0 Req 2 3 2 1 Req 3 1 1 0 I tried pd.merge(df1, df2, on = "Req", ) but it is

Python : How to use Multinomial Logistic Regression using SKlearn

风格不统一 提交于 2019-12-22 04:36:08
问题 I have a test dataset and train dataset as below. I have provided a sample data with min records, but my data has than 1000's of records. Here E is my target variable which I need to predict using an algorithm. It has only four categories like 1,2,3,4. It can take only any of these values. Training Dataset: A B C D E 1 20 30 1 1 2 22 12 33 2 3 45 65 77 3 12 43 55 65 4 11 25 30 1 1 22 23 19 31 2 31 41 11 70 3 1 48 23 60 4 Test Dataset: A B C D E 11 21 12 11 1 2 3 4 5 6 7 8 99 87 65 34 11 21 24

Python : How to use Multinomial Logistic Regression using SKlearn

本秂侑毒 提交于 2019-12-22 04:36:06
问题 I have a test dataset and train dataset as below. I have provided a sample data with min records, but my data has than 1000's of records. Here E is my target variable which I need to predict using an algorithm. It has only four categories like 1,2,3,4. It can take only any of these values. Training Dataset: A B C D E 1 20 30 1 1 2 22 12 33 2 3 45 65 77 3 12 43 55 65 4 11 25 30 1 1 22 23 19 31 2 31 41 11 70 3 1 48 23 60 4 Test Dataset: A B C D E 11 21 12 11 1 2 3 4 5 6 7 8 99 87 65 34 11 21 24

How do you deal with missing data using numpy/scipy?

孤街醉人 提交于 2019-12-22 03:25:32
问题 One of the things I deal with most in data cleaning is missing values. R deals with this well using its "NA" missing data label. In python, it appears that I'll have to deal with masked arrays which seem to be a major pain to set up and don't seem to be well documented. Any suggestions on making this process easier in Python? This is becoming a deal-breaker in moving into Python for data analysis. Thanks Update It's obviously been a while since I've looked at the methods in the numpy.ma

How do you deal with missing data using numpy/scipy?

此生再无相见时 提交于 2019-12-22 03:25:06
问题 One of the things I deal with most in data cleaning is missing values. R deals with this well using its "NA" missing data label. In python, it appears that I'll have to deal with masked arrays which seem to be a major pain to set up and don't seem to be well documented. Any suggestions on making this process easier in Python? This is becoming a deal-breaker in moving into Python for data analysis. Thanks Update It's obviously been a while since I've looked at the methods in the numpy.ma

How to convert rows values in dataframe to columns labels in Python after groupby?

生来就可爱ヽ(ⅴ<●) 提交于 2019-12-21 23:12:36
问题 I have specific case where I want to convert this df: print df Schoolname Attribute Value 0 xyz School Safe 3.44 1 xyz School Cleanliness 2.34 2 xyz School Money 4.65 3 abc School Safe 4.40 4 abc School Cleanliness 4.50 5 abc School Money 4.90 6 lmn School Safe 2.34 7 lmn School Cleanliness 3.89 8 lmn School Money 4.65 Which i need to get in this format so that i can convert it to numpy array for linear regression modelling. required_df: Schoolname Safe Cleanliness Money 0 xyz School 3.44 2

date range for six monthly in pandas

感情迁移 提交于 2019-12-21 20:59:36
问题 So, this is my data frame. PatientNumber QT Answer Answerdate DiagnosisDate 1 1 transferring No 2017-03-03 2018-05-03 2 1 preparing food No 2017-03-03 2018-05-03 3 1 medications Yes 2017-03-03 2018-05-03 4 2 transferring No 2011-05-10 2012-05-04 5 2 preparing food No 2011-05-10 2012-05-04 6 2 medications No 2011-05-10 2012-05-04 7 2 transferring Yes 2011-15-03 2012-05-04 8 2 preparing food Yes 2011-15-03 2012-05-04 9 2 medications No 2011-15-03 2012-05-04 10 2 transferring Yes 2010-15-12 2012

Customizing rolling_apply function in Python pandas

只愿长相守 提交于 2019-12-21 20:04:03
问题 Setup I have a DataFrame with three columns: "Category" contains True and False, and I have done df.groupby('Category') to group by these values. "Time" contains timestamps (measured in seconds) at which values have been recorded "Value" contains the values themselves. At each time instance, two values are recorded: one has category "True", and the other has category "False". Rolling apply question Within each category group , I want to compute a number and store it in column Result for each

R, relating columns to row

纵然是瞬间 提交于 2019-12-20 05:37:06
问题 I have five columns[each column name represents each candidate say.. can1 can2 can3 can4 can5 , each column has binary data(TRUE OR FALSE) and I have another column-CANDIDATES which has the data collection with names of the 5 candidates(factor=5)(the same candidates). so it is something like can1 can2 can3 can4 can5 CANDIDATES I want to create column which is binary, in which the row is true if the element of the CANDIDATE and the corresponding candidate column(in the 5 column) is true..