dataframe

Difference between two dates in Pandas DataFrame

情到浓时终转凉″ 提交于 2021-02-11 07:09:18
问题 I have many columns in a data frame and I have to find the difference of time in two column named as in_time and out_time and put it in the new column in the same data frame. The format of time is like this 2015-09-25T01:45:34.372Z . I am using Pandas DataFrame. I want to do like this: df.days = df.out_time - df.in_time I have many columns and I have to increase 1 more column in it named days and put the differences there. 回答1: You need to convert the strings to datetime dtype, you can then

Difference between two dates in Pandas DataFrame

喜夏-厌秋 提交于 2021-02-11 07:07:38
问题 I have many columns in a data frame and I have to find the difference of time in two column named as in_time and out_time and put it in the new column in the same data frame. The format of time is like this 2015-09-25T01:45:34.372Z . I am using Pandas DataFrame. I want to do like this: df.days = df.out_time - df.in_time I have many columns and I have to increase 1 more column in it named days and put the differences there. 回答1: You need to convert the strings to datetime dtype, you can then

perform operation on column of data frame based on condition given to column in another data frame in pandas

ぃ、小莉子 提交于 2021-02-11 07:05:05
问题 I have a data frame df1: df1 = pd.DataFrame([[40, 23, 22, 31], [12, 3, 11,23], [42, 16, 32, 34], [42, 13, 26, 34]], columns=['A', 'B', 'C', 'D']) I have another data frame df2: df2 = pd.DataFrame([["B","<20"],["A",">30"],["C","<40"],["D","<15"]], columns=["Column","Condition"]) Question: Select the data frame df1 based on any of the conditions present in df2. How to do it? Please help. Expected Output example: For B Condition: B_df = pd.DataFrame([3,16,13],columns=["B"]) For C Condition: C_df

perform operation on column of data frame based on condition given to column in another data frame in pandas

旧城冷巷雨未停 提交于 2021-02-11 07:01:25
问题 I have a data frame df1: df1 = pd.DataFrame([[40, 23, 22, 31], [12, 3, 11,23], [42, 16, 32, 34], [42, 13, 26, 34]], columns=['A', 'B', 'C', 'D']) I have another data frame df2: df2 = pd.DataFrame([["B","<20"],["A",">30"],["C","<40"],["D","<15"]], columns=["Column","Condition"]) Question: Select the data frame df1 based on any of the conditions present in df2. How to do it? Please help. Expected Output example: For B Condition: B_df = pd.DataFrame([3,16,13],columns=["B"]) For C Condition: C_df

Pandas custom function to find whether it is the 1st, 2nd etc Monday, Tuesday, etc - all suggestions welcome

独自空忆成欢 提交于 2021-02-11 06:50:30
问题 So I have the following code which reads in 5 columns, date ohlc. It then creates a column 'dow' to hold day of week. So far so good: import numpy as np import pandas as pd df = pd.read_csv('/content/drive/MyDrive/Forex/EURUSD-2018_12_18-2020_11_01.csv',parse_dates=True,names = ['date','1','2','3','4',]) df['date'] = pd.to_datetime(df['date']) df.index = df['date'] df['dow'] = df['date'].dt.dayofweek #df['downum'] = df.apply(lambda x: downu(x['date'])) df Producing the following output: date

How can I specifically add .0 to integers in a column containing both integers and decimals?

六眼飞鱼酱① 提交于 2021-02-11 06:45:17
问题 Question: how can I specifically add .0 to all df$x -values not containing decimals? Take the following a dataframe df <- data.frame(x=c("2.8","9","0.5","1.2","4","12")) > head(df) x 1 2.8 2 9 3 0.5 4 1.2 5 4 6 12 The desired result looks like this > head(df) x 1 2.8 2 9.0 3 0.5 4 1.2 5 4.0 6 12.0 EDIT I want to automatically add .0 to all integers in ee$x , which is included below. I tried ee$x <- as.numeric(as.character(ee$x)) but that didn't seem to work. Could anything be wring with my

How can I specifically add .0 to integers in a column containing both integers and decimals?

回眸只為那壹抹淺笑 提交于 2021-02-11 06:43:59
问题 Question: how can I specifically add .0 to all df$x -values not containing decimals? Take the following a dataframe df <- data.frame(x=c("2.8","9","0.5","1.2","4","12")) > head(df) x 1 2.8 2 9 3 0.5 4 1.2 5 4 6 12 The desired result looks like this > head(df) x 1 2.8 2 9.0 3 0.5 4 1.2 5 4.0 6 12.0 EDIT I want to automatically add .0 to all integers in ee$x , which is included below. I tried ee$x <- as.numeric(as.character(ee$x)) but that didn't seem to work. Could anything be wring with my

Creating a dataframe with text from a website

核能气质少年 提交于 2021-02-11 06:38:10
问题 I've been asked to create a data frame in R using information copied from a website; the data is not contained in a file. The full data list is at: https://www.npr.org/2012/12/07/166400760/hollywood-heights-the-ups-downs-and-in-betweens Here is a portion of the data: Leading Men (Average American male: 5 feet 9.5 inches) Dolph Lundgren — 6 feet 5 inches John Cleese — 6 feet 5 inches Michael Clarke Duncan — 6 feet 5 inches Vince Vaughn — 6 feet 5 inches Clint Eastwood — 6 feet 4 inches Jimmy

Variable number of unwanted white spaces resulting into distorted column

China☆狼群 提交于 2021-02-11 06:35:34
问题 Recently, I asked the following question - Unwanted white spaces resulting into distorted column and the answer by @sharathnatraj was satisfactory and worked like a charm. Answer was: import re with open('trial1.txt', 'r') as f: lines = f.readlines() l = [re.sub(r"([a-z]{5,})\s([a-z]{5,})", r"\1\2", line) for line in lines] df = pd.read_csv(io.StringIO('\n'.join(l)), delim_whitespace=True) Sample data set: 1 CAgF3O3S silver trifluoromethanesulfonate 2923-28-6 256.937 629.15 1 --- --- --- ---

Variable number of unwanted white spaces resulting into distorted column

半世苍凉 提交于 2021-02-11 06:35:32
问题 Recently, I asked the following question - Unwanted white spaces resulting into distorted column and the answer by @sharathnatraj was satisfactory and worked like a charm. Answer was: import re with open('trial1.txt', 'r') as f: lines = f.readlines() l = [re.sub(r"([a-z]{5,})\s([a-z]{5,})", r"\1\2", line) for line in lines] df = pd.read_csv(io.StringIO('\n'.join(l)), delim_whitespace=True) Sample data set: 1 CAgF3O3S silver trifluoromethanesulfonate 2923-28-6 256.937 629.15 1 --- --- --- ---