dataframe

How to shift data by a factor of two months in R?

拈花ヽ惹草 提交于 2021-02-20 19:08:17
问题 I would like to move down my entire data by a factor of two months. For example, if my data starts on Jan 01, i want to move the data in such a way that the data corresponds to March 01. Likewise, November data would become January data for the next year. Here is my sample code DF <- data.frame(seq(as.Date("2001-01-01"), to= as.Date("2003-12-31"), by="day"), A = runif(1095, 0,10), D = runif(1095,5,15)) colnames(DF) <- c("Date", "A", "B") I tried DF$Date <- DF$Date + 61 but this moved the

Filtering rows of a dataframe based on values in columns

时光总嘲笑我的痴心妄想 提交于 2021-02-20 19:08:01
问题 I want to filter the rows of a dataframe that contains values less than ,say 10. import numpy as np import pandas as pd from pprint import pprint df = pd.DataFrame(np.random.randint(0,100,size=(10, 4)), columns=list('ABCD')) df = df[df <10] gives, A B C D 0 5.0 NaN NaN NaN 1 NaN NaN NaN NaN 2 0.0 NaN 6.0 NaN 3 NaN NaN NaN NaN 4 NaN NaN NaN NaN 5 6.0 NaN NaN NaN 6 NaN NaN NaN NaN 7 NaN NaN NaN 7.0 8 NaN NaN NaN NaN 9 NaN NaN NaN NaN Expected: 0 5 57 87 95 2 0 80 6 82 5 6 33 74 75 7 71 44 60 7

How to shift data by a factor of two months in R?

那年仲夏 提交于 2021-02-20 19:08:00
问题 I would like to move down my entire data by a factor of two months. For example, if my data starts on Jan 01, i want to move the data in such a way that the data corresponds to March 01. Likewise, November data would become January data for the next year. Here is my sample code DF <- data.frame(seq(as.Date("2001-01-01"), to= as.Date("2003-12-31"), by="day"), A = runif(1095, 0,10), D = runif(1095,5,15)) colnames(DF) <- c("Date", "A", "B") I tried DF$Date <- DF$Date + 61 but this moved the

Pandas Dataframe - for each row, return count of other rows with overlapping dates

非 Y 不嫁゛ 提交于 2021-02-20 19:05:33
问题 I've got a dataframe with projects, start dates, and end dates. For each row I would like to return the number of other projects in process when the project started. How do you nest loops when using df.apply() ? I've tried using a for loop but my dataframe is large and it takes way too long. import datetime as dt data = {'project' :['A', 'B', 'C'], 'pr_start_date':[dt.datetime(2018, 9, 1), dt.datetime(2019, 4, 1), dt.datetime(2019, 6, 8)], 'pr_end_date': [dt.datetime(2019, 6, 15), dt.datetime

Filtering rows of a dataframe based on values in columns

99封情书 提交于 2021-02-20 19:05:12
问题 I want to filter the rows of a dataframe that contains values less than ,say 10. import numpy as np import pandas as pd from pprint import pprint df = pd.DataFrame(np.random.randint(0,100,size=(10, 4)), columns=list('ABCD')) df = df[df <10] gives, A B C D 0 5.0 NaN NaN NaN 1 NaN NaN NaN NaN 2 0.0 NaN 6.0 NaN 3 NaN NaN NaN NaN 4 NaN NaN NaN NaN 5 6.0 NaN NaN NaN 6 NaN NaN NaN NaN 7 NaN NaN NaN 7.0 8 NaN NaN NaN NaN 9 NaN NaN NaN NaN Expected: 0 5 57 87 95 2 0 80 6 82 5 6 33 74 75 7 71 44 60 7

Merging Dataframe chunks in Pandas

纵然是瞬间 提交于 2021-02-20 18:54:42
问题 I currently have a script that will combine multiple csv files into one, the script works fine except that we run out of ram really quickly when larger files start being used. This is an issue for one reason, the script runs on an AWS server and running out of RAM means a server crash. Currently the file size limit is around 250mb each, and that limits us to 2 files, however as the company I work is in Biotech and we're using Genetic Sequencing files, the files we use can range in size from

Merging Dataframe chunks in Pandas

…衆ロ難τιáo~ 提交于 2021-02-20 18:54:40
问题 I currently have a script that will combine multiple csv files into one, the script works fine except that we run out of ram really quickly when larger files start being used. This is an issue for one reason, the script runs on an AWS server and running out of RAM means a server crash. Currently the file size limit is around 250mb each, and that limits us to 2 files, however as the company I work is in Biotech and we're using Genetic Sequencing files, the files we use can range in size from

Python Pandas - use Multiple Character Delimiter when writing to_csv

本小妞迷上赌 提交于 2021-02-20 18:48:20
问题 It appears that the pandas to_csv function only allows single character delimiters/separators. Is there some way to allow for a string of characters to be used like, "::" or "%%" instead? I tried: df.to_csv(local_file, sep = '::', header=None, index=False) and getting: TypeError: "delimiter" must be a 1-character string 回答1: Use numpy-savetxt Ex: np.savetxt(file.csv, np.char.decode(chunk_data.values.astype(np.bytes_), 'UTF-8'), delimiter='~|', fmt='%s',encoding=None) np.savetxt(file.dat,

Python Pandas - use Multiple Character Delimiter when writing to_csv

核能气质少年 提交于 2021-02-20 18:47:47
问题 It appears that the pandas to_csv function only allows single character delimiters/separators. Is there some way to allow for a string of characters to be used like, "::" or "%%" instead? I tried: df.to_csv(local_file, sep = '::', header=None, index=False) and getting: TypeError: "delimiter" must be a 1-character string 回答1: Use numpy-savetxt Ex: np.savetxt(file.csv, np.char.decode(chunk_data.values.astype(np.bytes_), 'UTF-8'), delimiter='~|', fmt='%s',encoding=None) np.savetxt(file.dat,

Python Pandas - use Multiple Character Delimiter when writing to_csv

∥☆過路亽.° 提交于 2021-02-20 18:46:10
问题 It appears that the pandas to_csv function only allows single character delimiters/separators. Is there some way to allow for a string of characters to be used like, "::" or "%%" instead? I tried: df.to_csv(local_file, sep = '::', header=None, index=False) and getting: TypeError: "delimiter" must be a 1-character string 回答1: Use numpy-savetxt Ex: np.savetxt(file.csv, np.char.decode(chunk_data.values.astype(np.bytes_), 'UTF-8'), delimiter='~|', fmt='%s',encoding=None) np.savetxt(file.dat,