dataframe

I Get TypeError: cannot use a string pattern on a bytes-like object when using to_sql on dataframe python 3

可紊 提交于 2021-02-10 22:19:09
问题 Hi I am trying to write a dataframe to my sql database using df.to_sql however I am getting the error message: TypeError: cannot use a string pattern on a bytes-like object. I am using Python 3. I am using a path on my drive which I can unfortuantly not share. But it works fine when I just want to open the csv file using. df = pd.read_csv(path, delimiter=';', engine='python', low_memory=True, encoding='utf-8-sig') I am using the encoding item because otherwise their is a strange object at my

How to reshape a wider data.frame to longer data.frame in R? [duplicate]

半世苍凉 提交于 2021-02-10 22:15:51
问题 This question already has answers here : Transpose and Merge columns in R [duplicate] (3 answers) Reshaping data.frame from wide to long format (9 answers) Closed 7 months ago . I was playing with pivot_longer and pivot_wider but probably am missing something. I have a data.frame like D_Wider and would like to convert it to something like D_longer . any way forward? library(tidyverse) D_Wider <- data.frame(A = 15, S = 10, D = 25, Z = 16) Desired Output D_Longer <- data.frame(Stations = c("A",

Count events before a specific time for a series of items in R

丶灬走出姿态 提交于 2021-02-10 20:40:07
问题 I have a dataframe of items with a certain number of different events which occur at different times. e.g. say I had a times of events (goal, corner, red card etc...) in various games of football. I want to count the number of each events which occurred before a certain time for each team in each game (where the time is different for each game). So I could have a dataframe of events (where C is corner, G is goal and R is red card) as follows: events <- data.frame( game_id = c(1, 1, 1, 1, 1, 1

Count events before a specific time for a series of items in R

二次信任 提交于 2021-02-10 20:38:39
问题 I have a dataframe of items with a certain number of different events which occur at different times. e.g. say I had a times of events (goal, corner, red card etc...) in various games of football. I want to count the number of each events which occurred before a certain time for each team in each game (where the time is different for each game). So I could have a dataframe of events (where C is corner, G is goal and R is red card) as follows: events <- data.frame( game_id = c(1, 1, 1, 1, 1, 1

Is there syntactic sugar to define a data frame in R

偶尔善良 提交于 2021-02-10 20:36:35
问题 I want to regroup US states by regions and thus I need to define a "US state" -> "US Region" mapping function, which is done by setting up an appropriate data frame. The basis is this exercise (apparently this is a map of the "Commonwealth of the Fallout"): One starts off with an original list in raw form: Alabama = "Gulf" Arizona = "Four States" Arkansas = "Texas" California = "South West" Colorado = "Four States" Connecticut = "New England" Delaware = "Columbia" which eventually leads to

Expanding R Matrix on Date

扶醉桌前 提交于 2021-02-10 20:01:45
问题 I have the following R matrix: Date MyVal 2016 1 2017 2 2018 3 .... 2026 10 What I want to do is "blow it up" so that it goes like this (where monthly values are linearly interpolated): Date MyVal 01/01/2016 1 02/01/2016 .. .... 01/01/2017 2 .... 01/01/2026 10 I realize I can easily generate the sequence using: DateVec <- seq(as.Date(paste(minYear,"/01/01", sep = "")), as.Date(paste(maxYear, "/01/01", sep = "")), by = "month") And I can use that to make a large matrix and then fill things in

Long to wide format with several duplicates. Circumvent with unique combo of columns

徘徊边缘 提交于 2021-02-10 19:53:24
问题 I have a dataset similar to this (real one is way bigger). It is in long format and I need to change it to wide format with one row per id. My problem is that there are a lot of different combinations of time, drug, unit and admin. Only a combination of time, drug, unit and admin will be unique and should only occur once pr id. I could not find a solution to this. I would like R to create unique combinations of columns so the data can be transformed to wide format. I have tried melt.data

Use keywords from dataframe to detect if any present in another dataframe or string

佐手、 提交于 2021-02-10 18:22:46
问题 I have two problems: First is... I have one dataframe with category and keywords like this: Category Keywords 0 Fruit ['apple', 'pear', 'plum', 'grape'] 1 Color ['red', 'purple', 'green'] Another dataframe like this: Summary 0 This is a basket of red apples. They are sour. 1 We found a bushel of fruit. They are red. 2 There is a peck of pears that taste sweet. 3 We have a box of plums. I want the end result like this: Category Summary 0 Fruit, Color This is a basket of red apples. They are

How do I efficiently apply pandas.Timestamp functions to a full dataframe/column?

一个人想着一个人 提交于 2021-02-10 18:22:03
问题 Pandas is a great tool for a number of data tasks. Many functions have been streamlined to efficiently be applied to columns rather than individual cells/rows. One such function is the to_datetime() function, which I use as an example later in this question. However, there are a number of commands in pandas that, as best I can tell from the documentation, do not directly relate to dataframes. The specific function I am interested in is the pandas.Timestamp.isocalendar() function, but there

How do I efficiently apply pandas.Timestamp functions to a full dataframe/column?

久未见 提交于 2021-02-10 18:20:53
问题 Pandas is a great tool for a number of data tasks. Many functions have been streamlined to efficiently be applied to columns rather than individual cells/rows. One such function is the to_datetime() function, which I use as an example later in this question. However, there are a number of commands in pandas that, as best I can tell from the documentation, do not directly relate to dataframes. The specific function I am interested in is the pandas.Timestamp.isocalendar() function, but there