dataframe | 易学教程

I Get TypeError: cannot use a string pattern on a bytes-like object when using to_sql on dataframe python 3

阅读更多关于 I Get TypeError: cannot use a string pattern on a bytes-like object when using to_sql on dataframe python 3

问题 Hi I am trying to write a dataframe to my sql database using df.to_sql however I am getting the error message: TypeError: cannot use a string pattern on a bytes-like object. I am using Python 3. I am using a path on my drive which I can unfortuantly not share. But it works fine when I just want to open the csv file using. df = pd.read_csv(path, delimiter=';', engine='python', low_memory=True, encoding='utf-8-sig') I am using the encoding item because otherwise their is a strange object at my

How to reshape a wider data.frame to longer data.frame in R? [duplicate]

阅读更多关于 How to reshape a wider data.frame to longer data.frame in R? [duplicate]

问题 This question already has answers here : Transpose and Merge columns in R [duplicate] (3 answers) Reshaping data.frame from wide to long format (9 answers) Closed 7 months ago . I was playing with pivot_longer and pivot_wider but probably am missing something. I have a data.frame like D_Wider and would like to convert it to something like D_longer . any way forward? library(tidyverse) D_Wider <- data.frame(A = 15, S = 10, D = 25, Z = 16) Desired Output D_Longer <- data.frame(Stations = c("A",

Count events before a specific time for a series of items in R

阅读更多关于 Count events before a specific time for a series of items in R

问题 I have a dataframe of items with a certain number of different events which occur at different times. e.g. say I had a times of events (goal, corner, red card etc...) in various games of football. I want to count the number of each events which occurred before a certain time for each team in each game (where the time is different for each game). So I could have a dataframe of events (where C is corner, G is goal and R is red card) as follows: events <- data.frame( game_id = c(1, 1, 1, 1, 1, 1

Count events before a specific time for a series of items in R

阅读更多关于 Count events before a specific time for a series of items in R

Is there syntactic sugar to define a data frame in R

阅读更多关于 Is there syntactic sugar to define a data frame in R

问题 I want to regroup US states by regions and thus I need to define a "US state" -> "US Region" mapping function, which is done by setting up an appropriate data frame. The basis is this exercise (apparently this is a map of the "Commonwealth of the Fallout"): One starts off with an original list in raw form: Alabama = "Gulf" Arizona = "Four States" Arkansas = "Texas" California = "South West" Colorado = "Four States" Connecticut = "New England" Delaware = "Columbia" which eventually leads to

Expanding R Matrix on Date

阅读更多关于 Expanding R Matrix on Date

问题 I have the following R matrix: Date MyVal 2016 1 2017 2 2018 3 .... 2026 10 What I want to do is "blow it up" so that it goes like this (where monthly values are linearly interpolated): Date MyVal 01/01/2016 1 02/01/2016 .. .... 01/01/2017 2 .... 01/01/2026 10 I realize I can easily generate the sequence using: DateVec <- seq(as.Date(paste(minYear,"/01/01", sep = "")), as.Date(paste(maxYear, "/01/01", sep = "")), by = "month") And I can use that to make a large matrix and then fill things in

Long to wide format with several duplicates. Circumvent with unique combo of columns

阅读更多关于 Long to wide format with several duplicates. Circumvent with unique combo of columns

问题 I have a dataset similar to this (real one is way bigger). It is in long format and I need to change it to wide format with one row per id. My problem is that there are a lot of different combinations of time, drug, unit and admin. Only a combination of time, drug, unit and admin will be unique and should only occur once pr id. I could not find a solution to this. I would like R to create unique combinations of columns so the data can be transformed to wide format. I have tried melt.data

Use keywords from dataframe to detect if any present in another dataframe or string

阅读更多关于 Use keywords from dataframe to detect if any present in another dataframe or string

问题 I have two problems: First is... I have one dataframe with category and keywords like this: Category Keywords 0 Fruit ['apple', 'pear', 'plum', 'grape'] 1 Color ['red', 'purple', 'green'] Another dataframe like this: Summary 0 This is a basket of red apples. They are sour. 1 We found a bushel of fruit. They are red. 2 There is a peck of pears that taste sweet. 3 We have a box of plums. I want the end result like this: Category Summary 0 Fruit, Color This is a basket of red apples. They are

How do I efficiently apply pandas.Timestamp functions to a full dataframe/column?

阅读更多关于 How do I efficiently apply pandas.Timestamp functions to a full dataframe/column?

问题 Pandas is a great tool for a number of data tasks. Many functions have been streamlined to efficiently be applied to columns rather than individual cells/rows. One such function is the to_datetime() function, which I use as an example later in this question. However, there are a number of commands in pandas that, as best I can tell from the documentation, do not directly relate to dataframes. The specific function I am interested in is the pandas.Timestamp.isocalendar() function, but there

How do I efficiently apply pandas.Timestamp functions to a full dataframe/column?

阅读更多关于 How do I efficiently apply pandas.Timestamp functions to a full dataframe/column?