dataframe

Value is tryin to be set on a copy of a slice from DF

风格不统一 提交于 2021-01-29 02:42:05
问题 I´m doing some stuff with pandas and python. I have the next code df = pd.read_csv("Request.csv", keep_default_na=False) df1 = df.loc[(df["Request Status"] == "Closed")] df1["Request Close-Down Actual"] = pd.to_datetime(df1["Request Close-Down Actual"], errors = 'coerce' ) df3 = df1.loc[(df1["Request Close-Down Actual"] < '2016-11-01') | (df1["Request Close-Down Actual"].isnull())] df3.set_index("Request ID", inplace = True) df3.to_csv("Request1.csv") The issue is when i run the code i

Calculation within Pandas dataframe group

我只是一个虾纸丫 提交于 2021-01-29 02:32:21
问题 I've Pandas Dataframe as shown below. What I'm trying to do is, partition (or groupby) by BlockID, LineID, WordID , and then within each group use current WordStartX - previous (WordStartX + WordWidth) to derive another column, e.g., WordDistance to indicate the distance between this word and previous word. This post Row operations within a group of a pandas dataframe is very helpful but in my case multiple columns involved (WordStartX and WordWidth). *BlockID LineID WordID WordStartX

Weird inconsistency between df.drop() and df.idxmin()

烂漫一生 提交于 2021-01-29 02:06:45
问题 I am encountering a weird issue with pandas. After some careful debugging I have found the problem, but I would like a fix, and an explanation as to why this is happening. I have a dataframe which consists of a list of cities with some distances. I have to iteratively find a city which is closest to some "Seed" city (details are not too important here). To locate the "closest" city to my seed city, i use: id_new_point = df["Time from seed"].idxmin(skipna=True) Then, I want to remove the city

Weird inconsistency between df.drop() and df.idxmin()

半城伤御伤魂 提交于 2021-01-29 02:03:39
问题 I am encountering a weird issue with pandas. After some careful debugging I have found the problem, but I would like a fix, and an explanation as to why this is happening. I have a dataframe which consists of a list of cities with some distances. I have to iteratively find a city which is closest to some "Seed" city (details are not too important here). To locate the "closest" city to my seed city, i use: id_new_point = df["Time from seed"].idxmin(skipna=True) Then, I want to remove the city

Weird inconsistency between df.drop() and df.idxmin()

最后都变了- 提交于 2021-01-29 02:03:11
问题 I am encountering a weird issue with pandas. After some careful debugging I have found the problem, but I would like a fix, and an explanation as to why this is happening. I have a dataframe which consists of a list of cities with some distances. I have to iteratively find a city which is closest to some "Seed" city (details are not too important here). To locate the "closest" city to my seed city, i use: id_new_point = df["Time from seed"].idxmin(skipna=True) Then, I want to remove the city

Filter rows of one column which is alphabet, numbers or hyphen in Pandas

荒凉一梦 提交于 2021-01-29 00:12:58
问题 Given a dataframe as follows, I need to check room column: id room 0 1 A-102 1 2 201 2 3 B309 3 4 C·102 4 5 E_1089 The correct format of this column should be numbers , alphabet or hyphen , otherwise, fill check column with incorrect The expected result is like this: id room check 0 1 A-102 NaN 1 2 201 NaN 2 3 B309 NaN 3 4 C·102 incorrect 4 5 E_1089 incorrect Here informal syntax can be: df.loc[<filter1> | (<filter2>) | (<filter3>), 'check'] = 'incorrect' Thanks for your help at advance. 回答1:

How do you combine two data frames with quantities of items in R?

跟風遠走 提交于 2021-01-28 23:40:54
问题 I am working in R using data frame containing quantities of items (which are non-negative integers). Here is an example of two data frames called BASKET1 and BASKET2 . In both cases, an item appears in the data frame only if it has a quantity of at least one. Items appear in each data frame in alphabetical order. BASKET1 Vegetable Quantity 1 Carrots 3 2 Cucumbers 2 3 Parsnips 5 4 Celery 1 5 Onions 12 BASKET2 Vegetable Quantity 1 Carrots 10 2 Onions 6 3 Rhubarb 2 I am trying to create a

How do you combine two data frames with quantities of items in R?

南笙酒味 提交于 2021-01-28 23:14:18
问题 I am working in R using data frame containing quantities of items (which are non-negative integers). Here is an example of two data frames called BASKET1 and BASKET2 . In both cases, an item appears in the data frame only if it has a quantity of at least one. Items appear in each data frame in alphabetical order. BASKET1 Vegetable Quantity 1 Carrots 3 2 Cucumbers 2 3 Parsnips 5 4 Celery 1 5 Onions 12 BASKET2 Vegetable Quantity 1 Carrots 10 2 Onions 6 3 Rhubarb 2 I am trying to create a

How to convert DFM into dataframe BUT keeping docvars?

青春壹個敷衍的年華 提交于 2021-01-28 22:16:32
问题 I am using the quanteda package and the very good tutorials that have been written about it to make various operations on paper articles. I obtained the frequency of specific words over time by selecting them in a mainwordsDFM and using textstat_frequency(mainwordsDFM, group = "Date") , then converted the result into a dataframe, and plotted with ggplot. However, I now try to plot the frequency of a word over time and by paper . The solution I used on my previous operation does not work in

write list of dataframes to multiple excel files

筅森魡賤 提交于 2021-01-28 22:13:11
问题 I have a list of dataframes. Conveniently named: list.df and the objects, which are dataframes, are just this: list.df[[1]] list.df[[2]] list.df[[3]] I am trying to use lapply to write each of the list.df objects to a seperate excel sheet. I can't use the xlsx library because my workplace disables everything Java... so I've been trying write_xlsx. I've tried the following: lapply(names(list.df), function (x) write_xlsx(list.df[[x]], file=paste(x, "xlsx", sep="."))) But nothing happens. Any