series

How do I turn a dataframe into a series of lists?

冷暖自知 提交于 2019-11-30 04:45:35
I have had to do this several times and I'm always frustrated. I have a dataframe: df = pd.DataFrame([[1, 2, 3, 4], [5, 6, 7, 8]], ['a', 'b'], ['A', 'B', 'C', 'D']) print df A B C D a 1 2 3 4 b 5 6 7 8 I want to turn df into: pd.Series([[1, 2, 3, 4], [5, 6, 7, 8]], ['a', 'b']) a [1, 2, 3, 4] b [5, 6, 7, 8] dtype: object I've tried df.apply(list, axis=1) Which just gets me back the same df What is a convenient/effective way to do this? You can first convert DataFrame to numpy array by values , then convert to list and last create new Series with index from df if need faster solution: print (pd

Pandas mask / where methods versus NumPy np.where

白昼怎懂夜的黑 提交于 2019-11-30 04:33:05
I often use Pandas mask and where methods for cleaner logic when updating values in a series conditionally. However, for relatively performance-critical code I notice a significant performance drop relative to numpy.where . While I'm happy to accept this for specific cases, I'm interested to know: Do Pandas mask / where methods offer any additional functionality, apart from inplace / errors / try-cast parameters? I understand those 3 parameters but rarely use them. For example, I have no idea what the level parameter refers to. Is there any non-trivial counter-example where mask / where

Floor or ceiling of a pandas series in python?

我与影子孤独终老i 提交于 2019-11-30 04:18:38
I have a pandas series series . If I want to get the element-wise floor or ceiling, is there a built in method or do I have to write the function and use apply? I ask because the data is big so I appreciate efficiency. Also this question has not been asked with respect to the Pandas package. You can use NumPy's built in methods to do this: np.ceil(series) or np.floor(series) . Both return a Series object (not an array) so the index information is preserved. You could do something like this using NumPy's floor, for instance, with a dataframe : floored_data = data.apply(np.floor) Can't test it

From TimeDelta to float days in Pandas

一个人想着一个人 提交于 2019-11-30 02:29:19
问题 I have a TimeDelta column with values that look like this: 2 days 21:54:00.000000000 I would like to have a float representing the number of days, let's say here 2+21/24 = 2.875, neglecting the minutes. Is there a simple way to do this ? I saw an answer suggesting res['Ecart_lacher_collecte'].apply(lambda x: float(x.item().days+x.item().hours/24.)) But I get "AttributeError: 'str' object has no attribute 'item' " Numpy version is '1.10.4' Pandas version is u'0.17.1' The columns has originally

Extract values in Pandas value_counts()

爷,独闯天下 提交于 2019-11-29 20:20:25
Say we have used pandas' dataframe[column].value_counts() which outputs: apple 5 sausage 2 banana 2 cheese 1 How do you extract the values from this in the order shown above e.g. max to min ? [apple,sausage,banana,cheese] Try this: dataframe[column].value_counts().index.tolist() ['apple', 'sausage', 'banana', 'cheese'] #!/usr/bin/env python import pandas as pd # Make example dataframe df = pd.DataFrame([(1, 'Germany'), (2, 'France'), (3, 'Indonesia'), (4, 'France'), (5, 'France'), (6, 'Germany'), (7, 'UK'), ], columns=['groupid', 'country'], index=['a', 'b', 'c', 'd', 'e', 'f', 'g']) # What

Combine BarChart and PointChart

帅比萌擦擦* 提交于 2019-11-29 17:01:06
i got a Little "Problem", i want to create a Chart looking like this: So basically Series 1 = Normal bar Chart. Color green if it Ends before the "time max" (series2) Series 2 = just a DataPoint / Marker on top of series 1 items. I am struggling with this though... my Code: chart_TimeChart.Series.Clear(); string series_timeneeded = "Time Needed"; chart_TimeChart.Series.Add(series_timeneeded); chart_TimeChart.Series[series_timeneeded]["PixelPointWidth"] = "5"; chart_TimeChart.ChartAreas[0].AxisY.ScrollBar.Size = 10; chart_TimeChart.ChartAreas[0].AxisY.ScrollBar.ButtonStyle =

Are there functions to retrieve the histogram counts of a Series in pandas?

我与影子孤独终老i 提交于 2019-11-29 16:57:43
问题 There is a method to plot Series histograms, but is there a function to retrieve the histogram counts to do further calculations on top of it? I keep using numpy's functions to do this and converting the result to a DataFrame or Series when I need this. It would be nice to stay with pandas objects the whole time. 回答1: If your Series was discrete you could use value_counts: In [11]: s = pd.Series([1, 1, 2, 1, 2, 2, 3]) In [12]: s.value_counts() Out[12]: 2 3 1 3 3 1 dtype: int64 You can see

Convert datetime to another format without changing dtype

蹲街弑〆低调 提交于 2019-11-29 15:37:14
I'm just learning Pandas myself and I have met few problems. In a DataFrame, which it was reads from a csv file, I have one column includes date data that in different format(like '%m/%d/%Y' and '%Y-%m-%d' , may be blank.) and I want to unify the format of this column. But I don't know if there are any other formats. So when I using pd.to_datetime() ,it raised some errors like format not matching and not timelike data. How can I unify the format of this column? I have converted part of that column into datetime dtype, and it's in YYYY-mm-dd format. Can I keep the datetime dtype, and change the

Using replace efficiently in pandas

倾然丶 夕夏残阳落幕 提交于 2019-11-29 15:27:31
I am looking to use the replace function in an efficient way in python3. The code I have is achieving the task, but is much too slow, as I am working with a large dataset. Thus, my priority is efficiency over elegancy whenever there is a tradeoff. Here is a toy of what I would like to do: import pandas as pd df = pd.DataFrame([[1,2],[3,4],[5,6]], columns = ['1st', '2nd']) 1st 2nd 0 1 2 1 3 4 2 5 6 idxDict= dict() idxDict[1] = 'a' idxDict[3] = 'b' idxDict[5] = 'c' for k,v in idxDict.items(): df ['1st'] = df ['1st'].replace(k, v) Which gives 1st 2nd 0 a 2 1 b 4 2 c 6 as I desire, but it takes

Adding new HighChart Series

让人想犯罪 __ 提交于 2019-11-29 14:41:21
问题 At this code javascrip give an error $.each(JSON, function(i, array) { chart.series[i].name = array.teamName; chart.series[i].setData(array.teamPower, true); }); I must define the chart.series[i] because it say "Cannot set property 'name' of undefined" but i can't find a way in order to do this. Because it fonction runs with requestData so it came after chart determine with options function showGraph() { chart = new Highcharts.Chart(option); } chart: { renderTo: 'graphicShow', type: 'spline',