series | 易学教程

Extract all numeric characters from a pandas series (all groups)

阅读更多关于 Extract all numeric characters from a pandas series (all groups)

问题 I am trying to use the str.extract('(\d+)') method on a pandas series to get the digits of a phone number that looks like: (123) 456-7890 Using this method only returns 123 but I want the output to be 1234567890 In general I want to know how to get all digits from a string without having to worry about groups. Thanks 回答1: Source DF: In [66]: x Out[66]: phone 0 (123) 456-7890 1 +321 / 555-7890 2 (111) - 666 7890 In this case it's much easier to remove all non-digits using '\D+' RegEx as it

Combine BarChart and PointChart

阅读更多关于 Combine BarChart and PointChart

问题 i got a Little "Problem", i want to create a Chart looking like this: So basically Series 1 = Normal bar Chart. Color green if it Ends before the "time max" (series2) Series 2 = just a DataPoint / Marker on top of series 1 items. I am struggling with this though... my Code: chart_TimeChart.Series.Clear(); string series_timeneeded = "Time Needed"; chart_TimeChart.Series.Add(series_timeneeded); chart_TimeChart.Series[series_timeneeded]["PixelPointWidth"] = "5"; chart_TimeChart.ChartAreas[0]

how do I calculate a rolling idxmax

阅读更多关于 how do I calculate a rolling idxmax

consider the pd.Series s import pandas as pd import numpy as np np.random.seed([3,1415]) s = pd.Series(np.random.randint(0, 10, 10), list('abcdefghij')) s a 0 b 2 c 7 d 3 e 8 f 7 g 0 h 6 i 8 j 6 dtype: int64 I want to get the index for the max value for the rolling window of 3 s.rolling(3).max() a NaN b NaN c 7.0 d 7.0 e 8.0 f 8.0 g 8.0 h 7.0 i 8.0 j 8.0 dtype: float64 What I want is a None b None c c d c e e f e g e h f i i j i dtype: object What I've done s.rolling(3).apply(np.argmax) a NaN b NaN c 2.0 d 1.0 e 2.0 f 1.0 g 0.0 h 0.0 i 2.0 j 1.0 dtype: float64 which is obviously not what I

Dynamic Flot graph - show hide series by clicking on legend text or box on graph

阅读更多关于 Dynamic Flot graph - show hide series by clicking on legend text or box on graph

I am working on dynamic flot graph with 3 series. My need is to hide/show series when clicked on legend. I have seen different examples that will work fine for static graphs but for dynamic graph, even it works first time but when graph is updated with new data values then everything is displaying with default options. once I hide the series, I want it to be hided until I click again to show it. Here's a quick example I put together for you. somePlot = null; togglePlot = function(seriesIdx) { var someData = somePlot.getData(); someData[seriesIdx].lines.show = !someData[seriesIdx].lines.show;

Convert datetime to another format without changing dtype

阅读更多关于 Convert datetime to another format without changing dtype

问题 I'm just learning Pandas myself and I have met few problems. In a DataFrame, which it was reads from a csv file, I have one column includes date data that in different format(like '%m/%d/%Y' and '%Y-%m-%d' , may be blank.) and I want to unify the format of this column. But I don't know if there are any other formats. So when I using pd.to_datetime() ,it raised some errors like format not matching and not timelike data. How can I unify the format of this column? I have converted part of that

How can I create a series of months to join sparse data to?

阅读更多关于 How can I create a series of months to join sparse data to?

I think this is a pretty common issue, but I don't know what the process is called, so I'll describe it with an example. The concept is that I want to join a sparse dataset to a complete series, such as the days of the week, months of the year, or any ordered set (for example, for ranking). Empty positions in the sparse data will show as NULL alongside the complete series. Let's say I run the following query in SQL Server to find out monthly sales. SELECT YEAR([timestamp]), MONTH([timestamp]), COUNT(*) FROM table1 WHERE YEAR([timestamp]) = YEAR(GETDATE()) GROUP BY YEAR([timestamp]), MONTH(

Using replace efficiently in pandas

阅读更多关于 Using replace efficiently in pandas

问题 I am looking to use the replace function in an efficient way in python3. The code I have is achieving the task, but is much too slow, as I am working with a large dataset. Thus, my priority is efficiency over elegancy whenever there is a tradeoff. Here is a toy of what I would like to do: import pandas as pd df = pd.DataFrame([[1,2],[3,4],[5,6]], columns = ['1st', '2nd']) 1st 2nd 0 1 2 1 3 4 2 5 6 idxDict= dict() idxDict[1] = 'a' idxDict[3] = 'b' idxDict[5] = 'c' for k,v in idxDict.items():

Is there a simple way to change a column of yes/no to 1/0 in a Pandas dataframe?

阅读更多关于 Is there a simple way to change a column of yes/no to 1/0 in a Pandas dataframe?

I read a csv file into a pandas dataframe, and would like to convert the columns with binary answers from strings of yes/no to integers of 1/0. Below, I show one of such columns ("sampleDF" is the pandas dataframe). In [13]: sampleDF.housing[0:10] Out[13]: 0 no 1 no 2 yes 3 no 4 no 5 no 6 no 7 no 8 yes 9 yes Name: housing, dtype: object Help is much appreciated! method 1 sample.housing.eq('yes').mul(1) method 2 pd.Series(np.where(sample.housing.values == 'yes', 1, 0), sample.index) method 3 sample.housing.map(dict(yes=1, no=0)) method 4 pd.Series(map(lambda x: dict(yes=1, no=0)[x], sample

Assign values to multiple columns in Pandas

阅读更多关于 Assign values to multiple columns in Pandas

问题 I have follow simple DataFrame - df : 0 0 1 1 2 2 3 Once I try to create a new columns and assign some values for them, as example below: df['col2', 'col3'] = [(2,3), (2,3), (2,3)] I got following structure 0 (col2, col3) 0 1 (2, 3) 1 2 (2, 3) 2 3 (2, 3) However, I am looking a way to get as here: 0 col2, col3 0 1 2, 3 1 2 2, 3 2 3 2, 3 回答1: Looks like solution is simple: df['col2'], df['col3'] = zip(*[(2,3), (2,3), (2,3)]) 回答2: There is a convenient solution to joining multiple series to a

Sum of series: 1^1 + 2^2 + 3^3 + … + n^n (mod m)

阅读更多关于 Sum of series: 1^1 + 2^2 + 3^3 + … + n^n (mod m)

问题 Can someone give me an idea of an efficient algorithm for large n (say 10^10) to find the sum of above series? Mycode is getting klilled for n= 100000 and m=200000 #include<stdio.h> int main() { int n,m,i,j,sum,t; scanf("%d%d",&n,&m); sum=0; for(i=1;i<=n;i++) { t=1; for(j=1;j<=i;j++) t=((long long)t*i)%m; sum=(sum+t)%m; } printf("%d\n",sum); } 回答1: Two notes: (a + b + c) % m is equivalent to (a % m + b % m + c % m) % m and (a * b * c) % m is equivalent to ((a % m) * (b % m) * (c % m)) % m As