series

Python Pandas removing substring using another column

本秂侑毒 提交于 2019-11-29 06:48:14
I've tried searching around and can't figure out an easy way to do this, so I'm hoping your expertise can help. I have a pandas data frame with two columns import numpy as np import pandas as pd pd.options.display.width = 1000 testing = pd.DataFrame({'NAME':[ 'FIRST', np.nan, 'NAME2', 'NAME3', 'NAME4', 'NAME5', 'NAME6'], 'FULL_NAME':['FIRST LAST', np.nan, 'FIRST LAST', 'FIRST NAME3', 'FIRST NAME4 LAST', 'ANOTHER NAME', 'LAST NAME']}) which gives me FULL_NAME NAME 0 FIRST LAST FIRST 1 NaN NaN 2 FIRST LAST NAME2 3 FIRST NAME3 NAME3 4 FIRST NAME4 LAST NAME4 5 ANOTHER NAME NAME5 6 LAST NAME NAME6

Python Reindex Producing Nan

耗尽温柔 提交于 2019-11-29 06:47:04
问题 Here is the code that I am working with: import pandas as pd test3 = pd.Series([1,2,3], index = ['a','b','c']) test3 = test3.reindex(index = ['f','g','z']) So originally every thing is fine and test3 has an index of 'a' 'b' 'c' and values 1,2,3. But then when I got to reindex test3 I get that my values 1 2 3 are lost. Why is that? The desired output would be: f 1 g 2 z 3 回答1: The docs are clear on this behaviour : Conform Series to new index with optional filling logic, placing NA/NaN in

highcharts: dynamically define colors in pie chart

随声附和 提交于 2019-11-29 06:11:51
问题 I'm trying to dynamically define color for each seria depending of their type. Below is my code which doesn't work but showing what I'm trying to do. I would like to define colour for certain type eg: if type = 'IS' then color = '#FFCACA' I cannot expect that ret will have all types so I need to know which types are returned in ret and then asociate color to certain type. How to do that? this is code since data received: success: function (ret) { $(function () { var chart; $(document).ready

Python Pandas iterate over rows and access column names

眉间皱痕 提交于 2019-11-29 03:02:16
I am trying to iterate over the rows of a Python Pandas dataframe. Within each row of the dataframe, I am trying to to refer to each value along a row by its column name. Here is what I have: import numpy as np import pandas as pd df = pd.DataFrame(np.random.rand(10,4),columns=list('ABCD')) print df A B C D 0 0.351741 0.186022 0.238705 0.081457 1 0.950817 0.665594 0.671151 0.730102 2 0.727996 0.442725 0.658816 0.003515 3 0.155604 0.567044 0.943466 0.666576 4 0.056922 0.751562 0.135624 0.597252 5 0.577770 0.995546 0.984923 0.123392 6 0.121061 0.490894 0.134702 0.358296 7 0.895856 0.617628 0

Pandas mask / where methods versus NumPy np.where

老子叫甜甜 提交于 2019-11-29 01:46:03
问题 I often use Pandas mask and where methods for cleaner logic when updating values in a series conditionally. However, for relatively performance-critical code I notice a significant performance drop relative to numpy.where. While I'm happy to accept this for specific cases, I'm interested to know: Do Pandas mask / where methods offer any additional functionality, apart from inplace / errors / try-cast parameters? I understand those 3 parameters but rarely use them. For example, I have no idea

How to get the number of the most frequent value in a column?

拟墨画扇 提交于 2019-11-28 20:50:32
问题 I have a data frame and I would like to know how many times a given column has the most frequent value. I try to do it in the following way: items_counts = df['item'].value_counts() max_item = items_counts.max() As a result I get: ValueError: cannot convert float NaN to integer As far as I understand, with the first line I get series in which the values from a column are used as key and frequency of these values are used as values. So, I just need to find the largest value in the series and,

assigning column names to a pandas series

心不动则不痛 提交于 2019-11-28 19:24:46
I have a pandas series object x Ezh2 2 Hmgb 7 Irf1 1 I want to save this as a dataframe with column names Gene and Count respectively I tried x_df = pd.DataFrame(x,columns = ['Gene','count']) but it does not work.The final form I want is Gene Count Ezh2 2 Hmgb 7 Irf1 1 Can you suggest how to do this You can create a dict and pass this as the data param to the dataframe constructor: In [235]: df = pd.DataFrame({'Gene':s.index, 'count':s.values}) df Out[235]: Gene count 0 Ezh2 2 1 Hmgb 7 2 Irf1 1 Alternatively you can create a df from the series, you need to call reset_index as the index will be

Extract values in Pandas value_counts()

我的梦境 提交于 2019-11-28 15:37:59
问题 Say we have used pandas dataframe[column].value_counts() which outputs: apple 5 sausage 2 banana 2 cheese 1 How do you extract the values in the order same as shown above from max to min ? e.g: [apple,sausage,banana,cheese] 回答1: Try this: dataframe[column].value_counts().index.tolist() ['apple', 'sausage', 'banana', 'cheese'] 回答2: #!/usr/bin/env python import pandas as pd # Make example dataframe df = pd.DataFrame([(1, 'Germany'), (2, 'France'), (3, 'Indonesia'), (4, 'France'), (5, 'France'),

How to get the first column of a pandas DataFrame as a Series?

▼魔方 西西 提交于 2019-11-28 14:23:52
问题 I tried: x=pandas.DataFrame(...) s = x.take([0], axis=1) And s gets a DataFrame, not a Series. 回答1: >>> import pandas as pd >>> df = pd.DataFrame({'x' : [1, 2, 3, 4], 'y' : [4, 5, 6, 7]}) >>> df x y 0 1 4 1 2 5 2 3 6 3 4 7 >>> s = df.ix[:,0] >>> type(s) <class 'pandas.core.series.Series'> >>> =========================================================================== UPDATE If you're reading this after June 2017, ix has been deprecated in pandas 0.20.2, so don't use it. Use loc or iloc

Highcharts - Get crossing point of crossing series

本小妞迷上赌 提交于 2019-11-28 12:59:58
I am currently trying to extract the points of multiple crossings of series (a,b,c,d) of a specific series (x). I can't seem to find any function that can aid me in this task. My best bet is to measure the distance of every single point in x with every single point in a,b,c,d... and assume when the distance reaches under some threshold, the point must be a crossing point. I think this approach is far too computational heavy and seems "dirty". I believe there must be easier or better ways, even perhaps functions within highcharts own API. I have searched various sources and sites, but I can't