apply

pandas操作excel-07-数据筛选

末鹿安然 提交于 2020-02-29 14:22:10
import pandas as pd def age_18_to_30(a): return 18 <= a < 30 def level_a(s): return 85 <= s <= 100 students = pd.read_excel('D:/output.xlsx', index_col='idx') # 筛选出 年龄在18到30之间,成绩在85到100分之间的学员 students = students.loc[students['Age'].apply(age_18_to_30)].loc[students['Score'].apply(level_a)] # 上一行的写法,也可以这样写 students = students.loc[students.Age.apply(age_18_to_30)].loc[students.Score.apply(level_a)] # lambda 表达式写法 students = students.loc[students.Age.apply(lambda a : 18 <= a < 30)] \ .loc[students.Score.apply(lambda s : 85 <= s <= 100)] print(students) #books.to_excel('D:/output.xlsx') 视频链接:

Panda rolling window percentile rank

浪子不回头ぞ 提交于 2020-02-26 15:30:36
问题 I am trying to calculate the percentile rank of data by column within a rolling window. test=pd.DataFrame(np.random.randn(20,3),pd.date_range('1/1/2000',periods=20),['A','B','C']) test Out[111]: A B C 2000-01-01 -0.566992 -1.494799 0.462330 2000-01-02 -0.550769 -0.699104 0.767778 2000-01-03 -0.270597 0.060836 0.057195 2000-01-04 -0.583784 -0.546418 -0.557850 2000-01-05 0.294073 -2.326211 0.262098 2000-01-06 -1.122543 -0.116279 -0.003088 2000-01-07 0.121387 0.763100 3.503757 2000-01-08 0

apply、call和bind的区别

南楼画角 提交于 2020-02-22 00:56:08
   apply的call的区别   其实很简单,两者的区别就在于调用时的参数不一样。看一下代码: 1        var i = 40; 2 3 var a = { 4 i : 20, 5 foo : function () { 6 var i = 10; 7 return this.i; 8 }, 9 foo1 : function (param1, param2) { 10 return param1 + param2 + i; 11 } 12 } 13 14 var b = { 15 i : 5 16 } 17 18 console.log(a.foo()); //20 19 console.log((a.foo)()); //20 20 console.log((a.foo,a.foo)()); //40 21 console.log(a.foo.apply(window)); //40 22 console.log(a.foo.apply(a)); //20 23 console.log(a.foo.apply(a.foo)); //undefinded 24 console.log(a.foo1.apply(b, [2, 1])); //43 25 console.log(a.foo1.call(b, 2, 1)); //43   很显然

javascript 函数执行上下文

只愿长相守 提交于 2020-02-13 23:02:49
在js里,每个函数都有一个执行的上下文,我们可以通过this来访问。 如: 全局函数 function test(){ var local = this; } 我们发现local等于window(dom根对象),也就是说全局函数实际上是window的一个属性。 同理全局变量也是如此。 比如 var name = ‘phil’; 我们可以通过window[‘name’]或者window.name 来访问。 而当函数是某一个对象的属性的时候,该函数的上下文就是该对象。 var student = {}; student.age = 20; student.getAge = function(){ return this.age; } 当有函数嵌套的时候,事情就变得稍微复杂点了。 var seq = [1,2,3,4]; for(var i in seq){ var name = ‘phil’ + i; window.setTimeout(function(){ $('p’).apend(name); },i*1000) } 有人可能认为输出是phil1phil2phil3phil4,实际上结果是phil4phil4phil4phil4 因为函数window.setTimeout(实际上我们常常会省略掉window)的上下文实际上是window,而函数体中的name实际上就是window

map-apply-applymap

大兔子大兔子 提交于 2020-02-13 13:04:40
map-apply-applymap /*--> */ /*--> */ /*--> */ /*--> */ /*--> */ /*--> */ In [1]: import warnings import math import pandas as pd import numpy as np import matplotlib warnings.filterwarnings('ignore') pd.options.display.max_rows = 100 pd.options.display.max_columns = 100 pd.set_option('max_colwidth', 500) get_ipython().magic(u'matplotlib inline') matplotlib.style.use('ggplot') from matplotlib import pyplot as plt plt.rcParams['font.sans-serif'] = ['SimHei'] plt.rcParams['axes.unicode_minus'] = False myfont = matplotlib.font_manager.FontProperties(fname=u'simsun.ttc', size=14) In [11]: data = pd

text processing

南笙酒味 提交于 2020-02-05 22:59:27
Import libraries from nltk.corpus import stopwords from textblob import TextBlob from textblob import Word Lower casing and removing punctuations df[‘Text’] = df[‘Text’].apply(lambda x: " “.join(x.lower() for x in x.split())) df[‘Text’] = df[‘Text’].str.replace(’[^\w\s]’,”) df.Text.head(5) Removal of stop words stop = stopwords.words(‘english’) df[‘Text’] = df[‘Text’].apply(lambda x: " ".join(x for x in x.split() if x not in stop)) Spelling correction df[‘Text’] = df[‘Text’].apply(lambda x: str(TextBlob(x). correct())) Lemmatization df[‘Text’] = df[‘Text’].apply(lambda x: " ".join([Word(word).

update column value of pandas groupby().last()

会有一股神秘感。 提交于 2020-02-04 02:33:03
问题 Given dataframe: dfd = pd.DataFrame({'A': [1, 1, 2,2,3,3], 'B': [4, 5, 6,7,8,9], 'C':['a','b','c','c','d','e'] }) I can find the last C value of each A group by using dfd.groupby('A').last()['C'] However, I want to update the C values to np.nan. I don't know how to do that. Method such as: def replace(df): df['C']=np.nan return replace dfd.groupby('A').last().apply(lambda dfd: replace(dfd)) Does not work. I want the result like: dfd_result= pd.DataFrame({'A': [1, 1, 2,2,3,3], 'B': [4, 5, 6,7

update column value of pandas groupby().last()

痴心易碎 提交于 2020-02-04 02:29:46
问题 Given dataframe: dfd = pd.DataFrame({'A': [1, 1, 2,2,3,3], 'B': [4, 5, 6,7,8,9], 'C':['a','b','c','c','d','e'] }) I can find the last C value of each A group by using dfd.groupby('A').last()['C'] However, I want to update the C values to np.nan. I don't know how to do that. Method such as: def replace(df): df['C']=np.nan return replace dfd.groupby('A').last().apply(lambda dfd: replace(dfd)) Does not work. I want the result like: dfd_result= pd.DataFrame({'A': [1, 1, 2,2,3,3], 'B': [4, 5, 6,7

Correct use of sapply with Anova on multiple subsets in R

孤街醉人 提交于 2020-02-03 02:07:48
问题 I am trying to run a two-way ANOVA on multiple subsets of a data frame without having to actually subset the data as this is in-efficient Example data: DF<-structure(list(Sample = c(666L, 676L, 686L, 667L, 677L, 687L, 822L, 832L, 842L, 824L, 834L, 844L), Time = c(300L, 300L, 300L, 300L, 300L, 300L, 400L, 400L, 400L, 400L, 400L, 400L), Ploidy = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("2n", "3n"), class = "factor"), Tissue = c("muscle", "muscle", "muscle", "liver