pivot

Python pandas教程推荐

我的梦境 提交于 2020-07-29 07:50:08
简介 下面内容为github作者的Pandas学习笔记,目前是我看到最好的资料,没有之一。搬砖: https://github.com/datawhalechina/joyful-pandas/blob/master/README.md ​ github.com Joyful-Pandas 【本教程与Pandas官方最新发行版本保持同步,当前版本: v-1.0.3 】 【注意】使用教程前请务必将Pandas升级到最新版本,否则可能会有代码运行报错 一、写作初衷 在使用Pandas之前,几乎所有的大型表格处理问题都是用xlrd/xlwt和python循环实现,虽然这已经几乎能完成一切的需求,但其缺点也显而易见,其一就是速度问题,其二就是代码的复用性几乎为0。 曾经也尝试过去零星地学Pandas,但不得不说这个包实在太过庞大,每次使用总觉得盲人摸象,每个函数的参数也很多,学习的路线并不是十分平缓。如果你刚刚上手使用Pandas,那么在碎片的学习过程中,报错是常常发生的事,并且很难修(因为不理解内部的操作),即使修好了下次又不会,令人有些沮丧。 2019秋季,我偶然找到了一本完全关于Pandas的书,Theodore Petrou所著的Pandas Cookbook,现在可在网上下到pdf,不过现在还没有中文版。寒假开始后,立即快速地过了一遍,发现之前很多搞不清的概念得到了较好的解答

仅需一行代码,轻松实现Excel中的3大高级功能!

不问归期 提交于 2020-07-26 23:35:38
Excel是一个无处不在的数据处理、分析工具,大多数人或多或少都使用过Excel,而且一旦你掌握了它的使用技巧,你会打开另外一扇窗! 此外,也有人认为,具有无限潜力的Python也非常有挑战性。在这篇文章中,我们将探讨在Excel中能够完成,但是在Python中能够更轻松实现的三件事! 从导入panda开始,并基于工作簿中需要用的工作表加载两个数据帧。两个列的定义为 sales 和 states 。 import pandas as pd sales = pd.read_excel('https://github.com/datagy/mediumdata/raw/master/pythonexcel.xlsx', sheet_name = 'sales') states = pd.read_excel('https://github.com/datagy/mediumdata/raw/master/pythonexcel.xlsx', sheet_name = 'states') 将我们的数据集导入Pandas数据帧。 让我们想象一下,我们对数据帧运行了.head()方法,如下所示: print(sales.head()) 我们可以将其与Excel中的数据进行比较: 我们可以看到显示的数据与Excel显示数据的方式相对类似,但有一些关键的区别: Excel从第1行开始

leetcode -- 二分查找

只愿长相守 提交于 2020-07-25 00:01:00
之前在数据结构搜索那章说过,折半(二分)一般适用于有序列表的查找,但是在写的时候需要注意算法的细节。我在leetcode上总共写了八道应用了二分算法的题目,从中总结一下写二分算法需要注意什么样的细节 目录 一般二分查找 注意查找位置 半有序 总结 一般二分查找 leetcode,第704题, binary search , Given a sorted (in ascending order) integer array nums of n elements and a target value, write a function to search target in nums. If target exists, then return its index, otherwise return -1. Example 1: Input: nums = [-1,0,3,5,9,12], target = 9 Output: 4 Explanation: 9 exists in nums and its index is 4 这道题就是最简单的二分查找算法,我当时的解法也是二分法, public int search(int[] nums, int target) { int start = 0, end = nums.length - 1; while(start <= end)

create pivot table with aggregates without a join

自作多情 提交于 2020-07-22 21:38:59
问题 I think I am trying to do something that cannot be done. I am trying to create a pivot table, simultaneously doing two pivots by aggregating off two different columns. I have created a much simplified example to make the point more understandable. CREATE TABLE two_aggregate_pivot ( ID INT, category CHAR(1), value INT ) INSERT INTO dbo.two_aggregate_pivot ( ID, category, value ) VALUES (1, 'A', 100), (1, 'B', 97), (1, 'D', NULL), (2, 'A', 86), (2, 'C', 83), (2, 'D', 81) I can pivot to get the

create pivot table with aggregates without a join

ε祈祈猫儿з 提交于 2020-07-22 21:37:16
问题 I think I am trying to do something that cannot be done. I am trying to create a pivot table, simultaneously doing two pivots by aggregating off two different columns. I have created a much simplified example to make the point more understandable. CREATE TABLE two_aggregate_pivot ( ID INT, category CHAR(1), value INT ) INSERT INTO dbo.two_aggregate_pivot ( ID, category, value ) VALUES (1, 'A', 100), (1, 'B', 97), (1, 'D', NULL), (2, 'A', 86), (2, 'C', 83), (2, 'D', 81) I can pivot to get the

create pivot table with aggregates without a join

流过昼夜 提交于 2020-07-22 21:37:04
问题 I think I am trying to do something that cannot be done. I am trying to create a pivot table, simultaneously doing two pivots by aggregating off two different columns. I have created a much simplified example to make the point more understandable. CREATE TABLE two_aggregate_pivot ( ID INT, category CHAR(1), value INT ) INSERT INTO dbo.two_aggregate_pivot ( ID, category, value ) VALUES (1, 'A', 100), (1, 'B', 97), (1, 'D', NULL), (2, 'A', 86), (2, 'C', 83), (2, 'D', 81) I can pivot to get the

Getting the top 6 items in a column to pivot to a row in SQL

独自空忆成欢 提交于 2020-07-15 10:00:17
问题 I'm having trouble getting a column to pivot in SQL. I'd like to pivot the top 6 results from one column into a row. The column I'm pivoting can have less than or more than 6 results to start with but I want to ignore anything beyond the top 6. My Table1 looks like this: ID | GroupID | CodeNum ---------------------- 1 | 1 | 111 2 | 1 | 222 3 | 1 | 333 4 | 1 | 444 5 | 1 | 555 6 | 1 | 666 7 | 1 | 777 8 | 2 | 111 9 | 2 | 888 10 | 3 | 999 And I want my output to look like this: GroupID | Code1 |

Dynamic Pivot Needed with Row_Number()

人盡茶涼 提交于 2020-07-13 01:24:19
问题 I am using Microsoft SQL Server Management Studio 2008. I have data that looks like this: Client ID Value ------------------------------- 12345 Did Not Meet 12345 Did Not Meet 12345 Partially Met 12346 Partially Met 12346 Partially Met 12346 Partially Met 12347 Partially Met 12347 Partially Met 12347 Did Not Meet 12347 Met and I'd like the results to appear like this: Client ID Value1 Value2 Value3 Value4 12345 Did Not Meet Did Not Meet Partially Met NULL 12346 Partially Met Partially Met

How to pivot a dataframe

早过忘川 提交于 2020-07-10 07:34:22
问题 What is pivot? How do I pivot? Is this a pivot? Long format to wide format? I've seen a lot of questions that ask about pivot tables. Even if they don't know that they are asking about pivot tables, they usually are. It is virtually impossible to write a canonical question and answer that encompasses all aspects of pivoting.... ... But I'm going to give it a go. The problem with existing questions and answers is that often the question is focused on a nuance that the OP has trouble

How to solve 'The column label 'Avg_Threat_Score' is not unique.'? suing pandas

微笑、不失礼 提交于 2020-06-28 05:53:11
问题 When running the code I am facing following error. error - The column label 'Avg_Threat_Score' is not unique. I was creating a pivot table and wanted to sort the value from high to low. pt = df.pivot_table(index = 'User Name',values = ['Threat Score', 'Score'], aggfunc = { 'Threat Score': np.mean, 'Score' :[np.mean, lambda x: len(x.dropna())] }, margins = False) new_col =['User Name Count', 'AVG_TH_Score', 'Avg_Threat_Score'] pt.columns = [new_col] #befor this code is working, after that now