pandas

Reassign index of a dataframe

空扰寡人 提交于 2021-02-05 12:25:05
问题 I have the following dataframe: Month 1 -0.075844 2 -0.089111 3 0.042705 4 0.002147 5 -0.010528 6 0.109443 7 0.198334 8 0.209830 9 0.075139 10 -0.062405 11 -0.211774 12 -0.109167 1 -0.075844 2 -0.089111 3 0.042705 4 0.002147 5 -0.010528 6 0.109443 7 0.198334 8 0.209830 9 0.075139 10 -0.062405 11 -0.211774 12 -0.109167 Name: Passengers, dtype: float64 As you can see numbers are listed twice from 1-12 / 1-12, instead, I would like to change the index to 1-24. The problem is that when plotting

Pandas concat flips all my values in the DataFrame

断了今生、忘了曾经 提交于 2021-02-05 12:23:41
问题 I have a dataframe called 'running_tally' list jan_to jan_from 0 LA True False 1 NY False True I am trying to append new data to it in the form of a single column dataframe called 'new_data' list 0 HOU 1 LA I concat these two dfs based on their 'list' column for further processing, but immediately after I do that all the boolean values unexpectedly flip. running_tally = pd.concat([running_tally,new_data]).groupby('list',as_index=False).first() the above statement will produce: list jan_to jan

Pandas concat flips all my values in the DataFrame

半城伤御伤魂 提交于 2021-02-05 12:22:00
问题 I have a dataframe called 'running_tally' list jan_to jan_from 0 LA True False 1 NY False True I am trying to append new data to it in the form of a single column dataframe called 'new_data' list 0 HOU 1 LA I concat these two dfs based on their 'list' column for further processing, but immediately after I do that all the boolean values unexpectedly flip. running_tally = pd.concat([running_tally,new_data]).groupby('list',as_index=False).first() the above statement will produce: list jan_to jan

For loop doesn't work for web scraping Google search in python

混江龙づ霸主 提交于 2021-02-05 12:21:26
问题 I'm working on web-scraping Google search with a list of keywords. The nested For loop for scraping a single page works well. However, the other for loop searching keywords in the list does not work as I intended to which scrap the data for each searching result. The results didn't get the search outcome of the first two keywords, but it got only the result of the last keyword. Here is the code: browser = webdriver.Chrome(r"C:\...\chromedriver.exe") df = pd.DataFrame(columns = ['ceo', 'value'

Transform Pandas string column containing unicodes to ascii to load urls

孤街浪徒 提交于 2021-02-05 12:12:31
问题 I have a pandas DataFrame containing a column with Wikipedia urls, that I want to load. However, some strings won't load because they contain unicodes. For example, 'Kruskal %E2%80%93 Wallis_one-way_analysis_of_variance' raises the following PageError: Page id "Cauchy%E2%80%93Schwarz_inequality" does not match any pages. Try another id! Is there a way to turn all unicodes into ascii? So in this case, I need a function that can create a new column: old column new column Cauchy%E2%80%93Schwarz

Iterate and extract tables from web saving as excel file in Python

╄→尐↘猪︶ㄣ 提交于 2021-02-05 11:30:12
问题 I want to iterate and extract table from the link here, then save as excel file. How can I do that? Thank you. My code so far: import pandas as pd import requests from bs4 import BeautifulSoup from tabulate import tabulate url = 'http://zjj.sz.gov.cn/ztfw/gcjs/xmxx/jgysba/' res = requests.get(url) soup = BeautifulSoup(res.content,'lxml') print(soup) New update: from requests import post import json import pandas as pd import numpy as np headers = { "User-Agent": "Mozilla/5.0 (Windows NT 10.0;

Saving a Pandas dataframe in fixed format with different column widths

佐手、 提交于 2021-02-05 11:29:06
问题 I have a pandas dataframe (df) that looks like this: A B C 0 1 10 1234 1 2 20 0 I want to save this dataframe in a fixed format. The fixed format I have in mind has different column width and is as follows: "one space for column A's value then a comma then four spaces for column B's values and a comma and then five spaces for column C's values" Or symbolically: -,----,----- My dataframe above (df) would look like the following in my desired fixed format: 1, 10, 1234 2, 20, 0 How can I write a

Combining columns of dataframe [duplicate]

南楼画角 提交于 2021-02-05 11:26:05
问题 This question already has answers here : how to collapse columns in pandas on null values? (6 answers) Closed 7 months ago . I have dataframe like this: c1 c2 c3 0 a NaN NaN 1 NaN b NaN 2 NaN NaN c 3 NaN b NaN 4 a NaN NaN I want to combine these three columns like this : c4 0 a 1 b 2 c 3 b 4 a Here is the code to make the above data frame: a = pd.DataFrame({ 'c1': ['a',np.NaN,np.NaN,np.NaN,'a'], 'c2': [np.NaN,'b',np.NaN,'b',np.NaN], 'c3': [np.NaN,np.NaN,'c',np.NaN,np.NaN] }) 回答1: bfill ing is

Pandas first 5 and last 5 rows in single iloc operation

拥有回忆 提交于 2021-02-05 11:18:26
问题 I need to check df.head() and df.tail() many times. When using df.head(), df.tail() jupyter notebook dispalys the ugly output. Is there any single line command so that we can select only first 5 and last 5 rows: something like: df.iloc[:5 | -5:] ? Test example: df = pd.DataFrame(np.random.rand(20,2)) df.iloc[:5] Update Ugly but working ways: df.iloc[(np.where( (df.index < 5) | (df.index > len(df)-5)))[0]] or, df.iloc[np.r_[np.arange(5), np.arange(df.shape[0]-5, df.shape[0])]] 回答1: Try look at

Pandas first 5 and last 5 rows in single iloc operation

风格不统一 提交于 2021-02-05 11:17:06
问题 I need to check df.head() and df.tail() many times. When using df.head(), df.tail() jupyter notebook dispalys the ugly output. Is there any single line command so that we can select only first 5 and last 5 rows: something like: df.iloc[:5 | -5:] ? Test example: df = pd.DataFrame(np.random.rand(20,2)) df.iloc[:5] Update Ugly but working ways: df.iloc[(np.where( (df.index < 5) | (df.index > len(df)-5)))[0]] or, df.iloc[np.r_[np.arange(5), np.arange(df.shape[0]-5, df.shape[0])]] 回答1: Try look at