pandas

creating a pandas dataframe from a database query that uses bind variables

大憨熊 提交于 2021-02-07 03:29:01
问题 I'm working with an Oracle database. I can do this much: import pandas as pd import pandas.io.sql as psql import cx_Oracle as odb conn = odb.connect(_user +'/'+ _pass +'@'+ _dbenv) sqlStr = "SELECT * FROM customers" df = psql.frame_query(sqlStr, conn) But I don't know how to handle bind variables, like so: sqlStr = """SELECT * FROM customers WHERE id BETWEEN :v1 AND :v2 """ I've tried these variations: params = (1234, 5678) params2 = {"v1":1234, "v2":5678} df = psql.frame_query((sqlStr,params

Writing a formated binary file from a Pandas Dataframe

試著忘記壹切 提交于 2021-02-07 03:28:38
问题 I've seen some ways to read a formatted binary file in Python to Pandas, namely, I'm using this code that read using NumPy fromfile formatted with a structure given using dtype. import numpy as np import pandas as pd input_file_name = 'test.hst' input_file = open(input_file_name, 'rb') header = input_file.read(96) dt_header = np.dtype([('version', 'i4'), ('copyright', 'S64'), ('symbol', 'S12'), ('period', 'i4'), ('digits', 'i4'), ('timesign', 'i4'), ('last_sync', 'i4')]) header = np

Pandas pivot table: columns order and subtotals

…衆ロ難τιáo~ 提交于 2021-02-07 03:28:34
问题 I'm using Pandas 0.19. Considering the following data frame: FID admin0 admin1 admin2 windspeed population 0 cntry1 state1 city1 60km/h 700 1 cntry1 state1 city1 90km/h 210 2 cntry1 state1 city2 60km/h 100 3 cntry1 state2 city3 60km/h 70 4 cntry1 state2 city4 60km/h 180 5 cntry1 state2 city4 90km/h 370 6 cntry2 state3 city5 60km/h 890 7 cntry2 state3 city6 60km/h 120 8 cntry2 state3 city6 90km/h 420 9 cntry2 state3 city6 120km/h 360 10 cntry2 state4 city7 60km/h 740 How can I create a table

Pandas pivot table: columns order and subtotals

こ雲淡風輕ζ 提交于 2021-02-07 03:28:07
问题 I'm using Pandas 0.19. Considering the following data frame: FID admin0 admin1 admin2 windspeed population 0 cntry1 state1 city1 60km/h 700 1 cntry1 state1 city1 90km/h 210 2 cntry1 state1 city2 60km/h 100 3 cntry1 state2 city3 60km/h 70 4 cntry1 state2 city4 60km/h 180 5 cntry1 state2 city4 90km/h 370 6 cntry2 state3 city5 60km/h 890 7 cntry2 state3 city6 60km/h 120 8 cntry2 state3 city6 90km/h 420 9 cntry2 state3 city6 120km/h 360 10 cntry2 state4 city7 60km/h 740 How can I create a table

Writing a formated binary file from a Pandas Dataframe

删除回忆录丶 提交于 2021-02-07 03:28:01
问题 I've seen some ways to read a formatted binary file in Python to Pandas, namely, I'm using this code that read using NumPy fromfile formatted with a structure given using dtype. import numpy as np import pandas as pd input_file_name = 'test.hst' input_file = open(input_file_name, 'rb') header = input_file.read(96) dt_header = np.dtype([('version', 'i4'), ('copyright', 'S64'), ('symbol', 'S12'), ('period', 'i4'), ('digits', 'i4'), ('timesign', 'i4'), ('last_sync', 'i4')]) header = np

Blank line below headers created when using MultiIndex and to_excel in Python

家住魔仙堡 提交于 2021-02-07 03:27:53
问题 I am trying to save a Pandas dataframe to an excel file using the to_excel function with XlsxWriter. When I print the dataframe to the terminal then it reads as it should, but when I save it to excel and open the file, there is an extra blank line below the headers which shouldn't be there. This only happens when using MultiIndex for the headers, but I need the layered headers that it offers and I can't find a solution. Below is code from an online MultiIndex example which produces the same

Blank line below headers created when using MultiIndex and to_excel in Python

佐手、 提交于 2021-02-07 03:27:31
问题 I am trying to save a Pandas dataframe to an excel file using the to_excel function with XlsxWriter. When I print the dataframe to the terminal then it reads as it should, but when I save it to excel and open the file, there is an extra blank line below the headers which shouldn't be there. This only happens when using MultiIndex for the headers, but I need the layered headers that it offers and I can't find a solution. Below is code from an online MultiIndex example which produces the same

cleaning big data using python

£可爱£侵袭症+ 提交于 2021-02-07 03:13:59
问题 I have to clean a input data file in python. Due to typo error, the datafield may have strings instead of numbers. I would like to identify all fields which are a string and fill these with NaN using pandas. Also, I would like to log the index of those fields. One of the crudest way is to loop through each and every field and checking whether it is a number or not, but this consumes lot of time if the data is big. My csv file contains data similar to the following table: Country Count Sales

How to extract a keyword(string) from a column in pandas dataframe in python

点点圈 提交于 2021-02-07 03:10:47
问题 I have a dataframe df and it looks like this: id Type agent_id created_at 0 44525 Stunning 6 bedroom villa in New Delhi 184 2018-03-09 1 44859 Villa for sale in Amritsar 182 2017-02-19 2 45465 House in Faridabad 154 2017-04-17 3 50685 5 Hectre land near New Delhi 113 2017-09-01 4 130728 Duplex in Mumbai 157 2017-02-07 5 130856 Large plot with fantastic views in Mumbai 137 2018-01-16 6 130857 Modern Design Penthouse in Bangalore 199 2017-03-24 I've this tabular data and I'm trying to clean

How to extract a keyword(string) from a column in pandas dataframe in python

戏子无情 提交于 2021-02-07 03:09:58
问题 I have a dataframe df and it looks like this: id Type agent_id created_at 0 44525 Stunning 6 bedroom villa in New Delhi 184 2018-03-09 1 44859 Villa for sale in Amritsar 182 2017-02-19 2 45465 House in Faridabad 154 2017-04-17 3 50685 5 Hectre land near New Delhi 113 2017-09-01 4 130728 Duplex in Mumbai 157 2017-02-07 5 130856 Large plot with fantastic views in Mumbai 137 2018-01-16 6 130857 Modern Design Penthouse in Bangalore 199 2017-03-24 I've this tabular data and I'm trying to clean