Nullify the matched string contained in one Data frame column if that is a match with another Data frame column

偶尔善良 提交于 2020-01-25 04:02:14

问题


I need to do a script that read a CSV and delete the characters that appears in another cell. I.e:

example

In the line 4, in "calle" column, appear the '28011', that appear in column "cod_postal" I need to delete '28011' from "calle" column but keep the rest untouched

I tried some simple scripts and researching but I can't reach what I need.

EDIT: Yeah, the image is a example, I have a full CSV with 2k lines

EDIT2: I tried something like this but I can't get it to work..

#-*-coding: latin1 -*-
import csv
import pandas

with open ('C:/trabajos/dani_cliente.csv') as csvfile:
    readcsv = csv.reader (csvfile, delimiter = ';')
    for row in readcsv:
        df ['cod_postal'] = np.where(df["cod_postal"]) < threshold, 
0,alt_value)
        print (row)    

EDIT 3: Trying also this, can get to work but only for specified character, and I would need every "cod_postal" in the CSV

#-*-coding: latin1 -*-

with open("C:/trabajos/extraccion_copia2.csv", 'r') as infile, \
     open("C:/trabajos/dani_cliente.test.csv", 'w') as outfile:


   # for row in infile
    #readcsv = csv.reader(infile, delimiter=';')
    data = infile.read()
    data = data.replace("28011", " ")
    outfile.write(data)

But using the full CSV instead of the sample one, I get the following error

Traceback (most recent call last): File "C:/Users/dalonso/PycharmProjects/untitled/switchtest.py", line 18, in data = infile.read() File "C:\Users\dalonso\AppData\Local\Programs\Python\Python37\lib\encodings\cp1252.py", line 23, in decode return codecs.charmap_decode(input,self.errors,decoding_table)[0] UnicodeDecodeError: 'charmap' codec can't decode byte 0x90 in position 577860: character maps to undefined


回答1:


I think I understand the question... If it's just one value, you can simply use

df.loc[4,'cod_postal'] = 0
#if you want, can use NaN, but suggest just keeping 0. 

or

df['cod_postal].iloc[4] = 0

If there's some specific guideline, use np.where() or pd.where()

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.where.html

np.where(condition, true_val, false_val) 
np.where(condition, true_val) # or if you want untouched in else condition

df['cod_postal'] = np.where(df["cod_postal"] < threshold, 0, alt_value)

next time you ask, please enter the dataframe/your code in your question



来源:https://stackoverflow.com/questions/58222415/nullify-the-matched-string-contained-in-one-data-frame-column-if-that-is-a-match

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!