Adding gaussian noise to a dataset of floating points and save it (python)

百般思念 提交于 2020-12-29 03:02:39

问题


I'm working on classification problem where i need to add different levels of gaussian noise to my dataset and do classification experiments until my ML algorithms can't classify the dataset. unfortunately i have no idea how to do that. any advise or coding tips on how to add the gaussian noise?


回答1:


You can follow these steps:

  1. Load the data into a pandas dataframe clean_signal = pd.read_csv("data_file_name")
  2. Use numpy to generate Gaussian noise with the same dimension as the dataset.
  3. Add gaussian noise to the clean signal with signal = clean_signal + noise

Here's a reproducible example:

import pandas as pd
# create a sample dataset with dimension (2,2)
# in your case you need to replace this with 
# clean_signal = pd.read_csv("your_data.csv")   
clean_signal = pd.DataFrame([[1,2],[3,4]], columns=list('AB'), dtype=float) 
print(clean_signal)
"""
print output: 
    A    B
0  1.0  2.0
1  3.0  4.0
"""
import numpy as np 
mu, sigma = 0, 0.1 
# creating a noise with the same dimension as the dataset (2,2) 
noise = np.random.normal(mu, sigma, [2,2]) 
print(noise)

"""
print output: 
array([[-0.11114313,  0.25927152],
       [ 0.06701506, -0.09364186]])
"""
signal = clean_signal + noise
print(signal)
"""
print output: 
          A         B
0  0.888857  2.259272
1  3.067015  3.906358
""" 

Overall code without the comments and print statements:

import pandas as pd
# clean_signal = pd.read_csv("your_data.csv")
clean_signal = pd.DataFrame([[1,2],[3,4]], columns=list('AB'), dtype=float) 
import numpy as np 
mu, sigma = 0, 0.1 
noise = np.random.normal(mu, sigma, [2,2])
signal = clean_signal + noise

To save the file back to csv

signal.to_csv("output_filename.csv", index=False)


来源:https://stackoverflow.com/questions/46093073/adding-gaussian-noise-to-a-dataset-of-floating-points-and-save-it-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!