add random dates in 400K pandas dataframe

久未见 提交于 2020-08-05 18:51:23

问题


Trying to append a fourth column to the following dataframe of length 465017.

     0        1     2
0   228055  231908  1
1   228056  228899  1

Running following syntax

x["Fake_date"]= fake.date(pattern="%Y-%m-%d", end_datetime=None)

returns

     0        1    2    Fake_date
0   228055  231908  1   1980-10-12
1   228056  228899  1   1980-10-12

but I want different random dates on 465017 rows for an instance,

      0       1    2    Fake_date
0   228055  231908  1   1980-10-11
1   228056  228899  1   1980-09-12

How do I randomize this?


回答1:


Without the faker package, you can do this:

import numpy as np
import pandas as pd

x["Fake_date"] = np.random.choice(pd.date_range('1980-01-01', '2000-01-01'), len(x))

>>> x
        0       1  2  Fake_date
0  228055  231908  1 1999-12-08
1  228056  228899  1 1989-01-25

replacing the 2 date strings in pd.date_range() with the minimum and maximum date that you want to choose random dates from



来源:https://stackoverflow.com/questions/49522397/add-random-dates-in-400k-pandas-dataframe

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!