问题
I have a date column in a pandas DF with Julian dates. How can I convert these Julian dates to mm-dd-yyyy format.
Sample data
ORG CHAIN_NBR SEQ_NBR INT_STATUS BLOCK_CODE_1 DATA_BLOCK_CODE_1
0 523 1 0 A C 2012183
1 523 2 1 I A 2013025
2 521 3 1 A H 2007067
3 513 4 1 D H 2001046
4 513 5 1 8 I 2006075
I was using jd2gcal
function but it's not working. I was also trying to write a code like this but of no use.
for i,row in amna.iterrows():
amna['DATE_BLOCK_CODE_1'] = datetime.datetime.strptime(row['DATE_BLOCK_CODE_1'], '%Y%j')
desired Output:
ORG CHAIN_NBR SEQ_NBR INT_STATUS BLOCK_CODE_1 DATA_BLOCK_CODE_1
0 523 1 0 A C mm-dd-yyyy
1 523 2 1 I A mm-dd-yyyy
2 521 3 1 A H mm-dd-yyyy
3 513 4 1 D H mm-dd-yyyy
4 513 5 1 8 I mm-dd-yyyy
Please help me with this.
回答1:
julian = df.DATA_BLOCK_CODE_1.str[4:].str.extract("([1-9][0-9]?[0-9]?)")
df["DATA_BLOCK_CODE_1"] = df.DATA_BLOCK_CODE_1.str[:4] + "-" + julian
df['DATA_BLOCK_CODE_1'] = pd.to_datetime(df['DATA_BLOCK_CODE_1'], format='%Y-%j')
The regular expression would require the digits to start any digit but 0. (I assume DATA_BLOCK_CODE_1
is of string type.)
回答2:
Let's try something like this
df['New Date'] = (pd.to_datetime((df.DATA_BLOCK_CODE_1 // 1000).astype(str)) +
pd.to_timedelta(df.DATA_BLOCK_CODE_1 % 1000, unit='D'))
print(df)
ORG CHAIN_NBR SEQ_NBR INT_STATUS BLOCK_CODE_1 DATA_BLOCK_CODE_1 New Date
0 523 1 0 A C 2012183 2012-07-02
1 523 2 1 I A 2013025 2013-01-26
2 521 3 1 A H 2007067 2007-03-09
3 513 4 1 D H 2001046 2001-02-16
4 513 5 1 8 I 2006075 2006-03-17
回答3:
The Pandas to_datetime function has an origin
parameter for julian to datetime conversion:
amna['Date'] = pd.to_datetime(amna['DATA_BLOCK_CODE_1'], unit='D', origin='julian')
print(df)
ORG CHAIN_NBR SEQ_NBR INT_STATUS BLOCK_CODE_1 DATA_BLOCK_CODE_1 Date
0 523 1 0 A C 2012183 2012-07-02
1 523 2 1 I A 2013025 2013-01-26
2 521 3 1 A H 2007067 2007-03-09
3 513 4 1 D H 2001046 2001-02-16
4 513 5 1 8 I 2006075 2006-03-17
Note that you must include the unit parameter, and it must be set to days ('D').
Edit 2020-06-27
In response to Sami Navesi's comment, if the DATA_BLOCK_CODE_1
column is integer or string, you can split the year out (convert to string) and convert the julian days separately as shown in this answer. This example assumes the DATA_BLOCK_CODE_1
is string. The results are the same as above.
amna['Date'] = [pd.to_datetime(e[:4]) + pd.to_timedelta(int(e[4:]) - 1, unit='D') for e in amna['DATA_BLOCK_CODE_1']]
来源:https://stackoverflow.com/questions/48587595/convert-julian-dates-to-normal-dates-in-a-dataframe