Say I have a dataframe with 3 columns: Date, Ticker, Value (no index, at least to start with). I have many dates and many tickers, but each (ticker, date) tupl
You can use pivot to convert the dataframe into date-ticker table, here is an example:
create the test data first:
import pandas as pd
import numpy as np
import random
from itertools import product
dates = pd.date_range(start="2013-12-01", periods=10).to_native_types()
ticks = "ABCDEF"
pairs = list(product(dates, ticks))
random.shuffle(pairs)
pairs = pairs[:-5]
values = np.random.rand(len(pairs))
dates, ticks = zip(*pairs)
df = pd.DataFrame({"date":dates, "tick":ticks, "value":values})
convert the dataframe by pivot format:
df2 = df.pivot(index="date", columns="tick", values="value")
fill NaN:
df2 = df2.fillna(method="ffill")
call diff() method:
df2.diff()
here is what df2 looks like:
tick A B C D E F
date
2013-12-01 0.077260 0.084008 0.711626 0.071267 0.811979 0.429552
2013-12-02 0.106349 0.141972 0.457850 0.338869 0.721703 0.217295
2013-12-03 0.330300 0.893997 0.648687 0.628502 0.543710 0.217295
2013-12-04 0.640902 0.827559 0.243816 0.819218 0.543710 0.190338
2013-12-05 0.263300 0.604084 0.655723 0.299913 0.756980 0.135087
2013-12-06 0.278123 0.243264 0.907513 0.723819 0.506553 0.717509
2013-12-07 0.960452 0.243264 0.357450 0.160799 0.506553 0.194619
2013-12-08 0.670322 0.256874 0.637153 0.582727 0.628581 0.159636
2013-12-09 0.226519 0.284157 0.388755 0.325461 0.957234 0.810376
2013-12-10 0.958412 0.852611 0.472012 0.832173 0.957234 0.723234