I have stock data and I am trying to adjust share count of a hypothetical portfolio using a dividend reinvestment plan. Every time there is a dividend distribution the current share count needs to be multiplied by the dividend amount, that product then needs to be divided by the current price of the stock to return the reinvested shares.
See the sample dataframe below:
vals = [38.46429771,
46.01712878,
38.89251387,
33.89977144,
27.66171419,
38.03103652,
31.26521149,
35.82193894,
49.47433244,
33.60148345,]
divi = [0, 0, 1, 0, 0, 1, 0, 0, 1, 0]
shares = [30,0,0,0,0,0,0,0,0,0]
sample = pd.DataFrame(data=zip(vals, divi,shares),\
index = range(0,10), columns=['Open', 'Dividends', 'share_count'])
Open Dividends share_count
0 38.464298 0 30
1 46.017129 0 0
2 38.892514 1 0
3 33.899771 0 0
4 27.661714 0 0
5 38.031037 1 0
6 31.265211 0 0
7 35.821939 0 0
8 49.474332 1 0
9 33.601483 0 0
using the sample above we have a starting share count of 30 which would carry to iloc[1], iloc[2] would have a new value of 30.7713 (((30*1)/38.892)+(share_count) at the time), which would carry through to iloc[3] and iloc[4]. iloc[5] would calculate a new value and carry forward until iloc[8] where a new value would be calculated again.
I’m using this loop at the moment to calculate the share_count:
cum_val = []
x = max(sample.share_count)
t = []
for d, sc, op in zip(sample.Dividends, sample.share_count, sample.Open):
divi = (x*d)/op
t = divi+x
cum_val.append(t)
if t > x:
x = t
else:
pass
but I assume there is a more pythonic method to solve this problem. any suggestions would be greatly appreciated.
When I look at your example, I see that you need to compute new value only if the Dividends != 0
, so you can leverage that:
def fn(share_count):
def _inner(x):
nonlocal share_count
rv = share_count * x["Dividends"] / x["Open"] + share_count
share_count = rv
return rv
return _inner
starting_value = sample.loc[sample["share_count"].idxmax(), "share_count"]
mask = sample["Dividends"] > 0
sample.loc[mask, "res"] = sample[mask].apply(fn(starting_value), axis=1)
sample = sample.ffill().fillna(starting_value)
print(sample)
Prints:
Open Dividends share_count res
0 38.464298 0 30 30.000000
1 46.017129 0 0 30.000000
2 38.892514 1 0 30.771357
3 33.899771 0 0 30.771357
4 27.661714 0 0 30.771357
5 38.031037 1 0 31.580468
6 31.265211 0 0 31.580468
7 35.821939 0 0 31.580468
8 49.474332 1 0 32.218789
9 33.601483 0 0 32.218789
Assuming only the first share_count is initially non-zero, this can be changed into a vectorized cumprod
:
df['out'] = (df['Open']
.add(1).div(df['Open'])
.mask(df['Dividends'].eq(0), 1)
.cumprod()
.mul(df['share_count'].iloc[0])
)
Output:
Open Dividends share_count out
0 38.464298 0 30 30.000000
1 46.017129 0 0 30.000000
2 38.892514 1 0 30.771357
3 33.899771 0 0 30.771357
4 27.661714 0 0 30.771357
5 38.031037 1 0 31.580468
6 31.265211 0 0 31.580468
7 35.821939 0 0 31.580468
8 49.474332 1 0 32.218789
9 33.601483 0 0 32.218789