Skip to content Skip to sidebar Skip to footer

Re-index Dataframe By New Range Of Dates

I have a data frame containing a number of observations: date colour orders 2014-10-20 red 7 2014-10-21 red 10 2014-10-20 yellow 3 I would like

Solution 1:

Starting from your exampe dataframe:

In [51]:dfOut[51]:datecolourorders02014-10-20     red712014-10-21     red1022014-10-20  yellow3

If you want to reindex on both 'date' and 'colour', one possibility is to set both as the index (a multi-index):

In [52]: df = df.set_index(['date', 'colour'])

In [53]: df
Out[53]:
                   orders
date       colour
2014-10-20 red          7
2014-10-21 red         10
2014-10-20 yellow       3

You can now reindex this dataframe, after you constructed to desired index:

In [54]: index = pd.date_range('20/10/2014', '22/10/2014')

In [55]: multi_index = pd.MultiIndex.from_product([index, ['red', 'yellow']])

In [56]: df.reindex(multi_index)
Out[56]:
                   orders
2014-10-20 red          7
           yellow       32014-10-21 red         10
           yellow     NaN
2014-10-22 red        NaN
           yellow     NaN

To have the same output as your example output, the index should be sorted in the second level (level=1 as it is 0-based):

In [60]:df2=df.reindex(multi_index)In [64]:df2.sortlevel(level=1)Out[64]:orders2014-10-20 red72014-10-21 red102014-10-22 redNaN2014-10-20 yellow32014-10-21 yellowNaN2014-10-22 yellowNaN

A possible way to generate the multi-index automatically would be (with your original frame):

pd.MultiIndex.from_product([pd.date_range(df['date'].min(), df['date'].max(), freq='D'), 
                            df['colour'].unique()])

Another way would be to use resample for each group of colors:

In [77]: df = df.set_index('date')

In [78]: df.groupby('colour').resample('D')

This is simpler, but this does not give you the full range of dates for each colour, only the range of dates that is available for that colour group.

Post a Comment for "Re-index Dataframe By New Range Of Dates"