Sunday, February 10, 2019

Pandas Practice: Timeseries



In [1]:
import warnings
warnings.filterwarnings('ignore')
In [2]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
%config InlineBackend.figure_format = 'retina'
The following code shows some of the timeseries functionalities using Python Pandas Library. I learnt this through Python for Data Analysis book (2nd ed) https://www.amazon.com/Python-Data-Analysis-Wrangling-IPython/dp/1491957662

1 Date and Time Data Types

In [3]:
from datetime import datetime
In [4]:
datetime.now()
Out[4]:
datetime.datetime(2019, 2, 10, 16, 34, 9, 225043)
In [5]:
now = datetime.now()
now.year, now.month, now.day
Out[5]:
(2019, 2, 10)
In [6]:
delta = datetime.now() - datetime(1992,7,1)
In [7]:
delta
Out[7]:
datetime.timedelta(9720, 59649, 242102)
In [8]:
delta.days, delta.seconds, delta.microseconds
Out[8]:
(9720, 59649, 242102)
In [9]:
from datetime import timedelta
In [10]:
datetime.now() + timedelta(20)
Out[10]:
datetime.datetime(2019, 3, 2, 16, 34, 9, 265499)

2 Converting between String and Datetime

In [11]:
t = datetime(2019,7,1,8,30,0)
t, str(t)
Out[11]:
(datetime.datetime(2019, 7, 1, 8, 30), '2019-07-01 08:30:00')
In [12]:
t.strftime("%Y-%m-%d")
Out[12]:
'2019-07-01'
In [13]:
t.strftime("%Y-%m-%d:%H:%M:%S")
Out[13]:
'2019-07-01:08:30:00'
Now, let's convert the date '01/07/1992' as a datetime type
In [14]:
d = '01/07/1992'
datetime.strptime(d, '%d/%m/%Y')
Out[14]:
datetime.datetime(1992, 7, 1, 0, 0)
Another easy way is to use parser.parse from dateutil library
In [15]:
from dateutil.parser import parse
In [16]:
parse('July 1 1992 8:30 PM')
Out[16]:
datetime.datetime(1992, 7, 1, 20, 30)
In [17]:
parse('01/07/1992', dayfirst=True)
Out[17]:
datetime.datetime(1992, 7, 1, 0, 0)
We can also use pandas to_datetime function
In [18]:
test_dates =['July 1 1992 8:30 pm', '31/7/1992 8:30:00 pm']
In [19]:
pd.to_datetime(test_dates)
Out[19]:
DatetimeIndex(['1992-07-01 20:30:00', '1992-07-31 20:30:00'], dtype='datetime64[ns]', freq=None)
If some date are missing, then pandas library assign the missing value as 'NaT' - Not a time
In [20]:
test_dates =['July 1 1992 8:30 pm', '31/7/1992 8:30:00 pm', None]
In [21]:
pd.to_datetime(test_dates)
Out[21]:
DatetimeIndex(['1992-07-01 20:30:00', '1992-07-31 20:30:00', 'NaT'], dtype='datetime64[ns]', freq=None)

3 Timeseries basics

In [22]:
from datetime import datetime
In [23]:
dates = pd.date_range('2019-01-01', '2019-01-15')
In [24]:
dates
Out[24]:
DatetimeIndex(['2019-01-01', '2019-01-02', '2019-01-03', '2019-01-04',
               '2019-01-05', '2019-01-06', '2019-01-07', '2019-01-08',
               '2019-01-09', '2019-01-10', '2019-01-11', '2019-01-12',
               '2019-01-13', '2019-01-14', '2019-01-15'],
              dtype='datetime64[ns]', freq='D')
In [25]:
ts = pd.Series(np.random.randn(len(dates)), index = dates)
In [26]:
ts
Out[26]:
2019-01-01    0.447572
2019-01-02   -0.685998
2019-01-03   -0.420128
2019-01-04    0.961194
2019-01-05    0.796045
2019-01-06   -0.828637
2019-01-07    0.414431
2019-01-08    0.604626
2019-01-09    1.666236
2019-01-10   -0.781232
2019-01-11    0.933723
2019-01-12   -0.865820
2019-01-13   -0.260901
2019-01-14   -0.124710
2019-01-15   -0.800516
Freq: D, dtype: float64
In [27]:
ts + ts[::2]
Out[27]:
2019-01-01    0.895144
2019-01-02         NaN
2019-01-03   -0.840257
2019-01-04         NaN
2019-01-05    1.592090
2019-01-06         NaN
2019-01-07    0.828862
2019-01-08         NaN
2019-01-09    3.332471
2019-01-10         NaN
2019-01-11    1.867446
2019-01-12         NaN
2019-01-13   -0.521801
2019-01-14         NaN
2019-01-15   -1.601031
dtype: float64

Indexing , selecting, subsetting

In [28]:
ts[0]
Out[28]:
0.44757216134019756
In [29]:
ts['Jan 10 2019']
Out[29]:
-0.7812322721521562
In [30]:
ts[:4]
Out[30]:
2019-01-01    0.447572
2019-01-02   -0.685998
2019-01-03   -0.420128
2019-01-04    0.961194
Freq: D, dtype: float64
In [31]:
ts['Jan 1 2019':'Jan 10 2019']
Out[31]:
2019-01-01    0.447572
2019-01-02   -0.685998
2019-01-03   -0.420128
2019-01-04    0.961194
2019-01-05    0.796045
2019-01-06   -0.828637
2019-01-07    0.414431
2019-01-08    0.604626
2019-01-09    1.666236
2019-01-10   -0.781232
Freq: D, dtype: float64
In [32]:
ts['Jan 2019']
Out[32]:
2019-01-01    0.447572
2019-01-02   -0.685998
2019-01-03   -0.420128
2019-01-04    0.961194
2019-01-05    0.796045
2019-01-06   -0.828637
2019-01-07    0.414431
2019-01-08    0.604626
2019-01-09    1.666236
2019-01-10   -0.781232
2019-01-11    0.933723
2019-01-12   -0.865820
2019-01-13   -0.260901
2019-01-14   -0.124710
2019-01-15   -0.800516
Freq: D, dtype: float64
In [33]:
ts.truncate(before='Jan 10 2019')
Out[33]:
2019-01-10   -0.781232
2019-01-11    0.933723
2019-01-12   -0.865820
2019-01-13   -0.260901
2019-01-14   -0.124710
2019-01-15   -0.800516
Freq: D, dtype: float64

Timeseries with duplicate indices

In [34]:
d = pd.DatetimeIndex(['1/1/2019', '1/1/2019', '1/2/2019', '1/2/2019', '1/3/2019', '1/4/2019', '1/4/2019'])
In [35]:
ts = pd.Series(np.arange(len(d)), d)
In [36]:
ts
Out[36]:
2019-01-01    0
2019-01-01    1
2019-01-02    2
2019-01-02    3
2019-01-03    4
2019-01-04    5
2019-01-04    6
dtype: int64
In [37]:
ts['1/1/2019']
Out[37]:
2019-01-01    0
2019-01-01    1
dtype: int64
In [38]:
ts['1/3/2019']
Out[38]:
4
In [39]:
ts.groupby(level=0).mean()
Out[39]:
2019-01-01    0.5
2019-01-02    2.5
2019-01-03    4.0
2019-01-04    5.5
dtype: float64

4 Date Ranges, Frequencies and Shifting

In [40]:
pd.date_range('1 Jan 2019', '1/31/2019')
Out[40]:
DatetimeIndex(['2019-01-01', '2019-01-02', '2019-01-03', '2019-01-04',
               '2019-01-05', '2019-01-06', '2019-01-07', '2019-01-08',
               '2019-01-09', '2019-01-10', '2019-01-11', '2019-01-12',
               '2019-01-13', '2019-01-14', '2019-01-15', '2019-01-16',
               '2019-01-17', '2019-01-18', '2019-01-19', '2019-01-20',
               '2019-01-21', '2019-01-22', '2019-01-23', '2019-01-24',
               '2019-01-25', '2019-01-26', '2019-01-27', '2019-01-28',
               '2019-01-29', '2019-01-30', '2019-01-31'],
              dtype='datetime64[ns]', freq='D')
In [41]:
pd.date_range(start = '1 Jan 2019', periods = 10)
Out[41]:
DatetimeIndex(['2019-01-01', '2019-01-02', '2019-01-03', '2019-01-04',
               '2019-01-05', '2019-01-06', '2019-01-07', '2019-01-08',
               '2019-01-09', '2019-01-10'],
              dtype='datetime64[ns]', freq='D')
In [42]:
pd.date_range(end = '31 Jan 2019', periods = 10)
Out[42]:
DatetimeIndex(['2019-01-22', '2019-01-23', '2019-01-24', '2019-01-25',
               '2019-01-26', '2019-01-27', '2019-01-28', '2019-01-29',
               '2019-01-30', '2019-01-31'],
              dtype='datetime64[ns]', freq='D')

Frequency and date offsets

In [43]:
pd.date_range('Jan 1 2019','Jan 3 2019', freq='4h')
Out[43]:
DatetimeIndex(['2019-01-01 00:00:00', '2019-01-01 04:00:00',
               '2019-01-01 08:00:00', '2019-01-01 12:00:00',
               '2019-01-01 16:00:00', '2019-01-01 20:00:00',
               '2019-01-02 00:00:00', '2019-01-02 04:00:00',
               '2019-01-02 08:00:00', '2019-01-02 12:00:00',
               '2019-01-02 16:00:00', '2019-01-02 20:00:00',
               '2019-01-03 00:00:00'],
              dtype='datetime64[ns]', freq='4H')
In [44]:
pd.date_range('Jan 1 2019','Jan 3 2019', freq='4h30min')
Out[44]:
DatetimeIndex(['2019-01-01 00:00:00', '2019-01-01 04:30:00',
               '2019-01-01 09:00:00', '2019-01-01 13:30:00',
               '2019-01-01 18:00:00', '2019-01-01 22:30:00',
               '2019-01-02 03:00:00', '2019-01-02 07:30:00',
               '2019-01-02 12:00:00', '2019-01-02 16:30:00',
               '2019-01-02 21:00:00'],
              dtype='datetime64[ns]', freq='270T')
Let's try to get the date of second friday every month in 2019
In [45]:
pd.date_range('Jan 1 2019','Dec 31 2019', freq='WOM-2FRI')
Out[45]:
DatetimeIndex(['2019-01-11', '2019-02-08', '2019-03-08', '2019-04-12',
               '2019-05-10', '2019-06-14', '2019-07-12', '2019-08-09',
               '2019-09-13', '2019-10-11', '2019-11-08', '2019-12-13'],
              dtype='datetime64[ns]', freq='WOM-2FRI')

Shifting data

In [46]:
ts = pd.Series(np.random.randn(5), index = pd.date_range(start ='Jan 1 2019', periods=5, freq ='MS'))
ts
Out[46]:
2019-01-01    0.910424
2019-02-01   -0.862415
2019-03-01   -0.415291
2019-04-01   -1.487511
2019-05-01    0.458960
Freq: MS, dtype: float64
In [47]:
ts.shift(2)
Out[47]:
2019-01-01         NaN
2019-02-01         NaN
2019-03-01    0.910424
2019-04-01   -0.862415
2019-05-01   -0.415291
Freq: MS, dtype: float64
In [48]:
ts.shift(-2)
Out[48]:
2019-01-01   -0.415291
2019-02-01   -1.487511
2019-03-01    0.458960
2019-04-01         NaN
2019-05-01         NaN
Freq: MS, dtype: float64
In [49]:
ts.shift(2, freq='M')
Out[49]:
2019-02-28    0.910424
2019-03-31   -0.862415
2019-04-30   -0.415291
2019-05-31   -1.487511
2019-06-30    0.458960
Freq: M, dtype: float64
In [50]:
ts = pd.Series(np.random.randn(5), index = pd.date_range(start ='Jan 1 2019', periods=5, freq ='MS'))
ts2 = pd.Series(np.random.randn(5), index = pd.date_range(start ='Jan 1 2019', periods=5, freq ='MS'))
In [51]:
t = pd.DataFrame({"d":ts, "e":ts2})
In [52]:
t
Out[52]:
d e
2019-01-01 1.207880 -1.173963
2019-02-01 1.764176 -0.453463
2019-03-01 0.050691 -0.075505
2019-04-01 -0.764371 -0.763209
2019-05-01 1.028606 1.610696
In [53]:
t["d"] = t["d"].shift(1,freq='2MS')
In [54]:
t
Out[54]:
d e
2019-01-01 NaN -1.173963
2019-02-01 NaN -0.453463
2019-03-01 1.207880 -0.075505
2019-04-01 1.764176 -0.763209
2019-05-01 0.050691 1.610696

Shifting dates with offsets

In [55]:
from pandas.tseries.offsets import Day, MonthEnd
In [56]:
now = datetime.now()
now
Out[56]:
datetime.datetime(2019, 2, 10, 16, 34, 9, 703492)
In [57]:
now + 3* Day()
Out[57]:
Timestamp('2019-02-13 16:34:09.703492')
In [58]:
offset = MonthEnd()
In [59]:
offset.rollforward(now)
Out[59]:
Timestamp('2019-02-28 16:34:09.703492')
In [60]:
offset.rollback(now)
Out[60]:
Timestamp('2019-01-31 16:34:09.703492')

5 Time Zone Handling

In [61]:
import pytz
In [62]:
pytz.common_timezones[-5:]
Out[62]:
['US/Eastern', 'US/Hawaii', 'US/Mountain', 'US/Pacific', 'UTC']
In [63]:
tz = pytz.timezone('America/New_York')
tz
Out[63]:
<DstTzInfo 'America/New_York' LMT-1 day, 19:04:00 STD>
In pandas, default timezone is None
In [64]:
dates = pd.date_range('Jan 1 2019', periods=6, freq='D')
ts = pd.Series(np.random.randn(len(dates)), index = dates)
ts
Out[64]:
2019-01-01   -0.783826
2019-01-02    0.073117
2019-01-03    0.257225
2019-01-04    2.238624
2019-01-05   -1.263806
2019-01-06   -0.691353
Freq: D, dtype: float64
In [65]:
print(ts.index.tz)
None
In [66]:
pd.date_range('Jan 1 2019 8:30 pm', periods=6, freq='D', tz = 'UTC')
Out[66]:
DatetimeIndex(['2019-01-01 20:30:00+00:00', '2019-01-02 20:30:00+00:00',
               '2019-01-03 20:30:00+00:00', '2019-01-04 20:30:00+00:00',
               '2019-01-05 20:30:00+00:00', '2019-01-06 20:30:00+00:00'],
              dtype='datetime64[ns, UTC]', freq='D')
In [67]:
ts_utc = ts.tz_localize('UTC')
ts_utc
Out[67]:
2019-01-01 00:00:00+00:00   -0.783826
2019-01-02 00:00:00+00:00    0.073117
2019-01-03 00:00:00+00:00    0.257225
2019-01-04 00:00:00+00:00    2.238624
2019-01-05 00:00:00+00:00   -1.263806
2019-01-06 00:00:00+00:00   -0.691353
Freq: D, dtype: float64
Once the timeseries is localized to a particular timezone, then we can convert it to other timezone using tz_convert
In [68]:
ts_utc.tz_convert('America/New_York')
Out[68]:
2018-12-31 19:00:00-05:00   -0.783826
2019-01-01 19:00:00-05:00    0.073117
2019-01-02 19:00:00-05:00    0.257225
2019-01-03 19:00:00-05:00    2.238624
2019-01-04 19:00:00-05:00   -1.263806
2019-01-05 19:00:00-05:00   -0.691353
Freq: D, dtype: float64

6 Periods and Period Arithmetic

In [69]:
p = pd.Period(2019, freq = 'A-JAN')
p
Out[69]:
Period('2019', 'A-JAN')
In [70]:
p+10
Out[70]:
Period('2029', 'A-JAN')
In [71]:
rng = pd.period_range('Jan 1 2019', 'Dec 31 2019', freq = 'M')
In [72]:
rng
Out[72]:
PeriodIndex(['2019-01', '2019-02', '2019-03', '2019-04', '2019-05', '2019-06',
             '2019-07', '2019-08', '2019-09', '2019-10', '2019-11', '2019-12'],
            dtype='period[M]', freq='M')
In [73]:
pd.Series(np.random.randn(len(rng)), index=rng)
Out[73]:
2019-01   -0.065846
2019-02   -0.796582
2019-03    0.958372
2019-04   -1.274628
2019-05   -0.801418
2019-06   -1.076862
2019-07   -0.034526
2019-08    0.945753
2019-09    0.055703
2019-10   -1.663554
2019-11    1.101157
2019-12    1.013785
Freq: M, dtype: float64

Period conversion

In [74]:
p = pd.Period('2019', freq = 'A-Dec')
p
Out[74]:
Period('2019', 'A-DEC')
In [75]:
p.asfreq('M')
Out[75]:
Period('2019-12', 'M')
In [76]:
p.asfreq('M', how='start')
Out[76]:
Period('2019-01', 'M')

7 Resampling and Frequency Conversion

In [77]:
rng = pd.date_range('Jan 1 2019', periods = 100, freq = 'D')
In [78]:
ts = pd.Series(np.random.randn(len(rng)), index=rng)
ts.head()
Out[78]:
2019-01-01   -1.311830
2019-01-02   -0.660061
2019-01-03   -0.834177
2019-01-04    1.115807
2019-01-05   -0.494158
Freq: D, dtype: float64
In [79]:
ts.resample('M').mean()
Out[79]:
2019-01-31    0.171568
2019-02-28   -0.187548
2019-03-31   -0.120110
2019-04-30    0.057148
Freq: M, dtype: float64
In [80]:
ts.resample('M', kind='period').mean()
Out[80]:
2019-01    0.171568
2019-02   -0.187548
2019-03   -0.120110
2019-04    0.057148
Freq: M, dtype: float64

Down sampling

In [81]:
rng = pd.date_range('Jan 1 2019', periods = 20, freq = 'T')
ts = pd.Series(np.arange(len(rng)), index=rng)
ts
Out[81]:
2019-01-01 00:00:00     0
2019-01-01 00:01:00     1
2019-01-01 00:02:00     2
2019-01-01 00:03:00     3
2019-01-01 00:04:00     4
2019-01-01 00:05:00     5
2019-01-01 00:06:00     6
2019-01-01 00:07:00     7
2019-01-01 00:08:00     8
2019-01-01 00:09:00     9
2019-01-01 00:10:00    10
2019-01-01 00:11:00    11
2019-01-01 00:12:00    12
2019-01-01 00:13:00    13
2019-01-01 00:14:00    14
2019-01-01 00:15:00    15
2019-01-01 00:16:00    16
2019-01-01 00:17:00    17
2019-01-01 00:18:00    18
2019-01-01 00:19:00    19
Freq: T, dtype: int64
In [82]:
ts.resample("5T").last()
Out[82]:
2019-01-01 00:00:00     4
2019-01-01 00:05:00     9
2019-01-01 00:10:00    14
2019-01-01 00:15:00    19
Freq: 5T, dtype: int64
In [83]:
ts.resample("5T", closed='left', label = 'right').last()
Out[83]:
2019-01-01 00:05:00     4
2019-01-01 00:10:00     9
2019-01-01 00:15:00    14
2019-01-01 00:20:00    19
Freq: 5T, dtype: int64

Open-High-Low-Close (OHLC resampling)

In [84]:
ts
Out[84]:
2019-01-01 00:00:00     0
2019-01-01 00:01:00     1
2019-01-01 00:02:00     2
2019-01-01 00:03:00     3
2019-01-01 00:04:00     4
2019-01-01 00:05:00     5
2019-01-01 00:06:00     6
2019-01-01 00:07:00     7
2019-01-01 00:08:00     8
2019-01-01 00:09:00     9
2019-01-01 00:10:00    10
2019-01-01 00:11:00    11
2019-01-01 00:12:00    12
2019-01-01 00:13:00    13
2019-01-01 00:14:00    14
2019-01-01 00:15:00    15
2019-01-01 00:16:00    16
2019-01-01 00:17:00    17
2019-01-01 00:18:00    18
2019-01-01 00:19:00    19
Freq: T, dtype: int64
In [85]:
ts.resample('5T').ohlc()
Out[85]:
open high low close
2019-01-01 00:00:00 0 4 0 4
2019-01-01 00:05:00 5 9 5 9
2019-01-01 00:10:00 10 14 10 14
2019-01-01 00:15:00 15 19 15 19

Upsampling and Interpolation

In [86]:
frame = pd.DataFrame(np.random.randn(2,4), index = pd.date_range('1 Jan 2019', periods = 2, freq = 'W-WED'),
                    columns = ['A','B','C','D'])
frame
Out[86]:
A B C D
2019-01-02 -0.272316 -0.117718 -0.107153 -0.486061
2019-01-09 0.041277 1.194271 -0.238108 -0.412966
In [87]:
df_daily = frame.resample('D').asfreq()
df_daily
Out[87]:
A B C D
2019-01-02 -0.272316 -0.117718 -0.107153 -0.486061
2019-01-03 NaN NaN NaN NaN
2019-01-04 NaN NaN NaN NaN
2019-01-05 NaN NaN NaN NaN
2019-01-06 NaN NaN NaN NaN
2019-01-07 NaN NaN NaN NaN
2019-01-08 NaN NaN NaN NaN
2019-01-09 0.041277 1.194271 -0.238108 -0.412966
In [88]:
frame.resample('D').ffill()
Out[88]:
A B C D
2019-01-02 -0.272316 -0.117718 -0.107153 -0.486061
2019-01-03 -0.272316 -0.117718 -0.107153 -0.486061
2019-01-04 -0.272316 -0.117718 -0.107153 -0.486061
2019-01-05 -0.272316 -0.117718 -0.107153 -0.486061
2019-01-06 -0.272316 -0.117718 -0.107153 -0.486061
2019-01-07 -0.272316 -0.117718 -0.107153 -0.486061
2019-01-08 -0.272316 -0.117718 -0.107153 -0.486061
2019-01-09 0.041277 1.194271 -0.238108 -0.412966
ffill with limits
In [89]:
frame.resample('D').ffill(limit = 2)
Out[89]:
A B C D
2019-01-02 -0.272316 -0.117718 -0.107153 -0.486061
2019-01-03 -0.272316 -0.117718 -0.107153 -0.486061
2019-01-04 -0.272316 -0.117718 -0.107153 -0.486061
2019-01-05 NaN NaN NaN NaN
2019-01-06 NaN NaN NaN NaN
2019-01-07 NaN NaN NaN NaN
2019-01-08 NaN NaN NaN NaN
2019-01-09 0.041277 1.194271 -0.238108 -0.412966
In [90]:
frame.resample('W-THU').ffill()
Out[90]:
A B C D
2019-01-03 -0.272316 -0.117718 -0.107153 -0.486061
2019-01-10 0.041277 1.194271 -0.238108 -0.412966
In [91]:
frame["A"] = frame["A"].resample('W-THU').ffill()
frame
Out[91]:
A B C D
2019-01-02 NaN -0.117718 -0.107153 -0.486061
2019-01-09 NaN 1.194271 -0.238108 -0.412966

8 Moving Window Functions

In [92]:
rng = pd.date_range('Jan 1 2019', periods = 50, freq = 'D')
ts = pd.DataFrame(np.arange(len(rng)) + 10* (np.random.randn(len(rng))), 
                                  index=rng, columns=["a"])
ts.head()
Out[92]:
a
2019-01-01 6.630017
2019-01-02 5.674378
2019-01-03 5.784299
2019-01-04 8.663564
2019-01-05 9.416056
In [93]:
ts.shape
Out[93]:
(50, 1)
In [94]:
ts["a"].plot()
Out[94]:
<matplotlib.axes._subplots.AxesSubplot at 0x11f6901d0>
In [95]:
ts.tail()
Out[95]:
a
2019-02-15 50.570837
2019-02-16 57.933594
2019-02-17 47.741517
2019-02-18 44.181876
2019-02-19 65.310513
In [96]:
ts["a"].rolling(window=10).mean()
Out[96]:
2019-01-01          NaN
2019-01-02          NaN
2019-01-03          NaN
2019-01-04          NaN
2019-01-05          NaN
2019-01-06          NaN
2019-01-07          NaN
2019-01-08          NaN
2019-01-09          NaN
2019-01-10     4.508384
2019-01-11     4.610411
2019-01-12     4.224876
2019-01-13     4.446583
2019-01-14     3.287941
2019-01-15     5.017485
2019-01-16     3.893725
2019-01-17     6.622753
2019-01-18     8.091227
2019-01-19    10.086082
2019-01-20    12.445596
2019-01-21    14.455557
2019-01-22    16.618212
2019-01-23    19.740139
2019-01-24    23.250862
2019-01-25    23.401914
2019-01-26    24.499964
2019-01-27    24.872275
2019-01-28    28.040213
2019-01-29    30.118726
2019-01-30    32.158406
2019-01-31    32.707782
2019-02-01    34.220996
2019-02-02    33.813478
2019-02-03    33.788801
2019-02-04    35.140125
2019-02-05    38.809873
2019-02-06    41.649079
2019-02-07    40.783787
2019-02-08    40.198782
2019-02-09    38.700378
2019-02-10    39.642599
2019-02-11    39.716043
2019-02-12    38.631803
2019-02-13    40.137793
2019-02-14    40.003597
2019-02-15    40.178308
2019-02-16    40.170663
2019-02-17    41.345069
2019-02-18    42.289741
2019-02-19    45.941021
Freq: D, Name: a, dtype: float64
In [97]:
ts["a"].plot(label = "Normal")
ts["a"].rolling(window=5).median().plot(label = "Rolling 5 days average")
plt.legend()
Out[97]:
<matplotlib.legend.Legend at 0x11f762b38>
In [98]:
ts["a"].plot(label = "Normal")
ts["a"].expanding().median().plot(label = "Rolling 5 days average")
plt.legend()
Out[98]:
<matplotlib.legend.Legend at 0x11f7909b0>
In [99]:
ts["a"].plot(label = "Normal")
ts["a"].rolling(window='7D').median().plot(label = "Rolling 7 days average")
plt.legend()
Out[99]:
<matplotlib.legend.Legend at 0x121bd1358>

No comments :

Post a Comment