Index and slice of time series
Indexes
The index method of time series is also applicable to Dataframe, and in the time series, because it is sorted according to the time sequence, it is unnecessary to consider the order problem.
The basic location index uses the same method as the list:
from datetime import datetime rng = pd.date_range('2017/1','2017/3') ts = pd.Series(np.random.rand(len(rng)), index = rng) print(ts.head()) print(ts[0]) print(ts[:2]) >>> 2017-01-01 0.107736 2017-01-02 0.887981 2017-01-03 0.712862 2017-01-04 0.920021 2017-01-05 0.317863 Freq: D, dtype: float64 0.107735945027 2017-01-01 0.107736 2017-01-02 0.887981 Freq: D, dtype: float64
In addition to the base location index, there are time series label indexes:
from datetime import datetime rng = pd.date_range('2017/1','2017/3') ts = pd.Series(np.random.rand(len(rng)), index = rng) print(ts['2017/1/2']) print(ts['20170103']) print(ts['1/10/2017']) print(ts[datetime(2017,1,20)]) >>> 0.887980757812 0.712861778966 0.788336674948 0.93070380011
Section
The operation of slicing is mentioned in the basic location index of the index section above, which is the same as that of Series according to the index index index principle, and is also included at the end.
rng = pd.date_range('2017/1','2017/3',freq = '12H') ts = pd.Series(np.random.rand(len(rng)), index = rng) print(ts['2017/1/5':'2017/1/10']) >>> 2017-01-05 00:00:00 0.462085 2017-01-05 12:00:00 0.778637 2017-01-06 00:00:00 0.356306 2017-01-06 12:00:00 0.667964 2017-01-07 00:00:00 0.246857 2017-01-07 12:00:00 0.386956 2017-01-08 00:00:00 0.328203 2017-01-08 12:00:00 0.260853 2017-01-09 00:00:00 0.224920 2017-01-09 12:00:00 0.397457 2017-01-10 00:00:00 0.158729 2017-01-10 12:00:00 0.501266 Freq: 12H, dtype: float64 # Here we can pass in the month and get the slice of the whole month directly print(ts['2017/2'].head()) >>> 2017-02-01 00:00:00 0.243932 2017-02-01 12:00:00 0.220830 2017-02-02 00:00:00 0.896107 2017-02-02 12:00:00 0.476584 2017-02-03 00:00:00 0.515817 Freq: 12H, dtype: float64
Time series of duplicate indexes
dates = pd.DatetimeIndex(['1/1/2015','1/2/2015','1/3/2015','1/4/2015','1/1/2015','1/2/2015']) ts = pd.Series(np.random.rand(6), index = dates) print(ts) # We can check whether the value or index is repeated through is unique print(ts.is_unique,ts.index.is_unique) >>> 2015-01-01 0.300286 2015-01-02 0.603865 2015-01-03 0.017949 2015-01-04 0.026621 2015-01-01 0.791441 2015-01-02 0.526622 dtype: float64 True False
According to the above results, it can be seen that in the above time series, there is a case where the index (ts.index. Is'unique) is repeated but the value (ts.is'unique) is not repeated.
We can solve the problem of index duplication by averaging the corresponding values of duplicate indexes in time series:
print(ts.groupby(level = 0).mean()) # Group through groupby. Repeat values are processed with average values >>> 2015-01-01 0.545863 2015-01-02 0.565244 2015-01-03 0.017949 2015-01-04 0.026621 dtype: float64
The original release time is: December 17, 2018
The author of this paper: the salt fish of Huangjin
This article comes from yunqi community partners“ Salted fish Plath ”, you can pay attention to“
xianyuplus1995 WeChat public address