DatetimeIndex reverse-lookup memo as I’m not quite sure
- 1. Introduction
- 2. Data used for case studies and versions of pandas and Python
- 3. Case studies
- 3.1. Confirming DataFrame configuration; index, columns, values and name
- 3.2. 指定した行番号のindex名を表示, index
- 3.3. Print the index name of the specified line number; index
- 3.4. Print column name of specified column number; columns
- 3.5. Extract “year”; year
- 3.6. Extract “month”; month
- 3.7. Extract “day”; day
- 3.8. Extract “date”; date
- 3.9. Extract “week number of year”; week
- 3.10. Extract “weekday”; weekday
- 3.11. Extract “day of the week” as a string; day_name ()
- 3.12. Extract “hour”; hour
- 3.13. Extract “minute”; minute
- 3.14. Extract “second”; second
- 3.15. Extract “hour”, “minute” and “second”; time
- 3.16. Extract a row by specifying the time and datetime with datetime.time type; loc
- 3.17. Extract value by specifying index and column
- 3.18. Confirm and set the time zone; tzinfo(), tz_localize(), tz_convert()
- 3.19. Convert python datetime.datetime object to pandas datetime (Timestamp); to_datetime ()
Introduction
I almost always feel confused about pandas.DatetimeIndex when I set DatetimeIndex in pandas.
More, I tend to drown in the sea of superabundant information, so I decided to make personal notes here.
Click the table of contents to jump to the relevant item!
For more details, please check official pandas website.
Data used for case studies and versions of pandas and Python
This post uses the following DataFrame as variable name “df”.
First of all, let’s make sure the data type of index is DatetimeIndex.
print(df.index) DatetimeIndex(['2020-01-02 03:04:05', '2021-06-07 08:09:10','2022-11-12 13:14:15'], dtype='datetime64[ns]', name='YMDHMS', freq=None)
Good good.
Then versions of pandas and Python are 0.24.2 and 3.7.2 respectively.
* ‘Extract “day of the week” as a string; day_name ()’ was curried out in pandas 1.0.5 (added on June 26, 2020)
しときます。
Well then, prepare in advance;
import pandas as pd import datetime
Case studies
Confirming DataFrame configuration; index, columns, values and name
print(df.index) #confirm index DatetimeIndex(['2020-01-02 03:04:05', '2021-06-07 08:09:10', '2022-11-12 13:14:15'], dtype='datetime64[ns]', name='YMDHMS', freq=None)
print(df.columns) #confirm columns Index(['a', 'b', 'c'], dtype='object')df.values
print(df.index.name) #confirm index name 'YMDHMS'
print(df.values) #confirm values array([['a0', 'b0', 'c0'], ['a1', 'b1', 'c1'], ['a2', 'b2', 'c2']], dtype=object)
print(df.columns.name) # confirm column names None
指定した行番号のindex名を表示, index
Print the index name of the specified line number; index
print(df.index[0]) #row 1 Timestamp('2020-01-02 03:04:05')
print(df.index[1]) #row 2 Timestamp('2021-06-07 08:09:10')
print(df.index[0:2]) #row 1-2 DatetimeIndex(['2020-01-02 03:04:05', '2021-06-07 08:09:10'], dtype='datetime64[ns]', name='YMDHMS', freq=None)
Print column name of specified column number; columns
print(df.columns[0]) #column 1 'a'
print(df.columns[1]) #column 2 'b'
print(df.columns[0:2]) #columns 1-2 Index(['a', 'b'], dtype='object')
Extract “year”; year
print(df.index.year) #years in the DatetimeIndex Int64Index([2020, 2021, 2022], dtype='int64', name='YMDHMS')
print(df.index[0].year) #year at row 1 2020
Extract “month”; month
print(df.index.month) #months in the DatetimeIndex Int64Index([1, 6, 11], dtype='int64', name='YMDHMS')
print(df.index[0].month) #month at row 1 1
Extract “day”; day
print(df.index.day) #days in the DatetimeIndex Int64Index([2, 7, 12], dtype='int64', name='YMDHMS')
print(df.index[0].day) #day at row 1 2
Extract “date”; date
print(df.index.date) #date in the DatetimeIndex array([datetime.date(2020, 1, 2), datetime.date(2021, 6, 7), datetime.date(2022, 11, 12)], dtype=object)
display(df.index[0].date) #date at row 1. Oops, this doesn't work. <built-in method date of Timestamp object at 0x115d0fa48> display(df.index.date[0]) #The first item of the date in the DatatimeIndex. Hum! datetime.date(2020, 1, 2)
Extract “week number of year”; week
print(df.index.week) #week number of DatetimeIndex Int64Index([1, 23, 45], dtype='int64', name='YMDHMS')
display(df.index[0].week) #week number of row 1
Extract “weekday”; weekday
Monday;0, Tuesday;1, Wednesday;2, Thursday;3, Friday;4, Saturday;5, SUnday;6
print(df.index.weekday) #weekday in the DatetimeIndex(Thursday, Monday and Saturday in this case) Int64Index([3, 0, 5], dtype='int64', name='YMDHMS')
print(df.index[0].weekday) #weekday in row 1. Oops, this doesn't work. <built-in method weekday of Timestamp object at 0x115d0fa48>
print(df.index.weekday[0]) #the first item of weekdays in the DatetimeIndex. (3;Thursday in this case)
Extract “day of the week” as a string; day_name ()
*This was curried out in pandas 1.0.5 (added on June 26, 2020)
Use “day_name ()” to extract “day of the week” as a string when pandas version is 1.0.0 or more, since “weekday_name” was abolished after pandas 1.0.0.
print(df.index.day_name()) Index(['Thursday', 'Monday', 'Saturday'], dtype='object', name='YMDHMS')
print(df.index[0].day_name()) Thursday
Extract “hour”; hour
print(df.index.hour) #hours in the DatetimeIndex Int64Index([3, 8, 13], dtype='int64', name='YMDHMS')
print(df.index[0].hour) #hour in row 1
Extract “minute”; minute
print(df.index.minute) #minutes in the DatetimeIndex Int64Index([4, 9, 14], dtype='int64', name='YMDHMS')
print(df.index[0].minute) #minute in row 1
Extract “second”; second
print(df.index.second) #seconds in the DatetimeIndex Int64Index([5, 10, 15], dtype='int64', name='YMDHMS')
print(df.index[0].second) #second in row 1
Extract “hour”, “minute” and “second”; time
display(df.index[0].time) #Attempt to get "hour", "minute" and "second" in row 1, but this doesn't work. <built-in method time of Timestamp object at 0x115d0fa48> print(df.index.time[0]) #This way works. 03:04:05
Extract a row by specifying the time and datetime with datetime.time type; loc
print(df.loc[datetime.time(3,4,5)]) # Specify the datetime-type-time of (03:04:05) with datetime.time a b c YMDHMS 2020-01-02 03:04:05 a0 b0 c0
print(df.loc[datetime.datetime(2020, 1, 2, 3, 4, 5 )]) # Specify "2020-01-02 03:04:05" with datetime.datetime a a0 b b0 c c0 Name: 2020-01-02 03:04:05, dtype: object
Note if it is written in a way like the below;
print(df.loc[datetime.date(2020, 1, 2)])
or
print(df.loc[datetime.datetime(2020, 1, 2, 3, 4)])
those does not include all element of datetime, it gets an error…
In these cases, converting datetime.date into a string goes well even if not including all element of datetime.
print(df.loc[str(datetime.date(2020, 1, 2))])# converting datetime.date into a string goes well. a b c YMDHMS 2020-01-02 03:04:05 a0 b0 c0
But in the case of datetime.datetime, an error occurs when converting partial datetime into a string.
print(df.loc[str(datetime.datetime(2020, 1, 2, 3, 4))]) #this doesn't work
Extract a row by specifying the time and datetime with strings; loc
#Specify the date and time with strings print(df.loc['2020']) print(df.loc['2020-01']) print(df.loc['2020-01-02']) print(df.loc['2020-01-02 03']) print(df.loc['2020-01-02 03:04']) #without loc goes well.... print(df['2020']) print(df['2020-01']) print(df['2020-01-02']) print(df['2020-01-02 03']) print(df['2020-01-02 03:04']) #All of the above gives the following results. (Since only one row meets these specifications, results are the same) a b c YMDHMS 2020-01-02 03:04:05 a0 b0 c0 print(df.loc['2020-01-02 03:04:05']) #specify whole datetime #or print(df['2020-01-02 03:04:05']) a a0 b b0 c c0 Name: 2020-01-02 03:04:05, dtype: object
With time alone, it is not accepted.
print(df.loc['03:04:05']) #This gets an error.
So, when extracting just by specifying the time…
df.loc[datetime.date(3, 4, 5)]
This way works.
Extract value by specifying index and column
print(df['2020']['a']) #index and column including 2020 YMDHMS 2020-01-02 03:04:05 a0 Name: a, dtype: object
print(df.loc['2020']['a']) #loc, as well YMDHMS 2020-01-02 03:04:05 a0 Name: a, dtype: object
print(df.loc[datetime.datetime(2020, 1, 2, 3, 4,5)]['a']) #with datetime.datetime a0
display(df.iloc[0][0]) #iloc when specifying the position by row number and column number a0
display(df.iloc[0][0:2]) #with slice a a0 b b0 Name: 2020-01-02 03:04:05, dtype: object
display(df.at[datetime.datetime(2020,1,2, 3, 4, 5), 'a']) #with at a0
display(df[['b', 'c']]) #multiple columns work b c YMDHMS 2020-01-02 03:04:05 b0 c0 2021-06-07 08:09:10 b1 c1 2022-11-12 13:14:15 b2 c2
display(df.iat[0, 1]) #with iat, by row number and column number b0
display(df.at[df.index[0], 'a']) #row number and column name a0
display(df.at[df['2020'], 'a']) #This gives an error...
Confirm and set the time zone; tzinfo(), tz_localize(), tz_convert()
print(df.index.tzinfo) #confirm the time zone None #not set yet #set the time zone df.index = df.index.tz_localize('Asia/Tokyo') df.index = df.index.tz_convert('Asia/Tokyo') print(df.index.tzinfo) Asia/Tokyo #done! print(df) #You can see that +09:00 has been added. a b c YMDHMS 2020-01-02 03:04:05+09:00 a0 b0 c0 2021-06-07 08:09:10+09:00 a1 b1 c1 2022-11-12 13:14:15+09:00 a2 b2 c2
Convert python datetime.datetime object to pandas datetime (Timestamp); to_datetime ()
mae = datetime.datetime(2020,1,2,3,4,5) print(mae) 2020-01-02 03:04:05 print(type(mae)) <class 'datetime.datetime'> ato = pd.to_datetime(mae) print(ato) 2020-01-02 03:04:05 print(type(ato)) <class 'pandas._libs.tslibs.timestamps.Timestamp'>
ちょっと広告です
https://business.xserver.ne.jp/
https://www.xdomain.ne.jp/
★LOLIPOP★
.tokyo
MuuMuu Domain!