python - How to get a dataframe index based on subselection -


i have dataframe (szen_df) , select portion of dataframe assign dataframe (orb_df). when try , index of subselected dataframe still has whole index of original dataframe. level 0 index of new dataframe. eg.

start = datetime(2007, 1, 25, 12, 49, 0) end = datetime(2007, 1, 25, 14, 30, 0) orb_df = szen_df.loc[start:end] 

orb_df shows:

enter image description here

if query index of new dataframe has dates of old dataframe.

orb_df.index.levels[0] 

shows:

datetimeindex(['2007-01-25 00:00:00', '2007-01-25 00:10:00',            '2007-01-25 00:20:00', '2007-01-25 00:30:00',            '2007-01-25 00:40:00', '2007-01-25 00:50:00',            '2007-01-25 01:00:00', '2007-01-25 01:10:00',            '2007-01-25 01:20:00', '2007-01-25 01:30:00',             ...            '2007-01-25 22:20:00', '2007-01-25 22:30:00',            '2007-01-25 22:40:00', '2007-01-25 22:50:00',            '2007-01-25 23:00:00', '2007-01-25 23:10:00',            '2007-01-25 23:20:00', '2007-01-25 23:30:00',            '2007-01-25 23:40:00', '2007-01-25 23:50:00'],           dtype='datetime64[ns]', name=u'time', length=144, freq=none, tz=none) 

has 144 elements. according subselection should have have 11 elements. need index starts @ 2007-01-25 12:50:00 , ends @ 2007-01-25 14:30:00. in other words level 0 index of new subselection.

here 1 way it. first reset_index(level='pos') break down multi-level index , use set_index('pos', append=true) rebuild multi-level index.

import pandas pd import numpy np  # simulate data np.random.seed(0) multi_index = pd.multiindex.from_product([pd.date_range('2007-02-01 00:00:00', periods=100, freq='10min'), ['left', 'center', 'right']], names=['time', 'pos'])  szen_df = pd.dataframe(np.random.randn(300, 3), index=multi_index, columns=['lat', 'lon', 'szen'])   out[48]:                                 lat     lon    szen time                pos                            2007-02-01 00:00:00 left    1.7641  0.4002  0.9787                     center  2.2409  1.8676 -0.9773                     right   0.9501 -0.1514 -0.1032 2007-02-01 00:10:00 left    0.4106  0.1440  1.4543                     center  0.7610  0.1217  0.4439                     right   0.3337  1.4941 -0.2052 2007-02-01 00:20:00 left    0.3131 -0.8541 -2.5530                     center  0.6536  0.8644 -0.7422                     right   2.2698 -1.4544  0.0458 2007-02-01 00:30:00 left   -0.1872  1.5328  1.4694                     center  0.1549  0.3782 -0.8878                     right  -1.9808 -0.3479  0.1563 2007-02-01 00:40:00 left    1.2303  1.2024 -0.3873                     center -0.3023 -1.0486 -1.4200                     right  -1.7063  1.9508 -0.5097 ...                            ...     ...     ... 2007-02-01 15:50:00 left   -0.4367 -1.6430 -0.4061                     center -0.5353  0.0254  1.1542                     right   0.1725  0.0211  0.0995 2007-02-01 16:00:00 left    0.2274 -1.0167 -0.1148                     center  0.3088 -1.3708  0.8657                     right   1.0814 -0.6314 -0.2413 2007-02-01 16:10:00 left   -0.8782  0.6994 -1.0612                     center -0.2225 -0.8589  0.0510                     right  -1.7942  1.3265 -0.9646 2007-02-01 16:20:00 left    0.0599 -0.2125 -0.7621                     center -0.8878  0.9364 -0.5256                     right   0.2712 -0.8015 -0.6472 2007-02-01 16:30:00 left    0.4722  0.9304 -0.1753                     center -1.4219  1.9980 -0.8565                     right  -1.5416  2.5944 -0.4040  [300 rows x 3 columns]  start_time = '2007-02-01 12:50:00' end_time = '2007-02-01 14:30:00' orb_df = szen_df.reset_index(level='pos').loc[start_time:end_time].set_index('pos', append=true)  orb_df.index  out[50]:  multiindex(levels=[[2007-02-01 12:50:00, 2007-02-01 13:00:00, 2007-02-01 13:10:00, 2007-02-01 13:20:00, 2007-02-01 13:30:00, 2007-02-01 13:40:00, 2007-02-01 13:50:00, 2007-02-01 14:00:00, 2007-02-01 14:10:00, 2007-02-01 14:20:00, 2007-02-01 14:30:00], ['center', 'left', 'right']],            labels=[[0, 0, 0, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 6, 6, 6, 7, 7, 7, 8, 8, 8, 9, 9, 9, 10, 10, 10], [1, 0, 2, 1, 0, 2, 1, 0, 2, 1, 0, 2, 1, 0, 2, 1, 0, 2, 1, 0, 2, 1, 0, 2, 1, 0, 2, 1, 0, 2, 1, 0, 2]],            names=['time', 'pos'])   orb_df.index.levels[0]  out[59]:  datetimeindex(['2007-02-01 12:50:00', '2007-02-01 13:00:00',                '2007-02-01 13:10:00', '2007-02-01 13:20:00',                '2007-02-01 13:30:00', '2007-02-01 13:40:00',                '2007-02-01 13:50:00', '2007-02-01 14:00:00',                '2007-02-01 14:10:00', '2007-02-01 14:20:00',                '2007-02-01 14:30:00'],               dtype='datetime64[ns]', name='time', freq=none, tz=none) 

Comments

Popular posts from this blog

Android : Making Listview full screen -

javascript - Parse JSON from the body of the POST -

javascript - How to Hide Date Menu from Datepicker in yii2 -