python - Pandas: why pandas.Series.std() is quite different from numpy.std() -


i got 2 snippets code follows.

import numpy numpy.std([766897346, 766897346, 766897346, 766897346, 766897346, 766897346, 766897346, 766897346, 766897346, 766897346]) 0 

and

import pandas pd pd.series([766897346, 766897346, 766897346, 766897346, 766897346, 766897346, 766897346, 766897346, 766897346, 766897346]).std(ddof=0) 10.119288512538814 

that's huge difference.

may ask why?

this issue indeed under discussion (link); problem seems algorithm calculating standard deviation used pandas since not numerically stable 1 used numpy.

an easy workaround apply .values series first , apply std these values; in case numpy's std used:

pd.series([766897346, 766897346, 766897346, 766897346, 766897346, 766897346, 766897346, 766897346, 766897346, 766897346]).values.std() 

which gives expected value 0.


Comments

Popular posts from this blog

Android : Making Listview full screen -

javascript - Parse JSON from the body of the POST -

javascript - How to Hide Date Menu from Datepicker in yii2 -