python - Fitting a Binomial distribution with pymc raises ZeroProbability error for certain FillValues -


i'm not sure if found bug in pymc. seems fitting binomial missing data can produce zeroprobability error depending on chosen fill_value masks missing data. maybe i'm using wrongly. tried following example current master branch github. i'm aware of bug concerning binomial distributions in pymc 2.3.4, seems different issue.

i fitted binomial distribution pymc , worked expected:

import scipy sp import pymc  def make_model(observed_values):     p = pymc.uniform('p', lower = 0.0, upper = 1.0, value = 0.1)     values = pymc.binomial('values', n = 10* sp.ones_like(observed_values), p = p * sp.ones_like(observed_values),\                              value = observed_values, observed = true, plot = false)     values = pymc.binomial('values', n = 10, p = p,\                              value = observed_values, observed = true, plot = false)     return locals()  sp.random.seed(0) observed_values = sp.random.binomial(n = 10.0, p = 0.1, size = 100)  m1 = pymc.mcmc(make_model(observed_values)) m1.sample(iter=10000, burn=1000, thin=10) pymc.matplot.plot(m1) m1.summary() 

output:

  [-----------------100%-----------------] 10000 of 10000 complete in 0.7 sec plotting p    p:            mean             sd               mc error        95% hpd interval           ------------------------------------------------------------------           0.093            0.007            0.0              [ 0.081  0.107]             posterior quantiles:            2.5             25              50              75             97.5           |---------------|===============|===============|---------------|           0.08             0.088           0.093          0.097         0.106 

now, tried similar situation difference 1 observed value missing:

mask = sp.zeros_like(observed_values) mask[0] = true masked_values = sp.ma.masked_array(observed_values, mask = mask, fill_value = 999999)  m2 = pymc.mcmc(make_model(masked_values)) m2.sample(iter=10000, burn=1000, thin=10) pymc.matplot.plot(m2) m2.summary() 

unexpectedly, got zeroprobability error:

--------------------------------------------------------------------------- zeroprobability                           traceback (most recent call last) <ipython-input-16-4f945f269628> in <module>() ----> 1 m2 = pymc.mcmc(make_model(masked_values))       2 m2.sample(iter=10000, burn=1000, thin=10)       3 pymc.matplot.plot(m2)       4 m2.summary()  <ipython-input-12-cb8707bb911f> in make_model(observed_values)       4 def make_model(observed_values):       5     p = pymc.uniform('p', lower = 0.0, upper = 1.0, value = 0.1) ----> 6     values = pymc.binomial('values', n = 10* sp.ones_like(observed_values), p = p * sp.ones_like(observed_values),                             value = observed_values, observed = true, plot = false)       7     values = pymc.binomial('values', n = 10, p = p,                             value = observed_values, observed = true, plot = false)       8     return locals()  /home/fabian/anaconda/lib/python2.7/site-packages/pymc/distributions.pyc in __init__(self, *args, **kwds)     318                     logp_partial_gradients=logp_partial_gradients,     319                     dtype=dtype, --> 320                     **arg_dict_out)     321      322     new_class.__name__ = name  /home/fabian/anaconda/lib/python2.7/site-packages/pymc/pymcobjects.pyc in __init__(self, logp, doc, name, parents, random, trace, value, dtype, rseed, observed, cache_depth, plot, verbose, isdata, check_logp, logp_partial_gradients)     773         if check_logp:     774             # check initial value --> 775             if not isinstance(self.logp, float):     776                 raise valueerror(     777                     "stochastic " +  /home/fabian/anaconda/lib/python2.7/site-packages/pymc/pymcobjects.pyc in get_logp(self)     930                     (self._value, self._parents.value))     931             else: --> 932                 raise zeroprobability(self.errmsg)     933      934         return logp  zeroprobability: stochastic values's value outside support, or forbids parents' current values. 

however, if change fill value in masked array 1, fitting works again:

masked_values2 = sp.ma.masked_array(observed_values, mask = mask, fill_value = 1)  m3 = pymc.mcmc(make_model(masked_values2)) m3.sample(iter=10000, burn=1000, thin=10) pymc.matplot.plot(m3) m3.summary() 

output:

[-----------------100%-----------------] 10000 of 10000 complete in 2.1 sec plotting p  p:          mean             sd               mc error        95% hpd interval         ------------------------------------------------------------------         0.092            0.007            0.0              [ 0.079  0.105]           posterior quantiles:          2.5             25              50              75             97.5         |---------------|===============|===============|---------------|         0.079            0.088           0.092          0.097         0.105   values:          mean             sd               mc error        95% hpd interval         ------------------------------------------------------------------         1.15             0.886            0.029                  [ 0.  3.]           posterior quantiles:          2.5             25              50              75             97.5         |---------------|===============|===============|---------------|         0.0              1.0             1.0            2.0           3.0 

is bug or there problem model? help!

i asked question pymc developers on github , there got answer fonnesbeck (https://github.com/pymc-devs/pymc/issues/47#issuecomment-129301002):

this because filled missing values 999999, outside support of variable. need give valid value, not 1 occurs in non-missing data."

to make sure missing value not in non-missing data following suggested:

you can give non-integer value in order avoid problem cite. example, if fill with, say, 1.5 should work.

(...) pymc calculates log-probability @ first iteration, , therefore values inserted missing values @ first iteration have valid. if give discrete values floating point value, should end getting truncated when converted integers, work."


Comments

Popular posts from this blog

Fail to load namespace Spring Security http://www.springframework.org/security/tags -

sql - MySQL query optimization using coalesce -

unity3d - Unity local avoidance in user created world -