ENH: Add the negative binomial distribution to rand_distr.#1296
ENH: Add the negative binomial distribution to rand_distr.#1296WarrenWeckesser wants to merge 6 commits into
Conversation
dhardy
left a comment
There was a problem hiding this comment.
The negative binomial makes no sense for r=0. But we can generalise to p=0, since our output is floating-point which has a representation for infinity?
The code style looks good.
As for the implementation, all I can say is that it correlates with the mentioned reference, 10.1007/978-1-4613-8643-8. I tried comparing with 10.1002/0471715816 (Johnson 2005, Univariate Discrete Distributions), but my understanding of statistics is lacking. Perhaps @saona-raimundo would be willing to take a look?
If p=0, the probability mass function of the distribution would be |
|
Hi! Note that Wikipedia accepts Regarding the implementation, I confirm it corresponds to the citation. @dhardy, I will check your reference 10.1002/0471715816 to see if there are improved algorithms proposed. |
| // and saved in the NegativeBinomial instance, because it | ||
| // depends on just the parameters `r` and `p`. We have to | ||
| // create a new Poisson instance for each variate generated. | ||
| Poisson::<F>::new(gamma.sample(rng)).unwrap().sample(rng) |
There was a problem hiding this comment.
Instead of unwrap, one should take care of the case where gamma.sample(rng) returns a float which should not be accepted.
I suggest introducing a loop which samples until one gets a finite sample.
The Float trait has the method is_finite for this.
There was a problem hiding this comment.
Can this happen? The Gamma distribution should return strictly positive values, so Poisson::new should never fail.
There was a problem hiding this comment.
Sorry, my point was about handling infinity as the result of simulating the gamma variable.
Nothing ensures that a Gamma samples always a positive and finite float.
Not is the signature of the sample method nor in its documentation.
At some point there was a discussion about who should handle infinity out of the simulation: the library or the user. I thought the decision was that the user should handle infinity floats, maybe I am wrong. This is why I suggest handling a possible infinite value here with a loop.
There was a problem hiding this comment.
I agree, I need to fix this. If p is extremely small (e.g. 1e-40), then the scale passed to Gamma is huge (1e+40), and with such a scale, Gamma will generate samples that are infinity.
A simple loop would not be safe if we don't have a bound on how frequently infinity is generated.
There was a problem hiding this comment.
Yeah, I was not even thinking on extreme values, just the unwrap.
The thing is, if gamma samples infinity, then the Poisson "should" also infinity.
Then, instead of a loop, one should introduce an if checking if the gamma samples infinity.
If it does, return infinity directly, if it does not, sample the Poisson (created with new and unwrap).
|
The reference 10.1002/0471715816 "Univariate Discrete Distributions" is really nice! The reference If I am not mistaken, |
|
For the normal distribution, we are using tables (Ziggurat algorithm). However, I agree that the approach here is fine. |
|
Great, and thanks for the review. Then are we agreed to merge this (once the above is corrected)? I didn't review in detail. |
vks
left a comment
There was a problem hiding this comment.
Looks good, thanks! We just need to update the changelog.
Co-authored-by: Vinzent Steinberg <Vinzent.Steinberg@gmail.com>
|
Looks ready to merge @WarrenWeckesser? |
|
@dhardy, I'm still looking into the issue that @saona-raimundo raised here. I'm checking how extreme values of the parameters might break things. |
|
@WarrenWeckesser are you still working on this? |
|
I've been away from this (and most of my other open source work) for much of last year, but I haven't forgotten about it. I have a project that I need to finish up before I can get back to this. That might take a few weeks. |
|
Closing as stale. Feel free to re-open. |
No description provided.