How to fit “Negative Binomial” Distribution on a histogram using ggplot2()?

Question

I am working with a dataset that I believe follows a “Negative Binomial” distribution. However, when I fit the Negative Binomial distribution, it turns out to be a poor fit. To explore further, I simulated a Negative Binomial distribution, but even on the simulated data, the overlaying distribution does not provide a good fit.

Here is my simulated data:

library(ggplot2)
library(MASS)  
library(fitdistrplus)
# Generating negative binomial random numbers
n <- 1000  # Number of random numbers
size <- 5  # Number of successes
prob <- 0.3  # Probability of success

# Generating negative binomial random numbers
negative_binomial <- rnbinom(n, size, prob)
xx <- data.frame(negative_binomial)

I want to create a histogram with an overlay of the ‘Negative Binomial‘ distribution on this data. Let’s assume that I was given this data, so I had to estimate the parameters of the distribution using fitdist().

fit <- fitdistr(negative_binomial,densfun = "negative binomial")
ggplot(data = xx, aes(negative_binomial)) +
  geom_histogram(
    aes(y = ..density..),
    bins = 18, color = "black", fill = "lightblue") +
  stat_function(fun = dnbinom ,
    args = list(mu = fit$estimate[2] , size = fit$estimate[1]),
    color = "red", size = 1)

Question: Despite knowing that the simulated data is Negative Binomial, why does the overlaying distribution provide such a poor fit to the data? What did I do wrong?

Leave a Comment Cancel reply