200 people were tested, 20 of those were infected. I want to get a posterior distribution for the uncertainty associated with the probability that a person is infected.
I do this like this:
n<-200
s<-20
p<-seq(0,0.3,0.001)
dp<-dbeta(p, s+1, n-s+1)
But then when I plot it, I don’t know how to interpret the y axis and summary results:
plot(p, dp, type="l")
> summary(dp)
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.000000 0.000011 0.032438 3.322259 3.841204 18.820899
So there is a 10% chance of…..something being 18.82? Or? What does this summary tell me?
Also, what is the difference between the first plot and the plot below?
plot(density(dp))
I want to get a posterior distribution for the uncertainty associated with the probability that a person is infected. + interpreting densities + differentiating kernel density estimates and theoretical densities. I think this is better placed on stats.stackexchange.com/questions
Hello, so for clarification: Your first plot shows the values obtained by ‘dbeta(p, s+1, n-s+1)’ against sequence p. So naturally the x-axis describes sequence p (i. e. the distance from 0 to 3 in 300 steps). Acordingly the y-axis shows the values of the vector dp; This is the density of the beta-distribution at values p (like imagine you have a normal distribution with mean 0 and sd 1 and you would ask for the density at 0). The second shows just the density (probability density function) of your beta-distribution; So the probability of a variable falling within a particular range of values.
This is at least vague in many places and lacks interpretation @Markus_J.
Remember that the area under a probability density curve must equal 1. Your x values are mainly in a small range, which is why you get high values of y.
OP, the interpretation of the plot dp versus p is that the area under the curve over an interval (i.e., the integral of the function over the interval) is the probability, i.e. the rational degree of belief, that p falls in that interval. p must be somewhere between 0 and 1, so the integral over the interval (0, 1) must be exactly 1. One can talk about that stuff at greater length, but that’s the most concise summary.