Lab Assignment 5
Which coin?
We have a box with three coins with different biases which are known. We pick a coin at random and flip it 100 times. Can we decide on the basis of the results which was the “most likely” coin which was picked?
To answer this question, let for
be the probability of getting Heads depending on the coin which was picked. Let
Heads and
Tails be the result of the experiment. The probability of that particular result with the
-th coin is
Note that the contant in front is the same, the only changing parameter is
. This number
is called the likelihood; it is the probability of obtaining the result which has been obtained if we had picked the
-th coin. The value of
for which
is maximum is called the maximum likelihood estimate (MLE) for which coin was picked. It is that choice of
for which the probability was highest was what actually occurred!
As usual, let us simulate and calculate.
coins <- c(0.3,0.5,0.7)
chosen <- sample(coins,1)
The vector coins
is that of the ’s. The chosen one is randomly picked. We carry out one observation of 100 coin flips:
heads <- rbinom(1,100,chosen)
Now, we calculate the probability of the given result for each of the coins:
lcoins <- sapply(coins,function(p){dbinom(heads,100,prob=p);})
result <- data.frame(coins,lcoins)
result
We see that the largest value is for the second coin, so our maximum likelihood estimate for the chosen coin is the second coin. Let’s check!
chosen
[1] 0.5
We also notice that the values of likelihood are between 0 and 1 (as they are probabilities) and these are often small. So one can also take the “log”-likelihood. Since is an order-preserving function, it is good enough. (Note that we will get negative values of logs, so the maximum is the one which is smallest in magnitude!)
result$llcoins <- sapply(coins,function(p){dbinom(heads,100,prob=p,log=TRUE)})
result
This shows a little more clearly that the second coin is the most likely.
We can also consider a situation where the possible values of vary continuously. In that case, how do we find the maximum likelihood estimate. We can plot:
curve(dbinom(heads,100,p=x),from=0.3,to=0.7,xlab="p",ylab="likelihood")
We see that the maximum occurs somewhere between and
.
We can also use calculus to calculate the value of where
takes a maximum where
is the value of
heads
. This is easily calculated, by solving for
to get
.
heads/100
[1] 0.57
Continuous distributions
The same problem can also be considered for (absolutely) continuous distributions. We can think of a distribution which has a continuous parameter . Now, we make a large number of observations. To find the maximum likelihood estimate for
we can calculate the probability
of the event that already took place for each value of
. The MLE is that
for which
is maximum. However, there is no “probability” for an actual value in this case! So we use the density function in this case. (This can be justified, but we will not do so here!)
The data set faithful
in R
gives the waiting times between eruptions of a water “geyser” called “Old Faithful” in USA.
faithful[1:5,]
We can try to model it using a waiting time distribution with a single parameter . The distribution density is given by
. This makes it easier to calculate the log-likelihood:
loglhood <- function(lambda){sum(dexp(faithful$waiting,rate=lambda,log=TRUE))}
Let us plot this.
x <- seq(0.001,0.1,0.001)
plot(x,sapply(x,loglhood),type='l',xlab="lambda",ylab="log-likelihood")
This seems to indicate that the MLE is somewhere in the range 0.01 to 0.02.
In fact, we can again use calculus to check that the MLE , where
is the average waiting time between eruptions.
1/mean(faithful$waiting)
[1] 0.01410496
We can do a similar analysis of faithful$eruptions
. This measures the quantum of eruption of the geyser.
hist(faithful$eruptions,breaks=15,main="Histogram of Eruptions",xlab="Duration")
This looks like a bimodal distribution with density something like where
. Let us try to fit it to such a distribution. We first define functions to calculate the log-likelihood.
bimod <- function(x,m1,m2){log(dnorm(x,m1,(m2-m1)/6)+dnorm(x,m2,(m2-m1)/6))}
loglik <- function(m1,m2){ sum(sapply(faithful$eruptions,function(x){lbimod(x,m1,m2)}))}
The problem is that this is a function of two variables and so it might appear that we need a 3-dimensional plot. However, since we are trying to locate “peak”, we can make do with a contour plot.
m1range <- seq(1.5,3.0,length=50)
m2range <- seq(4.0,5.0,length=50)
llik <- matrix(NA,50,50)
for(i in 1:50)
for(j in 1:50)
llik[i,j] <- loglik(m1range[i],m2range[j])
contour(m1range,m2range,llik,levels=10*(-30:-10),drawlabels=FALSE,xlab=expression(m[1]),ylab=expression(m[2]))
We see that the most likely estimate for is about 2.05 and for
is around 4.3. In such situations, solving the problem using calculus is possible, but somewhat more complicated than the visual method!
Note: This calculation does not say that the distribution (with
) fits the given data. It merely says that if this is the type of distribution, then
and
are (approximately) the MLE’s for these parameters.
Assignment
- Take 10 different coins with unknown biases and choose one at random.
mycoins <- runif(10,min=0.2,max=0.8)
mychosen <- sample(mycoins,1)
Now flip the coin a certain number of times.
flips <- 107
heads <- rbinom(1,flips,chosen)
Apply the method given above to find maximum likelihood estimate for which coin it is.
Do assignment 1 with a different number of flips like 50 or 200. How good are is the MLE now?
Copy the data file
/usr/local/lib/summer_data_ozone_temp_solar.csv
to your project directory and load it (or load it directly).
dat <- read.csv('/usr/local/lib/summer_data_ozone_temp_solar.csv')
dat[1:5,]
Now, isolate the data for noon time ozone levels.
ozo <- dat$o3_avg[dat$time==12]
summary(ozo)
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
6.075 41.470 53.760 52.250 62.330 103.600 7
hist(ozo)
Assuming that this belongs to a normal distribution centred around the mean
, estimate the maximum likelihood estimate for
.
m <- mean(ozo)
ll(s) <- function(s){sum(dnorm(ozo,mean=m,sd=s,log=TRUE))}
Here ll
is the log likelihood function. So make an appropriate plot and “guess” the value.