MTH202(2017-18): Lab Session 4

Lab Session 4

Lab Assignment 4

Estimates

We can use R for some simple programming tasks like estimating limits. Below we use it to estimate $(1-x/n)^n$ for various values of $n$ for $x=0.5$

x <- 0.5
myfun <- function(n){ (1-x/n)**n;}
plot(sapply(c(1:20),myfun),xlab="n",ylab="approx")

We can compare this with the value of $\exp(-x)$ .

exp(-x)

[1] 0.6065307

We note that for $n\geq 15$ the value is quite close.

Next we can do partial sums of series to see how rapidly we have convergence. Here we sum the series

$\sum_{k=0}^{\infty} (-1)^k \frac{x^{(2k+1)}}{2^k\cdot (2k+1)\cdot k!}$

which is the power series for the integral $\int_0^{x} e^{-t^2/2}dt$ .

x <- 3
myetrm <- function(k){t<-x**(2*k+1)/(2**k*(2*k+1)*factorial(k));if(k%%2==0) t else -t;}
myexp <- function(n){ sum(sapply(c(0:n),myetrm));}
plot(sapply(c(1:20),myexp),ylab="partial sum")

We see that as soon as we have about 8-9 terms, the approximation is quite good. Also note how the alternating series jumps up and down!

Note: Our method of looking at how good the approximation is only works for series that converge! To do a better job, we must estimate the error as shown in the class.

Normal vs Binomial

As we did in the previous lab session for the Poisson distribution, we can examine how well the Normal distribution approximates the Binomial distribution.

Let us define a function that compares these two distributions. Note that we are comparing the cumulative distribution functions in this case instead of the probability mass function or density functions. This is because the Central Limit theorem is about the cumulative distribution function.

comp <- function(p,N,ks){
  m <- p*N;
  s <- sqrt(p*(1-p)*N);
  plot(pbinom(ks,N,p),xlab="successes",ylab="probs",col="blue");
  points(pnorm(ks,m,s),col="red");
}

The value $p$ represents the probability of success in a single Bernoulli trial. The Binomial distribution is the one associated with $N$ independent trials of this type. We are trying to compare the probability of at most $k$ successes for a vector ks of values. The blue colour represents the “real” probabilities coming from the Binomial distribution and the red colour is the Normal approximation.

The approximation is supposed to work well when $N$ is large.

p <- 1/3
N <- 20
ks <- c(0:20)
comp(p,N,ks)

We note the Binomial distribution appears to be a bit larger than the Normal distribution. However, in applications, one is actually interested between the difference between the values at two points (the probability that the result lies in some range).

p <- 1/3
N <- 20
m <- p*N
s <- sqrt(p*(1-p)*N)
pbinom(8,N,p)-pbinom(5,N,p)

[1] 0.5122372

pnorm(8,m,s)-pnorm(5,m,s)

[1] 0.5218577

Assignment

Repeat the first approximation check with smaller and bigger values for $x$ .
Repeat the second approximation check with other series like the following:

The geometric series $\sum_{k=1}^{\infty} x^k$ for $0<x<1$ . Note how the convergence in this case slows as $x$ comes close to 1 and speeds up as $x$ goes close to 0.
The series $\sum_{k=0}^{\infty} (-1)^k \frac{x^{2k+1}}{(2k+1)!}$ for $\sin(x)$ .

Compare the Normal and Binomial distributions for various values of $p$ (between 0 and 1!) and see that the convergence is good for large values of $N$ for all those values. See how it improves for larger $N$ and becomes bad for small values of $N$ .

Last modified: Wednesday, 7 March 2018, 2:35 PM

MTH202 Probability and Statistics

Lab Session 4

Lab Assignment 4

Kapil Paranjape

07/03/2018

Estimates

Normal vs Binomial

Assignment

Hosted and maintained by IISER Mohali Computer Centre