Random Variables - Binomial, Hypergeometric, and Poisson

Code Examples

Web-r to accompany examples in Stat20 Random Variable Notes.

Bernoulli\((p)\) and Binomial\((n,p)\)

  1. dbinom computes the pmf of \(X\), \(f(k) = P(X = k)\), for \(k = 0, 1, \ldots, n\).
  • Arguments:
    • x: the value of \(k\) in \(f(k)\)
    • size: the parameter \(n\), the number of trials
    • prob: the parameter \(p\), the probability of success
  1. pbinom computes the cdf \(F(x) = P(X \le x)\)
  • Arguments:
    • q: the value of \(x\) in \(F(x)\)
    • size: the parameter \(n\), the number of trials
    • prob: the parameter \(p\), the probability of success
  1. rbinom generates a sample (random numbers) from the Binomial\((n,p)\) distribution.
  • Arguments:
    • n: the sample size
    • size: the parameter \(n\), the number of trials
    • prob: the parameter \(p\), the probability of success

Example

Suppose we consider \(n = 3\), \(p= 0.5\), that is, \(X\) is the number of successes in 3 independent Bernoulli trials.

  1. What is the probability that we see exactly 1 success = \(f(1)\)
  1. What is the probability that we see at most 1 success = \(F(1) = f(0) + f(1)\)

Let’s double check that this is true: \(F(1) = f(0) + f(1)\)

  1. Generate a sample of size 5 where each element in the sample represents the number of successes in 3 trials (like the number of heads in 3 tosses)
  1. Let’s generate a sample of size 10 to simulate 10 tosses of a single fair coin
  1. If we have a Binomial distribution that can be described as \(X \sim Bin(10, 0.4)\) compute the following
  • \(X = 5\)

We can calculate \(X = 5\) one of two ways:

Using dbinom to calculate the pmf for \(X = 5\)

Using pbinom to calculate the cdf for \(X \le 5\) and subtracting the cdf for \(X \le 4\) to get the pmf of \(X = 5\).

  • \(X \le 5\)

Can also be calculated two ways.

We can compute this by adding up the pmfs of \(X = 0\), \(X = 1\), \(X = 2\), \(X = 3\), \(X = 4\), \(X = 5\)

Or we can calculate the cdf:

  • \(3 \le X \le 8\)

Can be calculated by adding up the pmfs of \(X = 3\), \(X = 4\), \(X = 5\), \(X = 6\), \(X = 7\), and \(X = 8\).

Or by calculating the cdf of \(X \le 8\) and subtracting the cdf of \(X \le 2\)

Exercise:

Suppose our process is \(X \sim Bin(20, 0.2)\). Fill in the blanks below to compute the following:

  • \(X = 4\)

Hint: Your answer should come out to be 0.2181994

  • \(X \le 4\)

Hint: Your answer should come out to be 0.6296483

  • \(7 \le X \le 10\)

Hint: Your answer should come out to be 0.0861291. Use the help file to check the formulas used and be careful with the less than or equal to signs.

Hypergeometric \((N, G, n)\)

The notation is a bit confusing, but just remember that x is usually the number \(k\) that you want the probability for, and m + n\(=N\) is the total number of successes and failures, or the population size.

  1. dhyper computes the pmf of \(X\), \(f(k) = P(X = k)\), for \(k = 0, 1, \ldots, n\).
  • Arguments:
    • x: the value of \(k\) in \(f(k)\)
    • m: the parameter \(G\), the number of successes in the population
    • n: the value \(N-G\), the number of failures in the population
    • k: the sample size (number of draws \(n\), note that \(0 \le k \le m+n\))
  1. phyper computes the cdf \(F(x) = P(X \le x)\)
  • Arguments:
    • q: the value of \(x\) in \(F(x)\)
    • m: the parameter \(G\), the number of successes in the population
    • n: the value \(N-G\), the number of failures in the population
    • k: the sample size (number of draws \(n\))
  1. rhyper generates a sample (random numbers) from the hypergeometric\((N, G, n)\) distribution.
  • Arguments:
    • nn: the number of random numbers desired
    • m: the parameter \(G\), the number of successes in the population
    • n: the value \(N-G\), the number of failures in the population
    • k: the sample size (number of draws \(n\))

Example

Suppose we consider \(N = 10, G = 6, n = 3\), that is, \(X\) is the number of successes in 3 draws without replacement from a box that has 6 tickets marked \(\fbox{1}\) and 4 tickets marked \(\fbox{0}\)

  1. The probability that we see exactly 1 success = \(f(1)\)

Try computing this by hand as well to check.

  1. The probability that we see at most 1 success = \(F(1) = f(0) + f(1)\)

Using the cdf:

Adding up the pmfs:

  1. Generate a sample of size 5 where each element in the sample represents the number of successes in 3 draws.

Poisson(\(\lambda\))

  1. dpois computes the pmf of \(X\), \(f(k) = P(X = k)\), for \(k = 0, 1, 2, \ldots\).
  • Arguments:
    • x: the value of \(k\) in \(f(k)\)
    • lambda: the parameter \(\lambda\)
  1. ppois computes the cdf \(F(x) = P(X \le x)\)
  • Arguments:
    • q: the value of \(x\) in \(F(x)\)
    • lambda: the parameter \(\lambda\)
  1. rpois generates a sample (random numbers) from the Poisson(\(\lambda\)) distribution.
  • Arguments:
    • n: the desired sample size
    • lambda: the parameter \(\lambda\)

Example

Suppose we consider \(\lambda = 1\), that is \(X \sim\) Poisson\((\lambda)\).

  1. What is the probability that we see exactly 1 event \(f(1)\)
  1. Let’s check that \(f(1) = exp(-\lambda)\lambda = exp(-1)1\)
  1. What is the probability that we see at most 1 success, \(F(1) = f(0) + f(1)\)

Using the cdf:

Adding up the pmfs

  1. Generate a sample of size 5 where each element in the sample represents a random count from the Poisson(1) distribution