汪群超 Chun-Chao Wang

Dept. of Statistics, National Taipei University, Taiwan

Lesson 5: Probability distributions and the Shapes

Objective:

  1. Learn how to draw the graphs of the PDF and CDF for the discrete and continuous distributions you have learned, such as Binomial, Poisson, Normal, T, \chi^2, \beta… distributions.
  2. Draw the best graph (view) of a distribution.
  3. Get acquainted with how the shape of a distribution changes with its parameter(s).

Prerequisite:


範例 1:Normal Distribution

Draw the PDF of a normal distribution N(\mu, \sigma^2).

Note:

  1. In this example, the 95% confidence interval is also patched with color.
  2. A special and famous plot style called “538” is called for your reference.
  3. Learn how to use PDF function (norm.pdf) and inverse CDF function (norm.ppf).
  4. Practice to draw a graph of CDF by the command norm.cdf.
import numpy as np
from scipy.stats import norm
import matplotlib.pyplot as plt

# print(norm.__doc__) # see basic information
mu, sigma = 0, 1
xlim = [mu - 5 * sigma, mu + 5 * sigma]
x = np.linspace(xlim[0], xlim[1], 1000)
y = norm.pdf(x, mu, sigma)

plt.plot(x, y)
plt.style.use('fivethirtyeight') # 538 style

# patch the area of confidence interval
conf = 0.95
ci = norm.ppf([(1-conf)/2, (1+conf)/2]) # inverse CDF function
x_ci = np.linspace(ci[0], ci[1], 1000)
y_ci = norm.pdf(x_ci, mu, sigma)

plt.fill_between(x_ci, y_ci, 0, color = 'g', alpha = 0.5)
plt.grid(True, linestyle='--', which='major')
plt.title('Normal({}, {})'.format(mu, sigma))
plt.show()


練習:Normal Distribution

Draw the PDF graphs of a normal distribution with \mu fixed at 0 and \sigma varied from 1 to 5, i.e the left figure below.

Note:

  1. To have a clear look at all distribution shapes, it needs to select x range appropriately.
  2. If you are familiar with “for loop”, you may want to use 5 plot commands to show the result first.
  3. Next, draw N(\mu, 1) with \mu varying from 1 to 5.
  4. Practice to use subplot to include the shape changing of fixing one parameter and varying the other, i.e. the right figure below.

範例 2:\chi^2 Distribution

Draw the PDF graphs of the \chi^2 distribution with degrees of freedom \nu from 4 to 32 step by 2, as shown below.

Note:

  1. The scipy pdf function for \chi^2 distribution with degrees of freedom \nu is scipy.stats.chi2.pdf(x, \nu).
  2. Use plt.pause(0.5) to demonstrate the changing of the shape as the degree of freedom changes. It is not working Jupyter Notebook environment.
import numpy as np
from scipy.stats import chi2
import matplotlib.pyplot as plt

xlim = [0, 50]
x = np.linspace(xlim[0], xlim[1], 1000)

# df 
df = np.arange(4, 32, 2)
# fix xlim before animation
plt.figure()
plt.axis([xlim[0], xlim[1], 0, 0.2])
for i in df:
    y=chi2.pdf(x, i)
    plt.plot(x,y, lw=1, color='blue')
    # plt.pause(0.5)

plt.title(r'$\chi^2$ Distribution')
plt.yticks([0, 0.1, 0.2])
plt.show()

練習:t Distribution

Draw the PDF graphs of the t distribution with degrees of freedom \nu from 0.1 to 1 step by 0.1 and continue with \nu from 1 to 30 step by 3, as shown below.

Note:

  1. To demonstrate the asymptotic property of approaching to the standard normal distribution, a standard normal pdf function is also drawn.
  2. Need organize the degree of freedom [0.1 0.2 … 0.9 1] + [3 6 … 30] in an array (vector) for use in “for loop”.
  3. The scipy pdf function for t distribution with degrees of freedom \nu is scipy.stats.t.pdf(x, \nu).
  4. You may want to use the plt.pause(0.5) command to show the asymptotical property.

練習:\beta Distribution

Draw the PDF graphs of the beta(a, b) distribution with a=9, 1 \leq b \leq 30 , as shown below.

Note:

  1. The scipy pdf function for \beta(a, b) distribution with parameters (a, b) is scipy.stats.beta.pdf(x, a, b).

習題:\beta Distribution

Arrange the two parameters a, b of \beta(a, b) distribution to observe the shape of its pdf function., especially the skewed direction.


習題:F Distribution

Arrange the two parameters n_1, n_2 of F(n_1, n_2) distribution to observe the shape of its pdf function., especially the skewed direction.


範例 2: Binomial Distribution

Draw the probability mass function (PMF) of the binomial distribution B(n, p) for n = 20, p =0.7.

Note:

  1. The scipy function for binomial pdf is scipy.stats.binom.pmf.
  2. The shape of the binomial distribution can be represented by STEM or BAR plot.
  3. The STEM plot in module matplotlib can be the “stem” function or the attribute drawstype=”steps-pre” in plot function.
import numpy as np
from scipy.stats import binom
import matplotlib.pyplot as plt

n, p = 20, 0.7
x = np.arange(n + 1)
y = binom.pmf(x, n, p)
plt.stem(x, y, linefmt='g-', markerfmt='o', basefmt = 'C1--')
# plt.plot(x, y, drawstyle = 'steps-pre')
plt.title('$B(20, 0.7)$')
plt.show()

範例 3: Binomial Distribution

Use three different ways to draw the PMF of the binomial distribution B(n, p) for n = 20, p =0.7. In addition, the CDF is also drawn by a step plot.

Note:

  1. A 4 x 1 subplot is employed and set to share x ticks and ticklabels column-wise (only one column).
  2. bar function can define width, color, edgeclor, …etc, for each bar.
  3. Step plot is usually used to represent CDF of a discrete distribution.
import numpy as np
from scipy.stats import binom
import matplotlib.pyplot as plt

n, p = 20, 0.7
x = np.arange(n + 1)
y = binom.pmf(x, n, p)
fig, ax = plt.subplots(4,1, sharex = 'col', figsize = [6, 9])
ax[0].stem(x, y, linefmt='k-', markerfmt='ko', basefmt = 'C2--')
ax[1].bar(x, y, width = 0.9, color = 'r', edgecolor = 'y' )
ax[2].vlines(x, 0, y, lw = 5, alpha = 0.5)
Y = binom.cdf(x, n, p)
ax[3].plot(x, Y, drawstyle = 'steps-pre')
plt.suptitle('The PMF and CDF of Bino({}, {})'.format(n, p))
plt.show()

習題: