# 汪群超 Chun-Chao Wang

Dept. of Statistics, National Taipei University, Taiwan

# Lesson 5: Probability distributions and the Shapes

### Objective:

1. Learn how to draw the graphs of the PDF and CDF for the discrete and continuous distributions you have learned, such as Binomial, Poisson, Normal, T, $\chi^2, \beta$… distributions.
2. Draw the best graph (view) of a distribution.
3. Get acquainted with how the shape of a distribution changes with its parameter(s).

### Prerequisite:

Draw the PDF of a normal distribution $N(\mu, \sigma^2)$.

Note:

1. In this example, the 95% confidence interval is also patched with color.
2. A special and famous plot style called “538” is called for your reference.
3. Learn how to use PDF function (norm.pdf) and inverse CDF function (norm.ppf).
4. Practice to draw a graph of CDF by the command norm.cdf.
import numpy as np
from scipy.stats import norm
import matplotlib.pyplot as plt

# print(norm.__doc__) # see basic information
mu, sigma = 0, 1
xlim = [mu - 5 * sigma, mu + 5 * sigma]
x = np.linspace(xlim[0], xlim[1], 1000)
y = norm.pdf(x, mu, sigma)

plt.plot(x, y)
plt.style.use('fivethirtyeight') # 538 style

# patch the area of confidence interval
conf = 0.95
ci = norm.ppf([(1-conf)/2, (1+conf)/2]) # inverse CDF　function
x_ci = np.linspace(ci[0], ci[1], 1000)
y_ci = norm.pdf(x_ci, mu, sigma)

plt.fill_between(x_ci, y_ci, 0, color = 'g', alpha = 0.5)
plt.grid(True, linestyle='--', which='major')
plt.title('Normal({}, {})'.format(mu, sigma))
plt.show()



Draw the PDF graphs of a normal distribution with $\mu$ fixed at 0 and $\sigma$ varied from 1 to 5, i.e the left figure below.

Note:

1. To have a clear look at all distribution shapes, it needs to select x range appropriately.
2. If you are familiar with “for loop”, you may want to use 5 plot commands to show the result first.
3. Next, draw $N(\mu, 1)$ with $\mu$ varying from 1 to 5.
4. Practice to use subplot to include the shape changing of fixing one parameter and varying the other, i.e. the right figure below.

Draw the PDF graphs of the $\chi^2$ distribution with degrees of freedom $\nu$ from 4 to 32 step by 2, as shown below.

Note:

1. The scipy pdf function for $\chi^2$ distribution with degrees of freedom $\nu$ is scipy.stats.chi2.pdf(x, $\nu$).
2. Use plt.pause(0.5) to demonstrate the changing of the shape as the degree of freedom changes. It is not working Jupyter Notebook environment.
import numpy as np
from scipy.stats import chi2
import matplotlib.pyplot as plt

xlim = [0, 50]
x = np.linspace(xlim[0], xlim[1], 1000)

# df
df = np.arange(4, 32, 2)
# fix xlim before animation
plt.figure()
plt.axis([xlim[0], xlim[1], 0, 0.2])
for i in df:
y=chi2.pdf(x, i)
plt.plot(x,y, lw=1, color='blue')
# plt.pause(0.5)

plt.title(r'$\chi^2$ Distribution')
plt.yticks([0, 0.1, 0.2])
plt.show()


Draw the PDF graphs of the $t$ distribution with degrees of freedom $\nu$ from 0.1 to 1 step by 0.1 and continue with $\nu$ from 1 to 30 step by 3, as shown below.

Note:

1. To demonstrate the asymptotic property of approaching to the standard normal distribution, a standard normal pdf function is also drawn.
2. Need organize the degree of freedom [0.1 0.2 … 0.9 1] + [3 6 … 30] in an array (vector) for use in “for loop”.
3. The scipy pdf function for $t$ distribution with degrees of freedom $\nu$ is scipy.stats.t.pdf(x, $\nu$).
4. You may want to use the plt.pause(0.5) command to show the asymptotical property.

Draw the PDF graphs of the $beta(a, b)$ distribution with $a=9, 1 \leq b \leq 30$ , as shown below.

Note:

1. The scipy pdf function for $\beta(a, b)$ distribution with parameters $(a, b)$ is scipy.stats.beta.pdf(x, a, b).

Arrange the two parameters $a, b$ of $\beta(a, b)$ distribution to observe the shape of its pdf function., especially the skewed direction.

Arrange the two parameters $n_1, n_2$ of $F(n_1, n_2)$ distribution to observe the shape of its pdf function., especially the skewed direction.

Draw the probability mass function (PMF) of the binomial distribution $B(n, p)$ for $n = 20, p =0.7$.

Note:

1. The scipy function for binomial pdf is scipy.stats.binom.pmf.
2. The shape of the binomial distribution can be represented by STEM or BAR plot.
3. The STEM plot in module matplotlib can be the “stem” function or the attribute drawstype=”steps-pre” in plot function.
import numpy as np
from scipy.stats import binom
import matplotlib.pyplot as plt

n, p = 20, 0.7
x = np.arange(n + 1)
y = binom.pmf(x, n, p)
plt.stem(x, y, linefmt='g-', markerfmt='o', basefmt = 'C1--')
# plt.plot(x, y, drawstyle = 'steps-pre')
plt.title('$B(20, 0.7)$')
plt.show()


Use three different ways to draw the PMF of the binomial distribution $B(n, p)$ for $n = 20, p =0.7$. In addition, the CDF is also drawn by a step plot.

Note:

1. A 4 x 1 subplot is employed and set to share x ticks and ticklabels column-wise (only one column).
2. bar function can define width, color, edgeclor, …etc, for each bar.
3. Step plot is usually used to represent CDF of a discrete distribution.
import numpy as np
from scipy.stats import binom
import matplotlib.pyplot as plt

n, p = 20, 0.7
x = np.arange(n + 1)
y = binom.pmf(x, n, p)
fig, ax = plt.subplots(4,1, sharex = 'col', figsize = [6, 9])
ax[0].stem(x, y, linefmt='k-', markerfmt='ko', basefmt = 'C2--')
ax[1].bar(x, y, width = 0.9, color = 'r', edgecolor = 'y' )
ax[2].vlines(x, 0, y, lw = 5, alpha = 0.5)
Y = binom.cdf(x, n, p)
ax[3].plot(x, Y, drawstyle = 'steps-pre')
plt.suptitle('The PMF and CDF of Bino({}, {})'.format(n, p))
plt.show()