Confidence interval 和 prediction interval 的区别

One is a prediction of a future observation, and the other is a predicted mean response. I will give a more detailed answer to hopefully explain the difference and where it comes from, as well as how this difference manifests itself in wider intervals for prediction than for confidence.

This example might illustrate the difference between confidence and prediction intervals: suppose we have a regression model that predicts the price of houses based on number of bedrooms, size, etc. There are two kinds of predictions we can make for a given $x_0$:

  1. We can predict the price for a specific new house that comes on the market with characteristics $x_0$ ("what is the predicted price for this house $x_0$?"). Its true price will be $$y = x_0^T\beta+\epsilon$$. Since $E(\epsilon)=0$, the predicted price will be $$\hat{y} = x_0^T\hat{\beta}$$ In assessing the variance of this prediction, we need to include our uncertainty about $\hat{\beta}$, as well as our uncertainty about our prediction (the error of our prediction) and so must include the variance of $\epsilon$ (the error of our prediction). This is typically called a prediction of a future value.

  2. We can also predict the average price of a house with characteristics $x_0$ ("what would be the average price for a house with characteristics $x_0$?"). The point estimate is still $$\hat{y} = x_0^T\hat{\beta}$$, but now only the variance in $\hat{\beta}$ needs to be accounted for. This is typically called prediction of the mean response.

Most times, what we really want is the first case. We know that $$var(x_0^T\hat{\beta}) = x_0^T(X^TX)^{-1}x_0\sigma^2$$

This is the variance for our mean response (case 2). But, for a prediction of a future observation (case 1), recall that we need the variance of $x_0^T\hat{\beta} + \epsilon$; $\epsilon$ has variance $\sigma^2$ and is assumed to be independent of $\hat{\beta}$. Using some simple algebra, this results in the following confidence intervals:

  1. CI for a single future response for $x_0$: $$\hat{y}_0\pm t_{n-p}^{(\alpha/2)}\hat{\sigma}\sqrt{x_0^T(X^TX)^{-1}x_0 + 1}$$

  2. CI for the mean response given $x_0$: $$\hat{y}_0\pm t_{n-p}^{(\alpha/2)}\hat{\sigma}\sqrt{x_0^T(X^TX)^{-1}x_0}$$

Where $t_{n-p}^{\alpha/2}$ is a t-statistic with $n-p$ degrees of freedom at the $\alpha/2$ quantile.

Hopefully this makes it a bit clearer why the prediction interval is always wider, and what the underlying difference between the two intervals is. This example was adapted from Faraway, Linear Models with R, Sec. 4.1.

置信区间confidence interval

表示在给定预测变量的指定设置时,平均响应可能落入的范围。

预测区间Prediction Interval

表示在给定预测变量的指定设置时,单个观测值可能落入的范围。

置信区间confidence interval和预测区间Prediction Interval案例

例如,您为呼叫中心每天接收的呼叫数建立了一个回归模型。该数字会因星期日期、年的月份、市场条件等因子和其他变量中的经济因子而发生很大变化。您确信该模型可以准确地拟合数据。因此可以断定,使用该模型预测每天的呼叫人数以便安排相应数量的客户服务代理是可以接受的。

对于每天的预测,您为所有预测变量指定了值并将置信区间设置为 95%。结果是 [230, 270] 的 95% 置信区间。您能够以 95% 的置信度确信,此范围包括新观测值的值。可以进一步确定,预测的 95% 置信区间为 [240, 260]。您能够以 95% 的置信度确信,此范围包括与这些预测变量值匹配的所有天的响应均值。

预测区间PI总是要比对应的置信区间CI大,这是因为在对单个响应与响应均值的预测中包括了更多的不确定性。下面以案例说明。

�˴󾭼���̳

����: ����������Ԥ����������� [��ӡ��ҳ]
����: Poker_Face    ʱ��: 2013-3-23 10:46:29     ����: ����������Ԥ�����������

�ڽ��лع�Ԥ��ʱ����������и��������������Ԥ�����䣬�����������ۺ�ԭ������ʲô����ʵ������ʱ��β���������
�������Ϻ��ƾ���ѧ�������ͳ��Ԥ������ߵ��İ�33ҳ��

����: lipj    ʱ��: 2013-3-23 11:15:17

�����Ĺ��ƽ��������䣬�����Ĺ��ƽ�Ԥ������

����: Poker_Face    ʱ��: 2013-3-23 12:41:51

lipj ������ 2013-3-23 11:15

�����Ĺ��ƽ��������䣬�����Ĺ��ƽ�Ԥ������

ŶŶ~лл  ���ļ��㹫ʽ��ʲô��𰡣� ����: lipj    ʱ��: 2013-3-23 14:10:03

����Կ���������Ľ��ܣ�
//wenku.baidu.com/view/9a25573f580216fc700afdaf.html

����: Poker_Face    ʱ��: 2013-3-23 21:10:59

lipj ������ 2013-3-23 14:10

����Կ���������Ľ��ܣ�
//wenku.baidu.com/view/9a25573f580216fc700afdaf.html

лл�� ����: �h��    ʱ��: 2013-3-23 21:15:38

�����������(confidence interval estimate)�����ù��ƵĻع鷽�̣������Ա��� x ��һ������ֵ x0 ���������� y ��ƽ��ֵ�Ĺ������䡣
Ԥ���������(prediction interval estimate)�����ù��ƵĻع鷽�̣������Ա��� x ��һ������ֵ x0 ���������� y ��һ������ֵ�Ĺ������䡣

����: Poker_Face    ʱ��: 2013-3-23 21:17:53

�h�� ������ 2013-3-23 21:15

�����������(confidence interval estimate)�����ù��ƵĻع鷽�̣������Ա��� x ��һ������ֵ x0 ������� ...

лл�� ����: �h��    ʱ��: 2013-3-23 21:21:37

Poker_Face ������ 2013-3-23 21:17

лл��

û�� ����: mjj5557    ʱ��: 2013-5-4 09:39:29

Poker_Face ������ 2013-3-23 21:17

лл��

�������Իع��Ԥ�����乫ʽ�����ڷ����Ե�����Ԥ����
����: mjj5557    ʱ��: 2013-5-4 09:40:56

�h�� ������ 2013-3-23 21:15

�����������(confidence interval estimate)�����ù��ƵĻع鷽�̣������Ա��� x ��һ������ֵ x0 ������� ...

�������Իع��Ԥ�����乫ʽ�����ڷ����Ե�����Ԥ���� ����: Poker_Face    ʱ��: 2013-5-4 10:07:04

mjj5557 ������ 2013-5-4 09:40

�������Իع��Ԥ�����乫ʽ�����ڷ����Ե�����Ԥ����

�����ð� �����Ե���Ҫ������ϰ� ����: mjj5557    ʱ��: 2013-5-4 10:14:13

Poker_Face ������ 2013-5-4 10:07

�����ð� �����Ե���Ҫ������ϰ�

�����������ҵ������Ե�Ԥ������Ĺ�ʽ�أ� ����: Poker_Face    ʱ��: 2013-5-4 10:36:36

mjj5557 ������ 2013-5-4 10:14

�����������ҵ������Ե�Ԥ������Ĺ�ʽ�أ�

��Ҫ����ʲô���߰���  ���Σ����������ߣ��������ߵȵȺܶ�� ����: mjj5557    ʱ��: 2013-5-4 11:46:55

Poker_Face ������ 2013-5-4 10:36

��Ҫ����ʲô���߰���  ���Σ����������ߣ��������ߵȵȺܶ��

�Ƕ��εģ���ָ���ģ�
�ҿ���һƪӢ�����׶��ε�ֱ����һ�ε�Ԥ�����乫ʽ ����: Poker_Face    ʱ��: 2013-5-4 12:36:34

mjj5557 ������ 2013-5-4 11:46

�Ƕ��εģ���ָ���ģ�
�ҿ���һƪӢ�����׶��ε�ֱ����һ�ε�Ԥ�����乫ʽ

�Dz��ǰ�����������ת���Ժ�����OLS��ϵİ��� ����: mjj5557    ʱ��: 2013-5-4 13:15:26

Poker_Face ������ 2013-5-4 12:36

�Dz��ǰ�����������ת���Ժ�����OLS��ϵİ���

���û˵����ֻ��˵���һ�Σ����λع鷽�̳�������������ĸ�����Ԥ�⹫ʽ ����: Poker_Face    ʱ��: 2013-5-4 17:01:43

mjj5557 ������ 2013-5-4 13:15

���û˵����ֻ��˵���һ�Σ����λع鷽�̳�������������ĸ�����Ԥ�⹫ʽ

Ŷ����������Լ������ݿ���ͼ���Լ����һ�����εĽ���Ԥ�ⲻ�ͺ���
��ӭ���� �˴󾭼���̳ (//bbs.pinggu.org/) Powered by Discuz! X2

Toplist

最新的帖子

標籤