豪斯曼，拉姆齊檢驗，過度擬合，弱工具和過度識別，模型選擇和重抽樣問題

最新 09-14

《正文》

1.Hausman specification test

The test evaluates the consistency of an estimator when compared to an alternative, less efficient estimator which is already known to be consistent. It helps one evaluate if a statistical model corresponds to the data.

用處1：檢測變數的內生性This test can be used to check for the endogeneity of a variable (by comparing instrumental variable (IV) estimates to ordinary least squares (OLS) estimates).

用處2：檢測增加一個額外工具變數的正當性It can also be used to check the validity of extra instruments by comparing IV estimates using a full set of instruments Z to IV estimates that use a proper subset of Z. Note that in order for the test to work in the latter case, we must be certain of the validity of the subset of Z and that subset must have enough instruments to identify the parameters of the equation.

用處3：區分面板數據中的固定效應和隨機效應The Hausman test can be also used to differentiate between fixed effects model and random effects model in panel data. In this case, Random effects (RE) is preferred under the null hypothesis due to higher efficiency, while under the alternative Fixed effects (FE) is at least consistent and thus preferred.

2.Ramsey RESET test

Specification error occurs when an independent variable is correlated with the error term. There are several different causes of specification error:

用處1：檢測是不是用了不正確的方程式An incorrect functional form could be employed;

用處2：檢測是不是省略了重要變數a variable omitted from the model may have a relationship with both the dependent variable and one or more of the independent variables (omitted-variable bias);

用處3：檢測是不是加入了不相關的變數an irrelevant variable may be included in the model;

用處4：檢測是不是有聯立性偏誤the dependent variable may be part of a system of simultaneous equations (simultaneity bias);

用處5：檢測是不是有測量誤差measurement errors may affect the independent variables.

3.Overfitting

過擬合有兩種原因：

1.訓練集和測試機特徵分布不一致（白天鵝黑天鵝）

2.或者模型太過複雜（記住了每道題）而樣本量不足

在回歸時，樣本的數量n和參數的數量p

n>p時，最小二乘回歸會有較小的方差

n=p時，容易產生過擬合(overfitting)

模型的解釋能力：在模型中，總有一個variance與bias的平衡過程，如果這個模型與真實數據之間的variance很小，那麼很可能在out-of-sample預測過程中會有較大的bias，這就是我們在overfitting中遇到的問題。

In order to avoid overfitting, it is necessary to use additional techniques (e.g.cross-validation（交叉驗證）, regularization（正則化）, early stopping, pruning, Bayesian priors on parameters, model comparison or dropout), that can indicate when further training is not resulting in better generalization. 對於這些過度擬合的補救方法可以參看：http://dwz.cn/6uAcog（複製到瀏覽器）。

The basis of some techniques is either (1) to explicitly penalize overly complex models, or (2) to test the model s ability to generalize by evaluating its performance on a set of data not used for training, which is assumed to approximate the typical unseen data that a model will encounter.

4.Weak instruments and overidentification test

4.1.「Weak Instruments」（弱工具變數會造成回歸的效率甚至一致性出問題）

? If cov(z, x) is weak, IV no longer has such desirable asymptotic properties

? IV estimates are not unbiased, and the bias tends to be larger when instruments are weak (even with very large datasets)

? Weak instruments tend to bias the results towards the OLS estimates

? Adding more and more instruments to improve asymptotic efficiency does not solve the problem.Recommendation always test the 『strength』 of your instrument(s) by reporting the F-test on the instruments in the first stage regression (如果第一階段的內生變數X對工具變數Z的回歸中，F test的數值大於10，就不是weak instruments)。

4.2.Overidentification test（在工具變數多於內生變數情況下，檢測變數這些工具變數是不是外生的）

sargan test原假設是所有工具變數外生時構造近似卡方統計量，如果違反原假設，2SLS有偏，隨機干擾項估計也有偏，統計量自然也不服從卡方分布。如果違反原假設，2SLS有偏，隨機干擾項估計也有偏，統計量自然也不服從卡方分布。這裡檢驗只考慮原假設下統計量的顯著性問題，如果卡方統計量大則拒絕原假設認為，工具變數有內生的，反之不能認為工具變數內生（當然也不能肯定外生）。由於原假設是外生，檢驗不能檢驗是否外生。

5.Criteria for model selection（模型選擇標準）

Akaike information criterion

Bayes factor

Bayesian information criterion

Cross-validation

Deviance information criterion

False discovery rate

Focused information criterion

Likelihood-ratio test

Mallows s Cp

Minimum description length (Algorithmic information theory)

Minimum message length (Algorithmic information theory)

Structural Risk Minimization

Stepwise regression

The most commonly used criteria are (i) the Akaike information criterion and (ii) the Bayes factor and/or the Bayesian information criterion (which to some extent approximates the Bayes factor).

6.Bootstrap, Jacknife and Permutation test

Bootstrap自助法

在統計學中，自助法（BootstrapMethod，Bootstrapping或自助抽樣法）可以指任何一種有放回的均勻抽樣，也就是說，每當選中一個樣本，它等可能地被再次選中並被再次添加到訓練集中。自助法能對採樣估計的準確性（標準誤差、置信區間和偏差）進行比較好的估計，它基本上能夠對任何採樣分布的統計量進行估計。

Bootstrap有兩種形式：非參數bootstrap和參數化的bootstrap，但基本思想都是模擬。參數化的bootstrap假設總體的分布已知或總體的分布形式已知，可以由樣本估計出分布參數，再從參數化的分布中進行再採樣，類似於MC。非參數化的bootstrap是從樣本中再抽樣，而不是從分布函數中進行再抽樣。

Jackknife刀切法

Jackknife意為大摺刀,在統計分析中是一種估計方法,它是利用一次抽樣的樣本觀察值,來構造未知參數的無偏估計(或偏性很小的估計量)的一種模擬抽樣統計推斷方法.該法每次從原樣本中剔除一個樣本,得到樣本含量為n-1的新樣本,稱為Jackknife樣本,共有n個,由每個樣本計算估計值,稱為Jackknife估計.本方法是Quenouille於1956年提出的.因為用該方法得到未知參數的估計量偏性小或無偏性,故而在精確度要求較高的研究領域中具有很大的應用價值.以下將介紹Jackknife估計的方法,並舉一實例說明其在醫學研究中的應用。

Efron1979年文章指出了自助法與刀切法的關係。首先，自助法通過經驗分布函數構建了自助法世界，將不適定的估計概率分布的問題轉化為從給定樣本集中重採樣。第二，自助法可以解決不光滑參數的問題。遇到不光滑(Smooth)參數估計時，刀切法會失效，而自助法可以有效地給出中位數的估計。第三，將自助法估計用泰勒公式展開，可以得到刀切法是自助法方法的一階近似。第四，對於線性統計量的估計方差這個問題，刀切法或者自助法會得到同樣的結果。但在非線性統計量的方差估計問題上，刀切法嚴重依賴於統計量線性的擬合程度，所以遠不如自助法有效。

Permutation test 置換檢驗（非參數檢驗）

當樣本量不夠大，樣本分布未知的情況下；用置換檢驗模擬出樣本均值分布，然後再進行比較。

in detials：

兩組數據：A:樣本量n；B:樣本量m，總體樣本數量：n+m

則從n+m個樣本中隨機抽取n個值，計算出樣本均值，然後重複此過程i次（i=1000），得到樣本均值的分布情況，然後將A樣本均值與得到的分布進行比較。則可以進行假設檢驗。

從n+m個樣本中隨機抽n個的為A，剩下m為B，計算兩組差異，重複次過程i次，得到差異的分布情況，將實際差異與分布情況進行比較。

attention：模擬數據，想法與置換檢驗有相似點。去除掉混淆因素。

也可以看看這個：（變數內生性+工具變數知識匯總）

對於工具變數回歸，計量經濟圈推薦經典讀物：

https://pan.baidu.com/s/1c1OK37M

《END》

點擊展開全文