一般化線形モデルと交互作用ー標準化など説明変数の”平行移動”の効果

2009.4.17
2009.4.11より

 

 一般化線形モデル(GLM)や一般線形モデル(正規線形モデル)で、説明変数に定数を加える(あるいは定数を引く)操作をすると、交互作用の効果の推定はどのような影響を受けるかを考えてみます。ここでは、定数を加える(あるいは定数を引く)操作のことを”平行移動”と仮に呼んでいます。標準化は、”平行移動”+スケールの変化で、平均を0、分散を1にすることを指すことが、多いのですが、ここでは、話を単純にするためスケールの変更は行っていません。スケールの変更を(も)する場合には、それに応じて当然起こる回帰係数の変化(c倍されると回帰係数は1/cになる)をあわせて考えてください。

結論は:

1.説明変数が交互作用項のみのとき:推定される交互作用の効果は説明変数の”平行移動”の影響を受ける

2.説明変数が交互作用項+主効果のみのとき:推定される交互作用の効果は説明変数の”平行移動”の影響を受けないが、主効果は受ける。

という2点です。以下では、説明変数が量的なものか名義変数かより分けて例を示しています。


両説明変数が量的な変数の場合

 まず、例として使ったデータである。説明変数がx01とx02で、目的変数がy01である。(以下、Rで計算したものです。結果表示など簡単にしてある部分があります。)サンプルサイズは20です。

> x01
[1] 0.7452506 1.6278334 1.3070735 1.0267938 2.7319448 2.3952411 0.1893610
[8] 3.8487420 3.0717137 2.5531784 2.0608768 2.4838042 2.9895419 2.8983844
[15] 0.8401353 1.2294717 2.0002388 1.8784546 2.1792990 1.8727374

> x02
[1] -0.8687907 0.2168528 -2.3573511 -4.0435968 -1.9759199 -1.4642176
[7] -1.0799398 -2.2423776 1.0092726 -1.5163328 -2.2651893 -1.4782773
[13] -2.2331311 -3.2363845 -2.1090825 -3.1086167 -2.9875158 -1.4077831
[19] -0.9521418 -1.1867783
> y01
[1] 3.5731850 10.5554735 8.6906047 4.6799267 4.1895945 2.1648574
[7] 6.0377956 6.4796139 4.4789133 0.9164376 4.8474837 1.2088529
[13] 5.4738635 7.6755598 4.1498920 7.8116349 2.7913180 4.7828944
[19] 4.8287341 0.6910759

x01とx02の平均がそれぞれ0になるように、定数を加えたのが、x01cntとx02cntである。

> x01cnt
-1.251253213 -0.368670402  -0.689430345 -0.969710043 0.735441023
0.398737281 -1.807142782 1.852238187 1.075209896 0.556674545 
0.064372959  0.487300412  0.993038044  0.901880621 -1.156368528 
-0.767032082  0.003734955 -0.118049256  0.182795172 -0.123766444 

> x02cnt
0.8955743  1.9812179 -0.5929860 -2.2792317 -0.2115548 
0.3001475  0.6844252  -0.4780125 2.7736377  0.2480323 
-0.5008243  0.2860877 -0.4687661  -1.4720195  -0.3447175 
-1.3442516  -1.2231507  0.3565820  0.8122233  0.5775868

交互作用項のみの場合

> res12.int<-glm(y01~x01:x02,family=gaussian(link="identity"))
> res12.int

Call: glm(formula = y01 ~ x01:x02, family = gaussian(link = "identity"))

Coefficients:
(Intercept) x01:x02
4.70917 -0.02675

Degrees of Freedom: 19 Total (i.e. Null); 18 Residual
Null Deviance: 128.7
Residual Deviance: 128.6 AIC: 99.97
> summary(res12.int)

Call:
glm(formula = y01 ~ x01:x02, family = gaussian(link = "identity"))

Deviance Residuals:
Min 1Q Median 3Q Max
-4.07755 -1.38440 -0.06866 1.37726 5.85574

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 4.70917 0.92910 5.069 8e-05 ***
x01:x02 -0.02675 0.20637 -0.130 0.898
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for gaussian family taken to be 7.141667)

Null deviance: 128.67 on 19 degrees of freedom
Residual deviance: 128.55 on 18 degrees of freedom
AIC: 99.97

次に”平行移動”したもの

> res12c.int<-glm(y01~x01cnt:x02cnt,family=gaussian(link="identity"))
> res12c.int

Call: glm(formula = y01 ~ x01cnt:x02cnt, family = gaussian(link = "identity"))

Coefficients:
(Intercept) x01cnt:x02cnt
4.8287 -0.3633

Degrees of Freedom: 19 Total (i.e. Null); 18 Residual
Null Deviance: 128.7
Residual Deviance: 125.9 AIC: 99.55
> summary(res12c.int)

Call:
glm(formula = y01 ~ x01cnt:x02cnt, family = gaussian(link = "identity"))

Deviance Residuals:
Min 1Q Median 3Q Max
-4.16360 -1.75673 0.03051 0.90212 5.46141

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 4.8287 0.5929 8.145 1.90e-07 ***
x01cnt:x02cnt -0.3633 0.5737 -0.633 0.535
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for gaussian family taken to be 6.992567)

Null deviance: 128.67 on 19 degrees of freedom
Residual deviance: 125.87 on 18 degrees of freedom
AIC: 99.547


(Intercept) x01 x02 x01:x02
6.5931 -0.9736 0.6062 -0.3544

Degrees of Freedom: 19 Total (i.e. Null); 16 Residual
Null Deviance: 128.7
Residual Deviance: 123.6 AIC: 103.2

主効果+交互作用項の場合

> res12.all<-glm(y01~x01*x02,family=gaussian(link="identity"))
> res12.all

Call: glm(formula = y01 ~ x01 * x02, family = gaussian(link = "identity"))

Coefficients:
(Intercept) x01 x02 x01:x02
6.5931 -0.9736 0.6062 -0.3544

Degrees of Freedom: 19 Total (i.e. Null); 16 Residual
Null Deviance: 128.7
Residual Deviance: 123.6 AIC: 103.2

> summary(res12.all)

Call:
glm(formula = y01 ~ x01 * x02, family = gaussian(link = "identity"))

Deviance Residuals:
Min 1Q Median 3Q Max
-4.14689 -2.03810 0.02302 1.50619 5.54091

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 6.5931 2.8748 2.293 0.0357 *
x01 -0.9736 1.2660 -0.769 0.4531 ←ここはちがう
x02 0.6062 1.3430 0.451 0.6578 ←ここはちがう
x01:x02 -0.3544 0.6035 -0.587 0.5652
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for gaussian family taken to be 7.721903)

Null deviance: 128.67 on 19 degrees of freedom
Residual deviance: 123.55 on 16 degrees of freedom
AIC: 103.18

次に”平行移動”したもの

> res12c.all<-glm(y01~x01cnt*x02cnt,family=gaussian(link="identity"))
> res12c.all

Call: glm(formula = y01 ~ x01cnt * x02cnt, family = gaussian(link = "identity"))

Coefficients:
(Intercept) x01cnt x02cnt x01cnt:x02cnt
4.8280 -0.3484 -0.1013 -0.3544

Degrees of Freedom: 19 Total (i.e. Null); 16 Residual
Null Deviance: 128.7
Residual Deviance: 123.6 AIC: 103.2

> summary(res12c.all)

Call:
glm(formula = y01 ~ x01cnt * x02cnt, family = gaussian(link = "identity"))

Deviance Residuals:
Min 1Q Median 3Q Max
-4.14689 -2.03810 0.02302 1.50619 5.54091

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 4.8280 0.6230 7.749 8.36e-07 ***
x01cnt -0.3484 0.6964 -0.500 0.624 ←ここはちがう
x02cnt -0.1013 0.5456 -0.186 0.855 ←ここはちがう
x01cnt:x02cnt -0.3544 0.6035 -0.587 0.565
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for gaussian family taken to be 7.721903)

Null deviance: 128.67 on 19 degrees of freedom
Residual deviance: 123.55 on 16 degrees of freedom
AIC: 103.18


片方の説明変数が量的で、もう片方が名義変数の場合

 例として使ったデータは、x01とy01は上と同じ、x03は以下のものです、
> x03
[1] 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1

x03の値を0.5ずらして、平均を0にあわせたものが、 x03cntです。つまり、
> x03cnt
[1] -0.5 -0.5 -0.5 -0.5 -0.5 -0.5 -0.5 -0.5 -0.5 -0.5 0.5 0.5 0.5 0.5
[15] 0.5 0.5 0.5 0.5 0.5 0.5

交互作用項のみの場合

> res13.int

Call: glm(formula = y01 ~ x01:x03, family = gaussian(link = "identity"))

Coefficients:
(Intercept) x01:x03
5.0899 -0.2824

Degrees of Freedom: 19 Total (i.e. Null); 18 Residual
Null Deviance: 128.7
Residual Deviance: 126.7 AIC: 99.68
> summary(res13.int)

Call:
glm(formula = y01 ~ x01:x03, family = gaussian(link = "identity"))

Deviance Residuals:
Min 1Q Median 3Q Max
-4.17346 -1.57096 -0.09325 1.26859 5.46557

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 5.0899 0.8044 6.327 5.81e-06 ***
x01:x03 -0.2824 0.5318 -0.531 0.602
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for gaussian family taken to be 7.038089)

Null deviance: 128.67 on 19 degrees of freedom
Residual deviance: 126.69 on 18 degrees of freedom
AIC: 99.677

次に”平行移動”して(平均を0にした)もの

> res13c.int<-glm(y01~x01cnt:x03cnt,family=gaussian(link="identity"))
> res13c.int

Call: glm(formula = y01 ~ x01cnt:x03cnt, family = gaussian(link = "identity"))

Coefficients:
(Intercept) x01cnt:x03cnt
4.7816 0.8456

Degrees of Freedom: 19 Total (i.e. Null); 18 Residual
Null Deviance: 128.7
Residual Deviance: 125.8 AIC: 99.54
> summary(res13c.int)

Call:
glm(formula = y01 ~ x01cnt:x03cnt, family = gaussian(link = "identity"))

Deviance Residuals:
Min 1Q Median 3Q Max
-4.038202 -1.801032 0.004257 0.989403 5.618004

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 4.7816 0.5920 8.077 2.14e-07 ***
x01cnt:x03cnt 0.8456 1.3234 0.639 0.531
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for gaussian family taken to be 6.989798)

Null deviance: 128.67 on 19 degrees of freedom
Residual deviance: 125.82 on 18 degrees of freedom
AIC: 99.54

主効果+交互作用項の場合

> res13.all<-glm(y01~x01*x03,family=gaussian(link="identity"))
> res13.all

Call: glm(formula = y01 ~ x01 * x03, family = gaussian(link = "identity"))

Coefficients:
(Intercept) x01 x03 x01:x03
6.1727 -0.5109 -2.0733 0.6708

Degrees of Freedom: 19 Total (i.e. Null); 16 Residual
Null Deviance: 128.7
Residual Deviance: 122.6 AIC: 103
> summary(res13.all)

Call:
glm(formula = y01 ~ x01 * x03, family = gaussian(link = "identity"))

Deviance Residuals:
Min 1Q Median 3Q Max
-3.95190 -1.77565 -0.06104 1.24063 5.21439

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 6.1727 1.7897 3.449 0.0033 **
x01 -0.5109 0.8006 -0.638 0.5324 ←ここはちがう
x03 -2.0733 3.4575 -0.600 0.5571 ←ここはちがう
x01:x03 0.6708 1.5980 0.420 0.6802
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for gaussian family taken to be 7.664388)

Null deviance: 128.67 on 19 degrees of freedom
Residual deviance: 122.63 on 16 degrees of freedom
AIC: 103.03

次に”平行移動”した(平均を0にした)もの

> res13c.all<-glm(y01~x01cnt*x03cnt,family=gaussian(link="identity"))
> res13c.all

Call: glm(formula = y01 ~ x01cnt * x03cnt, family = gaussian(link = "identity"))

Coefficients:
(Intercept) x01cnt x03cnt x01cnt:x03cnt
4.7857 -0.1755 -0.7341 0.6708

Degrees of Freedom: 19 Total (i.e. Null); 16 Residual
Null Deviance: 128.7
Residual Deviance: 122.6 AIC: 103
> summary(res13c.all)

Call:
glm(formula = y01 ~ x01cnt * x03cnt, family = gaussian(link = "identity"))

Deviance Residuals:
Min 1Q Median 3Q Max
-3.95190 -1.77565 -0.06104 1.24063 5.21439

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 4.7857 0.6202 7.717 8.82e-07 ***
x01cnt -0.1755 0.7990 -0.220 0.829 ←ここはちがう
x03cnt -0.7341 1.2404 -0.592 0.562 ←ここはちがう
x01cnt:x03cnt 0.6708 1.5980 0.420 0.680
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for gaussian family taken to be 7.664388)

Null deviance: 128.67 on 19 degrees of freedom
Residual deviance: 122.63 on 16 degrees of freedom
AIC: 103.03