Module 11 assignment

 When I ran the different model formulas, I noticed that they all use the same factors, but depending on how I write the model, R builds the design matrix differently. So a*b gives main effects and the interaction. a:b only gives the interaction cells. a + b only gives main effects. Same variables, but different structure.

The main thing I saw is that some of these versions actually run fine, and some end up with singularities. For example, lm(z ~ a*b) worked normally, but lm(z ~ a:b) gave me an NA because the interaction only model wasn’t full rank with the intercept. So it isn’t the data that causes NAs, it’s literally how the model is written.

The implication is just that the way you write the formula matters. It controls the math in the background, and that decides whether R can estimate all the coefficients or not.

> a <- gl(2,2,8)   # same thing as g1(2,2,8)
> b <- gl(2,4,8)   # same thing as g1(2,4,8)
> x <- 1:8
> y <- c(1:4,8:5)
> set.seed(1)
> z <- rnorm(8)
> > model.matrix(~ a*b)
  (Intercept) a2 b2 a2:b2
1           1  0  0     0
2           1  0  0     0
3           1  1  0     0
4           1  1  0     0
5           1  0  1     0
6           1  0  1     0
7           1  1  1     1
8           1  1  1     1
attr(,"assign")
[1] 0 1 2 3
attr(,"contrasts")
attr(,"contrasts")$a
[1] "contr.treatment"

attr(,"contrasts")$b
[1] "contr.treatment"

> model.matrix(~ a:b)
  (Intercept) a1:b1 a2:b1 a1:b2 a2:b2
1           1     1     0     0     0
2           1     1     0     0     0
3           1     0     1     0     0
4           1     0     1     0     0
5           1     0     0     1     0
6           1     0     0     1     0
7           1     0     0     0     1
8           1     0     0     0     1
attr(,"assign")
[1] 0 1 1 1 1
attr(,"contrasts")
attr(,"contrasts")$a
[1] "contr.treatment"

attr(,"contrasts")$b
[1] "contr.treatment"

> model.matrix(~ a + b)
  (Intercept) a2 b2
1           1  0  0
2           1  0  0
3           1  1  0
4           1  1  0
5           1  0  1
6           1  0  1
7           1  1  1
8           1  1  1
attr(,"assign")
[1] 0 1 2
attr(,"contrasts")
attr(,"contrasts")$a
[1] "contr.treatment"

attr(,"contrasts")$b
[1] "contr.treatment"

> model.matrix(~ a*b - 1)  # no intercept
  a1 a2 b2 a2:b2
1  1  0  0     0
2  1  0  0     0
3  0  1  0     0
4  0  1  0     0
5  1  0  1     0
6  1  0  1     0
7  0  1  1     1
8  0  1  1     1
attr(,"assign")
[1] 1 1 2 3
attr(,"contrasts")
attr(,"contrasts")$a
[1] "contr.treatment"

attr(,"contrasts")$b
[1] "contr.treatment"

> > lm(z ~ a*b)

Call:
lm(formula = z ~ a * b)

Coefficients:
(Intercept)           a2           b2  
   -0.22141      0.60123     -0.02408  
      a2:b2  
    0.25713  

> lm(z ~ a:b)

Call:
lm(formula = z ~ a:b)

Coefficients:
(Intercept)        a1:b1        a2:b1  
     0.6129      -0.8343      -0.2331  
      a1:b2        a2:b2  
    -0.8584           NA  
Call:
lm(formula = vas ~ subject + period + treat, data = ash.long)

Residuals:
   Min     1Q Median     3Q    Max 
-48.94 -18.44   0.00  18.44  48.94 

Coefficients: (1 not defined because of singularities)
            Estimate Std. Error t value
(Intercept)  -113.06      27.39  -4.128
subject2       51.50      37.58   1.370
subject3      121.50      37.58   3.233
subject4       97.00      37.58   2.581
subject5      125.00      37.58   3.326
subject6       31.50      37.58   0.838
subject7      119.50      37.58   3.180
subject8      132.00      37.58   3.513
subject9       80.50      37.58   2.142
subject10     116.00      37.58   3.087
subject11     121.50      37.58   3.233
subject12     154.50      37.58   4.111
subject13     131.00      37.58   3.486
subject14     125.00      37.58   3.326
subject15      99.00      37.58   2.634
subject16      80.50      37.58   2.142
period2           NA         NA      NA
treat         -42.87      13.29  -3.227
            Pr(>|t|)    
(Intercept) 0.000895 ***
subject2    0.190721    
subject3    0.005573 ** 
subject4    0.020867 *  
subject5    0.004604 ** 
subject6    0.415070    
subject7    0.006215 ** 
subject8    0.003142 ** 
subject9    0.049003 *  
subject10   0.007518 ** 
subject11   0.005573 ** 
subject12   0.000925 ***
subject13   0.003318 ** 
subject14   0.004604 ** 
subject15   0.018768 *  
subject16   0.049003 *  
period2           NA    
treat       0.005644 ** 
---
Signif. codes:  
  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1
  ‘ ’ 1

Residual standard error: 37.58 on 15 degrees of freedom
Multiple R-squared:  0.7566,	Adjusted R-squared:  0.4969 
F-statistic: 2.914 on 16 and 15 DF,  p-value: 0.02229

> 

Comments

Popular posts from this blog

Module 10

basic graph

Module 3 assignemnt