Chapter 14 Week11_2: Lavaan Lab 11 Model Local Fitting and Model Modifications
In this lab, we will learn:
- how to examine SEM local fit using residuals
- how to modify SEM models for improved fit using modification indices
Load up the lavaan library:
library(lavaan)
14.1 PART I: Local Fit with Residuals
Let’s read in the new dataset ChiStatSimDat.csv:
<- read.csv("ChiStatSimDat.csv", header = T) cfaData
Write out syntax for a two-factor CFA model:
<- "
fixedIndTwoFacSyntax #Factor Specification
posAffect =~ glad + happy + cheerful
satisfaction =~ satisfied + content + comfortable
"
Fit the two-factor model:
= lavaan::sem(model = fixedIndTwoFacSyntax,
fixedIndTwoFacRun data = cfaData,
fixed.x=FALSE)
14.1.1 Unstandardized residuals
resid(fixedIndTwoFacRun)$cov
## glad happy cherfl satsfd contnt cmfrtb
## glad 0.000
## happy 0.017 0.000
## cheerful -0.028 -0.010 0.000
## satisfied -0.001 0.011 0.038 0.000
## content -0.030 -0.114 0.113 0.017 0.000
## comfortable 0.078 0.063 0.344 -0.090 0.012 0.000
What does this mean? What is the metric?
14.1.2 Standardized residuals
resid(fixedIndTwoFacRun, type = "standardized")$cov
## glad happy cherfl satsfd contnt cmfrtb
## glad 0.000
## happy 3.950 0.000
## cheerful -3.732 -0.764 0.000
## satisfied -0.035 0.431 1.189 0.000
## content -2.164 -5.979 3.164 3.410 0.000
## comfortable 2.079 1.383 7.821 -4.755 0.762 0.000
14.1.3 Normalized residuals
resid(fixedIndTwoFacRun, type = "normalized")$cov
## glad happy cherfl satsfd contnt cmfrtb
## glad 0.000
## happy 0.117 0.000
## cheerful -0.254 -0.081 0.000
## satisfied -0.004 0.073 0.337 0.000
## content -0.214 -0.738 0.903 0.105 0.000
## comfortable 0.686 0.502 3.187 -0.751 0.087 0.000
- Different residuals, same story
- The covariance residual between cheerful and comfortable is the largest and positive
- The model under-predicts this covariance
- Fix!
14.2 PART II: Modification Indices
modindices(fixedIndTwoFacRun)
Filter output and only show rows with a modification index value equal or higher than 1:
modindices(fixedIndTwoFacRun, minimum.value = 10)
## lhs op rhs mi epc sepc.lv sepc.all sepc.nox
## 17 posAffect =~ content 17.031 -0.718 -0.801 -0.587 -0.587
## 18 posAffect =~ comfortable 14.926 0.500 0.558 0.487 0.487
## 20 satisfaction =~ happy 10.239 -0.390 -0.450 -0.338 -0.338
## 21 satisfaction =~ cheerful 20.960 0.456 0.526 0.501 0.501
## 22 glad ~~ happy 20.960 0.256 0.256 1.245 1.245
## 23 glad ~~ cheerful 10.239 -0.106 -0.106 -0.443 -0.443
## 29 happy ~~ content 18.588 -0.132 -0.132 -0.525 -0.525
## 33 cheerful ~~ comfortable 45.215 0.253 0.253 0.521 0.521
## 34 satisfied ~~ content 14.926 0.303 0.303 1.151 1.151
## 35 satisfied ~~ comfortable 17.030 -0.181 -0.181 -0.414 -0.414
Sort the output using the values of the modification index values. Higher values appear first:
modindices(fixedIndTwoFacRun, minimum.value = 10, sort = TRUE)
## lhs op rhs mi epc sepc.lv sepc.all sepc.nox
## 33 cheerful ~~ comfortable 45.215 0.253 0.253 0.521 0.521
## 21 satisfaction =~ cheerful 20.960 0.456 0.526 0.501 0.501
## 22 glad ~~ happy 20.960 0.256 0.256 1.245 1.245
## 29 happy ~~ content 18.588 -0.132 -0.132 -0.525 -0.525
## 17 posAffect =~ content 17.031 -0.718 -0.801 -0.587 -0.587
## 35 satisfied ~~ comfortable 17.030 -0.181 -0.181 -0.414 -0.414
## 18 posAffect =~ comfortable 14.926 0.500 0.558 0.487 0.487
## 34 satisfied ~~ content 14.926 0.303 0.303 1.151 1.151
## 20 satisfaction =~ happy 10.239 -0.390 -0.450 -0.338 -0.338
## 23 glad ~~ cheerful 10.239 -0.106 -0.106 -0.443 -0.443
- op ~~ : a correlation between two unique factors
- op =~ : cross-loading
- This indicates that the parameters lavaan detects for you to free up are all residual covariances.
14.2.1 Modified Model 1:
<- "
mod1 posAffect =~ glad + happy + cheerful
satisfaction =~ satisfied + content + comfortable
#residal covariance
cheerful ~~ comfortable
"
<- lavaan::sem(mod1, data = cfaData,
mod1_fit std.lv = TRUE, fixed.x=FALSE)
summary(mod1_fit, fit.measures = T, standardized = T)
## lavaan 0.6-12 ended normally after 29 iterations
##
## Estimator ML
## Optimization method NLMINB
## Number of model parameters 14
##
## Number of observations 200
##
## Model Test User Model:
##
## Test statistic 60.048
## Degrees of freedom 7
## P-value (Chi-square) 0.000
##
## Model Test Baseline Model:
##
## Test statistic 1168.055
## Degrees of freedom 15
## P-value 0.000
##
## User Model versus Baseline Model:
##
## Comparative Fit Index (CFI) 0.954
## Tucker-Lewis Index (TLI) 0.901
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -1386.429
## Loglikelihood unrestricted model (H1) -1356.405
##
## Akaike (AIC) 2800.859
## Bayesian (BIC) 2847.035
## Sample-size adjusted Bayesian (BIC) 2802.682
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.195
## 90 Percent confidence interval - lower 0.151
## 90 Percent confidence interval - upper 0.241
## P-value RMSEA <= 0.05 0.000
##
## Standardized Root Mean Square Residual:
##
## SRMR 0.058
##
## Parameter Estimates:
##
## Standard errors Standard
## Information Expected
## Information saturated (h1) model Structured
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) Std.lv
## posAffect =~
## glad 1.111 0.064 17.475 0.000 1.111
## happy 1.221 0.073 16.802 0.000 1.221
## cheerful 0.801 0.060 13.392 0.000 0.801
## satisfaction =~
## satisfied 1.161 0.071 16.389 0.000 1.161
## content 1.262 0.075 16.835 0.000 1.262
## comfortable 0.750 0.069 10.821 0.000 0.750
## Std.all
##
## 0.940
## 0.919
## 0.790
##
## 0.910
## 0.925
## 0.679
##
## Covariances:
## Estimate Std.Err z-value P(>|z|) Std.lv
## .cheerful ~~
## .comfortable 0.270 0.043 6.230 0.000 0.270
## posAffect ~~
## satisfaction 0.876 0.022 39.298 0.000 0.876
## Std.all
##
## 0.536
##
## 0.876
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv
## .glad 0.163 0.031 5.200 0.000 0.163
## .happy 0.274 0.043 6.406 0.000 0.274
## .cheerful 0.386 0.043 8.990 0.000 0.386
## .satisfied 0.281 0.046 6.106 0.000 0.281
## .content 0.271 0.050 5.365 0.000 0.271
## .comfortable 0.657 0.070 9.380 0.000 0.657
## posAffect 1.000 1.000
## satisfaction 1.000 1.000
## Std.all
## 0.116
## 0.155
## 0.375
## 0.172
## 0.145
## 0.539
## 1.000
## 1.000
Model comparison:
anova(mod1_fit, fixedIndTwoFacRun)
## Chi-Squared Difference Test
##
## Df AIC BIC Chisq Chisq diff Df diff
## mod1_fit 7 2800.9 2847.0 60.048
## fixedIndTwoFacRun 8 2852.4 2895.3 113.638 53.59 1
## Pr(>Chisq)
## mod1_fit
## fixedIndTwoFacRun 2.47e-13 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Keep modifying mod1:
modindices(mod1_fit, minimum.value = 10, sort = TRUE)
## lhs op rhs mi epc sepc.lv sepc.all sepc.nox
## 22 satisfaction =~ cheerful 27.353 0.612 0.612 0.604 0.604
## 23 glad ~~ happy 27.353 0.275 0.275 1.304 1.304
## 34 satisfied ~~ content 23.194 0.369 0.369 1.340 1.340
## 19 posAffect =~ comfortable 23.194 0.711 0.711 0.644 0.644
## 30 happy ~~ content 22.974 -0.157 -0.157 -0.576 -0.576
## 33 cheerful ~~ content 16.477 0.113 0.113 0.348 0.348
## 24 glad ~~ cheerful 11.873 -0.098 -0.098 -0.392 -0.392
## 21 satisfaction =~ happy 11.872 -0.506 -0.506 -0.381 -0.381
## 35 satisfied ~~ comfortable 11.662 -0.127 -0.127 -0.297 -0.297
## 18 posAffect =~ content 11.661 -0.695 -0.695 -0.509 -0.509
14.2.2 Modified Model 2_1:
<- "
mod2_1 # cross loading
posAffect =~ glad + happy + cheerful
satisfaction =~ cheerful + satisfied + content + comfortable # cheerful also loads on satisfaction
#residal covariance
cheerful ~~ comfortable
"
<- lavaan::sem(mod2_1, data = cfaData,
mod2_1_fit std.lv = TRUE, fixed.x=FALSE)
summary(mod2_1_fit, fit.measures = T, standardized = T)
## lavaan 0.6-12 ended normally after 27 iterations
##
## Estimator ML
## Optimization method NLMINB
## Number of model parameters 15
##
## Number of observations 200
##
## Model Test User Model:
##
## Test statistic 34.265
## Degrees of freedom 6
## P-value (Chi-square) 0.000
##
## Model Test Baseline Model:
##
## Test statistic 1168.055
## Degrees of freedom 15
## P-value 0.000
##
## User Model versus Baseline Model:
##
## Comparative Fit Index (CFI) 0.975
## Tucker-Lewis Index (TLI) 0.939
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -1373.538
## Loglikelihood unrestricted model (H1) -1356.405
##
## Akaike (AIC) 2777.076
## Bayesian (BIC) 2826.551
## Sample-size adjusted Bayesian (BIC) 2779.029
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.153
## 90 Percent confidence interval - lower 0.106
## 90 Percent confidence interval - upper 0.205
## P-value RMSEA <= 0.05 0.000
##
## Standardized Root Mean Square Residual:
##
## SRMR 0.031
##
## Parameter Estimates:
##
## Standard errors Standard
## Information Expected
## Information saturated (h1) model Structured
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) Std.lv
## posAffect =~
## glad 1.128 0.063 17.786 0.000 1.128
## happy 1.223 0.073 16.752 0.000 1.223
## cheerful 0.389 0.084 4.624 0.000 0.389
## satisfaction =~
## cheerful 0.483 0.089 5.400 0.000 0.483
## satisfied 1.151 0.071 16.168 0.000 1.151
## content 1.284 0.074 17.339 0.000 1.284
## comfortable 0.815 0.072 11.361 0.000 0.815
## Std.all
##
## 0.954
## 0.921
## 0.374
##
## 0.465
## 0.902
## 0.941
## 0.712
##
## Covariances:
## Estimate Std.Err z-value P(>|z|) Std.lv
## .cheerful ~~
## .comfortable 0.262 0.043 6.055 0.000 0.262
## posAffect ~~
## satisfaction 0.835 0.027 30.624 0.000 0.835
## Std.all
##
## 0.529
##
## 0.835
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv
## .glad 0.126 0.035 3.618 0.000 0.126
## .happy 0.269 0.047 5.747 0.000 0.269
## .cheerful 0.380 0.041 9.270 0.000 0.380
## .satisfied 0.303 0.047 6.458 0.000 0.303
## .content 0.215 0.049 4.409 0.000 0.215
## .comfortable 0.646 0.070 9.287 0.000 0.646
## posAffect 1.000 1.000
## satisfaction 1.000 1.000
## Std.all
## 0.090
## 0.152
## 0.353
## 0.186
## 0.115
## 0.493
## 1.000
## 1.000
Model comparison:
anova(mod2_1_fit, mod1_fit)
## Chi-Squared Difference Test
##
## Df AIC BIC Chisq Chisq diff Df diff Pr(>Chisq)
## mod2_1_fit 6 2777.1 2826.6 34.265
## mod1_fit 7 2800.9 2847.0 60.048 25.783 1 3.821e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
14.2.3 Modified Model 2_2:
<- "
mod2_2 posAffect =~ glad + happy + cheerful
satisfaction =~ satisfied + content + comfortable
#residal covariance
cheerful ~~ comfortable
glad ~~ happy
"
<- lavaan::sem(mod2_2, data = cfaData,
mod2_2_fit std.lv = TRUE, fixed.x=FALSE)
summary(mod2_2_fit, fit.measures = T, standardized = T)
## lavaan 0.6-12 ended normally after 31 iterations
##
## Estimator ML
## Optimization method NLMINB
## Number of model parameters 15
##
## Number of observations 200
##
## Model Test User Model:
##
## Test statistic 34.265
## Degrees of freedom 6
## P-value (Chi-square) 0.000
##
## Model Test Baseline Model:
##
## Test statistic 1168.055
## Degrees of freedom 15
## P-value 0.000
##
## User Model versus Baseline Model:
##
## Comparative Fit Index (CFI) 0.975
## Tucker-Lewis Index (TLI) 0.939
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -1373.538
## Loglikelihood unrestricted model (H1) -1356.405
##
## Akaike (AIC) 2777.076
## Bayesian (BIC) 2826.551
## Sample-size adjusted Bayesian (BIC) 2779.029
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.153
## 90 Percent confidence interval - lower 0.106
## 90 Percent confidence interval - upper 0.205
## P-value RMSEA <= 0.05 0.000
##
## Standardized Root Mean Square Residual:
##
## SRMR 0.031
##
## Parameter Estimates:
##
## Standard errors Standard
## Information Expected
## Information saturated (h1) model Structured
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) Std.lv
## posAffect =~
## glad 1.020 0.069 14.784 0.000 1.020
## happy 1.107 0.079 13.947 0.000 1.107
## cheerful 0.875 0.061 14.323 0.000 0.875
## satisfaction =~
## satisfied 1.151 0.071 16.168 0.000 1.151
## content 1.284 0.074 17.339 0.000 1.284
## comfortable 0.815 0.072 11.361 0.000 0.815
## Std.all
##
## 0.863
## 0.833
## 0.843
##
## 0.902
## 0.941
## 0.712
##
## Covariances:
## Estimate Std.Err z-value P(>|z|) Std.lv
## .cheerful ~~
## .comfortable 0.262 0.043 6.055 0.000 0.262
## .glad ~~
## .happy 0.250 0.055 4.563 0.000 0.250
## posAffect ~~
## satisfaction 0.923 0.020 46.285 0.000 0.923
## Std.all
##
## 0.584
##
## 0.569
##
## 0.923
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv
## .glad 0.356 0.053 6.679 0.000 0.356
## .happy 0.540 0.074 7.249 0.000 0.540
## .cheerful 0.312 0.043 7.197 0.000 0.312
## .satisfied 0.303 0.047 6.458 0.000 0.303
## .content 0.215 0.049 4.409 0.000 0.215
## .comfortable 0.646 0.070 9.287 0.000 0.646
## posAffect 1.000 1.000
## satisfaction 1.000 1.000
## Std.all
## 0.255
## 0.306
## 0.290
## 0.186
## 0.115
## 0.493
## 1.000
## 1.000
Model comparison:
anova(mod2_2_fit, mod1_fit)
## Chi-Squared Difference Test
##
## Df AIC BIC Chisq Chisq diff Df diff Pr(>Chisq)
## mod2_2_fit 6 2777.1 2826.6 34.265
## mod1_fit 7 2800.9 2847.0 60.048 25.783 1 3.821e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Keep modifying 2_2:
modindices(mod2_2_fit, minimum.value = 10, sort = TRUE)
## lhs op rhs mi epc sepc.lv sepc.all sepc.nox
## 30 happy ~~ content 14.77 -0.117 -0.117 -0.344 -0.344
14.2.4 Modified Model 3:
<- "
mod3 posAffect =~ glad + happy + cheerful
satisfaction =~ satisfied + content + comfortable
#residal covariance
cheerful ~~ comfortable
glad ~~ happy
content ~~ happy
"
<- lavaan::sem(mod3, data = cfaData, std.lv = TRUE, fixed.x = FALSE)
mod3_fit
summary(mod3_fit, fit.measures = T, standardized = T)
## lavaan 0.6-12 ended normally after 33 iterations
##
## Estimator ML
## Optimization method NLMINB
## Number of model parameters 16
##
## Number of observations 200
##
## Model Test User Model:
##
## Test statistic 16.311
## Degrees of freedom 5
## P-value (Chi-square) 0.006
##
## Model Test Baseline Model:
##
## Test statistic 1168.055
## Degrees of freedom 15
## P-value 0.000
##
## User Model versus Baseline Model:
##
## Comparative Fit Index (CFI) 0.990
## Tucker-Lewis Index (TLI) 0.971
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -1364.561
## Loglikelihood unrestricted model (H1) -1356.405
##
## Akaike (AIC) 2761.122
## Bayesian (BIC) 2813.895
## Sample-size adjusted Bayesian (BIC) 2763.205
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.106
## 90 Percent confidence interval - lower 0.052
## 90 Percent confidence interval - upper 0.166
## P-value RMSEA <= 0.05 0.046
##
## Standardized Root Mean Square Residual:
##
## SRMR 0.025
##
## Parameter Estimates:
##
## Standard errors Standard
## Information Expected
## Information saturated (h1) model Structured
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) Std.lv
## posAffect =~
## glad 1.017 0.069 14.710 0.000 1.017
## happy 1.152 0.077 14.869 0.000 1.152
## cheerful 0.871 0.061 14.298 0.000 0.871
## satisfaction =~
## satisfied 1.141 0.071 16.002 0.000 1.141
## content 1.293 0.074 17.585 0.000 1.293
## comfortable 0.814 0.071 11.427 0.000 0.814
## Std.all
##
## 0.860
## 0.873
## 0.838
##
## 0.895
## 0.948
## 0.711
##
## Covariances:
## Estimate Std.Err z-value P(>|z|) Std.lv
## .cheerful ~~
## .comfortable 0.266 0.042 6.285 0.000 0.266
## .glad ~~
## .happy 0.187 0.055 3.384 0.001 0.187
## .happy ~~
## .content -0.134 0.031 -4.358 0.000 -0.134
## posAffect ~~
## satisfaction 0.928 0.020 47.200 0.000 0.928
## Std.all
##
## 0.583
##
## 0.484
##
## -0.484
##
## 0.928
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv
## .glad 0.364 0.054 6.787 0.000 0.364
## .happy 0.412 0.073 5.661 0.000 0.412
## .cheerful 0.321 0.042 7.645 0.000 0.321
## .satisfied 0.325 0.046 7.060 0.000 0.325
## .content 0.187 0.048 3.901 0.000 0.187
## .comfortable 0.647 0.068 9.474 0.000 0.647
## posAffect 1.000 1.000
## satisfaction 1.000 1.000
## Std.all
## 0.260
## 0.237
## 0.298
## 0.200
## 0.101
## 0.494
## 1.000
## 1.000
Model comparison:
anova(mod3_fit, mod2_2_fit)
## Chi-Squared Difference Test
##
## Df AIC BIC Chisq Chisq diff Df diff Pr(>Chisq)
## mod3_fit 5 2761.1 2813.9 16.311
## mod2_2_fit 6 2777.1 2826.6 34.265 17.954 1 2.263e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Keep modifying mod3:
modindices(mod3_fit, minimum.value = 10, sort = TRUE)
## [1] lhs op rhs mi epc sepc.lv sepc.all
## [8] sepc.nox
## <0 rows> (or 0-length row.names)
No suggestions could decrease the model chisquare by more than 10.