Chapter 14 Lavaan Lab 11: Model Local Fitting and Model Modifications
In this lab, we will learn:
- how to examine SEM local fit using residuals
- how to modify SEM models for improved fit using modification indices
Load up the lavaan library:
library(lavaan)
14.1 PART I: Local Fit with Residuals
Let’s read in the new dataset ChiStatSimDat.csv:
<- read.csv("ChiStatSimDat.csv", header = T) cfaData
Write out syntax for a two-factor CFA model:
<- "
fixedIndTwoFacSyntax #Factor Specification
posAffect =~ glad + happy + cheerful
satisfaction =~ satisfied + content + comfortable
"
Fit the two-factor model:
= lavaan::sem(model = fixedIndTwoFacSyntax,
fixedIndTwoFacRun data = cfaData,
fixed.x=FALSE)
# , estimator = 'MLMV'
summary(fixedIndTwoFacRun, standardized = T, fit.measures = T)
## lavaan 0.6-12 ended normally after 29 iterations
##
## Estimator ML
## Optimization method NLMINB
## Number of model parameters 13
##
## Number of observations 200
##
## Model Test User Model:
##
## Test statistic 113.638
## Degrees of freedom 8
## P-value (Chi-square) 0.000
##
## Model Test Baseline Model:
##
## Test statistic 1168.055
## Degrees of freedom 15
## P-value 0.000
##
## User Model versus Baseline Model:
##
## Comparative Fit Index (CFI) 0.908
## Tucker-Lewis Index (TLI) 0.828
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -1413.224
## Loglikelihood unrestricted model (H1) -1356.405
##
## Akaike (AIC) 2852.449
## Bayesian (BIC) 2895.327
## Sample-size adjusted Bayesian (BIC) 2854.141
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.257
## 90 Percent confidence interval - lower 0.216
## 90 Percent confidence interval - upper 0.300
## P-value RMSEA <= 0.05 0.000
##
## Standardized Root Mean Square Residual:
##
## SRMR 0.070
##
## Parameter Estimates:
##
## Standard errors Standard
## Information Expected
## Information saturated (h1) model Structured
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## posAffect =~
## glad 1.000 1.115 0.943
## happy 1.095 0.048 22.882 0.000 1.221 0.919
## cheerful 0.763 0.046 16.739 0.000 0.851 0.812
## satisfaction =~
## satisfied 1.000 1.152 0.902
## content 1.110 0.054 20.462 0.000 1.278 0.936
## comfortable 0.715 0.057 12.548 0.000 0.823 0.719
##
## Covariances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## posAffect ~~
## satisfaction 1.104 0.131 8.425 0.000 0.859 0.859
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .glad 0.154 0.031 4.966 0.000 0.154 0.110
## .happy 0.274 0.043 6.398 0.000 0.274 0.155
## .cheerful 0.374 0.042 8.851 0.000 0.374 0.341
## .satisfied 0.302 0.047 6.492 0.000 0.302 0.186
## .content 0.230 0.048 4.742 0.000 0.230 0.123
## .comfortable 0.632 0.068 9.253 0.000 0.632 0.482
## posAffect 1.244 0.142 8.790 0.000 1.000 1.000
## satisfaction 1.326 0.164 8.093 0.000 1.000 1.000
14.1.1 Unstandardized residuals
resid(fixedIndTwoFacRun)$cov
## glad happy cherfl satsfd contnt cmfrtb
## glad 0.000
## happy 0.017 0.000
## cheerful -0.028 -0.010 0.000
## satisfied -0.001 0.011 0.038 0.000
## content -0.030 -0.114 0.113 0.017 0.000
## comfortable 0.078 0.063 0.344 -0.090 0.012 0.000
What does this mean? What is the metric?
14.1.2 Standardized residuals
resid(fixedIndTwoFacRun, type = "standardized")$cov
## glad happy cherfl satsfd contnt cmfrtb
## glad 0.000
## happy 3.950 0.000
## cheerful -3.732 -0.764 0.000
## satisfied -0.035 0.431 1.189 0.000
## content -2.164 -5.979 3.164 3.410 0.000
## comfortable 2.079 1.383 7.821 -4.755 0.762 0.000
14.1.3 Normalized residuals
resid(fixedIndTwoFacRun, type = "normalized")$cov
## glad happy cherfl satsfd contnt cmfrtb
## glad 0.000
## happy 0.117 0.000
## cheerful -0.254 -0.081 0.000
## satisfied -0.004 0.073 0.337 0.000
## content -0.214 -0.738 0.903 0.105 0.000
## comfortable 0.686 0.502 3.187 -0.751 0.087 0.000
- Different residuals, same story
- The covariance residual between cheerful and comfortable is the largest and positive
- The model under-predicts this covariance
- Fix!
14.2 PART II: Modification Indices
modindices(fixedIndTwoFacRun)
Filter output and only show rows with a modification index value equal or higher than 1:
modindices(fixedIndTwoFacRun, minimum.value = 10)
## lhs op rhs mi epc sepc.lv sepc.all sepc.nox
## 17 posAffect =~ content 17.031 -0.718 -0.801 -0.587 -0.587
## 18 posAffect =~ comfortable 14.926 0.500 0.558 0.487 0.487
## 20 satisfaction =~ happy 10.239 -0.390 -0.450 -0.338 -0.338
## 21 satisfaction =~ cheerful 20.960 0.456 0.526 0.501 0.501
## 22 glad ~~ happy 20.960 0.256 0.256 1.245 1.245
## 23 glad ~~ cheerful 10.239 -0.106 -0.106 -0.443 -0.443
## 29 happy ~~ content 18.588 -0.132 -0.132 -0.525 -0.525
## 33 cheerful ~~ comfortable 45.215 0.253 0.253 0.521 0.521
## 34 satisfied ~~ content 14.926 0.303 0.303 1.151 1.151
## 35 satisfied ~~ comfortable 17.030 -0.181 -0.181 -0.414 -0.414
Sort the output using the values of the modification index values. Higher values appear first:
modindices(fixedIndTwoFacRun, minimum.value = 10, sort = TRUE)
## lhs op rhs mi epc sepc.lv sepc.all sepc.nox
## 33 cheerful ~~ comfortable 45.215 0.253 0.253 0.521 0.521
## 21 satisfaction =~ cheerful 20.960 0.456 0.526 0.501 0.501
## 22 glad ~~ happy 20.960 0.256 0.256 1.245 1.245
## 29 happy ~~ content 18.588 -0.132 -0.132 -0.525 -0.525
## 17 posAffect =~ content 17.031 -0.718 -0.801 -0.587 -0.587
## 35 satisfied ~~ comfortable 17.030 -0.181 -0.181 -0.414 -0.414
## 18 posAffect =~ comfortable 14.926 0.500 0.558 0.487 0.487
## 34 satisfied ~~ content 14.926 0.303 0.303 1.151 1.151
## 20 satisfaction =~ happy 10.239 -0.390 -0.450 -0.338 -0.338
## 23 glad ~~ cheerful 10.239 -0.106 -0.106 -0.443 -0.443
- op ~~ : a correlation between two unique factors
- op =~ : cross-loading
- This indicates that the parameters lavaan detects for you to free up are all residual covariances.
14.2.1 Modified Model 1:
<- "
mod1 posAffect =~ glad + happy + cheerful
satisfaction =~ satisfied + content + comfortable
#residal covariance
cheerful ~~ comfortable
"
<- lavaan::sem(mod1, data = cfaData,
mod1_fit std.lv = TRUE, fixed.x=FALSE)
summary(mod1_fit, fit.measures = T, standardized = T)
## lavaan 0.6-12 ended normally after 29 iterations
##
## Estimator ML
## Optimization method NLMINB
## Number of model parameters 14
##
## Number of observations 200
##
## Model Test User Model:
##
## Test statistic 60.048
## Degrees of freedom 7
## P-value (Chi-square) 0.000
##
## Model Test Baseline Model:
##
## Test statistic 1168.055
## Degrees of freedom 15
## P-value 0.000
##
## User Model versus Baseline Model:
##
## Comparative Fit Index (CFI) 0.954
## Tucker-Lewis Index (TLI) 0.901
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -1386.429
## Loglikelihood unrestricted model (H1) -1356.405
##
## Akaike (AIC) 2800.859
## Bayesian (BIC) 2847.035
## Sample-size adjusted Bayesian (BIC) 2802.682
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.195
## 90 Percent confidence interval - lower 0.151
## 90 Percent confidence interval - upper 0.241
## P-value RMSEA <= 0.05 0.000
##
## Standardized Root Mean Square Residual:
##
## SRMR 0.058
##
## Parameter Estimates:
##
## Standard errors Standard
## Information Expected
## Information saturated (h1) model Structured
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## posAffect =~
## glad 1.111 0.064 17.475 0.000 1.111 0.940
## happy 1.221 0.073 16.802 0.000 1.221 0.919
## cheerful 0.801 0.060 13.392 0.000 0.801 0.790
## satisfaction =~
## satisfied 1.161 0.071 16.389 0.000 1.161 0.910
## content 1.262 0.075 16.835 0.000 1.262 0.925
## comfortable 0.750 0.069 10.821 0.000 0.750 0.679
##
## Covariances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .cheerful ~~
## .comfortable 0.270 0.043 6.230 0.000 0.270 0.536
## posAffect ~~
## satisfaction 0.876 0.022 39.298 0.000 0.876 0.876
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .glad 0.163 0.031 5.200 0.000 0.163 0.116
## .happy 0.274 0.043 6.406 0.000 0.274 0.155
## .cheerful 0.386 0.043 8.990 0.000 0.386 0.375
## .satisfied 0.281 0.046 6.106 0.000 0.281 0.172
## .content 0.271 0.050 5.365 0.000 0.271 0.145
## .comfortable 0.657 0.070 9.380 0.000 0.657 0.539
## posAffect 1.000 1.000 1.000
## satisfaction 1.000 1.000 1.000
Model comparison:
anova(mod1_fit, fixedIndTwoFacRun)
## Chi-Squared Difference Test
##
## Df AIC BIC Chisq Chisq diff Df diff Pr(>Chisq)
## mod1_fit 7 2800.9 2847.0 60.048
## fixedIndTwoFacRun 8 2852.4 2895.3 113.638 53.59 1 2.47e-13 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Keep modifying mod1:
modindices(mod1_fit, minimum.value = 10, sort = TRUE)
## lhs op rhs mi epc sepc.lv sepc.all sepc.nox
## 22 satisfaction =~ cheerful 27.353 0.612 0.612 0.604 0.604
## 23 glad ~~ happy 27.353 0.275 0.275 1.304 1.304
## 34 satisfied ~~ content 23.194 0.369 0.369 1.340 1.340
## 19 posAffect =~ comfortable 23.194 0.711 0.711 0.644 0.644
## 30 happy ~~ content 22.974 -0.157 -0.157 -0.576 -0.576
## 33 cheerful ~~ content 16.477 0.113 0.113 0.348 0.348
## 24 glad ~~ cheerful 11.873 -0.098 -0.098 -0.392 -0.392
## 21 satisfaction =~ happy 11.872 -0.506 -0.506 -0.381 -0.381
## 35 satisfied ~~ comfortable 11.662 -0.127 -0.127 -0.297 -0.297
## 18 posAffect =~ content 11.661 -0.695 -0.695 -0.509 -0.509
14.2.2 Modified Model 2_1:
<- "
mod2_1 # cross loading
posAffect =~ glad + happy + cheerful
satisfaction =~ cheerful + satisfied + content + comfortable # cheerful also loads on satisfaction
#residal covariance
cheerful ~~ comfortable
"
<- lavaan::sem(mod2_1, data = cfaData,
mod2_1_fit std.lv = TRUE, fixed.x=FALSE)
summary(mod2_1_fit, fit.measures = T, standardized = T)
## lavaan 0.6-12 ended normally after 27 iterations
##
## Estimator ML
## Optimization method NLMINB
## Number of model parameters 15
##
## Number of observations 200
##
## Model Test User Model:
##
## Test statistic 34.265
## Degrees of freedom 6
## P-value (Chi-square) 0.000
##
## Model Test Baseline Model:
##
## Test statistic 1168.055
## Degrees of freedom 15
## P-value 0.000
##
## User Model versus Baseline Model:
##
## Comparative Fit Index (CFI) 0.975
## Tucker-Lewis Index (TLI) 0.939
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -1373.538
## Loglikelihood unrestricted model (H1) -1356.405
##
## Akaike (AIC) 2777.076
## Bayesian (BIC) 2826.551
## Sample-size adjusted Bayesian (BIC) 2779.029
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.153
## 90 Percent confidence interval - lower 0.106
## 90 Percent confidence interval - upper 0.205
## P-value RMSEA <= 0.05 0.000
##
## Standardized Root Mean Square Residual:
##
## SRMR 0.031
##
## Parameter Estimates:
##
## Standard errors Standard
## Information Expected
## Information saturated (h1) model Structured
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## posAffect =~
## glad 1.128 0.063 17.786 0.000 1.128 0.954
## happy 1.223 0.073 16.752 0.000 1.223 0.921
## cheerful 0.389 0.084 4.624 0.000 0.389 0.374
## satisfaction =~
## cheerful 0.483 0.089 5.400 0.000 0.483 0.465
## satisfied 1.151 0.071 16.168 0.000 1.151 0.902
## content 1.284 0.074 17.339 0.000 1.284 0.941
## comfortable 0.815 0.072 11.361 0.000 0.815 0.712
##
## Covariances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .cheerful ~~
## .comfortable 0.262 0.043 6.055 0.000 0.262 0.529
## posAffect ~~
## satisfaction 0.835 0.027 30.624 0.000 0.835 0.835
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .glad 0.126 0.035 3.618 0.000 0.126 0.090
## .happy 0.269 0.047 5.747 0.000 0.269 0.152
## .cheerful 0.380 0.041 9.270 0.000 0.380 0.353
## .satisfied 0.303 0.047 6.458 0.000 0.303 0.186
## .content 0.215 0.049 4.409 0.000 0.215 0.115
## .comfortable 0.646 0.070 9.287 0.000 0.646 0.493
## posAffect 1.000 1.000 1.000
## satisfaction 1.000 1.000 1.000
Model comparison:
anova(mod2_1_fit, mod1_fit)
## Chi-Squared Difference Test
##
## Df AIC BIC Chisq Chisq diff Df diff Pr(>Chisq)
## mod2_1_fit 6 2777.1 2826.6 34.265
## mod1_fit 7 2800.9 2847.0 60.048 25.783 1 3.821e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
14.2.3 Modified Model 2_2:
<- "
mod2_2 posAffect =~ glad + happy + cheerful
satisfaction =~ satisfied + content + comfortable
#residal covariance
cheerful ~~ comfortable
glad ~~ happy
"
<- lavaan::sem(mod2_2, data = cfaData,
mod2_2_fit std.lv = TRUE, fixed.x=FALSE)
summary(mod2_2_fit, fit.measures = T, standardized = T)
## lavaan 0.6-12 ended normally after 31 iterations
##
## Estimator ML
## Optimization method NLMINB
## Number of model parameters 15
##
## Number of observations 200
##
## Model Test User Model:
##
## Test statistic 34.265
## Degrees of freedom 6
## P-value (Chi-square) 0.000
##
## Model Test Baseline Model:
##
## Test statistic 1168.055
## Degrees of freedom 15
## P-value 0.000
##
## User Model versus Baseline Model:
##
## Comparative Fit Index (CFI) 0.975
## Tucker-Lewis Index (TLI) 0.939
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -1373.538
## Loglikelihood unrestricted model (H1) -1356.405
##
## Akaike (AIC) 2777.076
## Bayesian (BIC) 2826.551
## Sample-size adjusted Bayesian (BIC) 2779.029
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.153
## 90 Percent confidence interval - lower 0.106
## 90 Percent confidence interval - upper 0.205
## P-value RMSEA <= 0.05 0.000
##
## Standardized Root Mean Square Residual:
##
## SRMR 0.031
##
## Parameter Estimates:
##
## Standard errors Standard
## Information Expected
## Information saturated (h1) model Structured
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## posAffect =~
## glad 1.020 0.069 14.784 0.000 1.020 0.863
## happy 1.107 0.079 13.947 0.000 1.107 0.833
## cheerful 0.875 0.061 14.323 0.000 0.875 0.843
## satisfaction =~
## satisfied 1.151 0.071 16.168 0.000 1.151 0.902
## content 1.284 0.074 17.339 0.000 1.284 0.941
## comfortable 0.815 0.072 11.361 0.000 0.815 0.712
##
## Covariances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .cheerful ~~
## .comfortable 0.262 0.043 6.055 0.000 0.262 0.584
## .glad ~~
## .happy 0.250 0.055 4.563 0.000 0.250 0.569
## posAffect ~~
## satisfaction 0.923 0.020 46.285 0.000 0.923 0.923
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .glad 0.356 0.053 6.679 0.000 0.356 0.255
## .happy 0.540 0.074 7.249 0.000 0.540 0.306
## .cheerful 0.312 0.043 7.197 0.000 0.312 0.290
## .satisfied 0.303 0.047 6.458 0.000 0.303 0.186
## .content 0.215 0.049 4.409 0.000 0.215 0.115
## .comfortable 0.646 0.070 9.287 0.000 0.646 0.493
## posAffect 1.000 1.000 1.000
## satisfaction 1.000 1.000 1.000
Model comparison:
anova(mod2_2_fit, mod1_fit)
## Chi-Squared Difference Test
##
## Df AIC BIC Chisq Chisq diff Df diff Pr(>Chisq)
## mod2_2_fit 6 2777.1 2826.6 34.265
## mod1_fit 7 2800.9 2847.0 60.048 25.783 1 3.821e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Keep modifying 2_2:
modindices(mod2_2_fit, minimum.value = 10, sort = TRUE)
## lhs op rhs mi epc sepc.lv sepc.all sepc.nox
## 30 happy ~~ content 14.77 -0.117 -0.117 -0.344 -0.344
14.2.4 Modified Model 3:
<- "
mod3 posAffect =~ glad + happy + cheerful
satisfaction =~ satisfied + content + comfortable
#residal covariance
cheerful ~~ comfortable
glad ~~ happy
content ~~ happy
glad ~~ content
"
<- lavaan::sem(mod3, data = cfaData,
mod3_fit std.lv = TRUE, fixed.x = FALSE)
## Warning in lav_object_post_check(object): lavaan WARNING: the covariance matrix of the residuals of the observed
## variables (theta) is not positive definite;
## use lavInspect(fit, "theta") to investigate.
summary(mod3_fit, fit.measures = T, standardized = T)
## lavaan 0.6-12 ended normally after 35 iterations
##
## Estimator ML
## Optimization method NLMINB
## Number of model parameters 17
##
## Number of observations 200
##
## Model Test User Model:
##
## Test statistic 4.950
## Degrees of freedom 4
## P-value (Chi-square) 0.292
##
## Model Test Baseline Model:
##
## Test statistic 1168.055
## Degrees of freedom 15
## P-value 0.000
##
## User Model versus Baseline Model:
##
## Comparative Fit Index (CFI) 0.999
## Tucker-Lewis Index (TLI) 0.997
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -1358.880
## Loglikelihood unrestricted model (H1) -1356.405
##
## Akaike (AIC) 2751.761
## Bayesian (BIC) 2807.832
## Sample-size adjusted Bayesian (BIC) 2753.975
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.034
## 90 Percent confidence interval - lower 0.000
## 90 Percent confidence interval - upper 0.117
## P-value RMSEA <= 0.05 0.524
##
## Standardized Root Mean Square Residual:
##
## SRMR 0.013
##
## Parameter Estimates:
##
## Standard errors Standard
## Information Expected
## Information saturated (h1) model Structured
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## posAffect =~
## glad 1.071 0.069 15.446 0.000 1.071 0.908
## happy 1.190 0.079 15.143 0.000 1.190 0.898
## cheerful 0.837 0.063 13.355 0.000 0.837 0.802
## satisfaction =~
## satisfied 1.101 0.074 14.921 0.000 1.101 0.863
## content 1.341 0.073 18.279 0.000 1.341 0.983
## comfortable 0.805 0.071 11.327 0.000 0.805 0.703
##
## Covariances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .cheerful ~~
## .comfortable 0.299 0.046 6.433 0.000 0.299 0.590
## .glad ~~
## .happy 0.098 0.063 1.563 0.118 0.098 0.339
## .happy ~~
## .content -0.277 0.055 -4.998 0.000 -0.277 -1.896
## .glad ~~
## .content -0.166 0.050 -3.344 0.001 -0.166 -1.341
## posAffect ~~
## satisfaction 0.946 0.020 47.894 0.000 0.946 0.946
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .glad 0.246 0.062 3.962 0.000 0.246 0.176
## .happy 0.341 0.080 4.267 0.000 0.341 0.194
## .cheerful 0.389 0.048 8.016 0.000 0.389 0.357
## .satisfied 0.416 0.058 7.200 0.000 0.416 0.256
## .content 0.062 0.065 0.966 0.334 0.062 0.034
## .comfortable 0.662 0.069 9.660 0.000 0.662 0.505
## posAffect 1.000 1.000 1.000
## satisfaction 1.000 1.000 1.000
Model comparison:
anova(mod3_fit, mod2_2_fit)
## Chi-Squared Difference Test
##
## Df AIC BIC Chisq Chisq diff Df diff Pr(>Chisq)
## mod3_fit 4 2751.8 2807.8 4.9499
## mod2_2_fit 6 2777.1 2826.6 34.2652 29.315 2 4.308e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Keep modifying mod3:
modindices(mod3_fit, minimum.value = 1, sort = TRUE)
## Warning in lav_start_check_cov(lavpartable = lavpartable, start = START): lavaan WARNING: starting values imply a correlation larger than 1;
## variables involved are: happy content
## Warning in lav_start_check_cov(lavpartable = lavpartable, start = START): lavaan WARNING: starting values imply a correlation larger than 1;
## variables involved are: glad content
## [1] lhs op rhs mi epc sepc.lv sepc.all sepc.nox
## <0 rows> (or 0-length row.names)
No suggestions could decrease the model chisquare by more than 10.