Chapter 8 Lavaan Lab 5: One-factor CFA Model
In this lab, we will learn how to:
- Identify the One-factor CFA Model
- Scale the One-factor CFA Model
- Estimate the One-factor CFA Model
- Interpret the One-factor CFA Model
8.1 Data Prep
We will use cfaInClassData.csv in this lab.
This is a simulated dataset based on Todd Little’s positive affect example.
The hypothesis is that a latent variable ‘positive affect’ is measured by three indicators (glad, cheerful, and happy).
Let’s read this dataset in:
cfaData<- read.csv("cfaInclassData.csv", header = T)and examine the dataset:
head(cfaData)## ID glad cheerful happy satisfied content comfortable
## 1 1 0.13521092 0.5413297 -0.1041445 -0.5777446 0.8645383 0.02935020
## 2 2 -0.29116043 0.2434081 0.6671535 2.0763730 -0.7382832 1.05439183
## 3 3 0.71975913 0.2218277 0.4722337 2.1685984 -0.2727574 0.09053090
## 4 4 0.44432030 0.9295414 0.8574083 -1.0575363 -1.3841364 -0.07940091
## 5 5 2.84476524 3.1710123 3.5145040 1.5725274 2.3406754 1.59866763
## 6 6 -0.03317526 -0.8434011 -0.1485924 -0.5469343 -1.5750953 -0.69629828
str(cfaData)## 'data.frame': 1000 obs. of 7 variables:
## $ ID : int 1 2 3 4 5 6 7 8 9 10 ...
## $ glad : num 0.135 -0.291 0.72 0.444 2.845 ...
## $ cheerful : num 0.541 0.243 0.222 0.93 3.171 ...
## $ happy : num -0.104 0.667 0.472 0.857 3.515 ...
## $ satisfied : num -0.578 2.076 2.169 -1.058 1.573 ...
## $ content : num 0.865 -0.738 -0.273 -1.384 2.341 ...
## $ comfortable: num 0.0294 1.0544 0.0905 -0.0794 1.5987 ...
dim(cfaData) #n = 1000, 7 variables## [1] 1000 7
Let’s examine their means and standard deviations:
round(apply(cfaData[,-1], 2, mean), 2) # mean-centered## glad cheerful happy satisfied content comfortable
## 0.00 0.00 0.01 -0.04 -0.04 -0.04
round(apply(cfaData[,-1], 2, sd), 2) ## glad cheerful happy satisfied content comfortable
## 0.98 0.98 0.98 1.01 1.08 0.95
Let’s call up the lavaan library and run some CFA’s!
library(lavaan)8.2 PART I: One-Factor CFA, Fixed Loading
8.2.1 Fixed Loading, AKA Marker Variable method.
FYI, the three equations for the three indicators are:
- Glad = lambda1*posAffect(eta) + u1
- Cheerful = lambda2*posAffect(eta) + u2
- Happy = lambda3*posAffect(eta) + u3
Let’s first follow the equations above and write the syntax (disturbances are automatically included):
mod1.wrong<- "
glad ~ posAffect
cheerful ~ posAffect
happy ~ posAffect
"
fit1.wrong = lavaan::sem(model = mod1.wrong, data = cfaData, fixed.x=FALSE)Oops - an error message!
Error in lav_data_full(data = data, group = group, cluster = cluster, :
lavaan ERROR: missing observed variables in dataset: posAffectThis is because posAffect is a latent variable and we have to use =~ to define a latent variable:
mod1.wrong<-'
posAffect =~ Glad + Cheerful + Happy
'
fit1.wrong = lavaan::sem(model = mod1.wrong, data = cfaData, fixed.x=FALSE)Error in lavaan::lavaan(model = mod1.wrong, data = cfaData, fixed.x = FALSE, :
lavaan ERROR: missing observed variables in dataset: Glad Cheerful HappyError, why?
The variable names in the model syntax have to match the column names EXACTLY, even the letter cases.
Let’s try again:
mod1<-'
posAffect =~ glad + cheerful + happy
'Let’s explain the lavaan model syntax!
- mod1 is used to name our model.
- Since posAffect is a latent variable (it’s not in the data), we cannot follow the equations above and write syntax like glad ~ posAffect
- Instead, we specify a CFA measurement model in mod1.
- NEW SYNTAX ALERT: Using =~ means “manifested by”
- In the code above we can see that our latent construct ‘posAffect’ is manifested by glad, cheerful, and happy
- By default, the loading of glad is fixed at 1 (Fixed Loading Method)
Next we name the fitted object ‘fit1’ to see our output.
fit1 = lavaan::sem(mod1, data = cfaData, fixed.x=FALSE)This summary will show us the loadings (I also requested standardized results):
summary(fit1, standardized = T)## lavaan 0.6-12 ended normally after 20 iterations
##
## Estimator ML
## Optimization method NLMINB
## Number of model parameters 6
##
## Number of observations 1000
##
## Model Test User Model:
##
## Test statistic 0.000
## Degrees of freedom 0
##
## Parameter Estimates:
##
## Standard errors Standard
## Information Expected
## Information saturated (h1) model Structured
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## posAffect =~
## glad 1.000 0.693 0.705
## cheerful 1.117 0.059 18.782 0.000 0.774 0.787
## happy 1.066 0.057 18.786 0.000 0.739 0.757
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .glad 0.485 0.030 16.238 0.000 0.485 0.503
## .cheerful 0.367 0.030 12.062 0.000 0.367 0.380
## .happy 0.407 0.030 13.751 0.000 0.407 0.427
## posAffect 0.480 0.043 11.270 0.000 1.000 1.000
df = 0 (why?)
Latent Variables:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
posAffect =~
glad 1.000 0.693 0.705
cheerful 1.117 0.059 18.782 0.000 0.774 0.787
happy 1.066 0.057 18.786 0.000 0.739 0.757What does this mean?
- 1 unit change in posAffect produces:
- 1-unit change in “glad” (marker indicator)
- 1.117-unit change in “cheerful” (1.117 times greater than the effect on “glad”)
- 1.066-unit change in “happy” (1.066 times greater than the effect on “glad”)
Variances:
Unique factor variances:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
.glad 0.485 0.030 16.238 0.000 0.485 0.503
.cheerful 0.367 0.030 12.062 0.000 0.367 0.380
.happy 0.407 0.030 13.751 0.000 0.407 0.427- The leftover unique factor variances remain substantial
- Meaning that none of the indicators is a perfect measure of posAffect
- but they all contribute significantly to the measurement of posAffect (the standardized loadings above larger than 0.6)
Followed by the latent factor variance.
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
posAffect 0.480 0.043 11.270 0.000 1.000 1.0008.2.2 Change marker indicator
If you’d like to fix the 2nd loading to 1:
mod1b_wrong<-'
posAffect =~ glad + 1*cheerful + happy
'won’t work.
You will get something like this:
Latent Variables:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
posAffect =~
glad 1.000 0.734 0.733
cheerful 1.000 0.734 0.759
happy 1.009 0.046 22.052 0.000 0.741 0.759You’ll have to change the order of the indicators to move cheerful to the front of the variable list:
mod1b<-'
posAffect =~ cheerful + glad + happy
'Or use *NA to specify which loading to keep free and use *1 to specify the marker variable whose loading to be fixed at 1
mod1b<-'
posAffect =~ NA*glad + 1*cheerful + NA*happy
'Here we named the fitted object ‘fit1b’ to see our output.
fit1b = lavaan::sem(mod1b, data = cfaData, fixed.x=FALSE)
summary(fit1b, standardized = T)## lavaan 0.6-12 ended normally after 19 iterations
##
## Estimator ML
## Optimization method NLMINB
## Number of model parameters 6
##
## Number of observations 1000
##
## Model Test User Model:
##
## Test statistic 0.000
## Degrees of freedom 0
##
## Parameter Estimates:
##
## Standard errors Standard
## Information Expected
## Information saturated (h1) model Structured
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## posAffect =~
## glad 0.895 0.048 18.782 0.000 0.693 0.705
## cheerful 1.000 0.774 0.787
## happy 0.954 0.050 19.130 0.000 0.739 0.757
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .glad 0.485 0.030 16.238 0.000 0.485 0.503
## .cheerful 0.367 0.030 12.062 0.000 0.367 0.380
## .happy 0.407 0.030 13.751 0.000 0.407 0.427
## posAffect 0.599 0.048 12.616 0.000 1.000 1.000
- The loadings can be obtained by dividing those in fit1 by 1.117 (i.e., they change proportionally).
- The variances of unique factors and latent factor remain unchanged.
8.3 PART II: One-Factor CFA, Fixed Factor Variance
8.3.1 Fixed Factor Method
Keep using the same syntax but assign a new name mod2:
mod2<-'
posAffect =~ glad + cheerful + happy
'To fix the variance of the latent variable to 1, add std.lv=T to sem() function:
fit2<-lavaan::sem(mod2, data = cfaData, fixed.x=FALSE, std.lv=T)
summary(fit2, standardized = TRUE)## lavaan 0.6-12 ended normally after 11 iterations
##
## Estimator ML
## Optimization method NLMINB
## Number of model parameters 6
##
## Number of observations 1000
##
## Model Test User Model:
##
## Test statistic 0.000
## Degrees of freedom 0
##
## Parameter Estimates:
##
## Standard errors Standard
## Information Expected
## Information saturated (h1) model Structured
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## posAffect =~
## glad 0.693 0.031 22.540 0.000 0.693 0.705
## cheerful 0.774 0.031 25.233 0.000 0.774 0.787
## happy 0.739 0.030 24.226 0.000 0.739 0.757
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .glad 0.485 0.030 16.238 0.000 0.485 0.503
## .cheerful 0.367 0.030 12.062 0.000 0.367 0.380
## .happy 0.407 0.030 13.751 0.000 0.407 0.427
## posAffect 1.000 1.000 1.000
Latent Variables:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
posAffect =~
glad 0.693 0.031 22.540 0.000 0.693 0.705
cheerful 0.774 0.031 25.233 0.000 0.774 0.787
happy 0.739 0.030 24.226 0.000 0.739 0.757- 1-SD change in the factor (posAffect) causes:
- 0.693-unit change in glad (on its raw scale)
- 0.774-unit change in cheerful (on its raw scale)
- 0.739-unit change in happy (on its raw scale)
Variances:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
.glad 0.485 0.030 16.238 0.000 0.485 0.503
.cheerful 0.367 0.030 12.062 0.000 0.367 0.380
.happy 0.407 0.030 13.751 0.000 0.407 0.427
posAffect 1.000 1.000 1.000- We see that posAffect now has variance (=sd) of 1
- All loadings were freely estimated, no loading is 1.
- and the unique factor variances are the same as before