Chapter 8 Week7_1: Lavaan Lab 5 One-factor CFA Model

In this lab, we will learn how to:

Identify the One-factor CFA Model
Scale the One-factor CFA Model
Estimate the One-factor CFA Model
Interpret the One-factor CFA Model

8.1 Data Prep

We will use cfaInClassData.csv in this lab.

This is a simulated dataset based on Todd Little’s positive affect example.

The hypothesis is that a latent variable ‘positive affect’ is measured by three indicators (glad, cheerful, and happy).

Let’s read this dataset in:

cfaData<- read.csv("cfaInclassData.csv", header = T)

and examine the dataset:

head(cfaData)

##   ID        glad   cheerful      happy  satisfied    content comfortable
## 1  1  0.13521092  0.5413297 -0.1041445 -0.5777446  0.8645383  0.02935020
## 2  2 -0.29116043  0.2434081  0.6671535  2.0763730 -0.7382832  1.05439183
## 3  3  0.71975913  0.2218277  0.4722337  2.1685984 -0.2727574  0.09053090
## 4  4  0.44432030  0.9295414  0.8574083 -1.0575363 -1.3841364 -0.07940091
## 5  5  2.84476524  3.1710123  3.5145040  1.5725274  2.3406754  1.59866763
## 6  6 -0.03317526 -0.8434011 -0.1485924 -0.5469343 -1.5750953 -0.69629828

str(cfaData)

## 'data.frame':    1000 obs. of  7 variables:
##  $ ID         : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ glad       : num  0.135 -0.291 0.72 0.444 2.845 ...
##  $ cheerful   : num  0.541 0.243 0.222 0.93 3.171 ...
##  $ happy      : num  -0.104 0.667 0.472 0.857 3.515 ...
##  $ satisfied  : num  -0.578 2.076 2.169 -1.058 1.573 ...
##  $ content    : num  0.865 -0.738 -0.273 -1.384 2.341 ...
##  $ comfortable: num  0.0294 1.0544 0.0905 -0.0794 1.5987 ...

dim(cfaData) #n = 1000, 7 variables

## [1] 1000    7

Let’s examine their means and standard deviations:

round(apply(cfaData[,-1], 2, mean), 2) # mean-centered

##        glad    cheerful       happy   satisfied     content comfortable 
##        0.00        0.00        0.01       -0.04       -0.04       -0.04

round(apply(cfaData[,-1], 2, sd), 2)

##        glad    cheerful       happy   satisfied     content comfortable 
##        0.98        0.98        0.98        1.01        1.08        0.95

Let’s call up the lavaan library and run some CFA’s!

library(lavaan)

8.2 PART I: One-Factor CFA, Fixed Loading

8.2.1 Fixed Loading, AKA Marker Variable method.

FYI, the three equations for the three indicators are:

Glad = lambda1*posAffect(eta) + u1
Cheerful = lambda2*posAffect(eta) + u2
Happy = lambda3*posAffect(eta) + u3

Let’s first follow the equations above and write the syntax (disturbances are automatically included):

mod1.wrong<- "
  glad ~ posAffect
  cheerful ~ posAffect
  happy ~ posAffect
"
fit1.wrong = lavaan::sem(model = mod1.wrong, data = cfaData, fixed.x=FALSE)

Oops - an error message!

Error in lav_data_full(data = data, group = group, cluster = cluster,  : 
  lavaan ERROR: missing observed variables in dataset: posAffect

This is because posAffect is a latent variable and we have to use =~ to define a latent variable:

mod1.wrong<-'
posAffect =~ Glad + Cheerful + Happy
'
fit1.wrong = lavaan::sem(model = mod1.wrong, data = cfaData, fixed.x=FALSE)

Error in lavaan::lavaan(model = mod1.wrong, data = cfaData, fixed.x = FALSE,  : 
  lavaan ERROR: missing observed variables in dataset: Glad Cheerful Happy

Error, why?

The variable names in the model syntax have to match the column names EXACTLY, even the letter cases.

Let’s try again:

mod1<-'
posAffect =~ glad + cheerful + happy
'

Let’s explain the lavaan model syntax!

mod1 is used to name our model.
Since posAffect is a latent variable (it’s not in the data), we cannot follow the equations above and write syntax like glad ~ posAffect
Instead, we specify a CFA measurement model in mod1.
NEW SYNTAX ALERT: Using =~ means “manifested by”
In the code above we can see that our latent construct ‘posAffect’ is manifested by glad, cheerful, and happy
By default, the loading of glad is fixed at 1 (Fixed Loading Method)

Next we name the fitted object ‘fit1’ to see our output.

fit1 = lavaan::sem(mod1, data = cfaData, fixed.x=FALSE)

This summary will show us the loadings (I also requested standardized results):

summary(fit1, standardized = T)

## lavaan 0.6-12 ended normally after 20 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of model parameters                         6
## 
##   Number of observations                          1000
## 
## Model Test User Model:
##                                                       
##   Test statistic                                 0.000
##   Degrees of freedom                                 0
## 
## Parameter Estimates:
## 
##   Standard errors                             Standard
##   Information                                 Expected
##   Information saturated (h1) model          Structured
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   posAffect =~                                                          
##     glad              1.000                               0.693    0.705
##     cheerful          1.117    0.059   18.782    0.000    0.774    0.787
##     happy             1.066    0.057   18.786    0.000    0.739    0.757
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .glad              0.485    0.030   16.238    0.000    0.485    0.503
##    .cheerful          0.367    0.030   12.062    0.000    0.367    0.380
##    .happy             0.407    0.030   13.751    0.000    0.407    0.427
##     posAffect         0.480    0.043   11.270    0.000    1.000    1.000

df = 0 (why?)

Latent Variables:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  posAffect =~                                                          
    glad              1.000                               0.693    0.705
    cheerful          1.117    0.059   18.782    0.000    0.774    0.787
    happy             1.066    0.057   18.786    0.000    0.739    0.757

What does this mean?

1 unit change in posAffect produces:
- 1-unit change in “glad” (marker indicator)
- 1.117-unit change in “cheerful” (1.117 times greater than the effect on “glad”)
- 1.066-unit change in “happy” (1.066 times greater than the effect on “glad”)

Variances:
Unique factor variances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
   .glad              0.485    0.030   16.238    0.000    0.485    0.503
   .cheerful          0.367    0.030   12.062    0.000    0.367    0.380
   .happy             0.407    0.030   13.751    0.000    0.407    0.427

The leftover unique factor variances remain substantial
Meaning that none of the indicators is a perfect measure of posAffect
but they all contribute significantly to the measurement of posAffect (the standardized loadings above larger than 0.6)

Followed by the latent factor variance.

                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
    posAffect         0.480    0.043   11.270    0.000    1.000    1.000

8.2.2 Change marker indicator

If you’d like to fix the 2nd loading to 1:

mod1b_wrong<-'
posAffect =~ glad + 1*cheerful + happy
'

won’t work.

You will get something like this:

Latent Variables:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  posAffect =~                                                          
    glad              1.000                               0.734    0.733
    cheerful          1.000                               0.734    0.759
    happy             1.009    0.046   22.052    0.000    0.741    0.759

You’ll have to change the order of the indicators to move cheerful to the front of the variable list:

mod1b<-'
posAffect =~ cheerful + glad + happy
'

Or use *NA to specify which loading to keep free and use *1 to specify the marker variable whose loading to be fixed at 1

mod1b<-'
posAffect =~ NA*glad + 1*cheerful + NA*happy
'

Here we named the fitted object ‘fit1b’ to see our output.

fit1b = lavaan::sem(mod1b, data = cfaData, fixed.x=FALSE)
summary(fit1b, standardized = T)

## lavaan 0.6-12 ended normally after 19 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of model parameters                         6
## 
##   Number of observations                          1000
## 
## Model Test User Model:
##                                                       
##   Test statistic                                 0.000
##   Degrees of freedom                                 0
## 
## Parameter Estimates:
## 
##   Standard errors                             Standard
##   Information                                 Expected
##   Information saturated (h1) model          Structured
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   posAffect =~                                                          
##     glad              0.895    0.048   18.782    0.000    0.693    0.705
##     cheerful          1.000                               0.774    0.787
##     happy             0.954    0.050   19.130    0.000    0.739    0.757
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .glad              0.485    0.030   16.238    0.000    0.485    0.503
##    .cheerful          0.367    0.030   12.062    0.000    0.367    0.380
##    .happy             0.407    0.030   13.751    0.000    0.407    0.427
##     posAffect         0.599    0.048   12.616    0.000    1.000    1.000

The loadings can be obtained by dividing those in fit1 by 1.117 (i.e., they change proportionally).
The variances of unique factors and latent factor remain unchanged.

8.3 PART II: One-Factor CFA, Fixed Factor Variance

8.3.1 Fixed Factor Method

Keep using the same syntax but assign a new name mod2:

mod2<-'
posAffect =~ glad + cheerful + happy
'

To fix the variance of the latent variable to 1, add std.lv=T to sem() function:

fit2<-lavaan::sem(mod2, data = cfaData, fixed.x=FALSE, std.lv=T)
summary(fit2, standardized = TRUE)

## lavaan 0.6-12 ended normally after 11 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of model parameters                         6
## 
##   Number of observations                          1000
## 
## Model Test User Model:
##                                                       
##   Test statistic                                 0.000
##   Degrees of freedom                                 0
## 
## Parameter Estimates:
## 
##   Standard errors                             Standard
##   Information                                 Expected
##   Information saturated (h1) model          Structured
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   posAffect =~                                                          
##     glad              0.693    0.031   22.540    0.000    0.693    0.705
##     cheerful          0.774    0.031   25.233    0.000    0.774    0.787
##     happy             0.739    0.030   24.226    0.000    0.739    0.757
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .glad              0.485    0.030   16.238    0.000    0.485    0.503
##    .cheerful          0.367    0.030   12.062    0.000    0.367    0.380
##    .happy             0.407    0.030   13.751    0.000    0.407    0.427
##     posAffect         1.000                               1.000    1.000

Latent Variables:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  posAffect =~                                                          
    glad              0.693    0.031   22.540    0.000    0.693    0.705
    cheerful          0.774    0.031   25.233    0.000    0.774    0.787
    happy             0.739    0.030   24.226    0.000    0.739    0.757

1-SD change in the factor (posAffect) causes:
- 0.693-unit change in glad (on its raw scale)
- 0.774-unit change in cheerful (on its raw scale)
- 0.739-unit change in happy (on its raw scale)

Variances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
   .glad              0.485    0.030   16.238    0.000    0.485    0.503
   .cheerful          0.367    0.030   12.062    0.000    0.367    0.380
   .happy             0.407    0.030   13.751    0.000    0.407    0.427
    posAffect         1.000                               1.000    1.000

We see that posAffect now has variance (=sd) of 1
All loadings were freely estimated, no loading is 1.
and the unique factor variances are the same as before

8.4 Exercise: One-factor CFA Model

Could you use the indicators satisfied, content, and comfortable to build a one-factor CFA model to measure a latent variable called Satisfaction?

Use the Fixed Loading and the Fixed Factor Methods and compare their estimates.

R Cookbook for Structural Equation Modeling