Diese Musterlösung wurde erstellt von Peter Hähner (Ruhr-Universität Bochum).

(c) Luhmann: R für Einsteiger, 5. Aufl., Beltz, 2020

Vorbereitungen

Setzen Sie ein Arbeitsverzeichnis oder legen Sie ein entsprechendes R-Projekt an (Kap. 23).

Laden Sie dann die Datei erstis.RData.

load("erstis.RData")

Laden Sie die benötigten Pakete (ggf. müssen Sie diese vorab noch installieren).

library(psych)
library(tidyverse)
library(GPArotation)

Aufgabe 1

Führen Sie eine Itemanalyse für die Lebenszufriedenheits-Items (lz13 bis lz17) durch. Auf welches der fünf Items könnte man am ehesten verzichten?

# Variablen in Data Frame speichern
lezu <- select(erstis, lz13, lz14, lz15, lz16, lz17)

# Itemanalyse durchführen
psych::alpha(lezu)
## 
## Reliability analysis   
## Call: psych::alpha(x = lezu)
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean  sd median_r
##       0.86      0.86    0.84      0.56 6.3 0.017  4.9 1.1     0.53
## 
##  lower alpha upper     95% confidence boundaries
## 0.82 0.86 0.89 
## 
##  Reliability if an item is dropped:
##      raw_alpha std.alpha G6(smc) average_r S/N alpha se  var.r med.r
## lz13      0.80      0.81    0.77      0.51 4.2    0.023 0.0054  0.51
## lz14      0.83      0.83    0.80      0.56 5.0    0.021 0.0058  0.53
## lz15      0.82      0.82    0.79      0.54 4.7    0.022 0.0065  0.53
## lz16      0.83      0.84    0.81      0.57 5.3    0.021 0.0123  0.59
## lz17      0.85      0.86    0.82      0.60 6.0    0.017 0.0062  0.61
## 
##  Item statistics 
##        n raw.r std.r r.cor r.drop mean  sd
## lz13 189  0.86  0.86  0.84   0.77  4.8 1.4
## lz14 189  0.79  0.80  0.74   0.67  5.0 1.3
## lz15 189  0.81  0.83  0.78   0.71  5.3 1.2
## lz16 189  0.79  0.78  0.70   0.65  5.0 1.5
## lz17 188  0.76  0.74  0.63   0.59  4.5 1.7
## 
## Non missing response frequency for each item
##         1    2    3    4    5    6    7 miss
## lz13 0.02 0.05 0.10 0.18 0.28 0.32 0.05 0.01
## lz14 0.01 0.05 0.07 0.19 0.28 0.31 0.09 0.01
## lz15 0.01 0.03 0.06 0.13 0.20 0.46 0.11 0.01
## lz16 0.02 0.04 0.13 0.16 0.20 0.32 0.13 0.01
## lz17 0.04 0.12 0.10 0.24 0.19 0.19 0.12 0.02

Am ehesten würde man auf das Item lz17 verzichten, da es die geringste Trennschärfe aufweist (r.drop = .59) und da Cronbachs Alpha durch das Weglassen dieses Items am wenigsten beeinträchtigt wird (raw_alpha = .85).

Aufgabe 2

Untersuchen Sie die faktorielle Struktur der Stimmungs-Items (stim1 bis stim12). Fassen Sie zunächst diese Items in einem neuen Data Frame zusammen, der keine fehlenden Werte enthält. Bestimmen Sie anschließend die Anzahl der zu extrahierenden Faktoren mit einer Parallelanalyse und führen Sie dann eine exploratorische Faktorenanalyse mit Maximum-Likelihood-Schätzung und obliquer Rotation durch.

# Variablen in Data Frame speichern und fehlende Werte entfernen
stimmung <- na.omit(select(erstis, contains("stim"))) 

# Parallelanalyse
fa.parallel(stimmung)

## Parallel analysis suggests that the number of factors =  4  and the number of components =  3

Die Parallelanalyse favorisiert eine Lösung mit drei oder vier Faktoren.

Beide werden im Folgenden einmal dargestellt.

# 3-Faktorenlösung
faktoren.3 <- fa(stimmung, nfactors = 3, 
                 fm = "ml", 
                 rotate = "promax")

print(faktoren.3, digits = 2, cut = .3, sort = TRUE)
## Factor Analysis using method =  ml
## Call: fa(r = stimmung, nfactors = 3, rotate = "promax", fm = "ml")
## Standardized loadings (pattern matrix) based upon correlation matrix
##        item   ML3   ML1   ML2   h2   u2 com
## stim3     3  0.91             0.65 0.35 1.1
## stim9     9  0.80             0.60 0.40 1.1
## stim6     6 -0.62             0.42 0.58 1.3
## stim12   12 -0.53             0.50 0.50 1.4
## stim11   11  0.39 -0.34       0.43 0.57 2.0
## stim8     8        0.98       0.75 0.25 1.1
## stim1     1        0.79       0.60 0.40 1.1
## stim10   10        0.42 -0.38 0.36 0.64 2.2
## stim7     7              0.92 0.70 0.30 1.1
## stim5     5              0.68 0.43 0.57 1.1
## stim2     2             -0.59 0.55 0.45 1.3
## stim4     4                   0.38 0.62 2.9
## 
##                        ML3  ML1  ML2
## SS loadings           2.43 2.02 1.91
## Proportion Var        0.20 0.17 0.16
## Cumulative Var        0.20 0.37 0.53
## Proportion Explained  0.38 0.32 0.30
## Cumulative Proportion 0.38 0.70 1.00
## 
##  With factor correlations of 
##       ML3   ML1   ML2
## ML3  1.00 -0.60  0.43
## ML1 -0.60  1.00 -0.46
## ML2  0.43 -0.46  1.00
## 
## Mean item complexity =  1.5
## Test of the hypothesis that 3 factors are sufficient.
## 
## The degrees of freedom for the null model are  66  and the objective function was  4.82 with Chi Square of  873.13
## The degrees of freedom for the model are 33  and the objective function was  0.42 
## 
## The root mean square of the residuals (RMSR) is  0.04 
## The df corrected root mean square of the residuals is  0.06 
## 
## The harmonic number of observations is  187 with the empirical chi square  41.2  with prob <  0.15 
## The total number of observations was  187  with Likelihood Chi Square =  74.44  with prob <  4.9e-05 
## 
## Tucker Lewis Index of factoring reliability =  0.896
## RMSEA index =  0.082  and the 90 % confidence intervals are  0.057 0.107
## BIC =  -98.18
## Fit based upon off diagonal values = 0.99
## Measures of factor score adequacy             
##                                                    ML3  ML1  ML2
## Correlation of (regression) scores with factors   0.93 0.93 0.91
## Multiple R square of scores with factors          0.86 0.87 0.83
## Minimum correlation of possible factor scores     0.72 0.74 0.67
# 4-Faktorenlösung
faktoren.4 <- fa(stimmung, 
                 nfactors = 4, 
                 fm = "ml", 
                 rotate = "promax")

print(faktoren.4, digits = 2, cut = .3, sort = TRUE)
## Factor Analysis using method =  ml
## Call: fa(r = stimmung, nfactors = 4, rotate = "promax", fm = "ml")
## Standardized loadings (pattern matrix) based upon correlation matrix
##        item   ML3   ML2   ML1   ML4   h2   u2 com
## stim8     8  0.90                   0.70 0.30 1.1
## stim1     1  0.90                   0.67 0.33 1.1
## stim4     4 -0.40                   0.43 0.57 2.9
## stim11   11 -0.40                   0.41 0.59 1.9
## stim7     7        0.92             0.69 0.31 1.1
## stim5     5        0.68             0.43 0.57 1.1
## stim2     2       -0.60             0.56 0.44 1.3
## stim10   10  0.35 -0.39             0.37 0.63 2.7
## stim3     3              0.96       0.80 0.20 1.0
## stim9     9              0.60       0.55 0.45 1.3
## stim12   12                    0.90 0.83 0.17 1.0
## stim6     6                    0.53 0.47 0.53 1.7
## 
##                        ML3  ML2  ML1  ML4
## SS loadings           1.99 1.90 1.70 1.32
## Proportion Var        0.17 0.16 0.14 0.11
## Cumulative Var        0.17 0.32 0.47 0.58
## Proportion Explained  0.29 0.28 0.25 0.19
## Cumulative Proportion 0.29 0.56 0.81 1.00
## 
##  With factor correlations of 
##       ML3   ML2   ML1   ML4
## ML3  1.00 -0.50 -0.50  0.59
## ML2 -0.50  1.00  0.41 -0.34
## ML1 -0.50  0.41  1.00 -0.60
## ML4  0.59 -0.34 -0.60  1.00
## 
## Mean item complexity =  1.5
## Test of the hypothesis that 4 factors are sufficient.
## 
## The degrees of freedom for the null model are  66  and the objective function was  4.82 with Chi Square of  873.13
## The degrees of freedom for the model are 24  and the objective function was  0.23 
## 
## The root mean square of the residuals (RMSR) is  0.03 
## The df corrected root mean square of the residuals is  0.05 
## 
## The harmonic number of observations is  187 with the empirical chi square  22.07  with prob <  0.57 
## The total number of observations was  187  with Likelihood Chi Square =  40.62  with prob <  0.018 
## 
## Tucker Lewis Index of factoring reliability =  0.942
## RMSEA index =  0.061  and the 90 % confidence intervals are  0.025 0.093
## BIC =  -84.92
## Fit based upon off diagonal values = 0.99
## Measures of factor score adequacy             
##                                                    ML3  ML2  ML1  ML4
## Correlation of (regression) scores with factors   0.93 0.91 0.93 0.93
## Multiple R square of scores with factors          0.86 0.84 0.87 0.86
## Minimum correlation of possible factor scores     0.73 0.67 0.74 0.72

In diesem Fall würde man sich vermutlich für die 4-Faktorenlösung entscheiden, da die Struktur der Faktorladungen einfacher zu interpretieren ist. Es gibt zum Beispiel weniger Items, die auf zwei Faktoren laden.