Description of Experiment and Data
Singmann and Klauer (2011) were interested in whether or not conditional reasoning can be explained by a single process or whether multiple processes are necessary to explain it. To provide evidence for multiple processes we aimed to establish a double dissociation of two variables: instruction type and problem type. Instruction type was manipulated betweensubjects, one group of participants received deductive instructions (i.e., to treat the premises as given and only draw necessary conclusions) and a second group of participants received probabilistic instructions (i.e., to reason as in an everyday situation; we called this “inductive instruction” in the manuscript). Problem type consisted of two different orthogonally crossed variables that were manipulated withinsubjects, validity of the problem (formally valid or formally invalid) and plausibility of the problem (inferences which were consisted with the background knowledge versus problems that were inconsistent with the background knowledge). The critical comparison across the two conditions was among problems which were valid and implausible with problems that were invalid and plausible. For example, the next problem was invalid and plausible:
If a person is wet, then the person fell into a swimming pool.
A person fell into a swimming pool.
How valid is the conclusion/How likely is it that the person is wet?
For those problems we predicted that under deductive instructions responses should be lower (as the conclusion does not necessarily follow from the premises) as under probabilistic instructions. For the valid but implausible problem, an example is presented next, we predicted the opposite pattern:
If a person is wet, then the person fell into a swimming pool.
A person is wet.
How valid is the conclusion/How likely is it that the person fell into a swimming pool?
Our study also included valid and plausible and invalid and implausible problems.
In contrast to the analysis reported in the manuscript, we initially do not separate the analysis into affirmation and denial problems, but first report an analysis on the full set of inferences, MP, MT, AC, and DA, where MP and MT are valid and AC and DA invalid. We report a reanalysis of our Experiment 1 only. Note that the factor plausibility
is not present in the original manuscript, there it is a results of a combination of other factors.
Data and R Preperation
We begin by loading the packages we will be using throughout.
library("afex") # needed for ANOVA functions.
library("emmeans") # emmeans must now be loaded explicitly for followup tests.
library("multcomp") # for advanced control for multiple testing/Type 1 errors.
library("ggplot2") # for customizing plots.
afex_options(emmeans_model = "multivariate") # use multivariate model for all followup tests.
Note that for ANOVAs involving repeatedmeasures factors, followup tests based on the multivariate model are generally preferrably to univariate followup tests. Consequently, we set this option globally. Future versions of afex
will likely use the multivariate model as the default.
data(sk2011.1)
str(sk2011.1)
## 'data.frame': 640 obs. of 9 variables:
## $ id : Factor w/ 40 levels "8","9","10","12",..: 3 3 3 3 3 3 3 3 3 3 ...
## $ instruction : Factor w/ 2 levels "deductive","probabilistic": 2 2 2 2 2 2 2 2 2 2 ...
## $ plausibility: Factor w/ 2 levels "plausible","implausible": 1 2 2 1 2 1 1 2 1 2 ...
## $ inference : Factor w/ 4 levels "MP","MT","AC",..: 4 2 1 3 4 2 1 3 4 2 ...
## $ validity : Factor w/ 2 levels "valid","invalid": 2 1 1 2 2 1 1 2 2 1 ...
## $ what : Factor w/ 2 levels "affirmation",..: 2 2 1 1 2 2 1 1 2 2 ...
## $ type : Factor w/ 2 levels "original","reversed": 2 2 2 2 1 1 1 1 2 2 ...
## $ response : int 100 60 94 70 100 99 98 49 82 50 ...
## $ content : Factor w/ 4 levels "C1","C2","C3",..: 1 1 1 1 2 2 2 2 3 3 ...
An important feature in the data is that each participant provided two responses for each cell of the design (the content is different for each of those, each participant saw all four contents). These two data points will be aggregated automatically by afex
.
with(sk2011.1, table(inference, id, plausibility))
## , , plausibility = plausible
##
## id
## inference 8 9 10 12 13 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
## MP 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
## MT 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
## AC 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
## DA 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
## id
## inference 37 38 39 40 41 42 43 44 46 47 48 49 50
## MP 2 2 2 2 2 2 2 2 2 2 2 2 2
## MT 2 2 2 2 2 2 2 2 2 2 2 2 2
## AC 2 2 2 2 2 2 2 2 2 2 2 2 2
## DA 2 2 2 2 2 2 2 2 2 2 2 2 2
##
## , , plausibility = implausible
##
## id
## inference 8 9 10 12 13 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
## MP 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
## MT 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
## AC 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
## DA 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
## id
## inference 37 38 39 40 41 42 43 44 46 47 48 49 50
## MP 2 2 2 2 2 2 2 2 2 2 2 2 2
## MT 2 2 2 2 2 2 2 2 2 2 2 2 2
## AC 2 2 2 2 2 2 2 2 2 2 2 2 2
## DA 2 2 2 2 2 2 2 2 2 2 2 2 2
ANOVA
To get the full ANOVA table for the model, we simply pass it to aov_ez
using the design as described above. We save the returned object for further analysis.
a1 < aov_ez("id", "response", sk2011.1, between = "instruction",
within = c("inference", "plausibility"))
## Warning: More than one observation per cell, aggregating the data using mean (i.e,
## fun_aggregate = mean)!
## Contrasts set to contr.sum for the following variables: instruction
a1 # the default print method prints a data.frame produced by nice
## Anova Table (Type 3 tests)
##
## Response: response
## Effect df MSE F ges p.value
## 1 instruction 1, 38 2027.42 0.31 .003 .58
## 2 inference 2.66, 101.12 959.12 5.81 ** .06 .002
## 3 instruction:inference 2.66, 101.12 959.12 6.00 ** .07 .001
## 4 plausibility 1, 38 468.82 34.23 *** .07 <.0001
## 5 instruction:plausibility 1, 38 468.82 10.67 ** .02 .002
## 6 inference:plausibility 2.29, 87.11 318.91 2.87 + .009 .06
## 7 instruction:inference:plausibility 2.29, 87.11 318.91 3.98 * .01 .02
## 
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '+' 0.1 ' ' 1
##
## Sphericity correction method: GG
The equivalent calls (i.e., producing exactly the same output) of the other two ANOVA functions aov_car
or aov4
is shown below.
aov_car(response ~ instruction + Error(id/inference*plausibility), sk2011.1)
aov_4(response ~ instruction + (inference*plausibilityid), sk2011.1)
As mentioned before, the two responses per cell of the design and participants are aggregated for the analysis as indicated by the warning message. Furthermore, the degrees of freedom are GreenhouseGeisser corrected per default for all effects involving inference
, as inference
is a withinsubject factor with more than two levels (i.e., MP, MT, AC, & DA). In line with our expectations, the threeway interaction is significant.
The object printed per default for afex_aov
objects (produced by nice
) can also be printed nicely using knitr
:
instruction 
1, 38 
2027.42 
0.31 
.003 
.58 
inference 
2.66, 101.12 
959.12 
5.81 ** 
.06 
.002 
instruction:inference 
2.66, 101.12 
959.12 
6.00 ** 
.07 
.001 
plausibility 
1, 38 
468.82 
34.23 *** 
.07 
<.0001 
instruction:plausibility 
1, 38 
468.82 
10.67 ** 
.02 
.002 
inference:plausibility 
2.29, 87.11 
318.91 
2.87 + 
.009 
.06 
instruction:inference:plausibility 
2.29, 87.11 
318.91 
3.98 * 
.01 
.02 
Alternatively, the anova
method for afex_aov
objects returns a data.frame
of class anova
that can be passed to, for example, xtable
for nice formatting:
print(xtable::xtable(anova(a1), digits = c(rep(2, 5), 3, 4)), type = "html")

num Df

den Df

MSE

F

ges

Pr(>F)

instruction

1.00

38.00

2027.42

0.31

0.003

0.5830

inference

2.66

101.12

959.12

5.81

0.063

0.0016

instruction:inference

2.66

101.12

959.12

6.00

0.065

0.0013

plausibility

1.00

38.00

468.82

34.23

0.068

0.0000

instruction:plausibility

1.00

38.00

468.82

10.67

0.022

0.0023

inference:plausibility

2.29

87.11

318.91

2.87

0.009

0.0551

instruction:inference:plausibility

2.29

87.11

318.91

3.98

0.013

0.0177

PostHoc Contrasts and Plotting
To further analyze the data we need to pass it to package emmeans
, a package that offers great functionality for both plotting and contrasts of all kind. A lot of information on emmeans
can be obtained in its vignettes and faq. emmeans
can work with afex_aov
objects directly as afex comes with the necessary methods for the generic functions defined in emmeans
. When using the multivariate
options as described above, emmeans
uses the ANOVA model estimated via base R’s lm
method (which in the case of a multivariate response is an object of class c("mlm", "lm")
). In the default setting (i.e., emmeans_model = "univariate"
), emmeans
uses the object created by base R’s aov
function, which for now is also part of an afex_aov
object.
Some First Contrasts
Main Effects Only
This object can now be passed to emmeans
, for example to obtain the marginal means of the four inferences:
m1 < emmeans(a1, ~ inference)
m1
## inference emmean SE df lower.CL upper.CL
## MP 87.51250 1.797265 38 83.87413 91.15087
## MT 76.68125 4.064950 38 68.45219 84.91031
## AC 69.41250 4.771297 38 59.75351 79.07149
## DA 82.95625 3.837620 38 75.18740 90.72510
##
## Results are averaged over the levels of: instruction, plausibility
## Confidence level used: 0.95
This object can now also be used to compare whether or not there are differences between the levels of the factor:
## contrast estimate SE df t.ratio p.value
## MP  MT 10.83125 4.331479 38 2.501 0.0759
## MP  AC 18.10000 5.017994 38 3.607 0.0047
## MP  DA 4.55625 4.196484 38 1.086 0.7002
## MT  AC 7.26875 3.983558 38 1.825 0.2778
## MT  DA 6.27500 4.702592 38 1.334 0.5473
## AC  DA 13.54375 5.299024 38 2.556 0.0672
##
## Results are averaged over the levels of: instruction, plausibility
## P value adjustment: tukey method for comparing a family of 4 estimates
To obtain more powerful pvalue adjustments, we can furthermore pass it to multcomp
(Bretz, Hothorn, & Westfall, 2011):
summary(as.glht(pairs(m1)), test=adjusted("free"))
##
## Simultaneous Tests for General Linear Hypotheses
##
## Linear Hypotheses:
## Estimate Std. Error t value Pr(>t)
## MP  MT == 0 10.831 4.331 2.501 0.05907 .
## MP  AC == 0 18.100 5.018 3.607 0.00457 **
## MP  DA == 0 4.556 4.196 1.086 0.31350
## MT  AC == 0 7.269 3.984 1.825 0.19414
## MT  DA == 0 6.275 4.703 1.334 0.31350
## AC  DA == 0 13.544 5.299 2.556 0.05907 .
## 
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## (Adjusted p values reported  free method)
A Simple interaction
We could now also be interested in the marginal means of the inferences across the two instruction types. emmeans
offers two ways to do so. The first splits the contrasts across levels of the factor using the by
argument.
m2 < emmeans(a1, "inference", by = "instruction")
## equal: emmeans(a1, ~ inferenceinstruction)
m2
## instruction = deductive:
## inference emmean SE df lower.CL upper.CL
## MP 97.2875 2.541716 38 92.14206 102.43294
## MT 70.4000 5.748708 38 58.76235 82.03765
## AC 61.4875 6.747633 38 47.82763 75.14737
## DA 81.8125 5.427214 38 70.82568 92.79932
##
## instruction = probabilistic:
## inference emmean SE df lower.CL upper.CL
## MP 77.7375 2.541716 38 72.59206 82.88294
## MT 82.9625 5.748708 38 71.32485 94.60015
## AC 77.3375 6.747633 38 63.67763 90.99737
## DA 84.1000 5.427214 38 73.11318 95.08682
##
## Results are averaged over the levels of: plausibility
## Confidence level used: 0.95
Consequently, tests are also only performed within each level of the by
factor:
## instruction = deductive:
## contrast estimate SE df t.ratio p.value
## MP  MT 26.8875 6.125636 38 4.389 0.0005
## MP  AC 35.8000 7.096515 38 5.045 0.0001
## MP  DA 15.4750 5.934724 38 2.608 0.0599
## MT  AC 8.9125 5.633601 38 1.582 0.4007
## MT  DA 11.4125 6.650469 38 1.716 0.3297
## AC  DA 20.3250 7.493951 38 2.712 0.0471
##
## instruction = probabilistic:
## contrast estimate SE df t.ratio p.value
## MP  MT 5.2250 6.125636 38 0.853 0.8287
## MP  AC 0.4000 7.096515 38 0.056 0.9999
## MP  DA 6.3625 5.934724 38 1.072 0.7084
## MT  AC 5.6250 5.633601 38 0.998 0.7512
## MT  DA 1.1375 6.650469 38 0.171 0.9982
## AC  DA 6.7625 7.493951 38 0.902 0.8036
##
## Results are averaged over the levels of: plausibility
## P value adjustment: tukey method for comparing a family of 4 estimates
The second version considers all factor levels together. Consequently, the number of pairwise comparisons is a lot larger:
m3 < emmeans(a1, c("inference", "instruction"))
## equal: emmeans(a1, ~inference*instruction)
m3
## inference instruction emmean SE df lower.CL upper.CL
## MP deductive 97.2875 2.541716 38 92.14206 102.43294
## MT deductive 70.4000 5.748708 38 58.76235 82.03765
## AC deductive 61.4875 6.747633 38 47.82763 75.14737
## DA deductive 81.8125 5.427214 38 70.82568 92.79932
## MP probabilistic 77.7375 2.541716 38 72.59206 82.88294
## MT probabilistic 82.9625 5.748708 38 71.32485 94.60015
## AC probabilistic 77.3375 6.747633 38 63.67763 90.99737
## DA probabilistic 84.1000 5.427214 38 73.11318 95.08682
##
## Results are averaged over the levels of: plausibility
## Confidence level used: 0.95
## contrast estimate SE df t.ratio p.value
## MP,deductive  MT,deductive 26.8875 6.125636 38 4.389 0.0020
## MP,deductive  AC,deductive 35.8000 7.096515 38 5.045 0.0003
## MP,deductive  DA,deductive 15.4750 5.934724 38 2.608 0.1848
## MP,deductive  MP,probabilistic 19.5500 3.594529 38 5.439 0.0001
## MP,deductive  MT,probabilistic 14.3250 6.285536 38 2.279 0.3310
## MP,deductive  AC,probabilistic 19.9500 7.210470 38 2.767 0.1342
## MP,deductive  DA,probabilistic 13.1875 5.992910 38 2.201 0.3741
## MT,deductive  AC,deductive 8.9125 5.633601 38 1.582 0.7577
## MT,deductive  DA,deductive 11.4125 6.650469 38 1.716 0.6772
## MT,deductive  MP,probabilistic 7.3375 6.285536 38 1.167 0.9363
## MT,deductive  MT,probabilistic 12.5625 8.129901 38 1.545 0.7783
## MT,deductive  AC,probabilistic 6.9375 8.864434 38 0.783 0.9931
## MT,deductive  DA,probabilistic 13.7000 7.905839 38 1.733 0.6666
## AC,deductive  DA,deductive 20.3250 7.493951 38 2.712 0.1501
## AC,deductive  MP,probabilistic 16.2500 7.210470 38 2.254 0.3446
## AC,deductive  MT,probabilistic 21.4750 8.864434 38 2.423 0.2600
## AC,deductive  AC,probabilistic 15.8500 9.542594 38 1.661 0.7111
## AC,deductive  DA,probabilistic 22.6125 8.659400 38 2.611 0.1834
## DA,deductive  MP,probabilistic 4.0750 5.992910 38 0.680 0.9971
## DA,deductive  MT,probabilistic 1.1500 7.905839 38 0.145 1.0000
## DA,deductive  AC,probabilistic 4.4750 8.659400 38 0.517 0.9995
## DA,deductive  DA,probabilistic 2.2875 7.675239 38 0.298 1.0000
## MP,probabilistic  MT,probabilistic 5.2250 6.125636 38 0.853 0.9885
## MP,probabilistic  AC,probabilistic 0.4000 7.096515 38 0.056 1.0000
## MP,probabilistic  DA,probabilistic 6.3625 5.934724 38 1.072 0.9588
## MT,probabilistic  AC,probabilistic 5.6250 5.633601 38 0.998 0.9719
## MT,probabilistic  DA,probabilistic 1.1375 6.650469 38 0.171 1.0000
## AC,probabilistic  DA,probabilistic 6.7625 7.493951 38 0.902 0.9840
##
## Results are averaged over the levels of: plausibility
## P value adjustment: tukey method for comparing a family of 8 estimates
Running Custom Contrasts
Objects returned from emmeans
can also be used to test specific contrasts. For this, we can simply create a list, where each element corresponds to one contrasts. A contrast is defined as a vector of constants on the reference grid (i.e., the object returned from emmeans
, here m3
). For example, we might be interested in whether there is a difference between the valid and invalid inferences in each of the two conditions.
c1 < list(
v_i.ded = c(0.5, 0.5, 0.5, 0.5, 0, 0, 0, 0),
v_i.prob = c(0, 0, 0, 0, 0.5, 0.5, 0.5, 0.5)
)
contrast(m3, c1, adjust = "holm")
## contrast estimate SE df t.ratio p.value
## v_i.ded 12.19375 4.11901 38 2.96 0.0105
## v_i.prob 0.36875 4.11901 38 0.09 0.9291
##
## Results are averaged over the levels of: plausibility
## P value adjustment: holm method for 2 tests
summary(as.glht(contrast(m3, c1)), test = adjusted("free"))
##
## Simultaneous Tests for General Linear Hypotheses
##
## Linear Hypotheses:
## Estimate Std. Error t value Pr(>t)
## v_i.ded == 0 12.1937 4.1190 2.96 0.0105 *
## v_i.prob == 0 0.3687 4.1190 0.09 0.9291
## 
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## (Adjusted p values reported  free method)
The results can be interpreted as in line with expectations. Responses are larger for valid than invalid problems in the deductive, but not the probabilistic condition.
Plotting
Since version 0.22
, afex
comes with its own plotting function based on ggplot2
, afex_plot
, which works directly with afex_aov
objects.
As said initially, we are interested in the threeway interaction of instruction with inference, plausibility, and instruction. As we saw above, this interaction was significant. Consequently, we are interested in plotting this interaction.
Basic Plots
For afex_plot
, we need to specify the x
factor(s), which determine which factorlevels or combinations of factorlevels are plotted on the xaxis. We can also define trace
factor(s), which determine which factor levels are connected by lines. Finally, we can also define panel
factor(s), which determine if the plot is split into subplots. afex_plot
then plots the estimated marginal means obtained from emmeans
, confidence intervals, and the raw data in the background. Note that the raw data in the background is per default drawn using an alpha blending of .5 (i.e., 50% semitransparency). Thus, in case of several points lying directly on top of each other, this point appears noticeably darker.
afex_plot(a1, x = "inference", trace = "instruction", panel = "plausibility")
## Warning: Panel(s) show a mixed withinbetweendesign.
## Error bars do not allow comparisons across all means.
## Suppress error bars with: error = "none"
In the default settings, the error bars show 95%confidence intervals based on the standard error of the underlying model (i.e., the lm
model in the present case). In the present case, in which each subplot (defined by x
 and trace
factor) shows a combination of a withinsubjects factor (i.e., inference
) and a betweensubjects (i.e., instruction
) factor, this is not optimal. The error bars only allow to assess differences regarding the betweensubjects factor (i.e., across the lines), but not inferences regarding the withinsubjects factor (i.e., within one line). This is also indicated by a warning.
An alternative would be withinsubject confidence intervals:
afex_plot(a1, x = "inference", trace = "instruction", panel = "plausibility",
error = "within")
## Warning: Panel(s) show a mixed withinbetweendesign.
## Error bars do not allow comparisons across all means.
## Suppress error bars with: error = "none"
However, those only allow inferences regarding the withinsubject factors and not regarding the betweensubjecta factor. So the same warning is emitted again.
A further alternative is to suppress the error bars altogether. This is the approach used in our original paper and probably a good idea in general when figures show both between and withinsubjects factors within the same panel. The presence of the raw data in the background still provides a visual depiction of the variability of the data.
afex_plot(a1, x = "inference", trace = "instruction", panel = "plausibility",
error = "none")
Customizing Plots
afex_plot
allows to customize the plot in a number of different ways. For example, we can easily change the aesthetic mapping associated with the trace
factor. So instead of using lineytpe and shape of the symbols, we can use color. Furthermore, we can change the graphical element used for plotting the data points in the background. For example, instead of plotting the raw data, we can replace this with a boxplot. Finally, we can also make both the points showing the means and the lines connecting the means larger.
p1 < afex_plot(a1, x = "inference", trace = "instruction",
panel = "plausibility", error = "none",
mapping = c("color", "fill"),
data_geom = geom_boxplot, data_arg = list(width = 0.4),
point_arg = list(size = 1.5), line_arg = list(size = 1))
p1
Note that afex_plot
returns a ggplot2
plot object which can be used for further customization. For example, one can easily change the theme
to something that does not have a grey background:
We can also set the theme globally for the remainder of the R
session.
The full set of customizations provided by afex_plot
is beyond the scope of this vignette. The examples on the help page at ?afex_plot
provide a good overview.
Replicate Analysis from Singmann and Klauer (2011)
However, the plots shown so far are not particularly helpful with respect to the research question. Next, we fit a new ANOVA model in which we separate the data in affirmation and denial inferences. This was also done in the original manuscript. We then lot the data a second time.
a2 < aov_ez("id", "response", sk2011.1, between = "instruction",
within = c("validity", "plausibility", "what"))
## Warning: More than one observation per cell, aggregating the data using mean (i.e,
## fun_aggregate = mean)!
## Contrasts set to contr.sum for the following variables: instruction
## Anova Table (Type 3 tests)
##
## Response: response
## Effect df MSE F ges p.value
## 1 instruction 1, 38 2027.42 0.31 .003 .58
## 2 validity 1, 38 678.65 4.12 * .01 .05
## 3 instruction:validity 1, 38 678.65 4.65 * .01 .04
## 4 plausibility 1, 38 468.82 34.23 *** .07 <.0001
## 5 instruction:plausibility 1, 38 468.82 10.67 ** .02 .002
## 6 what 1, 38 660.52 0.22 .0007 .64
## 7 instruction:what 1, 38 660.52 2.60 .008 .11
## 8 validity:plausibility 1, 38 371.87 0.14 .0002 .71
## 9 instruction:validity:plausibility 1, 38 371.87 4.78 * .008 .04
## 10 validity:what 1, 38 1213.14 9.80 ** .05 .003
## 11 instruction:validity:what 1, 38 1213.14 8.60 ** .05 .006
## 12 plausibility:what 1, 38 204.54 9.97 ** .009 .003
## 13 instruction:plausibility:what 1, 38 204.54 5.23 * .005 .03
## 14 validity:plausibility:what 1, 38 154.62 0.03 <.0001 .85
## 15 instruction:validity:plausibility:what 1, 38 154.62 0.42 .0003 .52
## 
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '+' 0.1 ' ' 1
Then we plot the data from this ANOVA. Because each panel would again show a mixeddesign, we suppress the error bars.
afex_plot(a2, x = c("plausibility", "validity"),
trace = "instruction", panel = "what",
error = "none")
We see the critical and predicted crossover interaction in the left of those two graphs. For implausible but valid problems deductive responses are larger than probabilistic responses. The opposite is true for plausible but invalid problems. We now tests these differences at each of the four xaxis ticks in each plot using custom contrasts (diff_1
to diff_4
). Furthermore, we test for a validity effect and plausibility effect in both conditions.
(m4 < emmeans(a2, ~instruction+plausibility+validitywhat))
## what = affirmation:
## instruction plausibility validity emmean SE df lower.CL upper.CL
## deductive plausible valid 99.475 1.160727 38 97.12523 101.82477
## probabilistic plausible valid 95.300 1.160727 38 92.95023 97.64977
## deductive implausible valid 95.100 5.007815 38 84.96221 105.23779
## probabilistic implausible valid 60.175 5.007815 38 50.03721 70.31279
## deductive plausible invalid 66.950 6.950663 38 52.87912 81.02088
## probabilistic plausible invalid 90.550 6.950663 38 76.47912 104.62088
## deductive implausible invalid 56.025 7.972665 38 39.88518 72.16482
## probabilistic implausible invalid 64.125 7.972665 38 47.98518 80.26482
##
## what = denial:
## instruction plausibility validity emmean SE df lower.CL upper.CL
## deductive plausible valid 70.550 6.181540 38 58.03613 83.06387
## probabilistic plausible valid 92.975 6.181540 38 80.46113 105.48887
## deductive implausible valid 70.250 6.355033 38 57.38491 83.11509
## probabilistic implausible valid 72.950 6.355033 38 60.08491 85.81509
## deductive plausible invalid 86.525 5.318808 38 75.75764 97.29236
## probabilistic plausible invalid 87.450 5.318808 38 76.68264 98.21736
## deductive implausible invalid 77.100 6.617466 38 63.70364 90.49636
## probabilistic implausible invalid 80.750 6.617466 38 67.35364 94.14636
##
## Confidence level used: 0.95
c2 < list(
diff_1 = c(1, 1, 0, 0, 0, 0, 0, 0),
diff_2 = c(0, 0, 1, 1, 0, 0, 0, 0),
diff_3 = c(0, 0, 0, 0, 1, 1, 0, 0),
diff_4 = c(0, 0, 0, 0, 0, 0, 1, 1),
val_ded = c(0.5, 0, 0.5, 0, 0.5, 0, 0.5, 0),
val_prob = c(0, 0.5, 0, 0.5, 0, 0.5, 0, 0.5),
plau_ded = c(0.5, 0, 0.5, 0, 0.5, 0, 0.5, 0),
plau_prob = c(0, 0.5, 0, 0.5, 0, 0.5, 0, 0.5)
)
contrast(m4, c2, adjust = "holm")
## what = affirmation:
## contrast estimate SE df t.ratio p.value
## diff_1 4.1750 1.641515 38 2.543 0.0759
## diff_2 34.9250 7.082119 38 4.931 0.0001
## diff_3 23.6000 9.829721 38 2.401 0.0854
## diff_4 8.1000 11.275051 38 0.718 0.9538
## val_ded 35.8000 7.096515 38 5.045 0.0001
## val_prob 0.4000 7.096515 38 0.056 0.9553
## plau_ded 3.2750 3.065092 38 1.068 0.8761
## plau_prob 30.7750 4.992400 38 6.164 <.0001
##
## what = denial:
## contrast estimate SE df t.ratio p.value
## diff_1 22.4250 8.742017 38 2.565 0.1007
## diff_2 2.7000 8.987374 38 0.300 1.0000
## diff_3 0.9250 7.521931 38 0.123 1.0000
## diff_4 3.6500 9.358510 38 0.390 1.0000
## val_ded 11.4125 6.650469 38 1.716 0.5658
## val_prob 1.1375 6.650469 38 0.171 1.0000
## plau_ded 4.5625 4.114603 38 1.109 1.0000
## plau_prob 13.3625 2.957010 38 4.519 0.0005
##
## P value adjustment: holm method for 8 tests
We can also pass these tests to multcomp
which gives us more powerful Type 1 error corrections.
summary(as.glht(contrast(m4, c2)), test = adjusted("free"))
## Warning in tmp$pfunction("adjusted", ...): Completion with error > abseps
## Warning in tmp$pfunction("adjusted", ...): Completion with error > abseps
## Warning in tmp$pfunction("adjusted", ...): Completion with error > abseps
## Warning in tmp$pfunction("adjusted", ...): Completion with error > abseps
## Warning in tmp$pfunction("adjusted", ...): Completion with error > abseps
## Warning in tmp$pfunction("adjusted", ...): Completion with error > abseps
## Warning in tmp$pfunction("adjusted", ...): Completion with error > abseps
## Warning in tmp$pfunction("adjusted", ...): Completion with error > abseps
## $`what = affirmation`
##
## Simultaneous Tests for General Linear Hypotheses
##
## Linear Hypotheses:
## Estimate Std. Error t value Pr(>t)
## diff_1 == 0 4.175 1.641 2.543 0.064792 .
## diff_2 == 0 34.925 7.082 4.931 0.000109 ***
## diff_3 == 0 23.600 9.830 2.401 0.070758 .
## diff_4 == 0 8.100 11.275 0.718 0.688158
## val_ded == 0 35.800 7.096 5.045 7.09e05 ***
## val_prob == 0 0.400 7.096 0.056 0.955346
## plau_ded == 0 3.275 3.065 1.068 0.603570
## plau_prob == 0 30.775 4.992 6.164 1.92e06 ***
## 
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## (Adjusted p values reported  free method)
##
##
## $`what = denial`
##
## Simultaneous Tests for General Linear Hypotheses
##
## Linear Hypotheses:
## Estimate Std. Error t value Pr(>t)
## diff_1 == 0 22.425 8.742 2.565 0.081033 .
## diff_2 == 0 2.700 8.987 0.300 0.984913
## diff_3 == 0 0.925 7.522 0.123 0.984913
## diff_4 == 0 3.650 9.358 0.390 0.984913
## val_ded == 0 11.412 6.651 1.716 0.379122
## val_prob == 0 1.137 6.651 0.171 0.984913
## plau_ded == 0 4.562 4.115 1.109 0.725836
## plau_prob == 0 13.363 2.957 4.519 0.000386 ***
## 
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## (Adjusted p values reported  free method)
Unfortunately, in the present case this function throws several warnings. Nevertheless, the pvalues from both methods are very similar and agree on whether or not they are below or above .05. Because of the warnings it seems advisable to use the one provided by emmeans
directly and not use the ones from multcomp
.
The pattern for the affirmation problems is in line with the expectations: We find the predicted differences between the instruction types for valid and implausible (diff_2
) and invalid and plausible (diff_3
) and the predicted nondifferences for the other two problems (diff_1
and diff_4
). Furthermore, we find a validity effect in the deductive but not in the probabilistic condition. Likewise, we find a plausibility effect in the probabilistic but not in the deductive condition.