This vignette contains answers to questions received from users or posted on discussion boards like Cross Validated and Stack Overflow

- What are EMMs/lsmeans?
- I have three (or two or four) factors that interact
- I have covariate(s) that interact(s) with factor(s)
- I have covariate(s) and am fitting a polynomial model
- All my pairwise comparisons have the same
*P*value - emmeans() doesn’t work as expected
- All or some of the results are NA
- If I analyze subsets of the data separately, I get different results
- My lsmeans/EMMs are way off from what I expected
- Why do I get
`Inf`

for the degrees of freedom? - I get exactly the same comparisons for each “by” group
- My ANOVA
*F*is significant, but no pairwise comparisons are - I get annoying messages about namespaces

Estimated marginal means (EMMs), a.k.a. least-squares means, are predictions on a reference grid of predictor settings, or marginal averages thereof. See details in the “basics” vignette.

Perhaps your question has to do with interacting factors, and you want to do some kind of *post hoc* analysis comparing levels of one (or more) of the factors on the response. Some specific versions of this question…

- Perhaps you tried to do a simple comparison for one treatment and got a warning message you don’t understand
- You do pairwise comparisons of factor combinations and it’s just too much – want just some of them
- How do I even approach this?

My first answer is: plots almost always help. If you have factors A, B, and C, try something like `emmip(model, A ~ B | C)`

, which creates an interaction-style plot of the predictions against B, for each A, with separate panels for each C. This will help visualize what effects stand out in a practical way. This can guide you in what post-hoc tests would make sense. See the “interactions” vignette for more discussion and examples.

This is a situation where it may well be appropriate to compare the slopes of trend lines, rather than the EMMs. See the help page for `emtrends()`

and the discussion of this topic in the “interactions” vignette

You need to be careful to define the reference grid consistently. For example, if you use covariates `x`

and `xsq`

(equal to `x^2`

) to fit a quadratic curve, the default reference grid uses the mean of each covariate – and `mean(xsq)`

is usually not the same as `mean(x)^2`

. So you need to use `at`

to ensure that the covariates are set consistently with respect to the model. See this subsection of the “basics” vignette for an example.

This will happen if you fitted a model where the treatments you want to compare were put in as a numeric predictor; for example `dose`

, with values of 1, 2, and 3. If `dose`

is modeled as numeric, you will be fitting a linear trend in those dose values, rather than a model that allows those doses to differ in arbitrary ways. Go back and fit a different model using `factor(dose)`

instead; it will make all the difference. This is closely related to the next topic.

Equivalently, users ask how to get *post hoc* comparisons when we have covariates rather than factors. Yes, it does work, but you have to tell it the appropriate reference grid.

But before saying more, I have a question for you: *Are you sure your model is meaningful?*

- If your question concerns
*only*two-level predictors such as`sex`

(coded 1 for female, 2 for male), no problem. The model will produce the same predictions as you’d get if you’d used these as factors. - If
*any*of the predictors has 3 or more levels, you may have fitted a nonsense model, in which case you need to fit a different model that does make sense before doing any kind of*post hoc*analysis. For instance, the model contains a covariate`brand`

(coded 1 for Acme, 2 for Ajax, and 3 for Al’s), this model is implying that the difference between Acme and Ajax is exactly equal to the difference between Ajax and Al’s, owing to the fact that a linear trend in`brand`

has been fitted. If you had instead coded 1 for Ajax, 2 for Al’s, and 3 for Acme, the model would produce different fitted values. You need to fit another model using`factor(brand)`

in place of`brand`

.

Assuming that issue is settled, you can do something like `emmeans(model, "sex", at = list(sex = 1:2))`

, to get get separate EMMs for each sex rather than one EMM for the average numeric value of `sex`

.

An alternative to the `at`

list is to use `cov.reduce = FALSE`

, which specifies that the unique values of covariates are to be used rather than reducing them to their means. However, the specification applies to *all* covariates, so if you have another one, say `age`

, that has 43 different values in your data, you will have a mess on your hands.

See “altering the reference grid” in the “basics” vignette for more discussion.

The **emmeans** package uses tookls in the **estimability** package to determine whether its results are uniquely estimable. For example, in a two-way model with interactions included, if there are no observations in a particular cell (factor combination), then we cannot estimate the mean of that cell.

When *some* of the EMMs are estimable and others are not, that is information about missing information in the data. If it’s possible to remove some terms from the model (particularly interactions), that may make more things estimable if you re-fit with those terms excluded; but don’t delete terms that are really needed for the model to fit well.

When *all* of the estimates are non-estimable, it could be symptomatic of something else. Some possibilities include:

- An overly ambitious model; for example, in a Latin square design, interaction effects are confounded with main effects; so if any interactions are included in the model, you will render main effects inestimable.
- Possibly you have a nested structure that needs to be included in the model or specified via the
`nesting`

argument. Perhaps the levels that B can have depend on which level of A is in force. Then B is nested in A and the model should specify`A + A:B`

, with no main effect for`B`

. - Modeling factors as numeric predictors (see also the related section on covariates). To illustrate, suppose you have data on particular state legislatures, and the model includes the predictors
`state_name`

as well as`dem_gov`

which is coded 1 if the governor is a Democrat and 0 otherwise. If the model was fitted with`state_name`

as a factor or character variable, but`dem_gov`

as a numeric predictor, then, chances are,`emmeans()`

will return non-estimable results. If instead, you use`factor(dem_gov)`

in the model, then the fact that`state_name`

is nested in`dem_gov`

will be detected, causing EMMs to be computed separately for each party’s states, thus making things estimable. - Some other things may in fact be estimable. For illustration, it’s easy to construct an example where all the EMMs are non-estimable, but pairwise comparisons are estimable:

```
pg <- transform(pigs, x = rep(1:3, c(10, 10, 9)))
pg.lm <- lm(log(conc) ~ x + source + factor(percent), data = pg)
emmeans(pg.lm, consec ~ percent)
```

```
## $emmeans
## percent emmean SE df asymp.LCL asymp.UCL
## 9 nonEst NA NA NA NA
## 12 nonEst NA NA NA NA
## 15 nonEst NA NA NA NA
## 18 nonEst NA NA NA NA
##
## Results are averaged over the levels of: source
## Results are given on the log (not the response) scale.
## Confidence level used: 0.95
##
## $contrasts
## contrast estimate SE df t.ratio p.value
## 12 - 9 0.1796 0.0561 23 3.202 0.0109
## 15 - 12 0.0378 0.0582 23 0.650 0.8613
## 18 - 15 0.0825 0.0691 23 1.194 0.5201
##
## Results are averaged over the levels of: source
## Results are given on the log (not the response) scale.
## P value adjustment: mvt method for 3 tests
```

The “messy-data” vignette has more examples and discussion.

Estimated marginal means summarize the *model* that you fitted to the data – not the data themselves. Many of the most common models rely on several simplifying assumptions – that certain effects are linear, that the error variance is constant, etc. – and those assumptions are passed forward into the `emmeans()`

results. Doing separate analyses on subsets usually comprises departing from that overall model, so of course the results are different.

First step: Carefully read the annotations below the output. Do they say something like “results are on the log scale, not the response scale”? If so, that explains it. A Poisson or logistic model involves a link function, and by default, `emmeans()`

produces its results on that same scale. You can add `type = "response"`

to the `emmeans()`

call and it will put the results of the scale you expect. But that is not always the best approach. The “transformations” vignette has examples and discussion.

`Inf`

for the degrees of freedom?This is simply the way that **emmeans** labels asymptotic results (that is, estimates that are tested against the standard normal distribution – *z* tests – rather than the *t* distribution). Note that obtaining quantiles or probabilities from the *t* distribution with infinite degrees of freedom is the same as obtaining the corresponding values from the standard normal. For example:

`qt(c(.9, .95, .975), df = Inf)`

`## [1] 1.281552 1.644854 1.959964`

`qnorm(c(.9, .95, .975))`

`## [1] 1.281552 1.644854 1.959964`

so when you see infinite d.f., that just means its a *z* test or a *z* confidence interval.

As mentioned elsewhere, EMMs summarize a *model*, not the data. If your model does not include any interactions between the `by`

variables and the factors for which you want EMMs, then by definition, the effects for the latter will be exactly the same regardless of the `by`

variable settings. So of course the comparisons will all be the same. If you think they should be different, then you are saying that your model should include interactions between the factors of interest and the `by`

factors.

This is a common misunderstanding of ANOVA. If *F* is significant, this implies only that *some contrast* among the means (or effects) is statistically significant (compared to a Scheffé critical value). That contrast may be very much unlike a pairwise comparison, especially when there are several means being compared. Another factor is that by default, *P* values for pairwise comparisons are adjusted using the Tukey method, and the adjusted *P* values can be quite a bit larger than the unadjusted ones. (But I definitely do *not* advocate using no adjustment to “repair” this problem.)

As is shown in my answer in a *Cross Validated* discussion, the unadjusted *P* value can be more than .15 when *F* is significant (remarkably, regardless of the significance level used for *F*!).

This probably happens because somehow the **lsmeans** package or its namespace has been loaded – possibly because some old objects from that package are still in your workspace. Try this:

`emmeans:::convert_workspace()`

This non-exported function removes `Last.ref.grid`

if it exists, converts every `ref.grid`

or `lsmobj`

object to class `emmGrid`

, and unloads any vestige of the **lsmeans** package. [Note: You may get an error message if there are other packages loaded that depend on **lsmeans**. If so, `detach(package:`

*pkg*`, unload = TRUE)`

any that are in `search()`

and use `unloadNamespace("pkg")`

for others; then re-run `emmeans:::convert_workspace()`

.]