This vignette can be referred to by citing the package:

- Makowski, D., Ben-Shachar, M. S., & Lüdecke, D. (2019).
*bayestestR: Describing Effects and their Uncertainty, Existence and Significance within the Bayesian Framework*. Journal of Open Source Software, 4(40), 1541. https://doi.org/10.21105/joss.01541 - Makowski, D., Ben-Shachar, M. S., Chen, S. H. A., & Lüdecke, D. (2019).
*Indices of Effect Existence and Significance in the Bayesian Framework*. Retrieved from https://doi.org/10.31234/osf.io/2zexr

A Bayesian analysis returns a posterior distribution for each parameter (or *effect*). To minimally describe these distributions, we recommend reporting a point-estimate of centrality as well as information characterizing the estimation uncertainty (the dispersion). Additionally, one can also report indices of effect existence and/or significance.

Based on the previous **comparison of point-estimates** and **indices of effect existence**, we can draw the following recommendations.

We suggest reporting the **median** as an index of centrality, as it is more robust compared to the mean or the MAP estimate. However, in case of severly skewed posterior distributions, the MAP estimate could be a good alternative.

The **89% Credible Interval (CI)** appears as a reasonable range to characterize the uncertainty related to the estimation, being more stable than higher thresholds (such as 90% and 95%). We also recommend computing the CI based on the HDI rather than quantiles, favouring probable, - over central - values.

Note that a CI based on the quantile (equal-tailed interval) might be more appropriate in case of transformations (for instance when transforming log-odds to probabilities). Otherwise, intervals that originally do not cover the null might cover it after transformation (see here).

The Bayesian framework can neatly delineate and quantify different aspects of hypothesis testing, such as effect *existence* and *significance*. The most straightforward index to describe effect existence is the **Probability of Direction (pd)**, representing the certainty associated with the most probable direction (positive or negative) of the effect. This index is easy to understand, simple to interpret, straightforward to compute, robust to model characteristics and independent from the scale of the data.

Moreover, it is strongly correlated with the frequentist ** p-value**, and can thus be used to draw parallels and give some reference to readers non-familiar with Bayesian statistics. A

`.1`

, `.05`

, `.01`

and `.001`

would correspond approximately to a *pd***<= 95%**~*p*> .1: uncertain*pd***> 95%**~*p*< .1: possibly existing*pd***> 97%**: likely existing*pd***> 99%**: probably existing*pd***> 99.9%**: certainly existing

The percentage in **ROPE** is a index of **significance** (in its primary meaning), informing us whether a parameter is related - or not - to a non-negligible change (in terms of magnitude) in the outcome. We suggest reporting the **percentage of the full posterior distribution** (the *full* ROPE) instead of a given proportion of CI, in the ROPE, which appears as more sensitive (especially to delineate highly significant effects). Rather than using it as a binary, all-or-nothing decision criterion, such as suggested by the original equivalence test, we recommend using the percentage as a continuous index of significance. However, based on simulation data, we suggest the following reference values as an interpretation helpers:

**> 99%**in ROPE: negligible (we can accept the null hypothesis)**> 97.5%**in ROPE: probably negligible**<= 97.5%**&**>= 2.5%**in ROPE: undecided significance**< 2.5%**in ROPE: probably significant**< 1%**in ROPE: significant (we can reject the null hypothesis)

*Note that extra caution is required as its interpretation highly depends on other parameters such as sample size and ROPE range (see here)*.

Based on these suggestions, a template sentence for minimal reporting of a parameter based on its posterior distribution could be:

- “the effect of
*X*has a probability ofof being**pd***negative*(Median =, 89% CI [**median**,**HDI**_{low}] and can be considered as**HDI**_{high}*significant*(% in ROPE).”**ROPE**

Altough it can also be used to assess effect existence and signficance, the **Bayes factor (BF)** is a versatile index that can be used to directly compare different models (or data generation processes). The Bayes factor is a ratio, informing us by how much more (or less) likely the observed data are under two compared models - usually a model with an effect vs. a model *without* the effect. Depending on the specifications of the null model (whether it is a point-estimate (e.g., **0**) or an interval), the Bayes factor could be used both in the context of effect existence and significance.

In general, a Bayes factor greater than 1 giving evidence in favour of one of the models, and a Bayes factor smaller than 1 giving evidence in favour of the other model. Several rules of thumb exist to help the interpretation (see here), with **> 3** being one common treshold to categorize non-anecdotal evidence.

When reporting Bayes factors (BF), one can use the following sentence:

- “There is
*moderate evidence*in favour of an*absence*of effect of*x*(BF =*BF*).”

*Note: If you have any advice, opinion or such, we encourage you to let us know by opening an discussion thread or making a pull request.*