Given an input contingency table, `fun.chisq.test()`

offers three quantities to evaluate non-parametric functional dependency of the column variable \(Y\) on the row variable \(X\). They include test statistic (functional chi-square \(\chi^2_f\)), statistical significance (\(p\)-value), and effect size (function index \(\xi_f\)).

We explain their differences in analogy to those statistics returned from `cor.test()`

, the R function for the test of correlation, and the \(t\)-test. We chose both tests because they are widely used and well understood. Another choice could be the Pearsonâ€™s chi-square test plus a statistic called Cramerâ€™s V, analogous to correlation coefficient, but not as popularly used. The table below summarizes the differences among the quantities and their analogous counterparts in correlation and \(t\) tests.

Quantity | Measure functional dependency? | Affected by sample size? | Affected by table size? | Measure statistical significance? | Counterpart in correlation test | Counterpart in two-sample \(t\)-test |
---|---|---|---|---|---|---|

\(\chi^2_f\) | Yes | Yes | Yes | No | \(t\)-statistic | \(t\)-statistic |

\(p\)-value | Yes | Yes | Yes | Yes | \(p\)-value | \(p\)-value |

\(\xi_f\) | Yes | No | No | No | correlation coefficient | mean difference |

The test statistic \(\chi^2_f\) measures deviation of \(Y\) from a uniform distribution contributed by \(X\). It is maximized when there is a functional relationship from \(X\) to \(Y\). This statistic is also affected by sample size and the size of the contingency table. It summarizes the strength of both functional dependency and support from the sample. A strong function supported by few samples may have equal \(\chi^2_f\) to a weak function supported by many samples. It is analogous to the test statistic (not to be confused with correlation coefficient) in `cor.test()`

, or the \(t\) statistic from the \(t\)-test.

The \(p\)-value of \(\chi^2_f\) overcomes the table size factor and making tables of different sizes or sample sizes comparable. However, its null distribution (chi-square or normalized) is only asymptotically true. It is analogous to the role of the \(p\)-value of `cor.test()`

.

The function index \(\xi_f\) measures *only* the strength of functional dependency normalized by sample and table sizes without considering statistical significance. When the sample size is small, the index can be unreliable; when the sample size is large, it is a direct measure of functional dependency and is comparable across tables. It is analogous to the role of correlation coefficient in `cor.test()`

, or fold change in \(t\)-test for differential gene expression analysis.

We provide four examples to illustrate the differences among the statistics. `x1`

and `x4`

represent the same non-monotonic function pattern in different sample sizes; `x2`

is the transpose of `x1`

, no longer functional; and `x3`

is another non-functional pattern. Among the first three examples, `x3`

is the most statistically significant, but `x1`

has the highest function index \(\xi_f\). This can be explained by a larger sample size but a smaller effect in `x3`

than `x1`

. However, when `x1`

is linearly scaled to `x4`

to have exactly the same sample size with `x3`

, both the \(p\)-value and the function index \(\xi_f\) favor `x4`

over `x3`

for representing a stronger function.

```
require(FunChisq)
x1=matrix(c(5,1,5,1,5,1,1,0,1), nrow=3)
x1
```

```
## [,1] [,2] [,3]
## [1,] 5 1 1
## [2,] 1 5 0
## [3,] 5 1 1
```

`fun.chisq.test(x1)`

```
##
## Functional chi-square test
##
## data: x1
## statistic = 10.043, parameter = 4, p-value = 0.03971
## sample estimates:
## non-constant function index xi.f
## 0.5010703
```

```
x2=matrix(c(5,1,1,1,5,0,5,1,1), nrow=3)
x2
```

```
## [,1] [,2] [,3]
## [1,] 5 1 5
## [2,] 1 5 1
## [3,] 1 0 1
```

`fun.chisq.test(x2)`

```
##
## Functional chi-square test
##
## data: x2
## statistic = 8.3805, parameter = 4, p-value = 0.07859
## sample estimates:
## non-constant function index xi.f
## 0.4577259
```

```
x3=matrix(c(5,1,1,1,5,0,9,1,1), nrow=3)
x3
```

```
## [,1] [,2] [,3]
## [1,] 5 1 9
## [2,] 1 5 1
## [3,] 1 0 1
```

`fun.chisq.test(x3)`

```
##
## Functional chi-square test
##
## data: x3
## statistic = 10.221, parameter = 4, p-value = 0.03686
## sample estimates:
## non-constant function index xi.f
## 0.4614612
```

```
x4=x1*sum(x3)/sum(x1)
x4
```

```
## [,1] [,2] [,3]
## [1,] 6.0 1.2 1.2
## [2,] 1.2 6.0 0.0
## [3,] 6.0 1.2 1.2
```

`fun.chisq.test(x4)`

```
##
## Functional chi-square test
##
## data: x4
## statistic = 12.051, parameter = 4, p-value = 0.01697
## sample estimates:
## non-constant function index xi.f
## 0.5010703
```