# A new Integrated Mean Variance Correlation and Its Use in High-Dimensional Data Analysis

The goal of package newIMVC is to provide an easy way to implement the proposed methods in Xiong et al. (2024), which include a new robust correlation between continuous variables and its use in hypothesis test, feature screening and false discovery rate control.

## Installation

To install newIMVC,

install.packages("newIMVC")

## Example

Here are examples showing how to use main functions in package newIMVC.

library("newIMVC")
library("mvtnorm")
#> Warning: package 'mvtnorm' was built under R version 4.3.3
###The new IMVC measure###
n=200
x=rnorm(n)
y=x^2+rt(n,2)
IMVC(y,x,K=10,type="nonlinear")
#> [1] 0.2225841
###IMVC based feature screening###
n=200
p=300
pho1=0.8
mean_x=rep(0,p)
sigma_x=matrix(NA,nrow = p,ncol = p)
for (i in 1:p) {
for (j in 1:p) {
sigma_x[i,j]=pho1^(abs(i-j))
}
}
x=rmvnorm(n, mean = mean_x, sigma = sigma_x,method = "chol")
x1=x[,1]
x2=x[,2]
x3=x[,12]
x4=x[,22]
y=2*x1+0.5*x2+3*x3*ifelse(x3<0,1,0)+2*x4+rnorm(n)
IMVCS(y,x,K=5,d=round(n/log(n)),type="nonlinear")
#>  [1]   1   2  22   3   4  23  12  21  13   5  14  11  15  24   6   7  25   8   9
#> [20]  20  10 104  19  16 212 224 156  39  17 168  18 175 103 226 119 128 227 218
###IMVC based hypothesis test###
n=100
x=rnorm(n)
y=2*x+rt(n,2)
IMVCT(x,y,K=5,type = "linear")
#> [1] 1.506868e-16
y=2*cos(x)+rt(n,2)
IMVCT(x,y,K=5,type = "nonlinear",num_per = 100)
#> [1] 0
###IMVC based FDR control###
n=200
p=100
pho1=0.5
mean_x=rep(0,p)
sigma_x=matrix(NA,nrow = p,ncol = p)
for (i in 1:p) {
for (j in 1:p) {
sigma_x[i,j]=pho1^(abs(i-j))
}
}
x=rmvnorm(n, mean = mean_x, sigma = sigma_x,method = "chol")
x1=x[,1]
x2=x[,2]
x3=x[,3]
x4=x[,4]
x5=x[,5]
y=x1+x2+x3+x4+x5+rnorm(n)
IMVCFDR(y,x,K=5,numboot=100,timeboot=50,true_signal=c(1,2,3,4,5),null_method="hist",alpha=0.2)
#> $selected #> [1] 3 5 4 2 6 #> #>$FDR
#> [1] 0.2
#>
#> \$Power
#> [1] 0.8

## References

Wei Xiong, Han Pan, Hengjian Cui. (2024) “A Robust Integrated Mean Variance Correlation and Its Use in High Dimensional Data Analysis.”