Multilevel

Usage in mixOmics

data(data.simu);
X.simu <- data.simu$X
stimulation <-  data.simu$stimu
repeat.simu <-  data.simu$sample

result.1level <- multilevel(X.simu,
                            cond = stimulation,
                            sample = repeat.simu,
                            ncomp = 3,
                            keepX = c(200,200,200),
                            tab.prob.gene = NULL,
                            method = 'splsda');

plot3dIndiv(result.1level, col = as.numeric(data.simu$stimu), cex = 0.6);

pheatmap.multilevel(result.1level,
              col_sample=as.numeric(repeat.simu),
              col_stimulation= unique(as.numeric(stimulation)),
              label_annotation=NULL,
              border=FALSE,
              clustering_method="ward",
              show_colnames = FALSE,
              show_rownames = TRUE,
              fontsize_row=2)

Tuning

A tuning function (‘tune.multilevel’) is proposed to tune the number of variables to select

  • either using leave-one-out cross validation for sPLS-DA one factor analysis
  • or by maximising the correlation between the latent variables for sPLS-DA two factors analysis or sPLS on the whole data set (applies when there are too many conditions and not enough samples).
Below is an example of sequential tuning (one component at a time) for sPLS-DA one factor analysis (using leave-one-out cross-validation):
# tuning parameters: the number of variables to select
# ---- for splsda - one factor: with the simu data
result.tune <- tune.multilevel(X.simu,
                               cond = stimulation,
                               sample = repeat.simu,
                               ncomp=2,
                               test.keepX=c(5, 10, 15),
                               already.tested.X = c(50),
                               method = 'splsda',
                               dist = 'mahalanobis.dist',
                               validation = 'loo') 

For a one-factor analysis, the tuning criterion is based on leave-one-out cross
Number of variables selected on the first 1 component(s) was 50

result.tune

$error
var5     var10     var15
0.1875000 0.2291667 0.1875000

In this above example, 50 variables were already tuned and chosen for the first component. For the second component, one would choose the number of variables for which the estimated error rate is the lowest (here 5 or 15 variables).

Below is an example of sequential tuning (one component at a time) for sPLS-DA two factors analysis (maximising the correlation on the whole data set):

data(liver.toxicity)
X.rat = as.matrix(liver.toxicity$gene)
repeat.indiv = c(1,2, 1,  2,  1,  2,  1,  2,  3,  3,  4,  3,  4,  3,  4,  4,  5,  6,  5,  5,  6,  5,  6,  7,  7,  8,  6,  7,  8,  7,  8,  8,  9, 10,  9, 10, 11, 9,  9, 10, 11, 12, 12, 10, 11, 12, 11, 12, 13, 14, 13, 14, 13, 14, 13, 14, 15, 16, 15, 16, 15, 16, 15, 16)
dose = liver.toxicity$treatment$Dose.Group
time = liver.toxicity$treatment$Time.Group
dose.time = cbind(dose, time)

result.tune = tune.multilevel (X.rat,
                               cond = dose.time,
                               sample = repeat.indiv,
                               ncomp=2,
                               test.keepX=c(5, 10, 15),
                               already.tested.X = c(50),
                               method = 'splsda',
                              )
result.tune
$cor.value
var5 var10 var15
0.9997513 0.9998078 0.9997054

In this above example, 50 variables were already tuned and chosen for the first component. For the second component, one would choose the number of variables for which the estimated correlation is the highest (here 15 variables).

See also Multilevel:Liver Toxicity case study.

 

References

Liquet, B., Lê Cao, K.-A., Hocini, H. and Thiebaut, R. A novel approach for biomarker selection and the integration of repeated measures experiments from two platforms.Submitted.