Overview
The Small Round Blue Cell Tumors dataset from Khan et al., (2001) contains information of 63 samples and 2308 genes. The samples are distributed in four classes as follows: 8 Burkitt Lymphoma (BL), 23 Ewing Sarcoma (EWS), 12 neuroblastoma (NB), and 20 rhabdomyosarcoma (RMS).
Usage in mixOmics
The SRBCT dataset is implemented in mixOmics via srbct
, and contains the following:
srbct$
gene
: data frame with 63 rows and 2308 columns. The expression measure of 2308 genes for the 63 subjects.srbct$
class
: A class vector containing the class tumor of each case (4 classes in total).srbct$
gene.name
: data frame with 2308 rows and 2 columns containing further information on the genes.
Now, we will see how to analyze srbct
by using sPLS-DA. The aim of this analysis is to select the genes that can help predict the class of the samples.