Calculate Age-related Expression and Heterogeneity Changes

calc.het(exprmat, age, age_in = "days", age_to = "pw-0.25",
  batch_corr = "NC", modex = "linear", tr_log2 = T,
  sc_features = T, covariates = NA, het_change_met = "spearman",
  padj_met = "fdr")

Arguments

exprmat

a numeric matrix with the expression values, where columns are the samples and rows are probesets, transcripts, or genes.

age

a numeric vector, where the names correspond to samples (the same as colnames of the given matrix).

age_in

type of the age vector, allowed values are days or years. defaults to 'days'

age_to

final format of age vector. allowed values are 'years', 'days', 'pw-N', and 'lg-N', where N is any number. 'pw' means power. e.g. pw-0.5 means sqrt(age), and 'lg' means log, e.g. lg-2 means log2(age).defaults to 'pw-0.25'

batch_corr

batch correction strategy. available values are i) NC: No Correction, ii) QN: Quantile Normalization, iii) LR: Linear regression (requires covariates), iv) SVA: surrogate variable analysis, v) LR+QN: Linear regression followed by quantile normalization, and vi) SVA+QN: SVA followed by quantile normalization. Defaults to 'NC'.

modex

expression change calculation method. 'linear' or 'loess', defaults to 'linear'

tr_log2

logical to set log2 transforming expression matrix. defaults to TRUE.

sc_features

logical to set whether to scale features. defaults to TRUE.

covariates

a list of covarietes where each element is a vector with sampleIDs as names

het_change_met

heterogeneity change calculation method. 'LR', for 'linear regression', or and correlation method accepted by cor.test() function.

padj_met

method for multiple test correction. value is passed to 'p.adjust' function. defaults to 'fdr'.

Value

a list object with summary results including expression level and heterogeneity changes, input values, intermediate values, and heterogeneity matrix.

Examples

sampnames = paste('sample',1:20,sep='') myexp <- sapply( 1:20, function(i){ rnorm(n = 10000, mean = sample(1:3, 1), sd = sample(c(1, 3), 1)) }) rownames(myexp) = paste('gene', 1:10000, sep = '') colnames(myexp) = sampnames agevec <- sample(20:80,20) names(agevec) = sampnames het_result <- calc.het(myexp, agevec, age_in = 'years', tr_log2 = F) head(het_result$sampleID)
#> [1] "sample1" "sample2" "sample3" "sample4" "sample5" "sample6"
het_result$input_expr[1:5,1:5] # expression values used as input
#> sample1 sample2 sample3 sample4 sample5 #> gene1 1.2553171 -0.03420972 3.982986 5.9166836 5.5795243 #> gene2 -1.4372636 2.11799795 3.182873 -0.2971363 0.4746558 #> gene3 0.9944287 1.09024025 3.763126 4.1899031 0.5553416 #> gene4 1.6215527 3.72192036 3.442878 0.2738125 0.3384015 #> gene5 2.1484116 3.38209142 4.229840 1.9197492 2.2381836
head(het_result$input_age) # input ages
#> sample1 sample2 sample3 sample4 sample5 sample6 #> 30 27 20 39 51 36
head(het_result$usedAge) # ages used in calculations - transformation is applied using 'age_to' parameter
#> sample1 sample2 sample3 sample4 sample5 sample6 #> 10.229479 9.963551 9.243378 10.922935 11.680616 10.706533
het_result$resid_mat[1:5,1:5] # heterogeneity values (residual matrix)
#> NULL
head(het_result$feature_result) # summary statistics
#> # A tibble: 6 x 7 #> feature level_change level_change.p heterogeneity_c… heterogeneity_c… #> <chr> <dbl> <dbl> <dbl> <dbl> #> 1 gene1 0.0610 0.786 -0.0466 0.846 #> 2 gene2 0.00514 0.982 0.517 0.0210 #> 3 gene3 -0.00507 0.982 0.180 0.445 #> 4 gene4 0.171 0.443 0.310 0.183 #> 5 gene5 0.0390 0.862 0.149 0.530 #> 6 gene6 0.140 0.531 0.125 0.599 #> # … with 2 more variables: level_change.p.adj <dbl>, #> # heterogeneity_change.p.adj <dbl>