安装软件包

metagx胰腺包是胰腺癌数据集的概要。该软件包是公开可用的,可以从Bioconductor安装到R版本3.6.0或更高版本。目前,数据集的phenoData为总生存状态和总生存时间。15个数据集中的11个数据集提供了这种生存信息。

如果(!requireNamespace("BiocManager", quiet = TRUE)) install.packages("BiocManager")::install(" metagx胰腺")

加载数据集

首先,我们将metagx胰腺包加载到工作区中。

库(MetaGxPancreas)
##加载所需包:摘要实验
##加载所需包:MatrixGenerics
##加载所需的包:matrixStats
## ##附加包:'MatrixGenerics'
下面的对象从package:matrixStats中屏蔽:## ## colAlls, colAnyNAs, colanyans, colAvgsPerRowSet, colCollapse, ## colCounts, colCummaxs, colCummins, colCumprods, colMadDiffs, colIQRs, colLogSumExps, colMadDiffs, ## colMads, colMaxs, colMeans2, colMedians, colMins, colOrderStats, ## colProds, colQuantiles, colRanges, colRanks, colSdDiffs, colSds, ## colSums2, colTabulates, colVarDiffs, colVars, colWeightedMads, ## colWeightedMeans, colWeightedMedians, colweighteddsds, ## colweighttedvars, rowAlls, rowAnyNAs, rowAnys, colIQRs, colLogSumExps, colMadDiffs,rowAvgsPerColSet, ## rowCollapse, rowCounts, rowCummaxs, rowCummins, rowCumprods, ## rowcumsum, rowDiffs, rowIQRDiffs, rowIQRs, rowLogSumExps, ## rowMadDiffs, rowMads, rowMaxs, rowMeans2, rowMedians, rowMins, ## rowOrderStats, rowProds, rowQuantiles, rowwranges, rowwranks, ## rowSdDiffs, rowSds, rowSums2, rowTabulates, rowVarDiffs, rowVars, ## rowWeightedMads, rowWeightedMeans, rowWeightedMedians, ## rowweighteddsds, rowWeightedVars
##加载所需软件包:GenomicRanges
##加载所需的包:stats4
##加载所需的包:BiocGenerics
## ##附加包:“BiocGenerics”
以下对象从'package:stats'中屏蔽:## ## IQR, mad, sd, var, xtabs
##以下对象从'package:base'中屏蔽:## ## Filter, Find, Map, Position, Reduce, anyduplication, aperm, append, ## as.data.frame, basename, cbind, colnames, dirname, do。调用,## duplicate eval evalq get grep grepl, intersect, is。Unsorted, ## lapply, mapply, match, mget, order, paste, pmax, pmax.int, pmin, ## pmin.int, rank, rbind, rownames, sapply, setdiff, sort, table, ## tapply, union, unique, unsplit,其中。马克斯,which.min
##加载所需的包:S4Vectors
## ##附加包:“S4Vectors”
以下对象从'package:base'中屏蔽:## ## I,展开。网格,unname
##加载所需的包:IRanges
##加载所需包:GenomeInfoDb
##加载所需的包:Biobase
##欢迎访问Bioconductor ## ##小插图包含介绍性材料;查看## 'browseVignettes()'。要引用Bioconductor,请参见##“citation(“Biobase”)”,以及软件包的“citation(“pkgname”)”。
## ##附件:“Biobase”
下面的对象从“package:MatrixGenerics”中屏蔽:## ## rowMedians
以下对象从'package:matrixStats'中屏蔽:## ## anyMissing, rowMedians
##加载所需的包:ExperimentHub
##加载所需包:AnnotationHub
##加载所需包:BiocFileCache
##加载所需的包:dbplyr
## ##附加包:'AnnotationHub'
下面的对象从'package:Biobase'屏蔽:## ##缓存
胰脏数据<- load胰脏数据集()
## snapshotDate(): 2022-10-24
##过滤掉重复样本:ICGC_0400, ICGC_0402, GSM388116, GSM388118, GSM388120, GSM388145, GSM299238, GSM299239, GSM299240
duplicate <-胰脏数据$复制SEs <-胰脏数据$SEs

这将加载15个表达式数据集。用户可以修改函数的参数,以限制不满足某些加载条件的数据集。一些示例参数如下所示:

获取数据集中的样本计数

要获得每个数据集的样本数量,运行以下命令:

numSamples <- vapply(SEs, function(SE) length(colnames(SE)), function . value =numeric(1)) sampleNumberByDataset <- data.frame(numSamples=numSamples, row.names=names(SEs)) totalNumSamples <- sum(sampleNumberByDataset$numSamples) sampleNumberByDataset <- rbind(sampleNumberByDataset, totalNumSamples) rownames(sampleNumberByDataset)[nrow(sampleNumberByDataset)] <- 'Total' knitr::kable(sampleNumberByDataset)
X0
总计 0

SessionInfo

sessionInfo ()
## R版本4.2.1(2022-06-23)##平台:x86_64-pc-linux-gnu(64位)##运行在Ubuntu 20.04.5 LTS ## ##矩阵产品:默认## BLAS: /home/biocbuild/bbs-3.16-bioc/R/lib/libRblas。/home/biocbuild/bbs-3.16-bioc/R/lib/libRlapack。所以## ## locale: ## [1] LC_CTYPE=en_US。UTF-8 LC_NUMERIC= c# # [3] LC_TIME=en_GB LC_COLLATE= c# # [5] LC_MONETARY=en_US。utf - 8 LC_MESSAGES = en_US。UTF-8 ## [7] LC_PAPER=en_US。UTF-8 LC_NAME= c# # [9] LC_ADDRESS=C lc_phone = c# # [11] LC_MEASUREMENT=en_US。UTF-8 LC_IDENTIFICATION=C ## ##附加的基本包:## [1]stats4 stats graphics grDevices utils datasets methods ##[8]基础## ##其他附加包:[1] MetaGxPancreas_1.18.0 ExperimentHub_2.6.0 ## [5] dbplyr_2.2.1 SummarizedExperiment_1.28.0 ## [7] Biobase_2.58.0 GenomicRanges_1.50.0 ## [9] GenomeInfoDb_1.34.0 IRanges_2.32.0 ## [11] S4Vectors_0.36.0 BiocGenerics_0.44.0 ## [13] MatrixGenerics_1.10.0 matrixStats_0.62.0 ## ##通过命名空间加载(并且没有附加):## [3] Rcpp_1.0.9 lattice_0.1 -7 ## [3] Biostrings_2.66.0 png_0.1-7 ## [9] R6_2.5.1 RSQLite_2.2.18 ## [11] evaluate_0.17 highr_0.9 ## [13] httr_1.4.4 pillar_1.8.1 ## [15] zlibbioc_1.44.0 rlang_1.0.6 ## [17] curl_4.3.3 blob_1.2.3 ## [19] Matrix_1.5-1 string_1 .4.1 ## [23] shiny_1.7.3 DelayedArray_0.24.0 ## [29] htmltools_0.5.3tidyselect_1.2.0 # # [31] KEGGREST_1.38.0 tibble_3.1.8 # # [33] GenomeInfoDbData_1.2.9 interactiveDisplayBase_1.36.0 # # [35] fansi_1.0.3 withr_2.5.0 # # [37] crayon_1.5.2 dplyr_1.0.10 # # [39] later_1.3.0 bitops_1.0-7 # # [41] rappdirs_0.3.3 grid_4.2.1 # # [43] xtable_1.8-4 lifecycle_1.0.3 # # [45] DBI_1.1.3 magrittr_2.0.3 # # [47] impute_1.72.0 cli_3.4.1 # # [49] stringi_1.7.8 cachem_1.0.6 # # [51] XVector_0.38.0 promises_1.2.0.1 # # [53] ellipsis_0.3.2 filelock_1.0.2 # # [55] generics_0.1.3 vctrs_0.5.0 # #[57] tools_4.2.1 bit64_4.0.5 ## [59] glue_1.6.2 purrr_0.3.5 ## [61] BiocVersion_3.16.0 fastmap_1.1.0 ## [63] yaml_2.3.6 AnnotationDbi_1.60.0 ## [65] BiocManager_1.30.19 memoise_2.0.1 ## [67] knitr_1.40