Given a vector of user-supplied gene/transcript IDs, finds the AnnotationDbi keytype (e.g. ens, refseq, symbol, etc.) that maps the highest fraction of inputs, and returns both foreground and full background sets as Entrez IDs. Optionally collapses transcript-style inputs to genes when requested or when reverse-mapping inflation exceeds a threshold.
Usage
mapIDs(
orgdb,
id_type,
foreground_ids,
background_ids = NULL,
threshold = 0.9,
transcript = FALSE,
stripVersions = TRUE,
inflateThresh = 1
)Arguments
- orgdb
An OrgDb object. (e.g.
org.Hs.eg.db).- id_type
Type of identifier supplied in foreground and background IDs.
- foreground_ids
Character vector of gene or transcript IDs (e.g. Ensembl, RefSeq, gene symbols) to analyze.
- background_ids
Character vector of gene or transcript IDs to use as background set.
- threshold
Fraction in range 0 to 1; minimum mapping rate to accept a keytype without falling back (default 0.9).
- transcript
Logical; if
TRUE, analyze as transcript-level IDs (defaultFALSE).- stripVersions
Logical; strip version suffixes (e.g. ".1") from Ensembl/RefSeq IDs.
- inflateThresh
Fraction in range 0 to 1; if reverse-mapping shows excessive, inflation automatically collapse transcripts to genes (default 1 ie. 100%).
Value
A named list combining the original mapping components with:
fg_idsdata.frame(entrez, mappedID) for your foreground set
bg_idsdata.frame(entrez, mappedID) for the full background
userIDtypethe chosen keytype (e.g. "ensembl")
transcriptlogical, whether transcript-level mapping was used
Examples
library(org.Hs.eg.db)
orgdb <- org.Hs.eg.db
my_genes <- c("ENSG00000139618", "ENSG00000157764")
ids <- mapIDs(orgdb = orgdb,
foreground_ids = my_genes,
id_type = "ENSEMBL")
#> Mapping foreground ids to ENTREZIDs...
#> 'select()' returned 1:1 mapping between keys and columns
#> Successfully mapped 100% of the provided foreground ids.
#> Building background id set...
#> 'select()' returned 1:many mapping between keys and columns
#> Checking for inflation...
#> 'select()' returned 1:1 mapping between keys and columns
# Transcript Ids
my_transcripts <- c("ENST00000245479", "ENST00000633194")
ids <- mapIDs(orgdb = orgdb,
foreground_ids = my_transcripts,
id_type = "ENSEMBLTRANS",
transcript = TRUE)
#> Mapping foreground ids to ENTREZIDs...
#> 'select()' returned 1:1 mapping between keys and columns
#> Successfully mapped 100% of the provided foreground ids.
#> Building background id set...
#> 'select()' returned 1:many mapping between keys and columns