Retrieve Promoter Regions and Sample Background Elements
Source:R/getCoordinates.R
getCoordinates.RdGiven a mapped set of foreground IDs (with Entrez and mappedID), this function:
Fetches genomic coordinates for each feature (gene or transcript), according to the chosen
TSS.method.Extracts promoter windows around those coordinates.
Optionally reduces overlaps and selects one promoter per gene.
Samples background promoter elements matching the foreground
Usage
getCoordinates(
mapped,
txdb,
transcript = FALSE,
n_ratio = 1,
promoterWindow = c(upstream = 300, downstream = 50),
standardChroms = TRUE,
reduceOverlaps = TRUE,
overlapMinGap = 0,
onePromoterPerGene = FALSE
)Arguments
- mapped
A list as returned by
mapIDs(), containing at leastfg_idsandbg_ids.- txdb
A TxDb object. (e.g.
TxDb.Hsapiens.UCSC.hg38.knownGene).- transcript
Logical;
TRUEfor transcript-level coordinates,FALSEfor gene-level.- n_ratio
Numeric; ratio of ranges to retain in the background set. The number of background ranges will be equal to
n_ratiomultiplied by the number of foreground ranges.- promoterWindow
Numeric named vector of lengths:
c(upstream, downstream)(defaultc(300,50)).- standardChroms
Logical; restrict to standard chromosomes.
- reduceOverlaps
Logical; merge any overlapping promoter windows.
- overlapMinGap
Numeric; minimum gap when reducing overlaps.
- onePromoterPerGene
Logical; if
TRUE, choose one promoter per gene.
Value
A named list from the final
.defineBackgroundElements():
- backgroundElements
GRangesof sampled background promoters- foregroundElements
GRangesof foreground promoters- backgroundUniverse
GRangesof the pruned universe- matchObject
MatchedGRangesifbgMethod="matched", elseNULL
Examples
library(TxDb.Hsapiens.UCSC.hg38.knownGene)
library(org.Hs.eg.db)
txdb <- TxDb.Hsapiens.UCSC.hg38.knownGene
orgdb <- org.Hs.eg.db
my_genes <- c("ENSG00000139618", "ENSG00000157764")
ids <- mapIDs(orgdb = orgdb,
foreground_ids = my_genes,
id_type = "ENSEMBL")
#> Mapping foreground ids to ENTREZIDs...
#> 'select()' returned 1:1 mapping between keys and columns
#> Successfully mapped 100% of the provided foreground ids.
#> Building background id set...
#> 'select()' returned 1:many mapping between keys and columns
#> Checking for inflation...
#> 'select()' returned 1:1 mapping between keys and columns
filtered <- poolFilter(ids, geneType="protein-coding")
coords <- getCoordinates(mapped = filtered, txdb = txdb)
#> Combining overlapping promoter ranges within genes. (This step may take 1-2 minutes)...
#> Defining background elements...