| makeTranscriptDbFromBiomart {GenomicFeatures} | R Documentation |
The makeTranscriptDbFromBiomart function allows the user
to make a TranscriptDb object from transcript annotations
available on a BioMart database.
getChromInfoFromBiomart(biomart="ensembl",
dataset="hsapiens_gene_ensembl",
id_prefix="ensembl_",
host="www.biomart.org",
port=80)
makeTranscriptDbFromBiomart(biomart="ensembl",
dataset="hsapiens_gene_ensembl",
transcript_ids=NULL,
circ_seqs=DEFAULT_CIRC_SEQS,
filters="",
id_prefix="ensembl_",
host="www.biomart.org",
port=80,
miRBaseBuild=NA)
biomart |
which BioMart database to use.
Get the list of all available BioMart databases with the
|
dataset |
which dataset from BioMart. For example:
|
transcript_ids |
optionally, only retrieve transcript annotation data for the specified set of transcript ids. If this is used, then the meta information displayed for the resulting TranscriptDb object will say 'Full dataset: no'. Otherwise it will say 'Full dataset: yes'. |
circ_seqs |
a character vector to list out which chromosomes should be marked as circular. |
filters |
Additional filters to use in the BioMart query. Must be
a named list. An example is |
host |
The host URL of the BioMart. Defaults to www.biomart.org. |
port |
The port to use in the HTTP communication with the host. |
id_prefix |
Specifies the prefix used in BioMart attributes. For
example, some BioMarts may have an attribute specified as
|
miRBaseBuild |
specify the string for the appropriate build
Information from mirbase.db to use for microRNAs. This can be
learned by calling |
makeTranscriptDbFromBiomart is a convenience function that feeds
data from a BioMart database to the lower level makeTranscriptDb
function.
See ?makeTranscriptDbFromUCSC for a similar function
that feeds data from the UCSC source.
BioMart databases that are known to have compatible transcript annotations are:
the most recent ensembl: ENSEMBL GENES (SANGER UK)
the most recent bacterial_mart: ENSEMBL BACTERIA (EBI UK)
the most recent fungal_mart: ENSEMBL FUNGAL (EBI UK)
the most recent metazoa_mart: ENSEMBL METAZOA (EBI UK)
the most recent plant_mart: ENSEMBL PLANT (EBI UK)
the most recent protist_mart: ENSEMBL PROTISTS (EBI UK)
the most recent ensembl_expressionmart: EURATMART (EBI UK)
Not all annotations will have CDS information.
A TranscriptDb object.
M. Carlson and H. Pages
listMarts,
useMart,
listDatasets,
DEFAULT_CIRC_SEQS,
makeTranscriptDbFromUCSC,
makeTranscriptDbFromGFF,
makeTranscriptDb,
supportedMiRBaseBuildValues
## Discover which datasets are available in the "ensembl" BioMart
## database:
library("biomaRt")
head(listDatasets(useMart("ensembl")))
## Retrieving an incomplete transcript dataset for Human from the
## "ensembl" BioMart database:
transcript_ids <- c(
"ENST00000268655",
"ENST00000313243",
"ENST00000341724",
"ENST00000400839",
"ENST00000435657",
"ENST00000478783"
)
txdb <- makeTranscriptDbFromBiomart(transcript_ids=transcript_ids)
txdb # note that these annotations match the GRCh37 genome assembly
## Now what if we want to use another mirror? We might make use of the
## new host argument. But wait! If we use biomaRt, we can see that
## this host has named the mart differently!
listMarts(host="uswest.ensembl.org")
## Therefore we must also change the name passed into the "mart"
## argument thusly:
try(
txdb <- makeTranscriptDbFromBiomart(biomart="ENSEMBL_MART_ENSEMBL",
transcript_ids=transcript_ids,
host="uswest.ensembl.org")
)
txdb