Skip to contents

Using tximeta to create LinkedTxome

For creating a LinkedTxome the package tximeta can use a FASTA and a GTF file.

In the case of other organism or databases, everything you need is the GTF file with the annotation of transcripts and genes. The FASTA file used is the transcript sequences.

Note: even if you use GENCODE or Ensembl, use a custom name, like “localGENCODE”, otherwise tximeta will try to download resources from cached versions.

On the isoformic package we have an auxiliary function that can help you download the relevant reference files from the GENCODE project, for both Homo sapiens and Mus musculus.

# In this example we are goign to save the files to a temporary directory
base_dir <- fs::path_temp("isoformic_ref")
base_dir <- fs::path(base_dir, "gencode_v46")
fs::dir_create(base_dir)

NOTE:This chunk of code download at approximately 200MB of data. Only run it you are sure you need the files.

# For the GTF file
download_reference(
  version = "46",
  reference = "gencode",
  file_type = "gtf",
  organism = "human",
  output_path = base_dir
)

# For the transcriptome FASTA file
download_reference(
  version = "46",
  reference = "gencode",
  file_type = "fasta",
  organism = "human",
  output_path = base_dir
)

# Mouse transcriptome FASTA
download_reference(
  version = "M35",
  reference = "gencode",
  file_type = "fasta",
  organism = "mouse",
  output_path = fs::path_temp(isoformic_ref, "gencode_M35")
)
# fs::dir_create()
fs::dir_ls(base_dir)
gtf_file_path <- fs::path(base_dir, "gencode.v46.annotation.gtf.gz")
fasta_file_path <- fs::path(base_dir, "gencode.v46.transcripts.fa.gz")

fs::file_exists(gtf_file_path)
fs::file_exists(fasta_file_path)

fs::dir_create(base_dir)

fs::dir_exists("/var/folders/2q/937_bkg10svdwx1x00prs9nm0000gn/T/RtmpW9sxel/isoformic_ref/gencode_v46")

json_file_path <- paste0(base_dir, ".json")
fs::file_create(json_file_path)

fs::
tximeta::makeLinkedTxome(
  indexDir = base_dir,
  source = "localGENCODE",
  organism = "Homo sapiens",
  release = "46",
  genome = "GRCh38",
  fasta = fasta_file_path,
  gtf = gtf_file_path,
  write = TRUE,
  jsonFile = json_file_path
)


tximeta::loadLinkedTxome("data-raw/gencode_v33.json")
library(tximeta)

library(macrophage)
dir <- system.file("extdata", package = "macrophage")
fs::dir_ls(dir)
tximeta::makeLinkedTxome(
  indexDir = file.path(dir, "gencode.v29_salmon_0.12.0"),
  source = "localGENCODE",
  organism = "Homo sapiens",
  release = "45",
  genome = "GRCh38",
  fasta = "ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_29/gencode.v29.transcripts.fa.gz",
  gtf = file.path(dir, "gencode.v29.annotation.gtf.gz"), # local version
  write = FALSE
)

Using custom GTF or GFF files

annot_list <- prepare_annotation("data-raw/gencode.v46.annotation.gtf.gz")

annot_list$gene
annot_list$transcript
annot_list$exon