Skip to contents

Performs gene set enrichment analysis (GSEA) on differential expression results for various transcript types, using the fgsea package. The function iterates over specified transcript types, filters the data accordingly, and runs GSEA for each type.

Usage

run_enrichment(det_df, genesets_list, pval_cutoff = 0.05, lfc_cutoff = 1)

Arguments

det_df

A data.frame or tibble containing transcript-level differential expression results, including transcript_type, log2FC, and gene_name columns.

genesets_list

A list of gene sets to be used in the enrichment analysis.

pval_cutoff

A numeric value specifying the p-value cutoff for the enrichment results. Default is 0.05.

lfc_cutoff

A numeric value specifying the log2 fold-change cutoff for filtering transcripts. Default is 1.

Value

A tibble containing the enrichment analysis results for each transcript type, including pathway names, p-values, adjusted p-values, and the transcript type (experiment).

Details

The function defines a list of transcript types and their corresponding labels. It then filters the input differential expression data for each transcript type, ranks the genes by log2 fold-change, and performs GSEA using the fgsea package.

Examples

# Sample differential expression data
det_df <- data.frame(
  gene_name = c("GeneA", "GeneB", "GeneC", "GeneD"),
  transcript_type = c(
    "protein_coding", "retained_intron",
    "processed_transcript", "nonsense_mediated_decay"
  ),
  log2FC = c(1.5, -2.0, 0.8, -1.2)
)

# Sample gene sets
genesets_list <- list(
  Pathway1 = c("GeneA", "GeneC"),
  Pathway2 = c("GeneB", "GeneD")
)

# Run enrichment analysis
fgsea_results_df <- run_enrichment(
  det_df = det_df,
  genesets_list = genesets_list,
  pval_cutoff = 0.05,
  lfc_cutoff = 1
)
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".

# View the results
print(fgsea_results_df)
#> # A tibble: 0 × 9
#> # ℹ 9 variables: pathway <chr>, pval <dbl>, padj <dbl>, log2err <dbl>,
#> #   ES <dbl>, NES <dbl>, size <int>, leadingEdge <list>, experiment <chr>