Run Gene Set Enrichment Analysis for Different Transcript Types
Source:R/functions.R
run_enrichment.Rd
Performs gene set enrichment analysis (GSEA) on differential expression results for various transcript types,
using the fgsea
package. The function iterates over specified transcript types, filters the data accordingly,
and runs GSEA for each type.
Arguments
- det_df
A
data.frame
ortibble
containing transcript-level differential expression results, includingtranscript_type
,log2FC
, andgene_name
columns.- genesets_list
A list of gene sets to be used in the enrichment analysis.
- pval_cutoff
A numeric value specifying the p-value cutoff for the enrichment results. Default is
0.05
.- lfc_cutoff
A numeric value specifying the log2 fold-change cutoff for filtering transcripts. Default is
1
.
Value
A tibble
containing the enrichment analysis results for each transcript type, including pathway names,
p-values, adjusted p-values, and the transcript type (experiment).
Details
The function defines a list of transcript types and their corresponding labels.
It then filters the input differential expression data for each transcript type, ranks the genes by log2 fold-change,
and performs GSEA using the fgsea
package.
Examples
# Sample differential expression data
det_df <- data.frame(
gene_name = c("GeneA", "GeneB", "GeneC", "GeneD"),
transcript_type = c(
"protein_coding", "retained_intron",
"processed_transcript", "nonsense_mediated_decay"
),
log2FC = c(1.5, -2.0, 0.8, -1.2)
)
# Sample gene sets
genesets_list <- list(
Pathway1 = c("GeneA", "GeneC"),
Pathway2 = c("GeneB", "GeneD")
)
# Run enrichment analysis
fgsea_results_df <- run_enrichment(
det_df = det_df,
genesets_list = genesets_list,
pval_cutoff = 0.05,
lfc_cutoff = 1
)
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
#> Warning: All values in the stats vector are greater than zero and scoreType is "std", maybe you should switch to scoreType = "pos".
# View the results
print(fgsea_results_df)
#> # A tibble: 0 × 9
#> # ℹ 9 variables: pathway <chr>, pval <dbl>, padj <dbl>, log2err <dbl>,
#> # ES <dbl>, NES <dbl>, size <int>, leadingEdge <list>, experiment <chr>