Skip to content

Adding EnhancedVolcano function to the package#87

Open
lobanovav wants to merge 3 commits intoDEVfrom
DEV_EnhancedVolcano
Open

Adding EnhancedVolcano function to the package#87
lobanovav wants to merge 3 commits intoDEVfrom
DEV_EnhancedVolcano

Conversation

@lobanovav
Copy link
Copy Markdown
Member

A quick description: Enhanced Volcano is now a function into the R folder and created a unit test with synthetic data

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a new Volcano_Plot() function to generate static + interactive volcano plots (via EnhancedVolcano + plotly) and adds a testthat unit test suite using synthetic DEG data.

Changes:

  • Added R/Volcano_Plot.R implementing volcano plot generation for one or multiple comparisons, returning plots plus ranked output data.
  • Added tests/testthat/test-Volcano_Plot.R covering basic execution, return structure, multiple comparisons, and a few edge cases.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 8 comments.

File Description
R/Volcano_Plot.R New exported plotting function using EnhancedVolcano with optional labeling and interactive plotly output.
tests/testthat/test-Volcano_Plot.R New tests validating Volcano_Plot() output structure and a few edge cases.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +124 to +129
if (nchar(change_lfc_name) == 0){lfc_name = lfc.col[i]}
if (nchar(change_sig_name) == 0){sig_name = sig.col[i]}
colnames(df) <- c(label.col, change_lfc_name, sig_name)
} else {
lfc_name = lfc.col[i]
sig_name = sig.col[i]
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the use_custom_lab branch, lfc_name/sig_name can be undefined when change_lfc_name/change_sig_name are non-empty (the typical case), which will error at colnames(df) <- ... and later lookups. Set lfc_name/sig_name deterministically (e.g., to the chosen label names) and ensure they always correspond to actual column names in df.

Suggested change
if (nchar(change_lfc_name) == 0){lfc_name = lfc.col[i]}
if (nchar(change_sig_name) == 0){sig_name = sig.col[i]}
colnames(df) <- c(label.col, change_lfc_name, sig_name)
} else {
lfc_name = lfc.col[i]
sig_name = sig.col[i]
lfc_name <- if (nchar(change_lfc_name) == 0) lfc.col[i] else change_lfc_name
sig_name <- if (nchar(change_sig_name) == 0) sig.col[i] else change_sig_name
colnames(df) <- c(label.col, lfc_name, sig_name)
} else {
lfc_name <- lfc.col[i]
sig_name <- sig.col[i]

Copilot uses AI. Check for mistakes.

# By default, nothing will be greater than maxy. User can set this value lower
keep <- -log10(df[[sig_name]]) <= maxy
df[[sig_name]][!keep] <- maxy
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The y-limit capping logic overwrites p-values with maxy (a -log10(p) value): df[[sig_name]][!keep] <- maxy. That produces invalid p-values (>1) and will move highly-significant points to the bottom of the plot. If you want to cap the plotted y, either cap using the corresponding p-value (10^-maxy) or (preferably) leave the data unchanged and apply a plotting limit (e.g., via ggplot scales/coord_cartesian).

Suggested change
df[[sig_name]][!keep] <- maxy
df[[sig_name]][!keep] <- 10^(-maxy)

Copilot uses AI. Check for mistakes.
df_sub <- df
}

genes_to_label <- as.character(df_sub[1:no_genes_to_label, label.col])
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

genes_to_label <- df_sub[1:no_genes_to_label, ...] will introduce NA labels (or error) when df_sub has fewer than no_genes_to_label rows, including when the is_red filter yields 0 rows. Consider using head()/seq_len(min(nrow(df_sub), no_genes_to_label)) and dropping NA from genes_to_label before passing to EnhancedVolcano.

Suggested change
genes_to_label <- as.character(df_sub[1:no_genes_to_label, label.col])
n_to_label <- min(nrow(df_sub), no_genes_to_label)
genes_to_label <- as.character(df_sub[seq_len(n_to_label), label.col])
genes_to_label <- genes_to_label[!is.na(genes_to_label)]

Copilot uses AI. Check for mistakes.
rank <- list()
plot_list <- list()

for(i in 1:length(lfc.col)){
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The loop for (i in 1:length(lfc.col)) behaves incorrectly when lfc.col is length 0 (1:0), and there’s no validation that sig.col and lfc.col have the same length before indexing both. Prefer seq_along(lfc.col) and add an early argument check that length(sig.col) == length(lfc.col) (and that length > 0).

Suggested change
for(i in 1:length(lfc.col)){
# Validate lfc.col and sig.col arguments
if (missing(lfc.col) || missing(sig.col)) {
stop("Both 'lfc.col' and 'sig.col' must be provided.")
}
if (length(lfc.col) == 0L || length(sig.col) == 0L) {
stop("'lfc.col' and 'sig.col' must have length greater than 0.")
}
if (length(lfc.col) != length(sig.col)) {
stop("'lfc.col' and 'sig.col' must have the same length.")
}
for (i in seq_along(lfc.col)) {

Copilot uses AI. Check for mistakes.
Comment on lines +74 to +82
#' @import ggplot2
#' @import dplyr
#' @import tidyr
#' @importFrom stringr str_split
#' @importFrom ggrepel geom_text_repel
#' @importFrom EnhancedVolcano EnhancedVolcano
#' @importFrom plotly ggplotly
#' @importFrom grid grid.newpage
#' @export
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function imports and calls EnhancedVolcano::EnhancedVolcano, but the package is not listed in DESCRIPTION (so R CMD check will fail once NAMESPACE is regenerated). Add EnhancedVolcano to Imports (or Suggests + guard usage), and ensure NAMESPACE/man/ are regenerated to include the new @export and imports.

Copilot uses AI. Check for mistakes.
Comment on lines +281 to +298
# Extract the data used for plotting
plot_data <- ggplot_build(p_empty)$data[[1]]

pxx <- p_empty +
xlab("Fold Change") +
ylab("Significance") +
theme_minimal() +
geom_point(aes(
text = paste("Gene:", df[[label.col]],
"<br>Log2FC:", df[[lfc_name]],
"<br>P-value:", df[[sig_name]]),
colour = as.character(plot_data$colour),
fill = as.character(plot_data$colour)
),
shape = 21,
size = 2,
stroke = 0.1) +
scale_fill_identity()
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The interactive plot construction adds a new geom_point() layer on top of p_empty, so points are drawn twice (once from EnhancedVolcano, once from the added layer) and pointSize/other aesthetics may diverge between layers. Consider attaching the text aesthetic to the existing point layer (or modifying/replacing that layer) so the interactive plot doesn’t duplicate geometry and avoids the extra ggplot_build() work.

Suggested change
# Extract the data used for plotting
plot_data <- ggplot_build(p_empty)$data[[1]]
pxx <- p_empty +
xlab("Fold Change") +
ylab("Significance") +
theme_minimal() +
geom_point(aes(
text = paste("Gene:", df[[label.col]],
"<br>Log2FC:", df[[lfc_name]],
"<br>P-value:", df[[sig_name]]),
colour = as.character(plot_data$colour),
fill = as.character(plot_data$colour)
),
shape = 21,
size = 2,
stroke = 0.1) +
scale_fill_identity()
# Add hover text aesthetic directly to the existing plot
pxx <- p_empty +
xlab("Fold Change") +
ylab("Significance") +
theme_minimal() +
aes(
text = paste(
"Gene:", df[[label.col]],
"<br>Log2FC:", df[[lfc_name]],
"<br>P-value:", df[[sig_name]]
)
)

Copilot uses AI. Check for mistakes.
# Tests for Volcano_Plot function

library(testthat)
library(dplyr)
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

library(dplyr) is not used in this test file. Dropping unused test dependencies helps keep the test environment minimal and avoids masking functions during tests.

Suggested change
library(dplyr)

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants