vermeulen provides the Biomarker data set by Vermeulen et al. (2009) in tidy format.
This data set is for a real-time quantitative PCR experiment that comprises:
- The raw fluorescence data of 24,576 amplification curves.
- 64 targets: 59 genes of interest and 5 reference genes.
- 366 neuroblastoma cDNA samples and 18 dilution series samples.
Installation
Install vermeulen from CRAN:
# Install from CRAN
install.packages("vermeulen")
You can instead install the development version of vermeulen from GitHub:
# install.packages("remotes")
remotes::install_github("ramiromagno/vermeulen")
Usage
Because of CRAN size limits the data is not provided at installation time. The data can be retrieved from this GitHub repository after installation with the function get_biomarker_dataset()
.
library(vermeulen)
library(tibble)
library(dplyr)
# Takes a few seconds (downloading from GitHub...)
biomarker <- as_tibble(get_biomarker_dataset())
biomarker
#> # A tibble: 1,226,880 × 11
#> plate well dye target target_type sample sample_type copies dilution cycle
#> <fct> <fct> <fct> <fct> <fct> <chr> <fct> <int> <dbl> <int>
#> 1 AHCY A1 SYBR AHCY toi 1495 unk NA NA 1
#> 2 AHCY A1 SYBR AHCY toi 1495 unk NA NA 2
#> 3 AHCY A1 SYBR AHCY toi 1495 unk NA NA 3
#> 4 AHCY A1 SYBR AHCY toi 1495 unk NA NA 4
#> 5 AHCY A1 SYBR AHCY toi 1495 unk NA NA 5
#> 6 AHCY A1 SYBR AHCY toi 1495 unk NA NA 6
#> 7 AHCY A1 SYBR AHCY toi 1495 unk NA NA 7
#> 8 AHCY A1 SYBR AHCY toi 1495 unk NA NA 8
#> 9 AHCY A1 SYBR AHCY toi 1495 unk NA NA 9
#> 10 AHCY A1 SYBR AHCY toi 1495 unk NA NA 10
#> # ℹ 1,226,870 more rows
#> # ℹ 1 more variable: fluor <dbl>
Types of samples:
count(
distinct(biomarker, plate, well, sample_type, copies, dilution),
sample_type,
copies,
dilution
)
#> # A tibble: 7 × 4
#> sample_type copies dilution n
#> <fct> <int> <dbl> <int>
#> 1 ntc 0 Inf 192
#> 2 std 15 10000 192
#> 3 std 150 1000 192
#> 4 std 1500 100 192
#> 5 std 15000 10 192
#> 6 std 150000 1 192
#> 7 unk NA NA 23424
Code of Conduct
Please note that the vermeulen project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.
References
Vermeulen et al.. Predicting outcomes for children with neuroblastoma using a multigene-expression signature: a retrospective SIOPEN/COG/GPOH study. The Lancet Oncology 10, 663–671 (2009). doi: 10.1016/S1470-2045(09)70154-8.
Ruijter et al.. Evaluation of qPCR curve analysis methods for reliable biomarker discovery: Bias, resolution, precision, and implications. Methods 59 32–46 (2013). doi: 10.1016/j.ymeth.2012.08.011.