Retrieves associations via the NHGRI-EBI GWAS Catalog REST API. The REST API is queried multiple times with the criteria passed as arguments (see below). By default all associations that match the criteria supplied in the arguments are retrieved: this corresponds to the default option set_operation set to 'union'. If you rather have only the associations that match simultaneously all criteria provided, then set set_operation to 'intersection'.

get_associations(study_id = NULL, association_id = NULL,
  variant_id = NULL, efo_id = NULL, pubmed_id = NULL,
  efo_trait = NULL, set_operation = "union", interactive = TRUE,
  verbose = FALSE, warnings = TRUE)

Arguments

study_id

A character vector of GWAS Catalog study accession identifiers.

association_id

A character vector of GWAS Catalog association identifiers.

variant_id

A character vector of GWAS Catalog variant identifiers.

efo_id

A character vector of EFO identifiers.

pubmed_id

An integer vector of PubMed identifiers.

efo_trait

A character vector of EFO trait descriptions, e.g., 'uric acid measurement'.

set_operation

Either 'union' or 'intersection'. This tells how associations retrieved by different criteria should be combined: 'union' binds together all results removing duplicates and 'intersection' only keeps same associations found with different criteria.

interactive

A logical. If all associations are requested, whether to ask interactively if we really want to proceed.

verbose

A logical indicating whether the function should be verbose about the different queries or not.

warnings

A logical indicating whether to print warnings, if any.

Value

An associations object.

Details

Please note that all search criteria are vectorised, thus allowing for batch mode search, e.g., one can search by multiple variant identifiers at once by passing a vector of identifiers to variant_id.

Examples

# Get an association by study identifier get_associations(study_id = 'GCST001085')
#> An object of class "associations" #> Slot "associations": #> # A tibble: 9 x 17 #> association_id pvalue pvalue_descript… pvalue_mantissa pvalue_exponent #> <chr> <dbl> <chr> <int> <int> #> 1 24299694 1.00e-10 <NA> 1 -10 #> 2 24299702 5.00e-13 <NA> 5 -13 #> 3 24299710 3.00e-10 <NA> 3 -10 #> 4 24299718 3.00e-16 <NA> 3 -16 #> 5 24299726 2.00e-44 <NA> 2 -44 #> 6 24299734 1.00e- 7 <NA> 1 -7 #> 7 24299743 7.00e- 9 <NA> 7 -9 #> 8 24299751 2.00e- 7 <NA> 2 -7 #> 9 24299759 2.00e- 9 <NA> 2 -9 #> # … with 12 more variables: multiple_snp_haplotype <lgl>, #> # snp_interaction <lgl>, snp_type <chr>, standard_error <dbl>, range <chr>, #> # or_per_copy_number <dbl>, beta_number <dbl>, beta_unit <chr>, #> # beta_direction <chr>, beta_description <chr>, last_mapping_date <dttm>, #> # last_update_date <dttm> #> #> Slot "loci": #> # A tibble: 18 x 4 #> association_id locus_id haplotype_snp_count description #> <chr> <int> <int> <chr> #> 1 24299694 1 NA SNP x SNP interaction #> 2 24299694 2 NA SNP x SNP interaction #> 3 24299702 1 NA SNP x SNP interaction #> 4 24299702 2 NA SNP x SNP interaction #> 5 24299710 1 NA SNP x SNP interaction #> 6 24299710 2 NA SNP x SNP interaction #> 7 24299718 1 NA SNP x SNP interaction #> 8 24299718 2 NA SNP x SNP interaction #> 9 24299726 1 NA SNP x SNP interaction #> 10 24299726 2 NA SNP x SNP interaction #> 11 24299734 1 NA SNP x SNP interaction #> 12 24299734 2 NA SNP x SNP interaction #> 13 24299743 1 NA SNP x SNP interaction #> 14 24299743 2 NA SNP x SNP interaction #> 15 24299751 1 NA SNP x SNP interaction #> 16 24299751 2 NA SNP x SNP interaction #> 17 24299759 1 NA SNP x SNP interaction #> 18 24299759 2 NA SNP x SNP interaction #> #> Slot "risk_alleles": #> # A tibble: 18 x 7 #> association_id locus_id variant_id risk_allele risk_frequency genome_wide #> <chr> <int> <chr> <chr> <dbl> <lgl> #> 1 24299694 1 rs13420028 A NA TRUE #> 2 24299694 2 rs10188442 A NA TRUE #> 3 24299702 1 rs7735940 A NA TRUE #> 4 24299702 2 rs12522034 C NA TRUE #> 5 24299710 1 rs3798440 A NA TRUE #> 6 24299710 2 rs9350602 C NA TRUE #> 7 24299718 1 rs2469997 G NA TRUE #> 8 24299718 2 rs6469823 C NA TRUE #> 9 24299726 1 rs7827545 C NA TRUE #> 10 24299726 2 rs1372662 G NA TRUE #> 11 24299734 1 rs7960483 T NA TRUE #> 12 24299734 2 rs10785581 C NA TRUE #> 13 24299743 1 rs200752 T NA TRUE #> 14 24299743 2 rs200759 A NA TRUE #> 15 24299751 1 rs6452524 G NA TRUE #> 16 24299751 2 rs6887846 A NA TRUE #> 17 24299759 1 rs10496288 G NA TRUE #> 18 24299759 2 rs10496289 C NA TRUE #> # … with 1 more variable: limited_list <lgl> #> #> Slot "genes": #> # A tibble: 18 x 3 #> association_id locus_id gene_name #> <chr> <int> <chr> #> 1 24299694 1 GPR39 #> 2 24299694 2 GPR39 #> 3 24299702 1 RANBP3L #> 4 24299702 2 RANBP3L #> 5 24299710 1 MYO6 #> 6 24299710 2 MYO6 #> 7 24299718 1 NOV #> 8 24299718 2 NOV #> 9 24299726 1 ZFAT #> 10 24299726 2 ZFAT #> 11 24299734 1 AN06 #> 12 24299734 2 AN06 #> 13 24299743 1 MACROD2 #> 14 24299743 2 MACROD2 #> 15 24299751 1 XRCC4 #> 16 24299751 2 XRCC4 #> 17 24299759 1 intergenic #> 18 24299759 2 intergenic #> #> Slot "ensembl_ids": #> # A tibble: 18 x 4 #> association_id locus_id gene_name ensembl_id #> <chr> <int> <chr> <chr> #> 1 24299694 1 GPR39 ENSG00000183840 #> 2 24299694 2 GPR39 ENSG00000183840 #> 3 24299702 1 RANBP3L ENSG00000164188 #> 4 24299702 2 RANBP3L ENSG00000164188 #> 5 24299710 1 MYO6 ENSG00000196586 #> 6 24299710 2 MYO6 ENSG00000196586 #> 7 24299718 1 NOV <NA> #> 8 24299718 2 NOV <NA> #> 9 24299726 1 ZFAT ENSG00000066827 #> 10 24299726 2 ZFAT ENSG00000066827 #> 11 24299734 1 AN06 <NA> #> 12 24299734 2 AN06 <NA> #> 13 24299743 1 MACROD2 ENSG00000172264 #> 14 24299743 2 MACROD2 ENSG00000172264 #> 15 24299751 1 XRCC4 ENSG00000152422 #> 16 24299751 2 XRCC4 ENSG00000152422 #> 17 24299759 1 intergenic <NA> #> 18 24299759 2 intergenic <NA> #> #> Slot "entrez_ids": #> # A tibble: 18 x 4 #> association_id locus_id gene_name entrez_id #> <chr> <int> <chr> <chr> #> 1 24299694 1 GPR39 2863 #> 2 24299694 2 GPR39 2863 #> 3 24299702 1 RANBP3L 202151 #> 4 24299702 2 RANBP3L 202151 #> 5 24299710 1 MYO6 4646 #> 6 24299710 2 MYO6 4646 #> 7 24299718 1 NOV 4856 #> 8 24299718 2 NOV 4856 #> 9 24299726 1 ZFAT 57623 #> 10 24299726 2 ZFAT 57623 #> 11 24299734 1 AN06 <NA> #> 12 24299734 2 AN06 <NA> #> 13 24299743 1 MACROD2 140733 #> 14 24299743 2 MACROD2 140733 #> 15 24299751 1 XRCC4 7518 #> 16 24299751 2 XRCC4 7518 #> 17 24299759 1 intergenic <NA> #> 18 24299759 2 intergenic <NA> #>
# Get an association by association identifier get_associations(association_id = '25389945')
#> An object of class "associations" #> Slot "associations": #> # A tibble: 1 x 17 #> association_id pvalue pvalue_descript… pvalue_mantissa pvalue_exponent #> <chr> <dbl> <chr> <int> <int> #> 1 25389945 3.00e-7 <NA> 3 -7 #> # … with 12 more variables: multiple_snp_haplotype <lgl>, #> # snp_interaction <lgl>, snp_type <chr>, standard_error <dbl>, range <chr>, #> # or_per_copy_number <dbl>, beta_number <dbl>, beta_unit <chr>, #> # beta_direction <chr>, beta_description <chr>, last_mapping_date <dttm>, #> # last_update_date <dttm> #> #> Slot "loci": #> # A tibble: 1 x 4 #> association_id locus_id haplotype_snp_count description #> <chr> <int> <int> <chr> #> 1 25389945 1 27 27-SNP haplotype #> #> Slot "risk_alleles": #> # A tibble: 27 x 7 #> association_id locus_id variant_id risk_allele risk_frequency genome_wide #> <chr> <int> <chr> <chr> <dbl> <lgl> #> 1 25389945 1 rs9486815 G NA FALSE #> 2 25389945 1 rs4245535 G NA FALSE #> 3 25389945 1 rs17069173 A NA FALSE #> 4 25389945 1 rs9374021 G NA FALSE #> 5 25389945 1 rs9374007 A NA FALSE #> 6 25389945 1 rs9386694 A NA FALSE #> 7 25389945 1 rs9374002 G NA FALSE #> 8 25389945 1 rs218289 A NA FALSE #> 9 25389945 1 rs1064346 A NA FALSE #> 10 25389945 1 rs9374013 G NA FALSE #> # … with 17 more rows, and 1 more variable: limited_list <lgl> #> #> Slot "genes": #> # A tibble: 1 x 3 #> association_id locus_id gene_name #> <chr> <int> <chr> #> 1 25389945 1 OSTM1 #> #> Slot "ensembl_ids": #> # A tibble: 1 x 4 #> association_id locus_id gene_name ensembl_id #> <chr> <int> <chr> <chr> #> 1 25389945 1 OSTM1 ENSG00000081087 #> #> Slot "entrez_ids": #> # A tibble: 1 x 4 #> association_id locus_id gene_name entrez_id #> <chr> <int> <chr> <chr> #> 1 25389945 1 OSTM1 28962 #>
# Get associations by variant identifier get_associations(variant_id = 'rs3798440')
#> An object of class "associations" #> Slot "associations": #> # A tibble: 1 x 17 #> association_id pvalue pvalue_descript… pvalue_mantissa pvalue_exponent #> <chr> <dbl> <chr> <int> <int> #> 1 24299710 3.00e-10 <NA> 3 -10 #> # … with 12 more variables: multiple_snp_haplotype <lgl>, #> # snp_interaction <lgl>, snp_type <chr>, standard_error <dbl>, range <chr>, #> # or_per_copy_number <dbl>, beta_number <dbl>, beta_unit <chr>, #> # beta_direction <chr>, beta_description <chr>, last_mapping_date <dttm>, #> # last_update_date <dttm> #> #> Slot "loci": #> # A tibble: 2 x 4 #> association_id locus_id haplotype_snp_count description #> <chr> <int> <int> <chr> #> 1 24299710 1 NA SNP x SNP interaction #> 2 24299710 2 NA SNP x SNP interaction #> #> Slot "risk_alleles": #> # A tibble: 2 x 7 #> association_id locus_id variant_id risk_allele risk_frequency genome_wide #> <chr> <int> <chr> <chr> <dbl> <lgl> #> 1 24299710 1 rs3798440 A NA TRUE #> 2 24299710 2 rs9350602 C NA TRUE #> # … with 1 more variable: limited_list <lgl> #> #> Slot "genes": #> # A tibble: 2 x 3 #> association_id locus_id gene_name #> <chr> <int> <chr> #> 1 24299710 1 MYO6 #> 2 24299710 2 MYO6 #> #> Slot "ensembl_ids": #> # A tibble: 2 x 4 #> association_id locus_id gene_name ensembl_id #> <chr> <int> <chr> <chr> #> 1 24299710 1 MYO6 ENSG00000196586 #> 2 24299710 2 MYO6 ENSG00000196586 #> #> Slot "entrez_ids": #> # A tibble: 2 x 4 #> association_id locus_id gene_name entrez_id #> <chr> <int> <chr> <chr> #> 1 24299710 1 MYO6 4646 #> 2 24299710 2 MYO6 4646 #>
# Get associations by EFO trait identifier get_associations(efo_id = 'EFO_0005537')
#> An object of class "associations" #> Slot "associations": #> # A tibble: 5 x 17 #> association_id pvalue pvalue_descript… pvalue_mantissa pvalue_exponent #> <chr> <dbl> <chr> <int> <int> #> 1 38403 4.00e-6 <NA> 4 -6 #> 2 38405 9.00e-6 <NA> 9 -6 #> 3 38406 2.00e-8 <NA> 2 -8 #> 4 38404 1.00e-7 <NA> 1 -7 #> 5 38407 2.00e-8 <NA> 2 -8 #> # … with 12 more variables: multiple_snp_haplotype <lgl>, #> # snp_interaction <lgl>, snp_type <chr>, standard_error <dbl>, range <chr>, #> # or_per_copy_number <dbl>, beta_number <dbl>, beta_unit <chr>, #> # beta_direction <chr>, beta_description <chr>, last_mapping_date <dttm>, #> # last_update_date <dttm> #> #> Slot "loci": #> # A tibble: 5 x 4 #> association_id locus_id haplotype_snp_count description #> <chr> <int> <int> <chr> #> 1 38403 1 NA Single variant #> 2 38405 1 NA Single variant #> 3 38406 1 NA Single variant #> 4 38404 1 NA Single variant #> 5 38407 1 NA Single variant #> #> Slot "risk_alleles": #> # A tibble: 5 x 7 #> association_id locus_id variant_id risk_allele risk_frequency genome_wide #> <chr> <int> <chr> <chr> <dbl> <lgl> #> 1 38403 1 rs4245739 C NA NA #> 2 38405 1 rs3757318 A NA NA #> 3 38406 1 rs2363956 C NA NA #> 4 38404 1 rs10069690 A NA NA #> 5 38407 1 rs10771399 G NA NA #> # … with 1 more variable: limited_list <lgl> #> #> Slot "genes": #> # A tibble: 5 x 3 #> association_id locus_id gene_name #> <chr> <int> <chr> #> 1 38403 1 MDM4 #> 2 38405 1 ESR1 #> 3 38406 1 intergenic #> 4 38404 1 TERT #> 5 38407 1 PTHLH #> #> Slot "ensembl_ids": #> # A tibble: 5 x 4 #> association_id locus_id gene_name ensembl_id #> <chr> <int> <chr> <chr> #> 1 38403 1 MDM4 ENSG00000198625 #> 2 38405 1 ESR1 ENSG00000091831 #> 3 38406 1 intergenic <NA> #> 4 38404 1 TERT ENSG00000164362 #> 5 38407 1 PTHLH ENSG00000087494 #> #> Slot "entrez_ids": #> # A tibble: 5 x 4 #> association_id locus_id gene_name entrez_id #> <chr> <int> <chr> <chr> #> 1 38403 1 MDM4 4194 #> 2 38405 1 ESR1 2099 #> 3 38406 1 intergenic <NA> #> 4 38404 1 TERT 7015 #> 5 38407 1 PTHLH 5744 #>