Skip to contents

The association object consists of six slots, each a table (tibble), that combined form a relational database of a subset of GWAS Catalog associations. Each association is an observation (row) in the associations table --- main table. All tables have the column association_id as primary key.

Slots

associations

A tibble listing associations. Columns:

association_id

GWAS Catalog association accession identifier, e.g., "20250".

pvalue

Reported p-value for strongest variant risk or effect allele.

pvalue_description

Information describing context of p-value.

pvalue_mantissa

Mantissa of p-value.

pvalue_exponent

Exponent of p-value.

multiple_snp_haplotype

Whether the association is for a multi-SNP haplotype.

snp_interaction

Whether the association is for a SNP-SNP interaction.

snp_type

Whether the SNP has previously been reported. Either 'known' or 'novel'.

risk_frequency

Reported risk/effect allele frequency associated with strongest SNP in controls.

standard_error

Standard error of the effect size.

range

Reported 95% confidence interval associated with strongest SNP risk allele, along with unit in the case of beta coefficients. If 95% CIs have not been not reported, these are estimated using the standard error, when available.

or_per_copy_number

Reported odds ratio (OR) associated with strongest SNP risk allele. Note that all ORs included in the Catalog are >1.

beta_number

Beta coefficient associated with strongest SNP risk allele.

beta_unit

Beta coefficient unit.

beta_direction

Beta coefficient direction, either 'decrease' or 'increase'.

beta_description

Additional beta coefficient comment.

last_mapping_date

Last time this association was mapped to Ensembl.

last_update_date

Last time this association was updated.

loci

A tibble listing loci. Columns:

association_id

GWAS Catalog association accession identifier, e.g., "20250".

locus_id

A locus identifier referring to a single variant locus or to a multi-loci entity such as a multi-SNP haplotype.

haplotype_snp_count

Number of variants per locus. Most loci are single-SNP loci, i.e., there is a one to one relationship between a variant and a locus_id (haplotype_snp_count == NA). There are however cases of associations involving multiple loci at once, such as SNP-SNP interactions and multi-SNP haplotypes. This is signalled in the columns: multiple_snp_haplotype and snp_interaction with value TRUE.

description

Description of the locus identifier, e.g., 'Single variant', SNP x SNP interaction, or 3-SNP Haplotype.

risk_alleles

A tibble listing risk alleles. Columns:

association_id

GWAS Catalog association accession identifier, e.g., "20250".

locus_id

A locus identifier referring to a single variant locus or to a multi-loci entity such as a multi-SNP haplotype.

variant_id

Variant identifier, e.g., 'rs1333048'.

risk_allele

Risk allele or effect allele.

risk_frequency

Reported risk/effect allele frequency associated with strongest SNP in controls (if not available among all controls, among the control group with the largest sample size). If the associated locus is a haplotype the haplotype frequency will be extracted.

genome_wide

Whether this variant allele has been part of a genome-wide study or not.

limited_list

Undocumented.

genes

A tibble listing author reported genes. Columns:

association_id

GWAS Catalog association accession identifier, e.g., "20250".

locus_id

A locus identifier referring to a single variant locus or to a multi-loci entity such as a multi-SNP haplotype.

gene_name

Gene symbol according to HUGO Gene Nomenclature (HGNC).

ensembl_ids

A tibble listing Ensembl gene identifiers. Columns:

association_id

GWAS Catalog association accession identifier, e.g., "20250".

locus_id

A locus identifier referring to a single variant locus or to a multi-loci entity such as a multi-SNP haplotype.

gene_name

Gene symbol according to HUGO Gene Nomenclature (HGNC).

ensembl_id

The Ensembl identifier of an Ensembl gene, see Section Gene annotation in Ensembl for more information.

entrez_ids

A tibble listing Entrez gene identifiers. Columns:

association_id

GWAS Catalog association accession identifier, e.g., "20250".

locus_id

A locus identifier referring to a single variant locus or to a multi-loci entity such as a multi-SNP haplotype.

gene_name

Gene symbol according to HUGO Gene Nomenclature (HGNC).

entrez_id

The Entrez identifier of a gene, see ref. doi:10.1093/nar/gkq1237 for more information.