Aging has a genetic component which is estimated to be over 25%. Different individuals age with a different pace even if they have a similar lifestyle. The reason for this is that the genetic changes that differentiate an individual in specific genetic locations change the aging program or render an individual more resistant against damages and age-related diseases.
As the cost for genomic sequences falls the genetic markup underlying human longevity is becoming unraveled, there is a urgent need to utilize the knowledge on genetic variants that influence the speed of aging.
The simplest form of variants are Single-Nucleotide Polymorphisms (SNPs). A SNP is like a single letter in the 23 volume manual of human life containing approximately 3 billion letters. It is important to understand why certain SNPs affect longevit. Furthermore, SNPs sometimes change, or operate in networks, and work in other ways we are beginning to understand.
Association to longevity can be identified either via high throughput Genome wide Association Studies (GWAS) or via focused candidate gene approaches.
This information is to increase the relevance about the functional modifications at the protein level in the SNP.
It is really important to have a idea why certain SNPs create a longevity effect, although this can not be done yet with all SNPs. SNPs some times change the promoter activity or protein function which can be used for functional networks algorithms. Some times they are just photogenic traces and the real criminal gene is surrounding there with other variants not detectable by SNP modifications.
It is important to know how a SNP affects protein function and activity and give some more relevance to the information. Functional assays are crucial for figuring out what longevity SNPs do and why they matter.
There are around thousands of genetic variants associated with longevity.
Polymorph: number of evolution ?
intergenic, exons, :Ensembl variants.
There are three type of Studies: Candidate region approach, Candidate gene approach, genome-wide association studies.
Look at current genetic databases and borrow ideas. Region centric search. Most of the SNPs are not associated. Annotate.
In this study, [X number] papers were curated or evaluated for inclusing in our database. In the papers, longevity determing genes were identified via high throughput genome-wide-association studies (GWAS), focused candidate gene approaches, and others.
This database will allow researchers to better answer questions, solve problems, and make smart decisions for allocating future resources.
We designed a Boolean expression for paper identification. We use an inclusive policy which means positive as well as negative studies are included.
Articles were flagged as "curate" if they have valuable information for the database, "discard" if the are irrelevant", and "review" if they are relevant reviews. Reviews are not curated in any detail, but they are ranked on a subjective scale from 1to 5; where 1 means highly relevant.
Each genetic variant investigated in a study is a data record, which has defined polymorphism and a genomic location. If the polymorphism is within or next to a gene it is linked to the respective genetic factor. Further a record of a polymorphism has an informal description (notes) as well as association specific information such as the ethnicity, age of cases (mean) and shorter/longer-lived alleles. Further statistical data like number of cases/controls in the initial and replication studies, odds-ratio, p-value and whether it is significant are also included, if available. Moreover the utilized technology (e.g. PCR) and the type of study (GWAS, candidate gene). We strictly provide references to the primary information.
Each row represents an association to be more precise. So in the case a study looked into the same variant in different populations, we want to have here different entries.
An association of a genetic variant with longevity in a specific population can be either significant or not. If it is a significant association it can be a positive or a negative association (i.e. it is overrepresented or underrepresented in long-lived individuals, respectively).
Where information is known about the functional impact of a longevity variant (e.g. APOE, CETP), this is also represented in the database.
The intial and replication number is always given in the form long-lived (old) individuals vs short-lived (young) individuals.
For the UI we need: * Filters * Queries * Additions/editions/deletion * Mouseover for definition
We will have an index view, and table view of studies implicated in longevity (and whether they are curated or not) as well as table view for association and varaints. We need a detail view for both association and variant. Possible other views are Population list and table, study type and technology. Population need to support hierarchy.
For several longevity SNPs, no functional assays have been conducted yet, but then hopefully our database will help to identify the most well-replicated mystery SNPs, and let us prioritize them for functional follow-up.
A "disease SNP" is not a "true longevity SNP". Since age-related disease clearly do shorten the lifespan this means that "disease SNPs" and "true longeivty SNPs" are like two overlapping olympic rings, a Ven Diagram with an area where the two cicles overlap [James P. Watson, personal communication].