-
Genetics
- Ad-mixture (aka Ethnicity Mix)
- Allosomes (Sex chromosomes X & Y)
- Autosomes (Chromosomes 1-22)
- Base Pair
- CE Testing (1st Wave)
- centiMorgan (cM)
- Chromosomes
- Clade
- Cladogram
- dbSNP, rsID, NIH, etc
- Deoxyribonucleic Acid (DNA)
- Derived & Ancestral
- Endogamy or Pedigree Collapse
- epigenetics
- Gene
- Genetic Marker
- Genome Build (aka Reference Model)
- Genotyping
- Haplogroup
- Haploid & Diploid
- Haplotype
- Imputation
- Low Coverage Sequencing
- Meiosis & Mitosis
- Microarray Testing (2nd Wave)
- Microarray File Formats (aka RAW)
- Mito Build (rCRS, Yoruba, RSRS)
- Mitochondria
- Modal
- Null Allele
- Pangenome
- Phylogenetic Tree
- Probes, Primers, Adaptors and Tags
- Recombination (aka Cross-Overs)
- Sampling Techniques
- Sequencing (3rd Wave)
- Sequencing File Formats
- Single Nucleotide Polymorphism (SNP)
- Short Tandem Repeat (STR)
-
Genealogy
- Ahnentafel number
- Ancestor and Descendant
- Birth, Marriage and Death (BMD)
- Branches
- Consanguinity
- Cousins
- Deep Ancestry
- Earliest Known Ancestor (EKA)
- Family (Nuclear, and Household)
- Genealogical Exchange Database (GEDCom)
- Genealogical Proof Standard (GPS)
- Genealogical Records
- Genealogical Time Frame (aka last 500 years)
- Genealogical Tool
- Genealogical Trees
- Generation Difference (GD)
- Individuals
- Most Recent Common Ancestor (MRCA)
- Née
- Not Parent Expected (NPE)
- One-Tree (aka World Tree)
- Patriline & Matriline
- Places
- Repositories
- Siblings
- Sources
- Surname, One-Name and Family Branch Studies
- Years Before Present (ybp)
- (Genetic Genealogy) Terms
- Genetics Industry
- (Genetic Genealogy and Ancient DNA) Industry
-
»
- dbSNP, rsID, NIH, etc
dbSNP is a central database maintained by the USA National Institute of Health (NIH). In their own words: As can be seen by the above definition, this includes more than just a strict definition of SNPs but also InDels, STRs and other smaller-scale variations of 5 or less base-pairs.
In contrast, here is NIHs definition of their companion dbVar database: Large scan variations often involved in translocations are covered in their companion dbVar database.
Most variations are defined in one or the other of these two databases.
rsIDs are defined by the dbSNP database and is a short acronym for reference SNP ID's. A specific chromosome and position within it (per defined reference model) is often used to define an rsID. Along with an ancestral and one or more derived values. When more than one base-pair is involved, or it is an InDel, then the location specified in a file can be dependent on who wrote the file. But is fixed in the reference database here. Most commonly, the multi-base-pair sequence is considered left aligned in the forward strand. Meaning the start or lowest count value in the sequence of base-pairs is used as the coordinate to start the rsID.
The rsID is commonly used in microarray file formats and annotated VCFs along with the chromosome name and position within. Only the position is usually unique and different depending on the reference model being used. Most basic VCF files delivered by WGS test companies have not annotated the VCF to include the rsID nor possible gene region it may reside in.
Studied rsIDs are given descriptive SNP names and usually located within a named gene. Names used to be defined only in curated, refereed journal publications that first identified them. Now, especially in the yDNA haplogroup arena, it has turned into a wild-west race to see who can grab the most territory of defined names for any variant found — whether curated, likely or valid. Leading to falsely claimed names of what are not really variants (or reliable areas to read the genome). And the action of naming a variant before it is even submitted and curated in dbSNP with an rsID.
Everything associated with genetics is often under the purview of the USA National Center for Biotechnology Information (NCBI), part of the National Library of Medicine (NLM), that is part of the National Institute of Health (NIH) that reports into the Department of Health and Human Services (HHS).
A near parallel activity, that acts in concert with the NIH, is the European Variant Archive (EVA) centered with the European Bioinformatics Institute (EBI) at the European Molecular Biology Laboratory (EMBL). Although currently EU funded, this organization is located on the Welcome Genome Campus located near Cambridge University and the Sangar Institute. EVA is part of the ELIXIR hub of cooperating repositories of public data surrounding the biological sciences. ENSEMBL is another arm of EBI that handles the reference genome models.
dbSNP contains human single nucleotide variations, microsatellites, and small-scale insertions and deletions along with publication, population frequency, molecular consequence, and genomic and RefSeq mapping information for both common variations and clinical mutations.
In contrast, here is NIHs definition of their companion dbVar database:
dbVar is NCBI's database of human genomic Structural Variation — large variants >50 bp including insertions, deletions, duplications, inversions, mobile elements, translocations, and complex variants
Most variations are defined in one or the other of these two databases.
rsIDs are defined by the dbSNP database and is a short acronym for reference SNP ID's. A specific chromosome and position within it (per defined reference model) is often used to define an rsID. Along with an ancestral and one or more derived values. When more than one base-pair is involved, or it is an InDel, then the location specified in a file can be dependent on who wrote the file. But is fixed in the reference database here. Most commonly, the multi-base-pair sequence is considered left aligned in the forward strand. Meaning the start or lowest count value in the sequence of base-pairs is used as the coordinate to start the rsID.
The rsID is commonly used in microarray file formats and annotated VCFs along with the chromosome name and position within. Only the position is usually unique and different depending on the reference model being used. Most basic VCF files delivered by WGS test companies have not annotated the VCF to include the rsID nor possible gene region it may reside in.
Studied rsIDs are given descriptive SNP names and usually located within a named gene. Names used to be defined only in curated, refereed journal publications that first identified them. Now, especially in the yDNA haplogroup arena, it has turned into a wild-west race to see who can grab the most territory of defined names for any variant found — whether curated, likely or valid. Leading to falsely claimed names of what are not really variants (or reliable areas to read the genome). And the action of naming a variant before it is even submitted and curated in dbSNP with an rsID.
Everything associated with genetics is often under the purview of the USA National Center for Biotechnology Information (NCBI), part of the National Library of Medicine (NLM), that is part of the National Institute of Health (NIH) that reports into the Department of Health and Human Services (HHS).
A near parallel activity, that acts in concert with the NIH, is the European Variant Archive (EVA) centered with the European Bioinformatics Institute (EBI) at the European Molecular Biology Laboratory (EMBL). Although currently EU funded, this organization is located on the Welcome Genome Campus located near Cambridge University and the Sangar Institute. EVA is part of the ELIXIR hub of cooperating repositories of public data surrounding the biological sciences. ENSEMBL is another arm of EBI that handles the reference genome models.