Last year at this time (November 2018), we were excited and ready to test the waters of 30x WGS testing using full sequencing test techniques. Only a few had utilized the technology from established players Full Genomes and ySeq. This due to the expense and lack of support in the community. If used at all, it was mostly for the yDNA and maybe mtDNA haplogroup work. But with the introduction of a sub-$500 30x WGS test by Dante Labs last year, we saw what we hoped would be the new future. And then they went crazy with a $199 price during the week before Black Friday (simply called Black Friday sale by most). Wow! How could one go wrong?

Well, it is now nearly a year later and we are still waiting for something, anything to be delivered from that purchase. While, to give some credit, even traditional vendors like FTDNA have been having problems with their new BigY-700 test that is also based on newer full sequencing processes; this wait just seems a bit much. In the mean time, Marko released WGS Extract to fully utilize the test by bridging from WGS results to microarray test tool sets.

On 1 Nov 2019, Dante Labs again dropped the price below $500 (it had been hovering around $700) for a 30x average read depth, 150 base-pair segment length WGS test. And now, for 48 hours, offered that crazy $199 price again (and a further 10% discount if you use the code YFULL10 due to their new partnership with yFull). Has the time for using the 30x WGS test arrived? Lets look at the details.

How does an NGS test compare to the traditional Microarray and similar tests?

To be clear, the 30x WGS test is returning results of the whole human genome. Not just for yDNA, or mtDNA or even just the exome. Everything including your mouth bacteria! The only traditional microarray test still available covering everything with a single test, but to a much lesser amount of detail, is 23andMe. We have long recommended the 23andMe test because of this complete coverage they offer; even though they were overtaken in the match database with the new leader Ancestry. For the short time available in North America, the NGG 2.0 Nextgen test offered even more than 23andMe but never had a match database.

So lets look at the numbers of coverage.

[1] Typical Microarray Testing tests return around 600k+ SNPs from the autosomes, xDNA and sometimes include some yDNA and mtDNA. In reality, they actually return two values in most cases so really over 1.2 million SNP values. The 30x WGS is returning base-pair reads, with an average read depth of 30x, on nearly all the 3.2 billion base-pairs in your haploid DNA. Some reads may be as few as 4 or so and thus not reliable. Others as high as 100. The number of reads is significant to judge the reliability of the result. So key is, after testing with the 30x WGS test, you can go back and look for whatever base-pair value you want. Even possible SNPs not yet discovered. Using existing databases defining known SNP's, most doing post-processing of just the autosomes can pull out 3+ million SNP variant values. This is more than even the merge set of all Microarray Testing tests offered by the various companies at various times. Remember the microarray test is returning 1.2 million values of which maybe only 10% are variants. Even the super-kits created by merging microarray testing results from different companies do not approach those numbers. With WGS, there are unknown variants that can be found and reported on as well. Key here will be the reworking of existing match database technologies to incorporate these 30x WGS full diploid base-pair sequence reads. Not just the 1 every 5,000 base-pair average sampling of the existing microarray test results.

[2] Looking at yDNA comparisons at yDNA-Warehouse and yFull, we show more extensive coverage by 30x WGS tests than the existing best test on the market: BigY-700 from FamilyTreeDNA. We have links and images showing this below. Now, as many know, there are areas of DNA where you do not really want to read values. Near centromeres and other areas of high variance or repetition. But the RAW data is there in a 30x WGS result to make the determinations beyond what research has possibly uncovered to date. BigY-700 brought a big leap in total SNPs reported and added to the phylogenetic tree in known, stable regions. But the 30x WGS test goes even further with its near 100% coverage (compared to less than 90% with BigY) and more SNPs called than ever (over the known, combo-bed region often used for comparison). So, hands down, ignoring the other factors of 30x WGS testing, this is the best solution for yDNA testing out there. While Dante Labs does not provide matching services in yDNA that FTDNA does, yFull does. And while yFull has nowhere near the number of testers to give them as large a tree as FTDNA, some key areas are better because they include test results from research studies like 1K Genomes. Even at $500, the Dante Labs price for their 30x WGS beats the best yDNA test price and delivers a better result.

[3] Looking at mtDNA, you are getting the equivalent of the full-sequence result that others provide from the 30x WGS test. So any SNP from any genome build can be determined. yFull is already jumping on this and creating a new, citizen scientist, expansion to the mtDNA phylogenetic tree. Beyond what FTDNA has been providing. It should be noted that just the full-sequence mtDNA test is priced at $200 or more from FTDNA.

So, in summary, you are getting (a) a 2-20x or greater result across the autosomes and xDNA (although the match databases have to catch up to utilizing this), (b) a greater coverage with better quality reads across the yDNA than from BigY-700, and (c) a full mtDNA sequence (every value reliably). All for sub-$500 and where the best or closest similar complete test coverage with a-la-carte testing would cost $700 to $1200 and up. Seems a "no brainer" decision. If you want all that testing. Often, if the cost is reasonable, the answer is yes.

What about Match Databases?

30x WGS test companies, in general, are not providing any sort of match databases. A requirement we state that is needed for genetic genealogy. But many third parties have sprung up to take on results from various sources and map them. The 30x WGS just becomes another potential source of data for them.

Tools exist like DNA Kit Studio and WGS Extract to create super-kits to load into GEDMatch. The success with such kits at other loadable databases (like MyHeritage, LivingDNA or FTDNA) is mixed. But this more so from trying to use the VCF file as the starting point and not the BAM. At minimum, you can create a normal kit like coming from any other test company using the BAM and load it into any site that allows 3rd party test result submissions. So replicate an Ancestry or 23andMe result and load it into FTDNA, MyHeritage, LivingDNA and GEDMatch. This instead of a "superkit" obtained by merging test results from multiple companies into a single kit. When even comparing a super-kit against a normal kit from the same tester, there is no longer a 100% match (which you get if you compare a kit to itself). So clearly the match tools need to be tuned for this new data source (whether super-kits created from merging multiple microarray test results or from 30x WGS tests).

It is very clear that FTDNA is the elephant in the room regarding yDNA match lists and phylogenetic trees. But yFull has made significant advances in many ways; even though lacking most of the results present at FTDNA. Especially key has been yFull's continuing expansion / support for results from studies and research reports; as well as support for import from multiple labs. FGC has a phylogenetic tree for its customers and allows import. But it has just not taken off like yFull has. yFull is expanding to include a full STR match capability. But while 10-15 of the original y111 STRs are not reliably extracted from WGS short-read, paired-end sequencing results, the validity of the first 111 STRs is being called into question as a way to track lineages in the genealogical time frame anyway as deep SNP testing becomes prevalent.

Ancestry is top dog in having an atdna match database almost as big as all the rest combined. FTDNA has a larger tree and vastly more samples than yFull for both yDNA and mtDNA. And neither Ancestry nor FTDNA, the largest match databases in their respective areas, allow a transfer in of results from other test labs. So this is the biggest drawback.

One big plus with yFull is allowing you to compare with any other tester in the database. Not just some artificial limit on what is defined as a close match set by FTDNA. You simply peruse the phylogenetic tree for other test kits of interest to compare, and then go off and do the comparison. At FTDNA, you are archaically limited (in both STR match lists and the phylogenetic tree) with only those within some arbitrary criteria of match strength based on their algorithm. Sometimes you have kits in your own terminal haplogroup that are not in your match list; thus not allowing you to analyze, compare nor contact.

mtDNA does not really have match lists. It has even more restrictive use in genetic genealogy than xDNA analysis. The best you often can do is use your couple dozen ""VCF" style SNPs to determine an ancient haplogroup. And from that compare to someone else on your matriline to see if they are in the same haplogroup. Something you are not needing match lists for. While FTDNA has been the only one with many full-sequence mtDNA results to date; and using that to build an extended mtDNA phylogenetic tree, yFull jumped into this space this year and has been making headway. Doing major restructuring by pulling in research paper results and the many submitted WGS and FTDNA full-sequence results. It is no surprise that just today yFull and Dante Labs announced a partnership to help smooth the way for supporting genetic genealogists with yDNA and mtDNA analysis going forward. Our gut feel is the progress of yFull will alleviate the need for mitoYDNA even before it gets started. Which is potentially a shame but the way of the market.

Will Dante Labs deliver?

A huge, enormous question that takes many variables to answer. No one has any real clue. If we were talking about all the players being in a competitive space, then the reliance on the one company would not be an issue. But everyone else is still over the $1,000 cost mark for a similar WGS test. It is a week shy of a year ago when a member here ordered a Dante Labs kit at their unheard of $199 sale price. And they still do not have any results and only empty promises. Many are in the same boat (some an even longer time period). But just in the last month or two, with their in-house laboratory in Italy at full-speed, some have finally started seeing movement and gotten results. UPDATE: including our member here. Is it just more hints of a Ponzi scheme or has Dante Labs really turned a corner past its initial growing pains? Just today Dante Labs has started admitting past problems and talks of putting in place a new business model that resolves the issues. Thousands have ordered Dante Labs tests over the past year or so. It is not clear but it still appears less than 1/2 have received any results (based on a Facebook group poll of 800+ customers with over 100 responding). Prices paid have been from the low offer of $199 up to a more standard price hovering around $700. The price paid and the time ordered do not seem to be determining when results are finally delivered. There is still a gamble. But some have jumped feet first and expanded their kit orders to 10+ during this recent sale. And a majority of the discussion since 1 November 2019 on the non-company, Dante Labs Customer Facebook group is about analyzing results received. Not complaining of never receiving anything.

It just came out today (9 Nov 2019) that Dante Labs will have a one-day, flash sale of the $199 WGS 30x read depth, 150bp length test again. They had already lowered it to $450 with the start of DNA sales on the 1 November; something that all companies are doing also (big discounts). This new sale was supposed to start 5PM Sunday EST and go for 24 hours. Additionally, due to a just announced partnership with yFull, they are offering a further 10% discount using a check out code YFULL10. Beside the announcement of this by yFull, Dante Labs CEO provided a video today explaining their history, growing pains and going-forward setup. Maybe not surprising, the sale became available by 5PM Saturday EST. It is not clear yet how long it will really last past the scheduled Monday night cut-off. And whether results will be delivered in the declared commitment time of 8 weeks. Some who had paid full price ten months ago, and were able to cancel and get a refund (after bitter complaining), turned around and re-ordered at the same low price and got results in 6 weeks! It still appears like a crap shoot (as to whether you will get anything for the money). And you will have to learn bioinformatics and do significant self-processing to really make use of the results in genetic genealogy. But the discount is considerable. And the data will be near full-sequencing straight out of the machine that can be analyzed in new, unknown ways for years to come. The new yFull partnership, similar maybe to ones with Sequencing.com and others in the past, may be a great start to removing some of this post test burden faced by the genetic genealogy community.

So this may be a good time to at least test the waters and catch the wave of possible future genetic genealogy testing and analysis like one member here did last year. Even though waiting for last years "toe in the water" results, we have bought a few more kits in hopes we can further develop and support this approach in the surname study here and elsewhere. As well as ride this hoped-for future trend of the next generation in genetic genealogy.

UPDATE1: As of 24 Nov 2019, they are offering a $189 price for the week until Cyber Monday; similar to last year. A kit a member ordered during the sale on 9 Nov has yet to ship so unclear if the issues are mostly behind at this company ... oh dear. And that kit from last year finally delivered on 18 Nov 2019. A week shy of a full year. Results look solid so far.

UPDATE2: As of 24 Jan 2020, 4 more kit results have come in from ones the author ordered in November. All but one are excellent results. The other is supposedly excellent but they are having problems posting the complete files. Some are still experiencing these hit and miss issues but a majority seem to be in a normal track. Delivery of results from the date of order is 8-10 weeks from the USA; sooner from Europe. We have a delivery time tracking spreadsheet that we may turn into a form to create a summary chart so others can contribute.)

UPDATE3: A best-ever price of $149 was offered during Black Friday week this year. Right on the heals of just beginning to restart sequencing of kits arrived since March. And like exactly one year before, those that ordered new kits got results in weeks while others waiting months are still waiting. Quality continues to be a scatter shot with some "normal" and others way below par. So still very much a crap shoot on all three factors of timeliness, price you pay, and quality.

Comparisons on yFull and yDNA-Warehouse

This section has been moved to it's own Wiki page where it will be expanded on more. Most of the references were due to this section and moved there as well.

See Also