Jagannathan, VidyaVidyaJagannathanHitte, ChristopheChristopheHitteKidd, Jeffrey M.Jeffrey M.KiddMasterson, PatrickPatrickMastersonMurphy, Terence D.Terence D.MurphyEmery, SarahSarahEmeryDavis, BrianBrianDavisBuckley, Reuben M.Reuben M.BuckleyLiu, Yan-HuYan-HuLiuZhang, Xiang-QuanXiang-QuanZhangLeeb, TossoTossoLeeb0000-0003-0553-4880Zhang, Ya-PingYa-PingZhangOstrander, Elaine A.Elaine A.OstranderWang, Guo-DongGuo-DongWang2024-09-212024-09-212021https://boris-portal.unibe.ch/handle/20.500.12422/45620The domestic dog has evolved to be an important biomedical model for studies regarding the genetic basis of disease, morphology and behavior. Genetic studies in the dog have relied on a draft reference genome of a purebred female boxer dog named "Tasha" initially published in 2005. Derived from a Sanger whole genome shotgun sequencing approach coupled with limited clone-based sequencing, the initial assembly and subsequent updates have served as the predominant resource for canine genetics for 15 years. While the initial assembly produced a good-quality draft, as with all assemblies produced at the time, it contained gaps, assembly errors and missing sequences, particularly in GC-rich regions, which are found at many promoters and in the first exons of protein-coding genes. Here, we present Dog10K_Boxer_Tasha_1.0, an improved chromosome-level highly contiguous genome assembly of Tasha created with long-read technologies that increases sequence contiguity >100-fold, closes >23,000 gaps of the CanFam3.1 reference assembly and improves gene annotation by identifying >1200 new protein-coding transcripts. The assembly and annotation are available at NCBI under the accession GCF_000002285.5.en500 - Science::590 - Animals (Zoology)600 - Technology::630 - Agriculture500 - Science::570 - Life sciences; biology600 - Technology::610 - Medicine & healthDog10K_Boxer_Tasha_1.0: A Long-Read Assembly of the Dog Reference Genomearticle10.48350/1565733407091110.3390/genes12060847