利用者:Yamagu/sandbox

ここはYamaguさんの利用者サンドボックスです。編集を試したり下書きを置いておいたりするための場所であり、百科事典の記事ではありません。ただし、公開の場ですので、許諾されていない文章の転載はご遠慮ください。

登録利用者は自分用の利用者サンドボックスを作成できます（サンドボックスを作成する、解説）。

リファレンスゲノム（参照ゲノム配列などとも呼ばれる）とは、ゲノム解読プロジェクトなどで解読された大量の塩基配列を研究者がアセンブルし、その生物の種の理想的な個体の遺伝子セットの代表例として構築し、各種の情報を整備したデータベースでである(広義のデータベースであって、必ずしも実装はリレーショナルデータベース等ではない)。

リファレンスゲノムは、複数のサンプルのDNAシークエンシングデータからアセンブル(組み立て)されるため、アセンブルされた塩基配列は任意の単一の個体の遺伝子セットを正確に表しているわけではない(ただし、各サンプルからの異なるDNA配列がハプロイド配列として提供される場合はある)。例えば、最新の人間のリファレンスゲノム（アセンブリGRCh38 / hg38）は、60人以上のゲノムのクローンライブラリに由来している^[1]。

現在、ウイルス、バクテリア、菌類、植物、動物の複数の種に対するリファレンスゲノムが公開されている。リファレンスゲノムは、新しいゲノムをアセンブルする際のガイドとして利用される他、RNA-Seqなどの遺伝子発現解析、GWASなどの遺伝統計解析など、様々の用途に利用される。

初期のヒトゲノムプロジェクトなどでは膨大なコストががかかっていたが、次世代シーケンサーや第３世代シーケンサーなどの登場により、現在はそれよりもはるかに迅速かつ安価に構築することができる。リファレンスゲノムは、EnsemblやUCSC Genome Browser^[2]などのWebサイト上でWebブラウザを使用してアクセスできる他、IGVなどのアプリケーションを利用して見ることもできる。また、そのようなWebアプリケーションやIGVのようなリファレンスゲノムを表示できるソフトウェアはゲノムブラウザなどと呼ばれる。

リファレンスゲノムの特性

長さの測定

ゲノムの長さは、何通りかの表現方法がある。簡単な方法は、アセンブリ中の塩基数を数えるもので^[3]、物理的距離、物理位置などと呼ばれることがある。

ゴールデンパス(GoldenPath)と呼ばれるUCSCの公開したリファレンスゲノムでは、ハプロタイプ領域[4][5] や偽常染色体領域などの冗長領域を除外した長さを用いている。これは通常、物理的なマップ上にハプロタイプのシークエンシング情報を重ねるようにして構築され、スキャッフォールド(骨格)の情報とすりあわされている。これはゲノムがどのように見えるかの「最良の推定値」であり、通常はギャップを含むため、典型的な塩基対アセンブリよりも長くなる[6]。

Contigs and scaffolds

リファレンスゲノムのアセンブルは、リードを重ね合わせていくことでコンティグを作り、それを適切に並び替え、つなぎ合わせるという作業である。このコンティグと呼ばれる塩基配列はそれらのリードのアライメントによって作られるコンセンサス配列である。^[4] もしコンティグ間にギャップがある場合にはスキャッフォールディングと呼ばれる組み立て作業でギャップ（コンティグの間）を埋めていく。実際の作業としてはPCRやBacterial Artificial Chromosome (BAC) クローニングなどで配列を増幅してシーケンサーで読むことになる。^[5]^[4] ギャップの中には埋められないものもあり、そのような場合にはリファレンス中に複数のスキャッフォールドが作られることになる。^[6] スキャッフォールドは次のような3種類に分類できる。タイプ1) 染色体とその中におけるコンティグの位置と向きが決定されている(Placed); タイプ2) そのコンティグを含む染色体までは分かっているが、向きや位置が定まらないもの(Unlocalised); タイプ3) どの染色体に属するかすら不明のコンティグ群(Unplaced)。 ^[7]

リファレンスのアセンブル結果の良し悪しの評価には、 contigs数、スキャッフォールド数、及びそれらの平均長などが用いられ、解読できた塩基が長く連続している程、高品質であるとされる。つまり、染色体あたりのスキャッフォールド数は少ないほど望ましく、理想的には1個のスキャッフォールドで1本の染色体ということになる。^[8]^[9]^[10]

他に、N50とL50という指標もよく用いられる。N50とはアセンブルされたコンティグを長いものから短いもので並べたときに、ゲノム全体の長さの50%の点に位置するコンティグの長さである。またL50はN50以上の長さを持つコンティグの数を表す。N50の値が高くなれば、L50の値は反対に小さくなることになり、それは連続して解読できた塩基長が長く、アセンブルされたデータが高品質であることを意味する。^[11]^[12]^[13]

哺乳類のゲノム

ヒトとマウスのリファレンスゲノムはGenome Reference Consortium (GRC)によって維持、改良されている。GRCは20人以下のゲノム関連の研究者のメンバーから構成された組織で、その所属機関はEuropean Bioinformatics Institute、National Center for Biotechnology Information、 Sanger Institute、Washington University in St. LouisのMcDonnell Genome Instituteである。 GRCは日々、リファレンスゲノム中のギャップを埋めたり誤りを修正すべく、改善作業を継続している。

ヒトのリファレンスゲノム

初期のヒトのリファレンスゲノムの元になったのは、ニューヨーク州のバッファローで集められた13名の匿名の有志から提供されたサンプルである。提供者の募集は 1997年3月23日、日曜日に The Buffalo News(新聞、日刊紙)を通じて行われた。まず男女それぞれの有志10人ずつがプロジェクトの遺伝カウンセラーのところに招待され、説明を受け、同意した参加者は血液を提供し、そこからDNAが抽出された。最終的には構築されたBACクローンライブラリの品質の良いサンプルが主に利用されるなどした結果、80％のデータは8人のサンプルに由来するものとなり、中でもRP11という男性由来のデータの占める割合は66％にも及んだ。

なお、複数人のデータから一つのリファレンスゲノムを構築するにあたり、ABO血液型のように個人によって異なっているものについてはO型のアリルのみがリファレンスゲノム中では採用され、他の型についてはABO式血液型の遺伝子のアノテーションとして収録されている。^[14]^[15]^[16]^[17]^[18]

DNAシークエンシングのコストが低下するにつれ、新たな全ゲノムシークエンシング技術も登場しており、ゲノムシークエンシングは年々盛んに行われるようになってきている。ジェームズ・ワトソンらによるゲノムのアセンブリングのプロジェクトなどでは超並列シーケンサが利用された。^[19]リファレンスゲノム NCBI build36/hg18とワトソンらのアセンブルしたゲノムを比較すると330万個ものSNPの違いが見つかり、1.4%の配列についてはリファレンスゲノムのどことも一致しないという状況だっった。 .^[18]^[20] MHC領域などのように多型の領域が大きい場合については、オルタネート・ローカスという形で、リファレンスのローカス(座位)と対応する形で提供されている。

Genome Reference Consortiumからリリースされた最新のリファレンスゲノムはGRCh38で、公開されたのは2017年である。 ^[22] その後、更新のために多数のパッチが提供され、2022年3月時点ではパッチ適用が14回目という意味で、GRCh38.p14となっている。^[23]^[24] このビルドでは、リファレンスゲノム全体の中に含まれるギャップは349個まで減少し、最初のバージョンが15万個のギャップを含んでいたことと比べると大幅に進歩したと言える。^[15] ギャップとして残っているのは、テロメアとセントロメアと長い反復配列の領域で、そのうち最も長いものはY染色体の長腕の約30M塩基対の領域である。 (これはY染色体の約52%の長さである) ^[25] ゲノム解読用のクローンライブラリは年々着実に増加し、60人以上のものとなったが、それでもRP11という個人由来のデータはリファレンスゲノムの70%近くを占めている。^[26] この匿名の男性についてはゲノムの分析によれば、アフリカ・ヨーロッパ系を祖先系集団とする人物ではないかと見られている。^[26]

2022年には Telomere-to-Telomere (T2T) コンソーシアム^[27] 初の完全なアセンブルとなるリファレンスゲノムを発表した(バージョン T2T-CHM13)。このリファレンスゲノムは一切のギャップを含まず、短腕側のテロメアから長腕側のテロメアまでの全ての塩基を決定したのでこのように呼ばれる。^[28]^[29] CHM13というのは培養細胞株の名称であり、この株では全染色体がホモになっていることから通常のヒトの2倍体の細胞と異なり、一意に配列を決定することが可能である(ただしY染色体は含まれない)。このリファレンスゲノムが決定されるまで、特に解読の困難な8%の領域は未解読のままとなっていたが、これによって遂に全長が切れ目なく解読された。解読を難しくしていたリピートや構造多型は、イルミナの次世代シーケンサーやナノポアやPacBio社のロングリードシーケンサー、Arima Genomics社のHi-C、Bionano社のオプティカルマッピング技術、Strand-Seqといった多数の技術を駆使して解決された。このプロジェクトの成果は染色体の全長を決定したというものだが、セントロメアやその周辺を詳細に解読した初の成果でもあり、今後の研究の発展も期待されている。^[30] GRCプロジェクトのWebページによれば、このT2Tの発表後にGRCh39の無期限延期の旨が掲載された。^[31] 今後については、T2Tとヒトパンゲノムリファレンスコンソーシアムの手法を取り入れることで、ゲノムの多様性を考慮に入れた方式に移行していくとされている。

Recent genome assemblies are as follows:^[32]

Release name	Date of release	Equivalent UCSC version
GRCh39	Indefinitely postponed^[31]	-
T2T-CHM13	January 2022	-
GRCh38	Dec 2013	hg38
GRCh37	Feb 2009	hg19
NCBI Build 36.1	Mar 2006	hg18
NCBI Build 35	May 2004	hg17
NCBI Build 34	Jul 2003	hg16

Limitations

生物1個体を取り扱う状況であれば、リファレンスゲノムはゲノムの特徴をよくとらえており、扱いやすいものとなっている。しかし、遺伝的に多様性の高い領域、例えばヒトのMHC領域やマウスの主要尿タンパク質(MUP)の領域を取り扱うとなると、リファレンスゲノムはどの個体ともかなり違ってしまっている。^[33]^[34]^[35] そもそもリファレンスゲノムは1本の明確な塩基配列を定めたもので、それによってゲノム上のあらゆる特徴情報の位置を記述できるようにしたものなので、個人間で異なっているような多様性の情報(多型)を記述するには自ずと限界がある。また、別の問題としてリファレンスゲノムの構築に使用されたサンプルは、ヨーロッパに祖先を持つ個人から提供されたものであり、これは当時よく知見の揃っていたサンプルが使われたという事情はあるが、それによって非ヨーロッパの祖先を持つ集団については全く考慮に入れられていないということもある。 2010年にはアフリカ人集団と日本人集団についてデノボアセンブリングによってゲノムを解読し、それをNCBI36のリファレンスゲノムにマッピングしたところ、約5M塩基対の領域はリファレンスのどこにもマップできなかったことが報告されている。^[36]

ヒトゲノムプロジェクト以降、他の各種プロジェクトはそれを基盤としつつ、リファレンスゲノムだけでは見ることのできない、より詳細で遺伝的多様性を調査する方向へとシフトしていっている。 HapMap プロジェクトは 2002 -2010年の期間、活発に研究を推進し、ハプロタイプマップの構築を目指しヒトの各集団間に共通に見られる頻度の高い多型についてデータを蓄積していった。最終的には祖先集団を異にする11の集団が研究対象となり、中国からは漢民族、インドからはグジャラート人、ナイジェリアのヨルバ人、日本人などが対象となっていた。^[37]^[38]^[39]^[40] 1000ゲノムプロジェクトは2008年から2015年までのプロジェクトで、人類集団の95%以上の多型を収集してデータベースを構築することを目指し、その成果はゲノムワイド相関解析の基盤として糖尿病や心血管系、自己免疫疾患の研究などに広く利用された。最終的にはHapMapプロジェクトのスコープの拡大により26の民族集団が研究の対象となった。追加となったのは、フランスのマンド人、シエラレオネ人、ベトナム人、ベンガル人などであった。^[41]^[42]^[43]^[44] ヒトパンゲノムプロジェクトは、2019年にヒトパンゲノムリファレンスコンソーシアムの結成にともない最初の段階のプロジェクトとしてスタートした。このプロジェクトの目標は、これまでの各種プロジェクトの成果を統合し、ヒトの遺伝的多様性を最大限収集したゲノム地図の構築することである。 ^[45]^[46]

Mouse reference genome

Recent mouse genome assemblies are as follows:^[32]

Release name	Date of release	Equivalent UCSC version
GRCm39	June 2020	mm39
GRCm38	Dec 2011	mm10
NCBI Build 37	Jul 2007	mm9
NCBI Build 36	Feb 2006	mm8
NCBI Build 35	Aug 2005	mm7
NCBI Build 34	Mar 2005	mm6

Other genomes

ヒトゲノムプロジェクトは巨額の予算と多数の研究者の参加によって多くの技術革新をもたらした。これにより様々な生物種のゲノム解析プロジェクトがその後に開始された。主なものとしては、モデル生物であるゼブラフィッシュ(Danio rerio)、ニワトリ(Gallus gallus)、大腸菌(Escherichia coli)などでこれらは元々世界各国で研究対象となっていたことから特に注目を集めた。また、絶滅危惧種のゲノムも解読の対象となり、アジアのアロワナ(Scleropages formosus)、アメリカンバイソン(Bison bison)なども解読の対象となった。 2022年8月の時点ではNCBIに71886種の生物について完全もしくは部分的に解読されたゲノムが登録されていた。そのうち676種は哺乳類、590種は鳥類、865種は魚類、1896種は昆虫、3747種は菌類、1025種は植物、33724種はバクテリア、26004種はウイルス、2040種は古細菌だった。 ^[47] A lot of these species have annotation data associated with their reference genomes that can be publicly accessed and visualized in genome browsers such as Ensembl and UCSC Genome Browser.^[48]^[49]

Some examples of these international projects are: the Chimpanzee Genome Project, carried out between 2005 and 2013 jointly by the Broad Institute and the McDonnell Genome Institute of Washington University in St. Louis, which generated the first reference genomes for 4 subspecies of Pan troglodytes;^[50]^[51] the 100K Pathogen Genome Project, which started in 2012 with the main goal of creating a database of reference genomes for 100 000 pathogen microorganisms to use in public health, outbreaks detection, agriculture and environment;^[52] the Earth BioGenome Project, which started in 2018 and aims to sequence and catalog the genomes of all the eukaryotic organisms on Earth to promote biodiversity conservation projects. Inside this big-science project there are up to 50 smaller-scale affiliated projects such as the Africa BioGenome Project or the 1000 Fungal Genomes Project.^[53]^[54]^[55]

References

^ “How many individuals were sequenced for the human reference genome assembly?”. Genome Reference Consortium. 7 April 2022閲覧。
^ “Ensembl 2008”. Nucleic Acids Research 36 (Database issue): D707–D714. (January 2008). doi:10.1093/nar/gkm988. PMC 2238821. PMID 18000006.
^ “Help - Glossary - Homo sapiens - Ensembl genome browser 87”. www.ensembl.org. 2023年5月12日閲覧。
^ ^a ^b Gibson, Greg; Muse, Spencer V. (2009). A Primer of Genome Science (3rd ed.). Sinauer Associates. p. 84. ISBN 978-0-878-93236-8
^ “Help - Glossary - Homo_sapiens - Ensembl genome browser 107”. www.ensembl.org. 2022年9月26日閲覧。
^ Luo, Junwei; Wei, Yawei; Lyu, Mengna; Wu, Zhengjiang; Liu, Xiaoyan; Luo, Huimin; Yan, Chaokun (2021-09-02). “A comprehensive review of scaffolding methods in genome assembly”. Briefings in Bioinformatics 22 (5): bbab033. doi:10.1093/bib/bbab033. ISSN 1477-4054. PMID 33634311.
^ “Chromosomes, scaffolds and contigs”. www.ensembl.org. 2022年9月26日閲覧。
^ Meader, Stephen; Hillier, LaDeana W.; Locke, Devin; Ponting, Chris P.; Lunter, Gerton (May 2010). “Genome assembly quality: Assessment and improvement using the neutral indel model”. Genome Research 20 (5): 675–684. doi:10.1101/gr.096966.109. ISSN 1088-9051. PMC 2860169. PMID 20305016.
^ Rice, Edward S.; Green, Richard E. (2019-02-15). “New Approaches for Genome Assembly and Scaffolding” (英語). Annual Review of Animal Biosciences 7 (1): 17–40. doi:10.1146/annurev-animal-020518-115344. ISSN 2165-8102. PMID 30485757.
^ Cao, Minh Duc; Nguyen, Son Hoang; Ganesamoorthy, Devika; Elliott, Alysha G.; Cooper, Matthew A.; Coin, Lachlan J. M. (2017-02-20). “Scaffolding and completing genome assemblies in real-time with nanopore sequencing” (英語). Nature Communications 8 (1): 14515. Bibcode: 2017NatCo...814515C. doi:10.1038/ncomms14515. ISSN 2041-1723. PMC 5321748. PMID 28218240.
^ Mende, Daniel R.; Waller, Alison S.; Sunagawa, Shinichi; Järvelin, Aino I.; Chan, Michelle M.; Arumugam, Manimozhiyan; Raes, Jeroen; Bork, Peer (2012-02-23). “Assessment of Metagenomic Assembly Using Simulated Next Generation Sequencing Data”. PLOS ONE 7 (2): e31386. Bibcode: 2012PLoSO...731386M. doi:10.1371/journal.pone.0031386. ISSN 1932-6203. PMC 3285633. PMID 22384016.
^ Alhakami, Hind; Mirebrahim, Hamid; Lonardi, Stefano (2017-05-18). “A comparative evaluation of genome assembly reconciliation tools”. Genome Biology 18 (1): 93. doi:10.1186/s13059-017-1213-3. ISSN 1474-7596. PMC 5436433. PMID 28521789.
^ Castro, Christina J.; Ng, Terry Fei Fan (2017-11-01). “U50: A New Metric for Measuring Assembly Output Based on Non-Overlapping, Target-Specific Contigs”. Journal of Computational Biology 24 (11): 1071–1080. doi:10.1089/cmb.2017.0013. PMC 5783553. PMID 28418726.
^ A short guide to the human genome. CSHL Press. (2008). p. 135. ISBN 978-0-87969-791-4
^ ^a ^b “E pluribus unum”. Nature Methods 7 (5): 331. (May 2010). doi:10.1038/nmeth0510-331. PMID 20440876.
^ “Is it time to change the reference genome?”. Genome Biology 20 (1): 159. (August 2019). doi:10.1186/s13059-019-1774-4. PMC 6688217. PMID 31399121.
^ “Limitations of the human reference genome for personalized genomics”. PLOS ONE 7 (7): e40294. (11 July 2012). Bibcode: 2012PLoSO...740294R. doi:10.1371/journal.pone.0040294. PMC 3394790. PMID 22811759.
^ ^a ^b “Genome of DNA Pioneer Is Deciphered”. New York Times. (May 31, 2007) February 21, 2009閲覧。
^ 超並列シーケンサーを使わなかった例としては、クレイグ・ベンター(セレラ社)によるショットガン・シーケンス法がある。
^ “The complete genome of an individual by massively parallel DNA sequencing”. Nature 452 (7189): 872–876. (April 2008). Bibcode: 2008Natur.452..872W. doi:10.1038/nature06884. PMID 18421352.
^ “Genome Data Viewer - NCBI”. www.ncbi.nlm.nih.gov. 2022年8月18日閲覧。
^ “Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly”. Genome Research 27 (5): 849–864. (May 2017). doi:10.1101/gr.213611.116. PMC 5411779. PMID 28396521.
^ “GRCh38.p14 - hg38 - Genome - Assembly - NCBI”. www.ncbi.nlm.nih.gov. 2022年8月19日閲覧。
^ Genome Reference Consortium (2022年5月9日). “GenomeRef: GRCh38.p14 is now released!”. GRC Blog (GenomeRef). 2022年8月19日閲覧。
^ “GRCh38.p14 - hg38 - Genome - Assembly - NCBI - Statistics Report”. www.ncbi.nlm.nih.gov. 2022年8月18日閲覧。
^ ^a ^b 引用エラー: 無効な <ref> タグです。「GRC_FAQ」という名前の注釈に対するテキストが指定されていません
^ “Telomere-to-Telomere” (英語). NHGRI. 2022年8月16日閲覧。
^ “The complete sequence of a human genome”. Science 376 (6588): 44–53. (April 2022). Bibcode: 2022Sci...376...44N. doi:10.1126/science.abj6987. PMC 9186530. PMID 35357919.
^ “T2T-CHM13v2.0 - Genome - Assembly - NCBI”. www.ncbi.nlm.nih.gov. 2022年8月16日閲覧。
^ Altemose, Nicolas; Logsdon, Glennis A.; Bzikadze, Andrey V.; Sidhwani, Pragya; Langley, Sasha A.; Caldas, Gina V.; Hoyt, Savannah J.; Uralsky, Lev et al. (April 2022). “Complete genomic and epigenetic maps of human centromeres” (英語). Science 376 (6588): eabl4178. doi:10.1126/science.abl4178. ISSN 0036-8075. PMC 9233505. PMID 35357911.
^ ^a ^b “Genome Reference Consortium”. www.ncbi.nlm.nih.gov. 2022年8月18日閲覧。
^ ^a ^b “UCSC Genome Bioinformatics: FAQ”. genome.ucsc.edu. 2016年8月18日閲覧。
^ MHC Sequencing Consortium (October 1999). “Complete sequence and gene map of a human major histocompatibility complex. The MHC sequencing consortium”. Nature 401 (6756): 921–923. Bibcode: 1999Natur.401..921T. doi:10.1038/44853. PMID 10553908.
^ “Species specificity in major urinary proteins by parallel evolution”. PLOS ONE 3 (9): e3280. (September 2008). Bibcode: 2008PLoSO...3.3280L. doi:10.1371/journal.pone.0003280. PMC 2533699. PMID 18815613.
^ Urinary Lipocalins in Rodenta:is there a Generic Model?. Chemical Signals in Vertebrates 11. Springer New York. (October 2007). ISBN 978-0-387-73944-1
^ “Building the sequence map of the human pan-genome”. Nature Biotechnology 28 (1): 57–63. (January 2010). doi:10.1038/nbt.1596. PMID 19997067.
^ The International HapMap Consortium (October 2005). “A haplotype map of the human genome”. Nature 437 (7063): 1299–1320. Bibcode: 2005Natur.437.1299T. doi:10.1038/nature04226. PMC 1880871. PMID 16255080.
^ “A second generation human haplotype map of over 3.1 million SNPs”. Nature 449 (7164): 851–861. (October 2007). Bibcode: 2007Natur.449..851F. doi:10.1038/nature06258. PMC 2689609. PMID 17943122.
^ “Integrating common and rare genetic variation in diverse human populations”. Nature 467 (7311): 52–58. (September 2010). Bibcode: 2010Natur.467...52T. doi:10.1038/nature09298. PMC 3173859. PMID 20811451.
^ “International HapMap Project” (英語). Genome.gov. 2022年8月18日閲覧。
^ “A map of human genome variation from population-scale sequencing”. Nature 467 (7319): 1061–1073. (October 2010). Bibcode: 2010Natur.467.1061T. doi:10.1038/nature09534. PMC 3042601. PMID 20981092.
^ “An integrated map of genetic variation from 1,092 human genomes”. Nature 491 (7422): 56–65. (November 2012). Bibcode: 2012Natur.491...56T. doi:10.1038/nature11632. PMC 3498066. PMID 23128226.
^ “A global reference for human genetic variation”. Nature 526 (7571): 68–74. (October 2015). Bibcode: 2015Natur.526...68T. doi:10.1038/nature15393. PMC 4750478. PMID 26432245.
^ “An integrated map of structural variation in 2,504 human genomes”. Nature 526 (7571): 75–81. (October 2015). Bibcode: 2015Natur.526...75.. doi:10.1038/nature15394. PMC 4617611. PMID 26432246.
^ “The Need for a Human Pangenome Reference Sequence”. Annual Review of Genomics and Human Genetics 22 (1): 81–102. (August 2021). doi:10.1146/annurev-genom-120120-081921. PMC 8410644. PMID 33929893.
^ “The Human Pangenome Project: a global resource to map genomic diversity”. Nature 604 (7906): 437–446. (April 2022). Bibcode: 2022Natur.604..437W. doi:10.1038/s41586-022-04601-8. PMC 9402379. PMID 35444317.
^ “Genome List - Genome - NCBI”. www.ncbi.nlm.nih.gov. 2022年8月18日閲覧。
^ “Species List”. uswest.ensembl.org. 2022年8月18日閲覧。
^ “GenArk: UCSC Genome Archive”. hgdownload.soe.ucsc.edu. 2022年8月18日閲覧。
^ “Chimpanzee Genome Project” (英語). BCM-HGSC. (2016年3月4日) 2022年8月18日閲覧。
^ “Great ape genetic diversity and population history”. Nature 499 (7459): 471–475. (July 2013). Bibcode: 2013Natur.499..471P. doi:10.1038/nature12228. PMC 3822165. PMID 23823723.
^ “100K Pathogen Genome Project – Genomes for Public Health & Food Safety” (英語). 2022年8月18日閲覧。
^ “Earth BioGenome Project: Sequencing life for the future of life”. Proceedings of the National Academy of Sciences of the United States of America 115 (17): 4325–4333. (April 2018). Bibcode: 2018PNAS..115.4325L. doi:10.1073/pnas.1720115115. PMC 5924910. PMID 29686065.
^ “African BioGenome Project – Genomics in the service of conservation and improvement of African biological diversity” (英語). 2022年8月18日閲覧。
^ “1000 Fungal Genomes Project”. mycocosm.jgi.doe.gov. 2022年8月18日閲覧。

External links

Genome Reference Consortium

[GRC_FAQ2-1] “How many individuals were sequenced for the human reference genome assembly?”. Genome Reference Consortium. 7 April 2022閲覧。

[ensembl2-2] “Ensembl 2008”. Nucleic Acids Research 36 (Database issue): D707–D714. (January 2008). doi:10.1093/nar/gkm988. PMC 2238821. PMID 18000006.

[3] “Help - Glossary - Homo sapiens - Ensembl genome browser 87”. www.ensembl.org. 2023年5月12日閲覧。

[textbook-4] Gibson, Greg; Muse, Spencer V. (2009). A Primer of Genome Science (3rd ed.). Sinauer Associates. p. 84. ISBN 978-0-878-93236-8

[5] “Help - Glossary - Homo_sapiens - Ensembl genome browser 107”. www.ensembl.org. 2022年9月26日閲覧。

[6] Luo, Junwei; Wei, Yawei; Lyu, Mengna; Wu, Zhengjiang; Liu, Xiaoyan; Luo, Huimin; Yan, Chaokun (2021-09-02). “A comprehensive review of scaffolding methods in genome assembly”. Briefings in Bioinformatics 22 (5): bbab033. doi:10.1093/bib/bbab033. ISSN 1477-4054. PMID 33634311.

[7] “Chromosomes, scaffolds and contigs”. www.ensembl.org. 2022年9月26日閲覧。

[8] Meader, Stephen; Hillier, LaDeana W.; Locke, Devin; Ponting, Chris P.; Lunter, Gerton (May 2010). “Genome assembly quality: Assessment and improvement using the neutral indel model”. Genome Research 20 (5): 675–684. doi:10.1101/gr.096966.109. ISSN 1088-9051. PMC 2860169. PMID 20305016.

[9] Rice, Edward S.; Green, Richard E. (2019-02-15). “New Approaches for Genome Assembly and Scaffolding” (英語). Annual Review of Animal Biosciences 7 (1): 17–40. doi:10.1146/annurev-animal-020518-115344. ISSN 2165-8102. PMID 30485757.

[10] Cao, Minh Duc; Nguyen, Son Hoang; Ganesamoorthy, Devika; Elliott, Alysha G.; Cooper, Matthew A.; Coin, Lachlan J. M. (2017-02-20). “Scaffolding and completing genome assemblies in real-time with nanopore sequencing” (英語). Nature Communications 8 (1): 14515. Bibcode: 2017NatCo...814515C. doi:10.1038/ncomms14515. ISSN 2041-1723. PMC 5321748. PMID 28218240.

[11] Mende, Daniel R.; Waller, Alison S.; Sunagawa, Shinichi; Järvelin, Aino I.; Chan, Michelle M.; Arumugam, Manimozhiyan; Raes, Jeroen; Bork, Peer (2012-02-23). “Assessment of Metagenomic Assembly Using Simulated Next Generation Sequencing Data”. PLOS ONE 7 (2): e31386. Bibcode: 2012PLoSO...731386M. doi:10.1371/journal.pone.0031386. ISSN 1932-6203. PMC 3285633. PMID 22384016.

[12] Alhakami, Hind; Mirebrahim, Hamid; Lonardi, Stefano (2017-05-18). “A comparative evaluation of genome assembly reconciliation tools”. Genome Biology 18 (1): 93. doi:10.1186/s13059-017-1213-3. ISSN 1474-7596. PMC 5436433. PMID 28521789.

[13] Castro, Christina J.; Ng, Terry Fei Fan (2017-11-01). “U50: A New Metric for Measuring Assembly Output Based on Non-Overlapping, Target-Specific Contigs”. Journal of Computational Biology 24 (11): 1071–1080. doi:10.1089/cmb.2017.0013. PMC 5783553. PMID 28418726.

[Guide-14] A short guide to the human genome. CSHL Press. (2008). p. 135. ISBN 978-0-87969-791-4

[Editorial-15] “E pluribus unum”. Nature Methods 7 (5): 331. (May 2010). doi:10.1038/nmeth0510-331. PMID 20440876.

[Change-16] “Is it time to change the reference genome?”. Genome Biology 20 (1): 159. (August 2019). doi:10.1186/s13059-019-1774-4. PMC 6688217. PMID 31399121.

[PLOS_Rosen-17] “Limitations of the human reference genome for personalized genomics”. PLOS ONE 7 (7): e40294. (11 July 2012). Bibcode: 2012PLoSO...740294R. doi:10.1371/journal.pone.0040294. PMC 3394790. PMID 22811759.

[NYT-18] “Genome of DNA Pioneer Is Deciphered”. New York Times. (May 31, 2007) February 21, 2009閲覧。

[19] 超並列シーケンサーを使わなかった例としては、クレイグ・ベンター(セレラ社)によるショットガン・シーケンス法がある。

[Watson-20] “The complete genome of an individual by massively parallel DNA sequencing”. Nature 452 (7189): 872–876. (April 2008). Bibcode: 2008Natur.452..872W. doi:10.1038/nature06884. PMID 18421352.

[21] “Genome Data Viewer - NCBI”. www.ncbi.nlm.nih.gov. 2022年8月18日閲覧。

[22] “Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly”. Genome Research 27 (5): 849–864. (May 2017). doi:10.1101/gr.213611.116. PMC 5411779. PMID 28396521.

[23] “GRCh38.p14 - hg38 - Genome - Assembly - NCBI”. www.ncbi.nlm.nih.gov. 2022年8月19日閲覧。

[24] Genome Reference Consortium (2022年5月9日). “GenomeRef: GRCh38.p14 is now released!”. GRC Blog (GenomeRef). 2022年8月19日閲覧。

[25] “GRCh38.p14 - hg38 - Genome - Assembly - NCBI - Statistics Report”. www.ncbi.nlm.nih.gov. 2022年8月18日閲覧。

[GRC_FAQ-26] 引用エラー: 無効な <ref> タグです。「GRC_FAQ」という名前の注釈に対するテキストが指定されていません

[27] “Telomere-to-Telomere” (英語). NHGRI. 2022年8月16日閲覧。

[28] “The complete sequence of a human genome”. Science 376 (6588): 44–53. (April 2022). Bibcode: 2022Sci...376...44N. doi:10.1126/science.abj6987. PMC 9186530. PMID 35357919.

[29] “T2T-CHM13v2.0 - Genome - Assembly - NCBI”. www.ncbi.nlm.nih.gov. 2022年8月16日閲覧。

[30] Altemose, Nicolas; Logsdon, Glennis A.; Bzikadze, Andrey V.; Sidhwani, Pragya; Langley, Sasha A.; Caldas, Gina V.; Hoyt, Savannah J.; Uralsky, Lev et al. (April 2022). “Complete genomic and epigenetic maps of human centromeres” (英語). Science 376 (6588): eabl4178. doi:10.1126/science.abl4178. ISSN 0036-8075. PMC 9233505. PMID 35357911.

[:1-31] “Genome Reference Consortium”. www.ncbi.nlm.nih.gov. 2022年8月18日閲覧。

[:0-32] “UCSC Genome Bioinformatics: FAQ”. genome.ucsc.edu. 2016年8月18日閲覧。

[MHCsc-33] MHC Sequencing Consortium (October 1999). “Complete sequence and gene map of a human major histocompatibility complex. The MHC sequencing consortium”. Nature 401 (6756): 921–923. Bibcode: 1999Natur.401..921T. doi:10.1038/44853. PMID 10553908.

[Logan-34] “Species specificity in major urinary proteins by parallel evolution”. PLOS ONE 3 (9): e3280. (September 2008). Bibcode: 2008PLoSO...3.3280L. doi:10.1371/journal.pone.0003280. PMC 2533699. PMID 18815613.

[Hurstchapter-35] Urinary Lipocalins in Rodenta:is there a Generic Model?. Chemical Signals in Vertebrates 11. Springer New York. (October 2007). ISBN 978-0-387-73944-1

[36] “Building the sequence map of the human pan-genome”. Nature Biotechnology 28 (1): 57–63. (January 2010). doi:10.1038/nbt.1596. PMID 19997067.

[37] The International HapMap Consortium (October 2005). “A haplotype map of the human genome”. Nature 437 (7063): 1299–1320. Bibcode: 2005Natur.437.1299T. doi:10.1038/nature04226. PMC 1880871. PMID 16255080.

[38] “A second generation human haplotype map of over 3.1 million SNPs”. Nature 449 (7164): 851–861. (October 2007). Bibcode: 2007Natur.449..851F. doi:10.1038/nature06258. PMC 2689609. PMID 17943122.

[39] “Integrating common and rare genetic variation in diverse human populations”. Nature 467 (7311): 52–58. (September 2010). Bibcode: 2010Natur.467...52T. doi:10.1038/nature09298. PMC 3173859. PMID 20811451.

[40] “International HapMap Project” (英語). Genome.gov. 2022年8月18日閲覧。

[41] “A map of human genome variation from population-scale sequencing”. Nature 467 (7319): 1061–1073. (October 2010). Bibcode: 2010Natur.467.1061T. doi:10.1038/nature09534. PMC 3042601. PMID 20981092.

[42] “An integrated map of genetic variation from 1,092 human genomes”. Nature 491 (7422): 56–65. (November 2012). Bibcode: 2012Natur.491...56T. doi:10.1038/nature11632. PMC 3498066. PMID 23128226.

[43] “A global reference for human genetic variation”. Nature 526 (7571): 68–74. (October 2015). Bibcode: 2015Natur.526...68T. doi:10.1038/nature15393. PMC 4750478. PMID 26432245.

[44] “An integrated map of structural variation in 2,504 human genomes”. Nature 526 (7571): 75–81. (October 2015). Bibcode: 2015Natur.526...75.. doi:10.1038/nature15394. PMC 4617611. PMID 26432246.

[45] “The Need for a Human Pangenome Reference Sequence”. Annual Review of Genomics and Human Genetics 22 (1): 81–102. (August 2021). doi:10.1146/annurev-genom-120120-081921. PMC 8410644. PMID 33929893.

[46] “The Human Pangenome Project: a global resource to map genomic diversity”. Nature 604 (7906): 437–446. (April 2022). Bibcode: 2022Natur.604..437W. doi:10.1038/s41586-022-04601-8. PMC 9402379. PMID 35444317.

[47] “Genome List - Genome - NCBI”. www.ncbi.nlm.nih.gov. 2022年8月18日閲覧。

[48] “Species List”. uswest.ensembl.org. 2022年8月18日閲覧。

[49] “GenArk: UCSC Genome Archive”. hgdownload.soe.ucsc.edu. 2022年8月18日閲覧。

[50] “Chimpanzee Genome Project” (英語). BCM-HGSC. (2016年3月4日) 2022年8月18日閲覧。

[51] “Great ape genetic diversity and population history”. Nature 499 (7459): 471–475. (July 2013). Bibcode: 2013Natur.499..471P. doi:10.1038/nature12228. PMC 3822165. PMID 23823723.

[52] “100K Pathogen Genome Project – Genomes for Public Health & Food Safety” (英語). 2022年8月18日閲覧。

[53] “Earth BioGenome Project: Sequencing life for the future of life”. Proceedings of the National Academy of Sciences of the United States of America 115 (17): 4325–4333. (April 2018). Bibcode: 2018PNAS..115.4325L. doi:10.1073/pnas.1720115115. PMC 5924910. PMID 29686065.

[54] “African BioGenome Project – Genomics in the service of conservation and improvement of African biological diversity” (英語). 2022年8月18日閲覧。

[55] “1000 Fungal Genomes Project”. mycocosm.jgi.doe.gov. 2022年8月18日閲覧。

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

[25]

[26]

[27]

[28]

[29]

[30]

[31]

[32]

[33]

[34]

[35]

[36]

[37]

[38]

[39]

[40]

[41]

[42]

[43]

[44]

[45]

[46]

[47]

[48]

[49]

[50]

[51]

[52]

[53]

[54]

[55]