Zusammenfassung der Ressource
Reverse Genetics and Genome Projects
- Reverse Genetics
- Developed in 1980s
- Allowed detailed linkage mapping of human
genetic disorders using molecular markers
- Uses positional cloning to identify
disease-causing mutations
- Found links for Neurofibromatosis, Cystic
Fibrosis, fragile X-linked mental retardation
- Laborious process - involves long chromosome walks
- Chromosome walking (of CF gene)/Positional Cloning
- Create 2 genomic libraries of the same DNA, cut
with different restriction enzymes (EcoR1 and Sal1)
- Screen one of the libraries (EcoR1) with a probe (MET)
linked to the gene in question - find the fragment which binds
- D7S8 is another RFLP marker
at the other end of the CF gene
- Digest the fragment with multiple restriction enzymes and
separate with electrophoresis to create a restriction map
- Southern blot and mark with MET probe
- Construct a restriction map which shows where MET binds to the fragment
- Sal1 is last binding site nearest to CF gene
- Screen Sal1 digested genomic library with MET
- Find the fragment with Sal1 on either end
- Create a restriction map, as before
- Repeat the process until D7S8 is reached
- Combine all the restriction maps to create a contig map
- Identify which segment contains the CF gene
- Unknown gene linked to molecular markers by pedigree analysis
- Finding the phenotypic function of a DNA sequence
- Genome Project
- Radically advanced reverse genetics
- 1986 - first results from C. elegans genome mapping
- Discussions about human genome project
- Work on C. elegans started in 1960s (Brenner)
- Small differentiated organism
- Lineage of 959 somatic cells
known, 302 neurons mapped
- Coulson et al, 1986 created a physical map of C. elegans
- Cosmid libraries (plasmids containing bacterial DNA)
- Double digest restriction enzyme fingerprinting
- Computer generated contigs
- Fingerprinting
- Cosmid insert digested with Hind III
- Ends of digests labelled
- Second digest with Sau 3A
- Electrophoresis
- Autoradiography is used to find band
sharing of labelled fragments
- Information can be used to create a contig
- Produces many small fragments
- Allowed identification of 90-95% of the genome
- YACs were used after cosmids (linear DNA containing
gene being studied - originates from plasmid)
- Reduced the number of contigs - more overlaps were found
- Allows larger clones to be made
- Sulston et al, 1992 - started sequencing
- 95/100 Mb in contigs <40 gaps
- Human Genome
- Genetic linkage map
- Improved linkage analysis
- Microsatellite DNA
- Pedigree collection
- Software for simultaneous analysis of multiple markers
- 1416 loci mapped
- 279 genes, 339 microsatellites
- Complete contigs (1992)
- Chromosome 21 (Chumakov et al)
- Linked to several genetic diseases
- Y-chromosome (Foote et al)
- Long arm - heterochromain,
variable length (<10 Kb complexity)
- Short arm - euchromatin, constant length -
Y-specific genes and X chromosome homologues
- YAC Fingerprints (Bellane Chantelot et all)
- 15-20% of genome in YAC contigs
- Methods Used in Human Genome Project
- YAC Fingerprinting
- YACs digested with restriction
enzymes and Southern blotted
- Hybridized to L1
- Overlaps found
- PCR Screening with HC21q STSs
- 3 YAC libraries from cell lines - heirarchial
- 180 HC21 specific YAC clones and telomere containing YACs - individually
- PCR Screening with HCY STSs (sequence-tagged site)
- YAC library from XYYYY male cell line - heirarchial
- Individual YACs sized on PFGE
- By 1995
- Integrated YAC contigs for chromosome 3,12, 21, 22, 16, Y
- 83 Mb of cDNA sequence
- 75% of genome in 225 contigs
- 21.3 Mb of C. elegans sequence
- Expressed Sequence Tags (ESTs)
- Short cDNA sequence
- Used for gene sequence determination and gene discovery
- Individual clones from a cDNA library
- Represent portions of expressed genes
- ~500-800 nucleotides long
- 88 000 unique cDNA sequences from 37 human tissues
- 300 cDNA libraries
- <20% have no value
- >50% new genes
- Overlapped to form THCs (tentative
human consensus sequences) - contigs
- Errors
- Small sequence gaps
- Repetitive DNA errors
- Cloning artefacts
- Problems with Gene Prediction
- Introns
- Alternative splicing
- Low gene density
- Trans-splicing
- 25% of genes are in operons
- Repeat sequences
- e.g. can't tell if A and B form a contig with each other or
with the same repeat sequence at different locations
- Combining repeat sequences will cause over collapsing of contig
- Unitig - contig formed from unique sequences
- Unitigger - distinguishes between true alignment
and alignment due to repeat sequences
- Prevents overcollapse
- Based on the coverage that sequence is expected to
have, not the number of reads overlapping in the sequence
- Celera Pipeline
- Fragments/repeat library
- Screener produces fragments and marks
- Overlapper produces fragment overlaps
- Unitigger produces unitigs and overlaps
- Scaffolder produces unitigs,
contigs, link bundles and scaffolds
- Consensus sequence
- Assembly and evidence
- Links and distances
- Fragment store
- Fragment store
- Repeat resolver
- Products
- Fragment store