Introduction |
|
1 | (1) |
|
What This Book Does for You |
|
|
1 | (1) |
|
|
2 | (1) |
|
How This Book Is Organized |
|
|
2 | (2) |
|
Part I: Getting Started in Bioinformatics |
|
|
3 | (1) |
|
Part II: A Survival Guide to Bioinformatics |
|
|
3 | (1) |
|
Part III: Becoming a Pro in Sequence Analysis |
|
|
3 | (1) |
|
Part IV: Becoming a Specialist: Advanced Bioinformatics Techniques |
|
|
3 | (1) |
|
|
4 | (1) |
|
|
4 | (1) |
|
|
4 | (3) |
|
Part I: Getting Started in Bioinformatics |
|
|
7 | (60) |
|
Finding Out What Bioinformatics Can Do for You |
|
|
9 | (20) |
|
|
9 | (1) |
|
Analyzing Protein Sequences |
|
|
10 | (7) |
|
A brief history of sequence analysis |
|
|
12 | (1) |
|
Reading protein sequences from N to C |
|
|
13 | (1) |
|
Working with protein 3-D structures |
|
|
14 | (2) |
|
Protein bioinformatics covered in this book |
|
|
16 | (1) |
|
|
17 | (4) |
|
Reading DNA sequences the right way |
|
|
17 | (1) |
|
The two sides of a DNA sequence |
|
|
18 | (2) |
|
Palindromes in DNA sequences |
|
|
20 | (1) |
|
|
21 | (2) |
|
RNA structures: Playing with sticky strands |
|
|
22 | (1) |
|
More on nucleic acid nomenclature |
|
|
23 | (1) |
|
DNA Coding Regions: Pretending to Work with Protein Sequences |
|
|
23 | (3) |
|
Turning DNA into proteins: The genetic code |
|
|
24 | (1) |
|
More with coding DNA sequences |
|
|
25 | (1) |
|
DNA/RNA bioinformatics covered in this book |
|
|
26 | (1) |
|
Working with Entire Genomes |
|
|
26 | (3) |
|
Genomics: Getting all the genes at once |
|
|
27 | (1) |
|
Genome bioinformatics covered in this book |
|
|
28 | (1) |
|
How Most People Use Bioinformatics |
|
|
29 | (38) |
|
Becoming an Instant Expert with PubMed/Medline |
|
|
29 | (13) |
|
Finding out about a protein by its name |
|
|
30 | (2) |
|
Searching PubMed using author's names |
|
|
32 | (3) |
|
Searching PubMed using fields |
|
|
35 | (3) |
|
Searching PubMed using limits |
|
|
38 | (3) |
|
A few more tips about PubMed |
|
|
41 | (1) |
|
Retrieving Protein Sequences |
|
|
42 | (9) |
|
ExPASy: A prime Internet site for protein information |
|
|
42 | (3) |
|
More advanced ways to retrieve protein sequences |
|
|
45 | (3) |
|
Retrieving a list of related protein sequences |
|
|
48 | (3) |
|
|
51 | (6) |
|
Not all DNA is coding for protein |
|
|
51 | (1) |
|
Going from protein sequences to DNA sequences |
|
|
52 | (1) |
|
Retrieving the DNA sequence relevant to my protein |
|
|
53 | (4) |
|
Using BLAST to Compare My Protein Sequence to Other Protein Sequences |
|
|
57 | (5) |
|
Making a Multiple Protein Sequence Alignment with ClustalW |
|
|
62 | (5) |
|
Part II: A Survival Guide to Bioinformatics |
|
|
67 | (130) |
|
Using Nucleotide Sequence Databases |
|
|
69 | (36) |
|
Reading into Genes and Genomes |
|
|
70 | (3) |
|
Prokaryotes: Small bugs, simple genes |
|
|
70 | (2) |
|
Eukaryotes: Bigger bugs, complex genes |
|
|
72 | (1) |
|
Making Use (and Sense) of GenBank |
|
|
73 | (13) |
|
Making sense of the GenBank entry of a prokaryotic gene |
|
|
73 | (5) |
|
Making sense of the GenBank entry of an eukaryotic mRNA |
|
|
78 | (1) |
|
Making sense of a GenBank eukaryotic genomic entry |
|
|
79 | (5) |
|
Working with related GenBank entries |
|
|
84 | (1) |
|
Retrieving GenBank entries without accession numbers |
|
|
85 | (1) |
|
Using a Gene-Centric Database |
|
|
86 | (2) |
|
Working with Whole-Genome Databases |
|
|
88 | (9) |
|
Working with complete viral genomes |
|
|
89 | (3) |
|
Working with complete bacterial genomes |
|
|
92 | (2) |
|
More bacterial genomics at TIGR |
|
|
94 | (2) |
|
Microbes from the environment at DoE |
|
|
96 | (1) |
|
Exploring the Human Genome |
|
|
97 | (8) |
|
Finding out about the Ensembl project |
|
|
98 | (7) |
|
Using Protein and Specialized Sequence Databases |
|
|
105 | (24) |
|
From Translated ORFs to Mature Proteins |
|
|
107 | (3) |
|
ORFs: What you see is NOT what you get |
|
|
107 | (2) |
|
A personal final destination for each protein |
|
|
109 | (1) |
|
A combinatorial diversity of folds and functions |
|
|
109 | (1) |
|
Reading a Swiss-Prot Entry |
|
|
110 | (13) |
|
Deciphering the EGFR Swiss-Prot entry |
|
|
110 | (1) |
|
General information about the entry |
|
|
111 | (1) |
|
Name and origin of the protein |
|
|
112 | (2) |
|
|
114 | (1) |
|
|
114 | (2) |
|
|
116 | (2) |
|
|
118 | (1) |
|
|
119 | (4) |
|
Finally, the sequence itself |
|
|
123 | (1) |
|
Finding Out More about Your Protein |
|
|
123 | (6) |
|
Finding out more about ``modified amino acids |
|
|
124 | (1) |
|
Some advanced biochemistry sites |
|
|
125 | (1) |
|
Finding out more about biochemical pathways |
|
|
125 | (1) |
|
Finding out more about protein structures |
|
|
126 | (1) |
|
Finding out more about major protein families |
|
|
127 | (2) |
|
Working with a Single DNA Sequence |
|
|
129 | (30) |
|
Catching Errors Before It's Too Late |
|
|
130 | (4) |
|
Removing vector sequences |
|
|
130 | (3) |
|
Cases when you shouldn't discard your sequence |
|
|
133 | (1) |
|
Computing/Verifying a Restriction Map |
|
|
134 | (1) |
|
|
135 | (3) |
|
Analyzing DNA Composition |
|
|
138 | (7) |
|
Establishing the G+C content of your sequence |
|
|
138 | (1) |
|
Counting words in DNA sequences |
|
|
139 | (1) |
|
Counting long words in DNA sequences |
|
|
140 | (2) |
|
Experimenting with other DNA composition analyses |
|
|
142 | (1) |
|
Finding internal repeats in your sequence |
|
|
142 | (3) |
|
Identifying genome-specific repeats in your sequence |
|
|
145 | (1) |
|
Finding Protein-Coding Regions |
|
|
145 | (8) |
|
|
146 | (2) |
|
Analyzing your DNA sequence with GeneMark |
|
|
148 | (1) |
|
Finding internal exons in vertebrate genomic sequences |
|
|
149 | (2) |
|
Complete gene parsing for eukaryotic genomes |
|
|
151 | (1) |
|
Analyzing your sequence with GenomeScan |
|
|
151 | (2) |
|
Assembling Sequence Fragments |
|
|
153 | (4) |
|
Managing large sequencing projects with public software |
|
|
154 | (1) |
|
Assembling your sequences with CAP3 |
|
|
155 | (2) |
|
|
157 | (2) |
|
Working with a Single Protein Sequence |
|
|
159 | (38) |
|
Doing Biochemistry on a Computer |
|
|
160 | (6) |
|
Predicting the main physico-chemical properties of a protein |
|
|
161 | (3) |
|
Interpreting ProtParam results |
|
|
164 | (2) |
|
Digesting a protein in a computer |
|
|
166 | (1) |
|
Doing Primary Structure Analysis |
|
|
166 | (8) |
|
Looking for transmembrane segments |
|
|
168 | (6) |
|
Looking for coiled-coil regions |
|
|
174 | (1) |
|
Predicting Post-Translational Modifications in Your Protein |
|
|
174 | (6) |
|
Looking for PROSITE patterns |
|
|
175 | (2) |
|
Interpreting ScanProsite results |
|
|
177 | (3) |
|
Finding Known Domains in Your Protein |
|
|
180 | (14) |
|
Choosing the right collection of domains |
|
|
182 | (1) |
|
Finding domains with InterProScan |
|
|
183 | (2) |
|
Interpreting InterProScan results |
|
|
185 | (2) |
|
Finding domains with the CD server |
|
|
187 | (2) |
|
Interpreting and understanding CD server results |
|
|
189 | (1) |
|
Finding domains with Motif Scan |
|
|
190 | (4) |
|
Discovering New Domains in Your Proteins |
|
|
194 | (1) |
|
More Protein Analysis for Free over the Internet |
|
|
194 | (3) |
|
Part III: Becoming a Pro in Sequence Analysis |
|
|
197 | (130) |
|
Similarity Searches on Sequence Databases |
|
|
199 | (36) |
|
Understanding the Importance of Similarity |
|
|
200 | (1) |
|
The Most Popular Data-Mining Tool Ever: Blast |
|
|
201 | (18) |
|
Blasting protein sequences |
|
|
201 | (8) |
|
Understanding your Blast output |
|
|
209 | (7) |
|
|
216 | (2) |
|
The Blast way of doing things |
|
|
218 | (1) |
|
Controlling Blast: Choosing the Right Parameters |
|
|
219 | (7) |
|
Controlling the sequence masking |
|
|
220 | (3) |
|
Changing the Blast alignment parameters |
|
|
223 | (1) |
|
Controlling the Blast output |
|
|
224 | (2) |
|
Making Blast Iterative with PSI-Blast |
|
|
226 | (5) |
|
PSI-Blasting protein sequences |
|
|
226 | (2) |
|
Avoiding mistakes when running PSI-Blast |
|
|
228 | (2) |
|
Discovering and using protein domains with Blast and PSI-Blast |
|
|
230 | (1) |
|
Similarity Searches for Free over the Internet |
|
|
231 | (4) |
|
|
235 | (30) |
|
Making Sure You Have the Right Sequences and the Right Methods |
|
|
236 | (3) |
|
Choosing the right sequences |
|
|
236 | (1) |
|
Choosing the right method |
|
|
237 | (2) |
|
|
239 | (15) |
|
Choosing the right dot-plot flavor |
|
|
240 | (1) |
|
Using Dotlet over the Internet |
|
|
241 | (8) |
|
Doing biological analysis with a dot plot |
|
|
249 | (5) |
|
Making Local Alignments over the Internet |
|
|
254 | (7) |
|
Choosing the right local-alignment flavor |
|
|
255 | (1) |
|
Using Lalign to find the ten best local alignments |
|
|
256 | (2) |
|
Interpreting the Lalign output |
|
|
258 | (3) |
|
Making Global Alignments over the Internet |
|
|
261 | (1) |
|
Using Lalign to Make a Global Alignment |
|
|
262 | (1) |
|
Aligning Proteins and DNA |
|
|
262 | (1) |
|
Free Pairwise Sequence Comparisons over the Internet |
|
|
262 | (3) |
|
Building a Multiple Sequence Alignment |
|
|
265 | (38) |
|
Finding Out if a Multiple Sequence Alignment Can Help You |
|
|
266 | (4) |
|
Identifying situations where multiple alignments do not help |
|
|
267 | (1) |
|
Helping your research with multiple sequence alignments |
|
|
267 | (3) |
|
Choosing the Right Sequences |
|
|
270 | (11) |
|
The kinds of sequences you're looking for |
|
|
271 | (4) |
|
Gathering your sequences with online Blast servers |
|
|
275 | (6) |
|
Choosing the Right Method of Multiple Sequence Alignment |
|
|
281 | (10) |
|
|
282 | (5) |
|
Aligning sequences and structures with Tcoffee |
|
|
287 | (4) |
|
Crunching large datasets with MUSCLE |
|
|
291 | (1) |
|
Interpreting Your Multiple Sequence Alignment |
|
|
291 | (6) |
|
Recognizing the good parts in a protein alignment |
|
|
292 | (2) |
|
Taking your multiple alignment further |
|
|
294 | (3) |
|
Comparing Sequences That You Can't Align |
|
|
297 | (2) |
|
Making multiple local alignments with the Gibbs sampler |
|
|
298 | (1) |
|
Searching conserved patterns |
|
|
299 | (1) |
|
Internet Resources for Doing Multiple Sequence Comparisons |
|
|
299 | (4) |
|
Making multiple alignments with ClustalW around the clock |
|
|
300 | (1) |
|
Finding your favorite alignment method |
|
|
300 | (1) |
|
Searching for motifs or patterns |
|
|
301 | (2) |
|
Editing and Publishing Alignments |
|
|
303 | (24) |
|
Getting Your Multiple Alignment in the Right Format |
|
|
305 | (8) |
|
Recognizing the main formats |
|
|
307 | (1) |
|
Working with the right format |
|
|
307 | (2) |
|
|
309 | (3) |
|
Watching out for lost data |
|
|
312 | (1) |
|
Using Jalview to Edit Your Multiple Alignment Online |
|
|
313 | (6) |
|
|
314 | (2) |
|
Editing a group of sequences |
|
|
316 | (2) |
|
Useful features of Jalview |
|
|
318 | (1) |
|
Saving your alignment in Jalview |
|
|
318 | (1) |
|
Preparing Your Multiple Alignment for Publication |
|
|
319 | (4) |
|
|
319 | (3) |
|
|
322 | (1) |
|
Editing and Analyzing Multiple Sequence Alignments for Free over the Internet |
|
|
323 | (4) |
|
Finding multiple-sequence-alignment editors |
|
|
323 | (1) |
|
Finding tools to interpret your multiple sequence alignment |
|
|
324 | (1) |
|
Finding tools for beautifying your multiple alignments |
|
|
325 | (2) |
|
Part IV: Becoming a Specialist: Advanced Bioinformatics Techniques |
|
|
327 | (76) |
|
Working with Protein 3-D Structures |
|
|
329 | (24) |
|
From Primary to Secondary Structures |
|
|
330 | (6) |
|
Predicting the secondary structure of a protein sequence |
|
|
330 | (4) |
|
Predicting additional structural features |
|
|
334 | (2) |
|
From the Primary Structure to the 3-D Structure |
|
|
336 | (14) |
|
Retrieving and displaying a 3-D structure from a PDB site |
|
|
337 | (3) |
|
Guessing the 3-D structure of your protein |
|
|
340 | (3) |
|
Looking at sequence features in 3-D |
|
|
343 | (7) |
|
|
350 | (3) |
|
Finding proteins with similar shapes |
|
|
350 | (1) |
|
Finding other PDB viewers |
|
|
350 | (1) |
|
Classifying your PDB structure |
|
|
351 | (1) |
|
|
351 | (1) |
|
Folding proteins in a computer |
|
|
351 | (1) |
|
Threading sequences onto PDB structures |
|
|
351 | (1) |
|
Looking at structures in movement |
|
|
352 | (1) |
|
|
352 | (1) |
|
|
353 | (18) |
|
Predicting, Modeling and Drawing RNA Secondary Structures |
|
|
354 | (1) |
|
|
355 | (7) |
|
Interpreting mfold results |
|
|
359 | (2) |
|
Forcing interaction in mfold |
|
|
361 | (1) |
|
Searching Databases and Genomes for RNA Sequences |
|
|
362 | (5) |
|
Finding tRNAs in a genome |
|
|
363 | (1) |
|
Using PatScan to look for RNA patterns |
|
|
363 | (4) |
|
Finding the ``New'' RNAs: miRNAs and siRNAs |
|
|
367 | (1) |
|
Doing RNA Analysis for Free over the Internet |
|
|
368 | (3) |
|
Studying evolution with ribosomal RNA |
|
|
369 | (1) |
|
Finding the small, non-coding RNA you need |
|
|
369 | (1) |
|
|
370 | (1) |
|
Building Phylogenetic Trees |
|
|
371 | (32) |
|
Finding Out What Phylogenetic Trees Can Do for You |
|
|
372 | (1) |
|
Preparing Your Phylogenetic Data |
|
|
373 | (10) |
|
Choosing the right sequences for the right tree |
|
|
374 | (6) |
|
Preparing your multiple sequence alignment |
|
|
380 | (3) |
|
Building the Kind of Tree You Need |
|
|
383 | (17) |
|
|
383 | (15) |
|
Knowing what's what in your tree |
|
|
398 | (1) |
|
Displaying your phylogenetic tree |
|
|
399 | (1) |
|
Doing Phylogeny for Free over the Internet |
|
|
400 | (3) |
|
|
400 | (1) |
|
Finding generic resources |
|
|
401 | (1) |
|
Collections of orthologous genes |
|
|
402 | (1) |
|
|
403 | (14) |
|
The Ten (Okay, Twelve) Commandments for Using Servers |
|
|
405 | (6) |
|
Keep in Mind: Your Data Is Never Secure on the Web |
|
|
406 | (1) |
|
Remember the Server, the Database, and the Program Version You Used |
|
|
406 | (1) |
|
Write Down the Sequence-Identification Numbers |
|
|
407 | (1) |
|
Write Down the Program Parameters |
|
|
407 | (1) |
|
Save Your Internet Results the Right Way |
|
|
407 | (1) |
|
|
408 | (1) |
|
Make Sure You Can Trust Your Alignments |
|
|
408 | (1) |
|
Use Different Programs to Check Borderline Results |
|
|
409 | (1) |
|
Stay Away from Unpublished Methods! |
|
|
409 | (1) |
|
Databases Are Not Like Good Wine |
|
|
409 | (1) |
|
Just Because It Looks Free Doesn't Mean It Is Free |
|
|
410 | (1) |
|
Biting the Bullet at the Right Time |
|
|
410 | (1) |
|
Some Useful Bioinformatics Resources |
|
|
411 | (6) |
|
|
411 | (1) |
|
Ten Major Bioinformatics Software Programs |
|
|
412 | (2) |
|
Ten Major Resource Locators |
|
|
414 | (1) |
|
Some Places to Find Out What's Really Going On |
|
|
415 | (2) |
Index |
|
417 | |