Bioinformatique Licence L3: Lecture 4P. DerreumauxBanques et bases de données généralistesThe first issue of each year of Nucleic Acids Researchis devoted to articles on biological databases. do we need electronic databases ? 1. Data explosion 2. Data distributionWhat is a database ?1. computerized storehouse of data (records)2. allows user-defined queries3. allows extraction of specified records4. allows adding, changing, removing, and merging of records 5. uses standardized formatsStructure of databases1. ‘Structural’ Aspects (Id, ref, organism, links to other banks, function, phylogeny)2. ‘Expression’ Aspects (coding seq, exons, promoters, 3D structure, motifs, domains)3. Diminution of redondancies, e.g., NR Protein4. Quality criteria vary: automatic or human controlStructure «classique» d’un gène codant une protéine5’ UTR3’ UTRCDSintron intronCAAT TATA AUG T AAUAAexon exon exonMais un gène « classique », cela n’existe pas…. GenBank database recordLOCUS AF350270_1 691 aa linear INV 09-APR-2001StartDEFINITION fibroin 2 [Dolomedes tenebrosus].Accession codeACCESSION AAK30599PID g13561992VERSION AAK30599.1 GI:13561992DBSOURCE locus AF350270 accession AF350270.1KEYWORDS .SOURCE Dolomedes tenebrosus.ORGANISM Dolomedes tenebrosusEukaryota; Metazoa; Arthropoda; Chelicerata; Arachnida; Araneae;Identifiers Araneomorphae; Entelegynae; Lycosoidea; ...