DataStorage » History » Version 4
Anonymous, 10/05/2007 10:16 PM
1 | 1 | Anonymous | = Data Storage = |
---|---|---|---|
2 | 1 | Anonymous | |
3 | 3 | Anonymous | This page describes the design of the database that is/will be used in order to store all necessary pieces information that are obtained from the "stand-alone" ProCKSI ''core'' application (see [wiki:DataStandardisation]). |
4 | 1 | Anonymous | |
5 | 3 | Anonymous | == Database Design for the (static) Protein Multiverse == |
6 | 2 | Anonymous | [[Image(ProteinMultiverseDataBase.png)]] |
7 | 1 | Anonymous | |
8 | 1 | Anonymous | '''Explenation of the database design''': |
9 | 1 | Anonymous | * There are multiple similarity comparison ''Methods'': e.g. USM, MaxCMO, DaliLite, ... |
10 | 1 | Anonymous | * There are multiple similarity ''Measures'': e.g. Z-score, TM-score, Number of Alignments, ... |
11 | 1 | Anonymous | * Some ''Methods'' produce ''Measures'' with the same name, but not necessarily the same meaning: e.g. DaliLite/Z, TMalign/Z, ...[[br]] |
12 | 1 | Anonymous | Thus, a ''!MethodMeasures'' relation is necessary. |
13 | 1 | Anonymous | |
14 | 1 | Anonymous | * Each ''Method'' can have multiple (different) ''Parameters'': e.g. USM/Compressor, USM/Equation, ... |
15 | 1 | Anonymous | * Each ''Method'' can have multiple (different) ''!ParameterOptions'': USM/Compressor/bzip, USM/Compressor/gzip, ... |
16 | 1 | Anonymous | * A "!ParameterSet" is used to calculate the ''Similarity'' of ''!StructurePairs''. It is a collection of specific ''!ParameterSetOptions''. If a ''Method'' does not use any parameters, it is not included in the ''!ParameterSet'', but accessible via the ''!MethodMeasure'' relation. |
17 | 1 | Anonymous | |
18 | 1 | Anonymous | * The ''!StructurePairs'' relation holds all possible combinations of ''Structures'', and a link to a further ''Results'' file in XML format. This file may contain results for multiple ''!StructurePairs'', e.g. alignments, matrices, etc. |
19 | 1 | Anonymous | * Each ''Structure'' is uniquely determined by its PDB code, model and chain. (Domains are not taken into accout yet.) The location of the PDB file is given and a link to a further ''Results'' file in XML format. This file may contain additional information for multiple ''Structures'', e.g. sequence, secondary structure, experimental resolution, ... |
20 | 1 | Anonymous | |
21 | 1 | Anonymous | * Each ''Structure'' is extended by further classifiction information from ''CATH'' and ''SCOP''. |
22 | 3 | Anonymous | |
23 | 3 | Anonymous | |
24 | 3 | Anonymous | == Extended Database Design for the (static) Protein Multiverse == |
25 | 3 | Anonymous | |
26 | 4 | Anonymous | [[Image(ProteinMultiverseDataBaseExt1.png)]] |
27 | 3 | Anonymous | |
28 | 3 | Anonymous | This proposal for an extended database design for the (static) Protein Multiverse aims to include not only ''Comparisons'' but also ''Transformations'' (following the latest development of the I/O specificaitions for the ProCKSI "stand-alone" ''core'' application): |
29 | 3 | Anonymous | * A ''Transformation'' is a process that derives ONE (main) ''Result'' from ONE single input file.[[br]] |
30 | 3 | Anonymous | __Example__: The transformation of ''Structure'', ''Tree'', ''!SimilarityMatrix'', etc., using a certain ''Method'' with a certain ''!ParameterSet'', produces a contact map, a tree, ... |
31 | 3 | Anonymous | * A ''Comparison'' is a process that derives ONE (main) ''Result'' from TWO input files. [[br]] |
32 | 3 | Anonymous | __Example__ The comparison of ''Structures'', ''Trees'', etc., using a ''Method'' with a certain ''!ParameterSet'', produces a similarity value and an alignment |
33 | 3 | Anonymous | * A ''Composition'' is a process that derives ONE (main) ''Result'' from SEVERAL input files. [[br]] |
34 | 3 | Anonymous | __Example__ The composition of ''!SimilarityMatrices'', ''Trees'', using a ''Method'' with a certain ''!ParameterSet'', produces a consensus similarity matrix, a consensus tree, ... |