DataStorage » History » Version 2
Version 1 (Anonymous, 10/05/2007 09:34 PM) → Version 2/16 (Anonymous, 10/05/2007 09:38 PM)
= Data Storage =
This page describes the design of the database that is/will be used in order to store all necessary pieces information that are obtained from the "stand-alone" ProCKSI application (see [wiki:DataStandardisation]).
== Database Design for the (static) Protein Multivers ==
[[Image(ProteinMultiverseDataBase.png)]] [attachment:ProteinMultiverseDataBase.png]
'''Explenation of the database design''':
* There are multiple similarity comparison ''Methods'': e.g. USM, MaxCMO, DaliLite, ...
* There are multiple similarity ''Measures'': e.g. Z-score, TM-score, Number of Alignments, ...
* Some ''Methods'' produce ''Measures'' with the same name, but not necessarily the same meaning: e.g. DaliLite/Z, TMalign/Z, ...[[br]]
Thus, a ''!MethodMeasures'' relation is necessary.
* Each ''Method'' can have multiple (different) ''Parameters'': e.g. USM/Compressor, USM/Equation, ...
* Each ''Method'' can have multiple (different) ''!ParameterOptions'': USM/Compressor/bzip, USM/Compressor/gzip, ...
* A "!ParameterSet" is used to calculate the ''Similarity'' of ''!StructurePairs''. It is a collection of specific ''!ParameterSetOptions''. If a ''Method'' does not use any parameters, it is not included in the ''!ParameterSet'', but accessible via the ''!MethodMeasure'' relation.
* The ''!StructurePairs'' relation holds all possible combinations of ''Structures'', and a link to a further ''Results'' file in XML format. This file may contain results for multiple ''!StructurePairs'', e.g. alignments, matrices, etc.
* Each ''Structure'' is uniquely determined by its PDB code, model and chain. (Domains are not taken into accout yet.) The location of the PDB file is given and a link to a further ''Results'' file in XML format. This file may contain additional information for multiple ''Structures'', e.g. sequence, secondary structure, experimental resolution, ...
* Each ''Structure'' is extended by further classifiction information from ''CATH'' and ''SCOP''.
This page describes the design of the database that is/will be used in order to store all necessary pieces information that are obtained from the "stand-alone" ProCKSI application (see [wiki:DataStandardisation]).
== Database Design for the (static) Protein Multivers ==
[[Image(ProteinMultiverseDataBase.png)]] [attachment:ProteinMultiverseDataBase.png]
'''Explenation of the database design''':
* There are multiple similarity comparison ''Methods'': e.g. USM, MaxCMO, DaliLite, ...
* There are multiple similarity ''Measures'': e.g. Z-score, TM-score, Number of Alignments, ...
* Some ''Methods'' produce ''Measures'' with the same name, but not necessarily the same meaning: e.g. DaliLite/Z, TMalign/Z, ...[[br]]
Thus, a ''!MethodMeasures'' relation is necessary.
* Each ''Method'' can have multiple (different) ''Parameters'': e.g. USM/Compressor, USM/Equation, ...
* Each ''Method'' can have multiple (different) ''!ParameterOptions'': USM/Compressor/bzip, USM/Compressor/gzip, ...
* A "!ParameterSet" is used to calculate the ''Similarity'' of ''!StructurePairs''. It is a collection of specific ''!ParameterSetOptions''. If a ''Method'' does not use any parameters, it is not included in the ''!ParameterSet'', but accessible via the ''!MethodMeasure'' relation.
* The ''!StructurePairs'' relation holds all possible combinations of ''Structures'', and a link to a further ''Results'' file in XML format. This file may contain results for multiple ''!StructurePairs'', e.g. alignments, matrices, etc.
* Each ''Structure'' is uniquely determined by its PDB code, model and chain. (Domains are not taken into accout yet.) The location of the PDB file is given and a link to a further ''Results'' file in XML format. This file may contain additional information for multiple ''Structures'', e.g. sequence, secondary structure, experimental resolution, ...
* Each ''Structure'' is extended by further classifiction information from ''CATH'' and ''SCOP''.