DataManagement » History » Version 1
Anonymous, 07/09/2007 10:45 AM
| 1 | 1 | Anonymous | = Data Management and Representation = |
|---|---|---|---|
| 2 | 1 | Anonymous | |
| 3 | 1 | Anonymous | == Definitions == |
| 4 | 1 | Anonymous | |
| 5 | 1 | Anonymous | * '''User''' (to be implemented in the future): |
| 6 | 1 | Anonymous | * Represented by a unique email-address |
| 7 | 1 | Anonymous | * Authentication to gain access to user data (requests, personalised settings) |
| 8 | 1 | Anonymous | * Manage multiple requests |
| 9 | 1 | Anonymous | |
| 10 | 1 | Anonymous | * '''Request''': |
| 11 | 1 | Anonymous | * Unique handle for the combination of a dataset, tasks, and request parameters |
| 12 | 1 | Anonymous | * Request parameters: e.g. request description, settings for notification by email |
| 13 | 1 | Anonymous | |
| 14 | 1 | Anonymous | * '''Task''': |
| 15 | 1 | Anonymous | * Something to be performed with the given dataset[[br]] |
| 16 | 1 | Anonymous | e.g. calculation of PDB structure pictures (1D), or comparison of pairs of proteins with a given similarity method (2D) |
| 17 | 1 | Anonymous | * Task parameters: e.g. parameters for each comparison method, output parameters for picture generation, ... |
| 18 | 1 | Anonymous | |
| 19 | 1 | Anonymous | * '''Job''': |
| 20 | 1 | Anonymous | * Everything that lives in a queue[[br]] |
| 21 | 1 | Anonymous | e.g. local queue (ProCKSI cluster), remote queue (University cluster), external queue (web service, grid) |
| 22 | 1 | Anonymous | * Currently, a job is equal to a task:[[br]] |
| 23 | 1 | Anonymous | e.g. task = pairwise comparison of the proteins in the ''entire'' dataset with ''several'' given similarity methods[[br]] |
| 24 | 1 | Anonymous | jobs = ''separate'' jobs calculating all pairwise comparisons of the entire dataset with ''one'' similarity method |
| 25 | 1 | Anonymous | * Future plans:[[br]] |
| 26 | 1 | Anonymous | Divide 3D problem space into subsets of datasets and methods, each subset being an independent job[[br]] |
| 27 | 1 | Anonymous | See next section for further details on the ''3D Problem Space'' |
| 28 | 1 | Anonymous | |
| 29 | 1 | Anonymous | * '''Dataset''': |
| 30 | 1 | Anonymous | * Currently: Collection of PDB structures, previously calculated similarity matrices |
| 31 | 1 | Anonymous | * Future plans: Previously calculated similarity matrices should be uploaded in a post-processing step, not in a pre-processing step (ticket:28) |
| 32 | 1 | Anonymous | |
| 33 | 1 | Anonymous | * '''Results''': |
| 34 | 1 | Anonymous | * Currently, entire similarity matrices of different sources |
| 35 | 1 | Anonymous | * Future plans: Generate similarity matrices directly from single pairwise comparison results stored in the database |
| 36 | 1 | Anonymous | |
| 37 | 1 | Anonymous | |
| 38 | 1 | Anonymous | == The 3D Problem and Solution Spaces == |
| 39 | 1 | Anonymous | * '''Problem Space''':[[br]] |
| 40 | 1 | Anonymous | The problem space for an all-against-all comparison of a dataset of P protein structures using M different similarity comparison methods can be represented a 3D cube: [[br]] |
| 41 | 1 | Anonymous | x: Dataset: list of proteins[[br]] |
| 42 | 1 | Anonymous | y: Dataset: list of proteins[[br]] |
| 43 | 1 | Anonymous | z: Tasks: list of similarity comparison methods |
| 44 | 1 | Anonymous | * '''Partitionig the Problem Space''':[[br]] |
| 45 | 1 | Anonymous | For a most efficient calculation of all cells in the 3D problem space, it can be subdivided into sub-cubes, which are called jobs when placed into the queue of a queing system. Examples:[[br]] |
| 46 | 1 | Anonymous | a. Comparison of ''one pair of proteins'' using ''one method'' in the task list => PxPxM jobs, each performing 1 comparison |
| 47 | 1 | Anonymous | b. All-against-all comparison of the ''entire dataset'' with ''one one method'' => M jobs, each performing PxP comparisons |
| 48 | 1 | Anonymous | c. Comparison of ''one pair of proteins'' using ''all methods'' in the task list => PxP jobs, each performing M comparisons |
| 49 | 1 | Anonymous | d. Intelligent partitioning of the 3D problem space, comparing a subset of proteins with a subset of methods |
| 50 | 1 | Anonymous | * '''Solution Space''':[[br]] |
| 51 | 1 | Anonymous | Each similarity comparison ''methods'' can provide several similarity ''measures''[[br]] |
| 52 | 1 | Anonymous | For one slice in the 3D problems space using one particular method, we might get several slices in the 3D solution space providing several measures |
| 53 | 1 | Anonymous |