DataManagement » History » Version 1

Anonymous, 07/09/2007 10:45 AM

1 1 Anonymous
= Data Management and Representation =
2 1 Anonymous
3 1 Anonymous
== Definitions ==
4 1 Anonymous
5 1 Anonymous
 * '''User''' (to be implemented in the future):
6 1 Anonymous
   * Represented by a unique email-address
7 1 Anonymous
   * Authentication to gain access to user data (requests, personalised settings)
8 1 Anonymous
   * Manage multiple requests
9 1 Anonymous
10 1 Anonymous
 * '''Request''':
11 1 Anonymous
   * Unique handle for the combination of a dataset, tasks, and request parameters
12 1 Anonymous
   * Request parameters: e.g. request description, settings for notification by email
13 1 Anonymous
14 1 Anonymous
 * '''Task''':
15 1 Anonymous
   * Something to be performed with the given dataset[[br]]
16 1 Anonymous
     e.g. calculation of PDB structure pictures (1D), or comparison of pairs of proteins with a given similarity method (2D)
17 1 Anonymous
   * Task parameters: e.g. parameters for each comparison method, output parameters for picture generation, ...
18 1 Anonymous
19 1 Anonymous
 * '''Job''':
20 1 Anonymous
   * Everything that lives in a queue[[br]]
21 1 Anonymous
     e.g. local queue (ProCKSI cluster), remote queue (University cluster), external queue (web service, grid)
22 1 Anonymous
   * Currently, a job is equal to a task:[[br]]
23 1 Anonymous
     e.g. task = pairwise comparison of the proteins in the ''entire'' dataset with ''several'' given similarity methods[[br]]
24 1 Anonymous
     jobs = ''separate'' jobs calculating all pairwise comparisons of the entire dataset with ''one'' similarity method
25 1 Anonymous
   * Future plans:[[br]]
26 1 Anonymous
     Divide 3D problem space into subsets of datasets and methods, each subset being an independent job[[br]]
27 1 Anonymous
     See next section for further details on the ''3D Problem Space''
28 1 Anonymous
29 1 Anonymous
 * '''Dataset''':
30 1 Anonymous
   * Currently: Collection of PDB structures, previously calculated similarity matrices 
31 1 Anonymous
   * Future plans: Previously calculated similarity matrices should be uploaded in a post-processing step, not in a pre-processing step (ticket:28)
32 1 Anonymous
33 1 Anonymous
 * '''Results''':
34 1 Anonymous
   * Currently, entire similarity matrices of different sources 
35 1 Anonymous
   * Future plans: Generate similarity matrices directly from single pairwise comparison results stored in the database
36 1 Anonymous
37 1 Anonymous
38 1 Anonymous
== The 3D Problem and Solution Spaces ==
39 1 Anonymous
 * '''Problem Space''':[[br]]
40 1 Anonymous
   The problem space for an all-against-all comparison of a dataset of P protein structures using M different similarity comparison methods can be represented a 3D cube: [[br]]
41 1 Anonymous
   x: Dataset: list of proteins[[br]]
42 1 Anonymous
   y: Dataset: list of proteins[[br]]
43 1 Anonymous
   z: Tasks: list of similarity comparison methods
44 1 Anonymous
 * '''Partitionig the Problem Space''':[[br]]
45 1 Anonymous
   For a most efficient calculation of all cells in the 3D problem space, it can be subdivided into sub-cubes, which are called jobs when placed into the queue of a queing system. Examples:[[br]]
46 1 Anonymous
   a. Comparison of ''one pair of proteins'' using ''one method'' in the task list => PxPxM jobs, each performing 1 comparison
47 1 Anonymous
   b. All-against-all comparison of the ''entire dataset'' with ''one one method'' => M jobs, each performing PxP comparisons
48 1 Anonymous
   c. Comparison of ''one pair of proteins'' using ''all methods'' in the task list => PxP jobs, each performing M comparisons
49 1 Anonymous
   d. Intelligent partitioning of the 3D problem space, comparing a subset of proteins with a subset of methods
50 1 Anonymous
 * '''Solution Space''':[[br]]
51 1 Anonymous
   Each similarity comparison ''methods'' can provide several similarity ''measures''[[br]]
52 1 Anonymous
   For one slice in the 3D problems space using one particular method, we might get several slices in the 3D solution space providing several measures
53 1 Anonymous