DataStandardisation » History » Version 6

Paweł Widera, 10/05/2007 08:35 PM
XML output definition added.

1 5 Paweł Widera
= Standardising Results with XML =  
2 1 Anonymous
3 4 Anonymous
ProCKSI utilises a variety of similarity comparison methods (e.g. USM, MaxCMO, TMaling, ...) producing different similarity measures  (e.g. Zscore, TMscore, RMSD, ...) each. Each of the comparison methods produces output with different formats and additional content such as alignments, rotation matrix, etc. Some of them produce just one output file, others a set of linked HTML files.
4 1 Anonymous
5 5 Paweł Widera
== Input ==
6 1 Anonymous
7 6 Paweł Widera
Optional tags: '''exclude''' (measure, result), '''log''' (no log is generated if not specified) [[BR]]
8 6 Paweł Widera
Optional attributes: '''description'''
9 1 Anonymous
10 1 Anonymous
{{{
11 5 Paweł Widera
<job id="ID" description="TEXT">
12 5 Paweł Widera
  <log filename="FILENAME" />
13 5 Paweł Widera
  
14 5 Paweł Widera
  <input type="structure|tree|contact map|similarity matrix">
15 5 Paweł Widera
    <item id="ID" label="TEXT" filename="FILENAME" />
16 6 Paweł Widera
    :::
17 5 Paweł Widera
    <item id="ID" label="TEXT" filename="FILENAME" />
18 5 Paweł Widera
  </input>  
19 1 Anonymous
20 5 Paweł Widera
  <method id="ID" name="TEXT">
21 5 Paweł Widera
    <param name="TEXT">VALUE</param>
22 6 Paweł Widera
    :::
23 5 Paweł Widera
    <param name="TEXT">VALUE</param>
24 1 Anonymous
25 5 Paweł Widera
    <exclude>
26 5 Paweł Widera
      <measure>NAME</measure>
27 6 Paweł Widera
      :::
28 5 Paweł Widera
      <measure>NAME</measure>
29 5 Paweł Widera
      
30 5 Paweł Widera
      <result>NAME</result>
31 6 Paweł Widera
      :::
32 5 Paweł Widera
      <result>NAME</result>
33 5 Paweł Widera
    </exclude>
34 5 Paweł Widera
  </method>
35 6 Paweł Widera
  :::
36 6 Paweł Widera
  <method ...>
37 1 Anonymous
    ...
38 1 Anonymous
  </method>
39 1 Anonymous
</job>
40 1 Anonymous
}}}
41 1 Anonymous
42 1 Anonymous
The data used as an input could be protein structures, similarity trees, contact maps or similarity matrices. All specified methods should be able to operate on given data files. This dependency could be verified automatically using XML Schema.
43 6 Paweł Widera
44 6 Paweł Widera
== Output ==
45 6 Paweł Widera
46 6 Paweł Widera
Optional tags: '''log''', '''message''', '''similarity''' (used only if output is a ''comparison'') [[BR]]
47 6 Paweł Widera
Optional attributes: '''description''', '''node''', '''start''', '''end''', '''ref_id''' (only if output type is ''composition''), '''ref_id2''' (only if output type is not ''comparison'')
48 6 Paweł Widera
49 6 Paweł Widera
{{{
50 6 Paweł Widera
<job id="ID" description="TEXT" node="TEXT" start="TIME" end="TIME">
51 6 Paweł Widera
  <log filename="FILENAME" />
52 6 Paweł Widera
53 6 Paweł Widera
  <message type="error|warning|info">TEXT</message>
54 6 Paweł Widera
  :::
55 6 Paweł Widera
  <message type="error|warning|info">TEXT</message>
56 6 Paweł Widera
  
57 6 Paweł Widera
  <input type="structure|tree|contact map|similarity matrix">
58 6 Paweł Widera
    <item id="ID" label="TEXT" filename="FILENAME" />
59 6 Paweł Widera
    :::
60 6 Paweł Widera
    <item id="ID" label="TEXT" filename="FILENAME" />
61 6 Paweł Widera
  </input>  
62 6 Paweł Widera
63 6 Paweł Widera
  <parameters>
64 6 Paweł Widera
    <method id="ID" name="NAME">
65 6 Paweł Widera
      <parameter name="TEXT">VALUE</parameter>
66 6 Paweł Widera
      :::
67 6 Paweł Widera
      <parameter name="TEXT">VALUE</parameter>
68 6 Paweł Widera
    </method>
69 6 Paweł Widera
    :::
70 6 Paweł Widera
    <method ...>
71 6 Paweł Widera
      ...
72 6 Paweł Widera
    </method>
73 6 Paweł Widera
  </parameters>
74 6 Paweł Widera
75 6 Paweł Widera
  <output type="transformation|comparison|composition" ref_id="" ref_id2=" ">
76 6 Paweł Widera
    <method id="ID">
77 6 Paweł Widera
      <message type="error|warning|info">TEXT</message>
78 6 Paweł Widera
      :::
79 6 Paweł Widera
      <message type="error|warning|info">TEXT</message>
80 6 Paweł Widera
81 6 Paweł Widera
      <similarity measure="NAME">VALUE</similarity>
82 6 Paweł Widera
      :::
83 6 Paweł Widera
      <similarity measure="NAME">VALUE</similarity>
84 6 Paweł Widera
85 6 Paweł Widera
      <file type="TEXT" label="TEXT" name="FILENAME" />
86 6 Paweł Widera
      :::
87 6 Paweł Widera
      <file type="TEXT" label="TEXT" name="FILENAME" />
88 6 Paweł Widera
    <method>
89 6 Paweł Widera
  </output>
90 6 Paweł Widera
  :::
91 6 Paweł Widera
  <output ...>
92 6 Paweł Widera
    ...
93 6 Paweł Widera
  </output>
94 6 Paweł Widera
</job>
95 6 Paweł Widera
}}}
96 6 Paweł Widera
97 6 Paweł Widera
Message being an error, warning or additional information could be passed on a global or a method level. Input data and parameters defined in the input file could be repeated in the output if needed (self-contained output). Output could be a 1->1 transformation (e.g. structure -> contact map), a 2->1 comparison (e.g. 2*structure -> similarity measure) or N->1 composition (e.g. N*tree -> total tree or N*similarity matrix -> consensus similarity matrix). The results other than similarity measures for a pair of proteins are stored in external files and are just referenced from the XML file.
98 6 Paweł Widera
99 6 Paweł Widera
The alignment data could be described in the XML file, as there is no single format used by all programs. This yet to be decided.