Datasets » History » Version 20

Version 19 (Anonymous, 04/23/2008 09:39 AM) → Version 20/21 (PaweÅ‚ Widera, 12/09/2013 02:45 PM)



h1. Datasets



h2. How to label datasets



Datasets are labeled in the following way:
<pre>

[[DatasetName]].tgz
[[DatasetName]]_Literature.tgz
[[DatasetName]]_NumberOfChains_Extraction.tgz
</pre>



whereas:

|| *DatasetName* *!DatasetName* || Authors or web that site proposed the dataset
||
||
*Literature* || Special name given in the literatur
||
|| *NumberOfChains*
*!NumberOfChains* || Number of (extracted) chains contained in the dataset
||
||
*Extraction* || Extraction of the models/chains: _none_, _first-first_, _first-all_, _all-all_ ||



h2. Available Datasets



The following datasets are available from the repository (special privileges required!):

|| *DatasetName* *!DatasetName* || *Extraction* || *!NumberOfChains*|| *!NumberOfChains* || *Size in MB*|| MB* || *Link*
||
LelukKoniecznyRoterman || LelukKoniecznyRoterman|| - || - || 3.5 || "Download":http://www.ico2s.org/data/instances/procksi/LelukKoniecznyRoterman.tgz [/procksi/datasets/LelukKoniecznyRoterman.tgz Download]
||
|| || first-first || 6 || 0.4 || "Download":http://www.ico2s.org/data/instances/procksi/LelukKoniecznyRoterman_6_first-first.tgz [/procksi/datasets/LelukKoniecznyRoterman_6_first-first.tgz Download]
||
|| || first-all || 15 || 0.9 || "Download":http://www.ico2s.org/data/instances/procksi/LelukKoniecznyRoterman_15_first-all.tgz [/procksi/datasets/LelukKoniecznyRoterman_15_first-all.tgz Download]

||

*!DatasetName* || *DatasetName* || *Extraction* || *!NumberOfChains*|| *!NumberOfChains* || *Size in MB*|| MB* || *Link*
||
||
ChewKedem || - || - || 3.8 || "Download":http://www.ico2s.org/data/instances/procksi/ChewKedem.tgz [/procksi/datasets/ChewKedem.tgz Download]
||
|| || first-first || 34 || 1.3 || "Download":http://www.ico2s.org/data/instances/procksi/ChewKedem_34_first-first.tgz [/procksi/datasets/ChewKedem_34_first-first.tgz Download]
||
|| || first-all || 54 || 2.0 || "Download":http://www.ico2s.org/data/instances/procksi/ChewKedem_54_first-all.tgz [/procksi/datasets/ChewKedem_54_first-all.tgz Download]
||
|| || all-all || 132 || 4.1 || "Download":http://www.ico2s.org/data/instances/procksi/ChewKedem_132_all-all.tgz [/procksi/datasets/ChewKedem_132_all-all.tgz Download]

||

*!DatasetName* || *DatasetName* || *Extraction* || *!NumberOfChains*|| *!NumberOfChains* || *Size in MB*|| MB* || *Link*
||
||
ProteinKinaseResource || - || - || 3.6 || "Download":http://www.ico2s.org/data/instances/procksi/ProteinKinaseResource.tgz [/procksi/datasets/ProteinKinaseResource.tgz Download]
||
|| || first-first || 45 || 2.4 || "Download":http://www.ico2s.org/data/instances/procksi/ProteinKinaseResource_45_first-first.tgz [/procksi/datasets/ProteinKinaseResource_45_first-first.tgz Download]
||
|| || first-all || 49 || 2.5 || "Download":http://www.ico2s.org/data/instances/procksi/ProteinKinaseResource_49_first-all.tgz [/procksi/datasets/ProteinKinaseResource_49_first-all.tgz Download]
||
|| || all-all || 106 || 4.0 || "Download":http://www.ico2s.org/data/instances/procksi/ProteinKinaseResource_106_all-all.tgz [/procksi/datasets/ProteinKinaseResource_106_all-all.tgz Download]

||

*!DatasetName* || *DatasetName* || *Extraction* || *!NumberOfChains*|| *!NumberOfChains* || *Size in MB*|| MB* || *Link*
||
||
Skolnick || - || - || 5.1 || "Download":http://www.ico2s.org/data/instances/procksi/Skolnick.tgz [/procksi/datasets/Skolnick.tgz Download]
||
|| || first-first || 33 || 1.1 || "Download":http://www.ico2s.org/data/instances/procksi/Skolnick_33_first-first.tgz [/procksi/datasets/Skolnick_33_first-first.tgz Download]
||
|| || first-all || 65 || 2.1 || "Download":http://www.ico2s.org/data/instances/procksi/Skolnick_65_first-all.tgz [/procksi/datasets/Skolnick_65_first-all.tgz Download]
||
|| || all-all || 179 || 5.9 || "Download":http://www.ico2s.org/data/instances/procksi/Skolnick_179_all-all.tgz [/procksi/datasets/Skolnick_179_all-all.tgz Download]

||

*!DatasetName* || *DatasetName* || *Extraction* || *!NumberOfChains*|| *!NumberOfChains* || *Size in MB*|| MB* || *Link*
||
||
RostSander || - || - || 7.4 || "Download":http://www.ico2s.org/data/instances/procksi/RostSander.tgz [/procksi/datasets/RostSander.tgz Download]
||
|| || RS126 || 126 || 4.3 || "Download":http://www.ico2s.org/data/instances/procksi/RostSander_RS126.tgz [/procksi/datasets/RostSander_RS126.tgz Download]
||
|| || first-first || 119 || 4.4 || "Download":http://www.ico2s.org/data/instances/procksi/RostSander_119_first-first.tgz [/procksi/datasets/RostSander_119_first-first.tgz Download]
||
|| || first-all || 212 || 7.6 || "Download":http://www.ico2s.org/data/instances/procksi/RostSander_212_first-all.tgz [/procksi/datasets/RostSander_212_first-all.tgz Download]

||

*!DatasetName* || *DatasetName* || *Extraction* || *!NumberOfChains*|| *!NumberOfChains* || *Size in MB*|| MB* || *Link*
||
KinjoHorimotoNishikawa || KinjoHorimotoNishikawa|| - || - || 98 || "Download":http://www.ico2s.org/data/instances/procksi/KinjoHorimotoNishikawa.tgz [/procksi/datasets/KinjoHorimotoNishikawa.tgz Download]
||
|| || first-first || 1012 || 46 || "Download":http://www.ico2s.org/data/instances/procksi/KinjoHorimotoNishikawa_1012_first-first.tgz [/procksi/datasets/KinjoHorimotoNishikawa_1012_first-first.tgz Download]
||
|| || first-all || 2013 || 88 || "Download":http://www.ico2s.org/data/instances/procksi/KinjoHorimotoNishikawa_2013_first-all.tgz [/procksi/datasets/KinjoHorimotoNishikawa_2013_first-all.tgz Download]

||

*!DatasetName* || *DatasetName* || *Description* || *Extraction* || *!NumberOfChains*|| *!NumberOfChains* || *Size in MB* || *Link*
||
||
Shah || Randomly selected 1000 proteins from PDB || - || - || 114 || "Download":http://www.ico2s.org/data/instances/procksi/Shah.tgz
[/procksi/datasets/Shah.tgz Download]
|| || || first-first || 1000 || 41 || "Download":http://www.ico2s.org/data/instances/procksi/Shah_1000_first-first.tgz [/procksi/datasets/Shah_1000_first-first.tgz Download]
||
|| || || first-all || 1943 || 80 || "Download":http://www.ico2s.org/data/instances/procksi/Shah_1943_first-all.tgz [/procksi/datasets/Shah_1943_first-all.tgz Download]
||
|| || || all-all || 4007 || 124 || "Download":http://www.ico2s.org/data/instances/procksi/Shah_4007_all-all.tgz [/procksi/datasets/Shah_4007_all-all.tgz Download]

||

*!DatasetName* || *DatasetName* || *Description* || *Extraction* || *!NumberOfChains*|| *!NumberOfChains* || *Size in GB* || *Link*
||
||
PDB_SELECT30_04-2008 || Downloaded from PDB web site on 10/04/2008 || - || - || *1.1*, ucmp*: *4.8*|| *4.8* || "Download":http://www.ico2s.org/data/instances/procksi/PDB_SELECT30_04-2008_7307.tar.gz [/procksi/datasets/PDB_SELECT30_04-2008_7307.tar.gz Download] ||

|| || with criteria "Remove similar sequences at 30% identity" || first-first || 7183 || *0.285*, ucmp*: *1.2* || "Download":http://www.ico2s.org/data/instances/procksi/PDB_SELECT30_04-2008_7183_first-first.tar.gz [/procksi/datasets/PDB_SELECT30_04-2008_7183_first-first.tar.gz Download]
||
|| || || first-all || 14651 || *0.60* , ucmp:*2.7* || "Download":http://www.ico2s.org/data/instances/procksi/PDB_SELECT30_04-2008_14651_first-all.tar.gz [/procksi/datasets/PDB_SELECT30_04-2008_14651_first-all.tar.gz Download]
||
|| || || all-all || 43025 || *1.2* , ucmp*: *5.5* || "Download":http://www.ico2s.org/data/instances/procksi/PDB_SELECT30_04-2008_43025_all-all.tar.gz [/procksi/datasets/PDB_SELECT30_04-2008_43025_all-all.tar.gz Download]

||

*!DatasetName* || *DatasetName* || *Description* || *Extraction* *Extraction * || *!NumberOfChains*|| *!NumberOfChains* || *Size in GB* || *Link*
||
||
PDB_SELECT25_10-2007 || PDB_SELECT25 as of October2007 || - || - || *0.746*, ucmp*: *3.4* || "Download":http://www.ico2s.org/data/instances/procksi/PDB_SELECT25_10-2007_3560.tar.gz [/procksi/datasets/PDB_SELECT25_10-2007_3560.tar.gz Download]
||
|| || it's a six monthly updated list of- || first-first || 3464 ||*0.12*, ucmp*: *0.54*, || "Download":http://www.ico2s.org/data/instances/procksi/PDB_SELECT25_10-2007_3464_first-first.tar.gz [/procksi/datasets/PDB_SELECT25_10-2007_3464_first-first.tar.gz Download]
||
|| || non-redundent protein structures || first-all || 8581 || *0.30*, ucmp*: *1.4* || "Download":http://www.ico2s.org/data/instances/procksi/PDB_SELECT25_10-2007_8581_first-all.tar.gz [/procksi/datasets/PDB_SELECT25_10-2007_8581_first-all.tar.gz Download]
||
|| || Mostly used in Protein Structure Prediction || all-all ||31288 || *0.854* , ucmp*: *4.1* || "Download":http://www.ico2s.org/data/instances/procksi/PDB_SELECT25_10-2007_31288_all-all.tar.gz ||

[/procksi/datasets/PDB_SELECT25_10-2007_31288_all-all.tar.gz Download]

*ucmp: uncompressed