Datasets » History » Version 13
Anonymous, 04/11/2008 05:32 PM
| 1 | 1 | Anonymous | = Datasets = |
|---|---|---|---|
| 2 | 1 | Anonymous | |
| 3 | 1 | Anonymous | == How to label datasets == |
| 4 | 1 | Anonymous | Datasets are labeled in the following way: |
| 5 | 1 | Anonymous | {{{ |
| 6 | 1 | Anonymous | DatasetName.tgz |
| 7 | 1 | Anonymous | DatasetName_Literature.tgz |
| 8 | 1 | Anonymous | DatasetName_NumberOfChains_Extraction.tgz |
| 9 | 1 | Anonymous | }}} |
| 10 | 1 | Anonymous | |
| 11 | 1 | Anonymous | whereas: |
| 12 | 1 | Anonymous | || '''!DatasetName''' || Authors or web that site proposed the dataset |
| 13 | 1 | Anonymous | || '''Literature''' || Special name given in the literatur |
| 14 | 1 | Anonymous | || '''!NumberOfChains''' || Number of (extracted) chains contained in the dataset |
| 15 | 1 | Anonymous | || '''Extraction''' || Extraction of the models/chains: ''none'', ''first-first'', ''first-all'', ''all-all'' |
| 16 | 1 | Anonymous | |
| 17 | 1 | Anonymous | == Available Datasets == |
| 18 | 1 | Anonymous | The following datasets are available from the repository (special privileges required!): |
| 19 | 1 | Anonymous | || '''!DatasetName''' || '''Extraction''' || '''!NumberOfChains''' || '''Size in MB''' || '''Link''' |
| 20 | 1 | Anonymous | || !LelukKoniecznyRoterman || - || - || 3.5 || [source:Datasets/LelukKoniecznyRoterman.tgz Download] |
| 21 | 1 | Anonymous | || || first-first || 6 || 0.4 || [source:Datasets/LelukKoniecznyRoterman_6_first-first.tgz Download] |
| 22 | 1 | Anonymous | || || first-all || 15 || 0.9 || [source:Datasets/LelukKoniecznyRoterman_15_first-all.tgz Download] |
| 23 | 1 | Anonymous | |
| 24 | 1 | Anonymous | || '''!DatasetName''' || '''Extraction''' || '''!NumberOfChains''' || '''Size in MB''' || '''Link''' |
| 25 | 1 | Anonymous | || !ChewKedem || - || - || 3.8 || [source:Datasets/ChewKedem.tgz Download] |
| 26 | 1 | Anonymous | || || first-first || 34 || 1.3 || [source:Datasets/ChewKedem_34_first-first.tgz Download] |
| 27 | 1 | Anonymous | || || first-all || 54 || 2.0 || [source:Datasets/ChewKedem_54_first-all.tgz Download] |
| 28 | 1 | Anonymous | || || all-all || 132 || 4.1 || [source:Datasets/ChewKedem_132_all-all.tgz Download] |
| 29 | 1 | Anonymous | |
| 30 | 1 | Anonymous | || '''!DatasetName''' || '''Extraction''' || '''!NumberOfChains''' || '''Size in MB''' || '''Link''' |
| 31 | 1 | Anonymous | || !ProteinKinaseResource || - || - || 3.6 || [source:Datasets/ProteinKinaseResource.tgz Download] |
| 32 | 1 | Anonymous | || || first-first || 45 || 2.4 || [source:Datasets/ProteinKinaseResource_45_first-first.tgz Download] |
| 33 | 1 | Anonymous | || || first-all || 49 || 2.5 || [source:Datasets/ProteinKinaseResource_49_first-all.tgz Download] |
| 34 | 1 | Anonymous | || || all-all || 106 || 4.0 || [source:Datasets/ProteinKinaseResource_106_all-all.tgz Download] |
| 35 | 1 | Anonymous | |
| 36 | 1 | Anonymous | || '''!DatasetName''' || '''Extraction''' || '''!NumberOfChains''' || '''Size in MB''' || '''Link''' |
| 37 | 1 | Anonymous | || Skolnick || - || - || 5.1 || [source:Datasets/Skolnick.tgz Download] |
| 38 | 1 | Anonymous | || || first-first || 33 || 1.1 || [source:Datasets/Skolnick_33_first-first.tgz Download] |
| 39 | 1 | Anonymous | || || first-all || 65 || 2.1 || [source:Datasets/Skolnick_65_first-all.tgz Download] |
| 40 | 1 | Anonymous | || || all-all || 179 || 5.9 || [source:Datasets/Skolnick_179_all-all.tgz Download] |
| 41 | 1 | Anonymous | |
| 42 | 1 | Anonymous | || '''!DatasetName''' || '''Extraction''' || '''!NumberOfChains''' || '''Size in MB''' || '''Link''' |
| 43 | 1 | Anonymous | || !RostSander || - || - || 7.4 || [source:Datasets/RostSander.tgz Download] |
| 44 | 1 | Anonymous | || || RS126 || 126 || 4.3 || [source:Datasets/RostSander_RS126.tgz Download] |
| 45 | 1 | Anonymous | || || first-first || 119 || 4.4 || [source:Datasets/RostSander_119_first-first.tgz Download] |
| 46 | 1 | Anonymous | || || first-all || 212 || 7.6 || [source:Datasets/RostSander_212_first-all.tgz Download] |
| 47 | 1 | Anonymous | |
| 48 | 1 | Anonymous | || '''!DatasetName''' || '''Extraction''' || '''!NumberOfChains''' || '''Size in MB''' || '''Link''' |
| 49 | 1 | Anonymous | || !KinjoHorimotoNishikawa || - || - || 98 || [source:Datasets/KinjoHorimotoNishikawa.tgz Download] |
| 50 | 1 | Anonymous | || || first-first || 1012 || 46 || [source:Datasets/KinjoHorimotoNishikawa_1012_first-first.tgz Download] |
| 51 | 1 | Anonymous | || || first-all || 2013 || 88 || [source:Datasets/KinjoHorimotoNishikawa_2013_first-all.tgz Download] |
| 52 | 2 | Anonymous | |
| 53 | 12 | Anonymous | || '''!DatasetName''' || '''Description''' || '''Extraction''' || '''!NumberOfChains''' || '''Size in MB''' || '''Link''' |
| 54 | 5 | Anonymous | || Shah1 || Randomly selected 1000 proteins from PDB || - || - || 114 || [source:Datasets/Shah.tgz Download] |
| 55 | 4 | Anonymous | || || || first-first || 1000 || 41 || [source:Datasets/Shah_1000_first-first.tgz Download] |
| 56 | 4 | Anonymous | || || || first-all || 1943 || 80 || [source:Datasets/Shah_1943_first-all.tgz Download] |
| 57 | 1 | Anonymous | || || || all-all || 4007 || 124 || [source:Datasets/Shah_4007_all-all.tgz Download] |
| 58 | 5 | Anonymous | |
| 59 | 5 | Anonymous | |
| 60 | 12 | Anonymous | || '''!DatasetName''' || '''Description''' || '''Extraction''' || '''!NumberOfChains''' || '''Size in GB''' || '''Link''' |
| 61 | 13 | Anonymous | || Shah2 || Downloaded from PDB web site on 10/04/2007 || - || - || '''1.1''', ucmp*: '''4.8''' || |
| 62 | 13 | Anonymous | || || with criteria "Remove similar sequences at 30% identity" || first-first || 7183 || '''0.285''', ucmp*: '''1.2''' || |
| 63 | 13 | Anonymous | || || || first-all || 14651 || '''0.60''' , ucmp:'''2.7''' || |
| 64 | 7 | Anonymous | || || || all-all || || || |
| 65 | 6 | Anonymous | |
| 66 | 12 | Anonymous | || '''!DatasetName''' || '''Description''' || '''Extraction ''' || '''!NumberOfChains''' || '''Size in GB''' || '''Link''' |
| 67 | 13 | Anonymous | || Shah3 || PDB_SELECT25 as of October2007 || - || - || '''0.746''', ucmp*: '''3.4''' || |
| 68 | 13 | Anonymous | || || it's a six monthly updated list of- || first-first || 3464 ||'''0.12''', ucmp*: '''0.54''', || |
| 69 | 13 | Anonymous | || || non-redundent protein structures || first-all || 8581 || '''0.30''', ucmp*: '''1.4''' || |
| 70 | 13 | Anonymous | || || Mostly used in Protein Structure Prediction || all-all ||31288 || '''0.854''' , ucmp*: '''4.1''' || |
| 71 | 9 | Anonymous | |
| 72 | 9 | Anonymous | *ucmp: uncompressed |