3. SVM_predict.py¶
3.1. Description¶
Build SVM model from “train_file” and then predict cases in “data_file”
3.2. Options¶
--version show program’s version number and exit -h, --help show this help message and exit -t TRAIN_FILE, --train_file=TRAIN_FILE Tab or space separated file (for tranining purpose, to build SVM model). The first column contains sample IDs; the second column contains sample labels in integer (must be 0 or 1); the third column contains sample label names (string, must be consistent with column-2). The remaining columns contain featuers used to build SVM model. -d DATA_FILE, --data_file=DATA_FILE Tab or space separated file (new data to predict the label). The first column contains sample IDs; the second column contains sample labels in integer (must be 0 or 1); the third column contains sample label names (string, must be consistent with column-2). The remaining columns contain featuers used to build SVM model. -C C_VALUE, --cvalue=C_VALUE C value. default=1.0 -k S_KERNEL, --kernel=S_KERNEL Specifies the kernel type to be used in the algorithm. It must be one of ‘linear’, ‘poly’, ‘rbf’, ‘sigmoid’, ‘precomputed’ or a callable. If none is given, ‘rbf’ will be used. default=linear
3.3. Input files format¶
TRAIN_FILE and DATA_FILE use the same format as below. the 2nd and 3rd columns in DATA_FILE can be consideres as Original Label and Original Name.
ID | Label | Label_name | feature_1 | feature_2 | feature_3 | … | feature_n |
sample_1 | 1 | WT | 1560 | 795 | 0.9716 | … | feature_n |
sample_2 | 1 | WT | 784 | 219 | 0.4087 | … | feature_n |
sample_3 | 1 | WT | 2661 | 2268 | 1.1691 | … | feature_n |
sample_4 | 0 | Mut | 643 | 198 | 0.5458 | … | feature_n |
sample_5 | 0 | Mut | 534 | 87 | 1.0545 | … | feature_n |
sample_6 | 0 | Mut | 332 | 75 | 0.5115 | … | feature_n |
3.4. Command¶
$ python3 SVM_predict.py -t lung_CES_5features.tsv -d lung_CES_data_to_predict.tsv -C 10
3.5. Output to screen¶
TCGA_ID Ori_Label Ori_name Predict_Label Predict_Name
TCGA-05-4244 unknown TP53_WT 1 Truncating
TCGA-05-4249 unknown TP53_WT 1 Truncating
TCGA-05-4250 unknown TP53_WT 1 Truncating
TCGA-05-4389 unknown TP53_WT 1 Truncating
TCGA-05-4390 unknown TP53_WT 1 Truncating
TCGA-05-4403 unknown TP53_WT 1 Truncating
TCGA-38-7271 unknown TP53_WT 1 Truncating
TCGA-38-A44F unknown TP53_WT 0 Normal
TCGA-39-5030 unknown TP53_WT 1 Truncating