Menu

Gen3D Walkthrough

Lingfei Jackson Nowotny Sharif Ahmed
Attachments
22.png (76636 bytes)
22p_pdb2.PNG (9074 bytes)
22p_score.png (107146 bytes)

Gen3D Walkthrough

This is a guided walkthrough to help you get started with Gen3D. This walkthrough assumes you have downloaded and installed Gen3D along with the necessary sample data files.


Functions

By using the gen3D you can generate 3D structured of the chromosome and learn evaluation the scores of these structures. The generated pdb file contains the 3D structure information of the genome. The generated score.log file contains the scores for the generated models.

Command Format:

Binary_program_file input_contact_data

input_contact_data: genome contact data which contains only intra contacts information.

output_pdb_file: generated models in pdb format and the evaluation scores of all the generated structures

Result: a score.log file contains scores of the generated pdb files and generated pdb files

For example:

./gen3d.out 22p

This command will generate pdb files contain the malignant B-cell of an acute lymphoblastic leukemia patient's chromosome 22‘s prediction 3D structure and the score of these structure files.

The picture above is one of the generated pdb files of chromosome 22. This file contains the 3D coordinate of each unit in the structure and contact information between two contact units. The pdb file is named like filename_ percentage of contact percentage of non-contact maxdd_ percentage of satisfied interaction frequency.

Shown Above is the score.log file containing the scores of the generate models of chromosome 22. This file contains the score for each generated genome structure. It includes several evaluation scores for chromosome models shown below.
Maxd: the max distance between any 2 regions
Maxdd: percentage of the max distance between any two regions within the default threshold
Maxc: max distance between any 2 consecutive regions
Avg: average distance between regions
Actual: number of contact regions
Real: percentage of contact regions
Noncon: number of non-contact regions
Noncount: percentage of non-contact regions
Un con avg: average distance of the not satisfied contact regions
Un non-con avg: average distance of the not satisfied non-contact regions
% of IF satisfied: percentage of the satisfied interaction frequency
Total IF: number of the interaction frequency
Pif: number of the satisfied interaction frequency

To pick the best model, we can run a command to get the highest score model from ensemble of models.
ls -art 21pF* | awk -F_ 'BEGIN{my2=0;my3=0;my4=0;my5=0} {if(2$2 + 2$3 + 3$5 + $4> 2my2 + 2my3 + 3*my5 + my4){my2=$2;my3=$3;my4=$4;my5=$5;my=$0} } END{print my}'
For normal chromosomes, replace 19 with corresponding chromosome number; for cancerous ones, replace 19 with corresponding chromosome number + "p", like "19p".

*NOTE: In the source code, there are two parameters CON and NON (line #26 and # 27) control the minimum contact and non-contact score of the models generated. Given greater parameters, the program can generate higher quality but lower number of models. Given smaller parameters, the program can generate lower quality but larger number of models. *

Viewing Genome Models

To visualize the genome structure generated, you can use any kind of genome visualization tools which can read pdb files. For example, here we use UCSF Chimera tool for visualization. Shown below is one of our generated structures




Related

Wiki: Home
Wiki: Installing Gen3D

MongoDB Logo MongoDB