tutorial

Cavity

CavPharmer

CorrSite1.0

CorrSite2.0

CovCys

Overview of CavityPlus

CavityPlus is a web server for precise and robust protein cavity detection and functional analysis. With protein 3D structural information as input,

Cavity is used to detect the potential binding sites on the surface of a given protein structure and rank them with ligandability and druggability scores.

Based on the detected protein cavity information, further functions of a protein cavity are then analyzed by using three submodules, namely CavPharmer, CorrSite and CovCys.

CavPharmer uses a receptor-based pharmacophore modeling program Pocket to automatically extract pharmacophore features within cavities.

CorrSite identifies potential allosteric ligand binding sites based on motion correlation analysis between allosteric and orthosteric cavities.

CovCys automatically detects druggable cysteine residues for covalent ligand design, which is especially useful for identifying novel binding sites for covalent allosteric ligand design.

The overall flowchart of how to use CavityPlus is as follows:

Cavity

Step 1

Load a protein of your interest. Two ways are provided here:
1) Load from RCSB PDB based on a valid PDB ID;
2) Upload your own protein of interest.
After this step, one protein structure will be shown in the JSmol window.

If the protein was successfully loaded, there will be a pop-up window and the structure will be shown in the JSmol window.

If you have a previous finished job and you are going to submit a new structure in the same webpage, the results of previous job will be deleted. So make sure you have downloaded the results before submitting a new job.

Figure 1. Visualization of Cavity operation interface. Red rectangles show the input protein and its chains. Red ellipse with the "sample" can be clicked to download.

Step 2

Select the chain(s) from the whole protein, and it will be shown in JSmol right now.
Note: CAVITY will automatically remove water molecules, ion atoms and ligand atoms from structure.

Figure 2. Visualization of Cavity submission. Red rectangles show the current selected chain, then submit Cavity to detect pockets with a roughly estimated computational time.

Step 3

Select mode. CAVITY will detect the whole protein to find potential binding sites, and this is the default mode. And it is available to upload a ligand file by selecting "with Ligand" mode. CAVITY will detect around the given Mol2 file. It helps the program do know where the real binding site locates. In most cases, CAVITY could locate the binding site without given ligand coordinates, and you are allowed to try this mode if you are dissatisfied with the results from whole protein mode.

It should be noted that the given ligand in "With Ligand" mode must be located in the binding site. Otherwise, CAVITY module would fail to detect pockets. An error message would pop out if the ligand were not correct. Our webserver can automatically extracts ligands from protein structure (shown in Figure 3). Please don’t load your own ligand file when you have already selected an extracted ligand from the list because the automatic extracted ligand has a higher priority in the computing process.

Figure 3. Visualization of "ligand mode" and "Advanced parameters" with red rectangles.

Step 4

Advanced parameters (shown in Figure 3). We have a set of default parameters to run CAVITY. And you are allowed to adjust some parameters of your interest. Explanations for the parameters:

SEPARATE_MIN_DEPTH (SMD), MAX_ABSTRACT_LIMIT_V(MALV), SEPARATE_MAX_LIMIT_V(SMLV) and MIN_ABSTRACT_DEPTH(MAD) are used for cavity detection process, which determine the depth and volume of cavity. The bigger values are related to larger cavities. Actually, we have provided four parameter sets for different inputs in our web.

The first parameter set (8/1500/6000/2 for SMD/MALV/SMLV/MAD) is used for common cavity, which is also the default parameters of CAVITY.

The second parameter set (4/1500/6000/4 for SMD/MALV/SMLV/MAD) is used for shallow cavity, such as peptides binding site and protein-protein interface.

The third parameter set (8/2500/8000/2 for SMD/MALV/SMLV/MAD) is used for complex cavity, such as multi function cavity, channel and nucleic acid site.

The final parameter set (8/5000/16000/2 for SMD/MALV/SMLV/MAD) is used for super-sized cavity. These set of parameters are recommended to be used for proper inputs.

RULER_1 and OUTPUT_RANK are used as output filter. RULER_1 limits the minimum volume of cavity and OUTPUT_RANK limits the minimum CavityScore. CAVITY will only output detected binding sites whose CavityScore is greater than OUTPUT_RANK. User may increase this value to prevent CAVITY outputting useless results.

Step 5

Run CAVITY by clicking "Submit" button.

Before running CAVITY, our web server will check the Chain checkbox to ensure the selected protein and submitted protein are the same one.

It will take some time to detect potential cavities of a given protein. The processing time of CAVITY increases significantly with the increasing number of residues and complexity of protein. We have done a test by running CAVITY with a test dataset that contains 100 proteins selected from PDBBind database. The protein sequence length was between 87 and 1760. The test result showed that for most of proteins with residues less than 400, CAVITY could processing them in 3 minutes. When the number of residues increases, the processing time increases significantly. The biggest protein (with 1760 residues) in our test dataset cost 41 minutes. According to our test, we provided a knowledge-based table (Table 1) and fitted curve figure (Figure 4) to show the processing time.

Table 1. Statistical analysis of Cavity computing time.

Number of residues in protein	Predicted Running time(Unit: minute)
<400	Usually less than 3 minutes
400-500	2-5
500-600	4-7
600-800	6-9
800-1000	9-15
>1000	Usually longer than 15 minutes. The larger the protein, the longer the running time.

Figure 4. The scatter plot and fitted curve of Cavity processing time against the number of protein.

After finished, the results will be shown in the "Cavity Results" part at the bottom of the JSmol window (shown in Figure 5). The result table will list the DrugScore and Druggability for each cavities. Users can click the Checkbox in Surface column to view each cavity in JSmol window. The last column provides the information of residues that constitute the cavity. Users can get the information by clicking "More" and copy the residues information directly, rather than downloading the results.

Figure 5. Visualization of Cavity Results. Once a checkbox in the Red rectangle is selected, the JSmol window will show the current cavity.

CAVITY will output the following visual files for viewing the detection result.

name_surface.pdb: The output file storing the surface shape of the binding-site and the CavityScore. It is in PDB format, and user can use molecular modeling software to view this file and obtain an insight into the geometrical shape of the binding site. User can view this file by plain text editor, and check the predicted maximal pKd of the binding site. This value indicated the ligandability of the binding site. If it is less than 6.0(Kd is 1uM), suggests that this binding-site may be not a suitable drug design target.
name_vacant.pdb: The output file storing the volume shape of the binding-site. It is in PDB format, and user can use molecular modeling software to view this file and obtain an insight into the geometrical shape of the binding site.
name_cavity.pdb: The output file storing the atoms forming the binding-site. It is in PDB format, and user can use molecular modeling software to view this file and obtain an insight into the residues of the binding site. It is the visual version of "name_pocket.txt".

Note: Some molecular modeling software may not display these files correctly, please try different software if you could not view the results file. (Pymol is recommended to support these output files.)

Back to top