Inter-chain beta-sheet contacts play in particular a structural role in protein-protein interactions that are central to healthy biological function and diseases ranging from AIDS and cancer to Alzheimer's and Huntington's diseases.
Beta-sheets that extend over more than one protein chain can be spotted through the analysis of the atom coordinates of known protein structures, or through the analysis of the corresponding secondary structure, as defined by programs such as DSSP.
The ICBS database is intended as a tool to:
Note that there might be occasional delays between the release of a new / modified PDB structure and the release of the corresponding new / modified PQS structures. Some PQS structures might thus appear or be updated with some delay in the ICBS database with respect to the release of the corresponding PDB entries.
So far, however, the likely quaternary structures corresponding to PDB structures, which are generated by the EBI from PDB structures, are only available as PDB files. Moreover, mmCIF files do not yet represent the official release of the PDB structures.
For these reasons, the current version of the ICBS database relies on PDB files. This prompted a number of methodological choices linked to limitations of the PDB file format or to variations or errors in the PDB files.
When mmCIF files are officialy release or when an API is provided to directly query the database for PDB and for PQS entries, then, a new, improved version of the ICBS database will be generated.
For further details, please refer to:
Obtention of structure coordinatesPDB files are downloaded from the PDB ftp site .
PQS files are downloaded from the PQS web server.
We only consider the few PQS-entry types that are relevant to the ICBS database, namely:
See a scanned version of the DSSP paper (page 19 of 21) for a definition of secondary structure elements such as hydrogen bonds, bridges, ladders, sheets, etc.
Note that we take into account single beta-bridges, i.e., ladders of length 1. Single beta-bridges can strengthen the effect of regular ladders (of length > 1), and they are therefore taken into account for the computation of the ICBS index. The ICBS database thus contains some proteins for which the inter-chain interface is reduced to a single beta-bridge, and whose ICBS index is consequently very low. The query interface lets users exclude ICBS entries whose ICBS interface consists solely of single beta-bridges.
We first scan coordinate files and count all heavy atom contacts between chains that pair through inter-chain ß-sheets. Two heavy atoms belonging to different chains are here considered to be in contact when they are less than 4.4 Angstroms apart.
The ICBS index is then obtained for each pair of chain by forming the ratio between the number of hydrogen bonds in the ß-sheets between the 2 chains, and the number of heavy atom contacts. For readability purposes, the result is multiplied by 1000.
The highest value over all pairs is retained for a global characterization of the ICBS interface at the level of the whole structure.
HomogeneityTwo chains are considered identical when they have the same number of residues.
When identical chains pair through inter-chain ß-sheets, their interface is considered homogeneous. It is considered heterogeneous when the 2 chains are different.
The overall homogeneity of the interface at the level of the whole structure is derived from the homogeneity of all pairs of chains. See the explanation of the Homogeneity column of the results table.
OrientationDepending on the parallel or anti-parallel nature of the single or multiple ladders found in the inter-chain ß-sheets, the interface is considered parallel, anti-parallel, or mixed.
See the explanation of the Orientation column of the results table.
See a scanned version of the DSSP paper (page 19 of 21) for a definition of parallel and antiparallel ladders and sheets. PQS entries. They can contain the same ICBS interfaces as the PDB structure they are derived from, or they can contain new, unique ICBS interfaces. The query form lets users choose whether to display unique and/or redundant entries.
A simple but effective redundancy criterion is used. An ICBS entry corresponding to a PQS structure is considered redundant if one of the following conditions is met:
Note on some methodological choices that were prompted by limitations in the format of the source data, or by errors and variations in the data.Automatically spotting and analyzing inter-chain beta-bridges, ladders and beta-sheets in PDB files representing PDB and PQS entries is a challenging task for a number of reasons:
Instead of rebuilding unique ICBS chain identifiers and mapping them to non-unique PDB and DSSP labels, we currently ignore chains whose label has previously been used for another chain in the same entry. This only affects the results of some very large PQS macro-molecules. As these entries consist of numerous repetitions of the same PDB asymmetric unit, and of the same interfaces between them, this solution is not likely to cause any unique ICBS interaction to be missed.
Instead of adopting this costly and error-prone solution, we created an ICBS ladder label that is very unlikely to be non-unique, by combining the DSSP ladder label, the label of the ß-sheet it belongs to, and the labels of the two pairing chains. We are thus able to count ladders and hydrogen-bonds, and to determine the interface orientation for each pair of chains.
This method might cause some overestimations in the number of ladders in the probably rare cases where a same ladder joins more than 2 chains. This would however be of no consequence as far as the data we display in the database interface are concerned.
However, because of the numerous variations in the way HETATM rows were used and placed in PDB files, taking into account HETATM rows in the analysis of inter-chain ß-sheets has a number of undesirable effects.
For the time being, we therefore only consider beta-bridges and atom contacts between residues that belong to "standard" chains, i.e. residues whose coordinates are provided in PDB ATOM rows. As a consequence, we might under- or over-estimate the ICBS index for a few ICBS entries, and we might miss a few proteins in which ICBS interactions would occurr only between residues specified in HETATM rows. Please refer to the PDB documentation for details on ATOM and HETATM PDB records.
When such cases are encountered, we simply ignore the corresponding DSSP rows. As a consequence, we might occasionally miss some inter-chain ß-sheet interactions.
Given our current method to count 'partnerships' in ladders and inter-chain interfaces
(i.e., counting every partnership twice, once for R1, once for R2, and then dividing by 2),
this DSSP problem might cause an underestimation of the number of Hydrogen-bonds and of the ICBS index.
To limit the impact of this rare problem, we increase the number of any fractional
number of partnerships to the next higher integer.
Please note that this selection criterion does not exclude entries based on protein sequence redundancy; it is for ICBS-interface redundancy.
Please note that the interface between two chains can comprise several single beta-bridges. Please refer to the Methods section for details on secondary structure determination.
Please see the note on the information extracted from PDB files for an explanation of why there are 'missing' values and/or some discrepancies with what the web interface to the PDB database shows.
To display recent additions to the ICBS database only, one can set a lower limit on the PDB revision date. All ICBS entries (PDB and PQS structures) whose PDB revision date is higher than the limit will be displayed. Please note that the release of PQS structure corresponding to new or modified PDB entries might occasionally suffer some delay. In such cases, some PQS structures will appear / be updated with subsequent updates of the ICBS database.ICBS index is in a given range. homogeneity codes from the list. Entries whose ICBS interface matches this code will be selected. orientation codes from the list. Entries whose ICBS interface matches this code will be selected.
Tabulation of resultsThe results of a query are presented in a table.
Some columns reproduce pieces of information extracted from PDB files, such as the deposition date of the PDB structure. Note that several ICBS entries can correspond to a same PDB code, and therefore share the same PDB information.
Other columns present global information that characterizes the inter-chain ß-sheet interfaces found in the protein.
Lastly, some columns present detailed information on each pair of chains that interact through ß-sheets.
Navigation from page to pageThe display of query results is broken up into a number of pages.
The number of results per page can be adjusted in the query form.
Results pages can be accessed using the controls available in the navigation section of the page.
Context specific help
Sorting by columnControls ( buttons) are available in most title cells to sort results according to the current column, by ascending or descending order (alphabetical or numerical, depending on the column).
Running a new queryTo run a new query from a results page, click on the 'Query' link at the top of the page.
For some entries, you might therefore notice some discrepancies between what the ICBS database shows and what the PDB query interface returns. For instance, many original PDB files are missing the primary citation record, or have this information placed under a wrong section of the file. Such missing citation problems have been fixed in the improved data set, but not in the PDB files themselves. Typically, the ICBS Journal column will show a 'missing' primary citation for such entries.
While we tried to fix some common errors found in PDB files (e.g., handling the many variations or errors in Journal abbreviations as explained below), fixing problems such as missing primary citations records and duplicating the results will only be achieved when we will use improved primary sources .
The pieces of data extracted from PDB files are still considered helpful to filter and sort the query results. Each ICBS entry has a link to the corresponding (improved) PDB entry. Users can thus verify the corresponding PDB information, e.g., check whether a primary citation is actually missing.
The primary citation record, when present in PDB files, often contains errors. Instead of displaying the journal name they contain 'as is', we use the journal abbreviation used by the Journal Citation Reports.
Clicking on the header name brings up a RASMOL view of the ICBS entry, (a PDB structure or a quaternary structure derived from a PDB structure). For help on how to install RASMOL or modify display options, click here.
ICBS ID (ICBS)This column displays the unique identifier of the entry in the ICBS database. The identifier is:
PDB ID (PDB)This column displays the PDB code corresponding to an ICBS entry.
Clicking on the PDB code brings up the corresponding PDB entry in a new window.
PQS PID (PQS)This column displays the PQS code of ICBS entries.
Clicking on the PQS code brings up the corresponding PQS entry in a new window.
Please see the note on the information extracted from PDB files for an explanation of why there are 'missing' values and/or some discrepancies with what the web-interface to the PDB database shows.
Deposition date (Dep. Date)This columns displays the deposition date of the PDB structure corresponding to the ICBS entry, in yyyy-mm-dd format.
ICBS Index (Index)This columns displays the index value that characterizes the overall 'strength' of the inter-chain ß-sheet interactions in the protein structure. Index values are computed for every pair of chains where inter-chain ß-sheets occur. The maximum value over all pairs is retained to characterize the overall strength.
The higher the value, the higher the importance of the inter-chain ß-sheets in the interface between chains.
For details on the computation of the ICBS index, see Characterization of the strength of inter-chain ß-sheets: ICBS index .
The interface is considered:
See Homogeneity and sense of the ICBS interface for an explanation of the simple identity criterion retained here.
ß-sheet orientation (Sense)This column shows the overall orientation of the ICBS interface. The following codes are used:
Total number of protein chains (Chains)This column displays the total number of protein chains in the structure.
Total number of residues (Res.)This column displays the total number of residues that are found in standard protein chains. Residues belonging to "non-standard" groups (i.e., corresponding to HETATM rows in the PDB file) are not counted. Consequently, the number displayed in this column does not always correspond to the number of residues displayed in the PDB query interface and/or the corresponding PDB files and/or the corresponding DSSP files.
Please see the note on the information extracted from PDB files for an explanation of other possible discrepancies between ICBS columns and what the web-interface to the PDB database displays.
Pairing chains (Pairs)This column shows both the name of the protein chains that pair through inter-chain ß-sheets, and the 'homogeneity' of the pair, i.e., whether the chains are identical or not.
Chains are represented by their PDB or PQS one letter code, as specified in the coordinate file. Note that when a quaternary structure corresponds to the assembly of several copies of a PDB structure, the chains are renamed in the coordinate file so that each chain is uniquely identified.
The homogeneity of a pair is displayed as a 2-character code inserted between the chain identifiers.
The homogeneity codes are as follows:
See Homogeneity and sense of the ICBS interface for an explanation of the simple identity criterion retained here.
Use the sorting buttons ( ) below the column title to sort the entries according to their number of pairing chains.
Per pair ICBS index (Index)This columns displays the value of the ICBS index for each pair of chains that interact through ß-sheets. number of hydrogen bonds in the ICBS interface, for each pair of chains.
Use the sorting buttons ( ) below the column title to sort the entries according to their maximum number of hydrogen bonds over all chain pairs.number of heavy-atom contacts in the ICBS interface, for each pair of chains.
Use the sorting buttons ( ) below the column title to sort the entries according to their maximum number of heavy-atom contacts over all chain pairs.
Per pair ß-sheet orientation (Sense)This column shows the orientation of the ICBS interface for each pair of chains. The following codes are used: