
- Train RF2-ppi with MSA, inter-residue, and 3D structure features over data from monomers, positive DDIs, negative DDIs, positive PPIs, and negative PPIs (2:1:1:1:1 ratio)
- Build the training set from domain-domain interactions from AFDB andPPI dataset from PDB. Use negative control pairs to help the model distinguish true interactions from non-interacting pairs. Structure template input is removed for better identification of negative pairs
- Develop omicMSA that mines raw genomic and transcriptomic data from the NCBI genome and SRA, resulting in 7-fold deeper MSAs
- Use four loss functions: masked token prediction, distogram prediction (relative distances between residues), orientogram prediction (relative orientations between residues), and disorder prediction