Latent Diffusion Models for Controllable RNA Sequence Generation

Develop Latent Diffusion Model on the latent space of RNA-FM using the Ensembl ncRNA dataset
Train Q-Former (modified from ESM2) to map RNA sequences into fixed-length embeddings, and a Decoder (modified from ProGen2) to reconstruct sequences from these embeddings
Use a reward model to guide RNA sequence generation with enhanced Mean Ribosome Loading (MRL) and Translation Efficiency (TE)