Hey folks! We’'re a team of plant biologists, computational genomists and bioinformaticians, building an AI-native plant design lab. But we’re grappling with a deep challenge:
How do you unify and learn from fragmented, multimodal biological data?
DNA sequences, RNA expression profiles, protein structures, pathway ontologies, images, tables from papers, and text from research PDFs - all under one learning framework.
- Should we construct a knowledge graph and apply GNNs?
- Or use multimodal transformers to reason jointly across formats? Or try a different method entirely.
- How do we clean, embed, and align these diverse inputs meaningfully?
If you’re excited by building foundational models for biology - the GPT moment for plant genomics - let’s jam!