8 Uncertainties:

  • What proteins interact with the environment?
  • What is being emitted by the cell?
  • A large & structured dataset containing most or all bulk data and/or single-cell data. In this it should be noted which sequencing technology has been used, all necessary metadata should be included, it should be normalized,
  • Adoption of gene name convetions OR options to translate
  • Visualization of a transcriptome
    • Biological clustering of genes (protein cluster, pathway, …)
    • Too many genes to analyze in their fullness, how to visualize them?
  • What to do with unknown genes?
  • Variance through technology (within/between-tech, within/between data modality)
  • High biological variance for healthy samples (within/between individuals)