Next-generation nucleotide sequencing from biological samples generates massive amounts of short pieces of sequence. These are then virtually assembled to generate longer sequences: contigs (which are substantial parts of a genome) or isotigs (which are the differentially spliced forms of messenger RNA). A major challenge with these assembled contigs and isotigs is to measure the quality of the assembly and how completely they represent all the protein-coding genes that are present.
The Ramaciotti Centre have pioneered a method by which proteomic analysis (i.e. the identification of which proteins and isoforms thereof are present in the sample that was been used for the nucleotide sequence analysis) can be used to validate contigs and isotigs that are generated through next-gen sequencing analyses. However, the process is currently done in a series of unconnected steps and there is no good means of visualising the outcome.
This project will integrate these multiple data into a single platform and provide an analysis/visualisation layer.
The following are some of the key issues this project aims to alleviate:
- waste of resources because this is done manually
- current procedure is prone to errors and of limited use because it is labor intensive
- there are no easy ways to visualise proteomic results; visualisation gives researchers important clues during results analysis
- In general, it is hard to establish relationships between the genome, with the transcriptome and protein profile at a given time