Difference between revisions of "Team:Bordeaux/Software"

Line 36: Line 36:
  
 
<p>
 
<p>
   RNA-Seq as said previously allows to quantify RNA into a cell at a particular time. With NGS development, a huge amount of data became available to scientists. They actually needed peoples to compute these data and this is when bioinformaticians came up. Computers are actually thought to treat a lot of data faster than humans. Thus, a lot of tools were developed to process NGS outputs. For the competition we used some of these tools to study splicing in C. elegans organism. Lets see how we proceeded !
+
   RNA-Seq as said previously allows to quantify RNA into a cell at a particular time. With NGS development, a huge amount of data became available to scientists. They actually needed people to compute these data and this is when bioinformaticians came up. Computers are actually thought to treat a lot of data faster than humans. Thus, a lot of tools were developed to process NGS outputs. For the competition we used some of these tools to study splicing in <i>C. elegans organism</i>. Lets see how we proceeded !
 
</p>
 
</p>
  
Line 44: Line 44:
  
 
<p>
 
<p>
   In bioinformatics, sequence alignment is a way of arranging RNA sequences in relation to each other, to determine their structure or function similarities. Sequences are stored in a matrix where rows from each sequence are compared. Gaps can be added into sequences so that identical or similar characters are aligned in successive columns. The organism studied here is <i> C.elegans</i>. The purpose here was to align RNAseq reads to its reference genome by using the Hisat algorithm.
+
   In bioinformatics, sequence alignment is a way of arranging RNA sequences in relation to each other, to determine their structure or function similarities. Sequences are stored in a matrix where rows from each sequence are compared. Gaps can be added into sequences so that identical or similar characters are aligned in successive columns. The organism studied here is <i> C.elegans</i>. The purpose here was to align RNA-Seq reads to its reference genome by using the Hisat2 algorithm.
  RNA is transcribed from DNA sequences that are composed of alternating coding exons and non-coding introns. A pre-RNA is produced that contains the transcribed Exons and Introns.
+
RNA is transcribed from DNA sequences that are composed of alternating coding exons and non-coding introns. A pre-RNA is produced that contains the transcribed exons and introns.
 
</p>
 
</p>
  
 
<p>
 
<p>
  Out of this pre-RNA, only coding Exons must be kept and the introns removed. This process of removing introns is called splicing. Different combinations of exons can be brought together to produce different variants of the protein to be, in a process called alternative splicing.
+
Out of this pre-RNA, only coding exons must be kept and the introns removed. This process of removing introns is called splicing. Different combinations of exons can be brought together to produce different variants of the protein to be, in a process called alternative splicing.
  It is those spliced RNA sequences that are then sequenced. To do, so they are retro-transcribed into their complementary DNA, the cDNA. This DNA is sequenced using NGS.
+
It is those spliced RNA sequences that are then sequenced. To do so, they are retro-transcribed into their complementary DNA, the cDNA. This DNA is sequenced using NGS.
 
</p>
 
</p>
  
 
<p>
 
<p>
   Current sequencing technologies methods split the large DNA molecules to be sequenced into small chunks called reads. These reads sequences are mapped to the genome reference using algorithms like bowtie. Because reads are small, some sequences can be redundant, present at different locations in the genome, making them hard to map. To circumvent this, a technique of mapping called paired-end is used. It consists in sequencing a cDNA fragment at its extremities in both directions, 3’ to 5’ and 5’ to 3’ (reverse strand). Because these reads originate from the same fragment the distance between them is know and it is easier to map them. Indeed, if two reads can map at a same location only one will have its pair mapping further at the correct distance.
+
   Current sequencing technologies methods split the large DNA molecules to be sequenced into small chunks called reads. These reads sequences are mapped to the reference genome using algorithms like bowtie. Because reads are small, some sequences can be redundant, present at different locations in the genome, making them hard to map. To circumvent this, a technique of mapping called paired-end is used. It consists in sequencing a cDNA fragment at its extremities in both directions, 3’ to 5’ and 5’ to 3’ (reverse strand). Because these reads originate from the same fragment the distance between them is know and it is easier to map them. Indeed, if two reads can map at a same location only one will have its pair mapping further at the correct distance.
 
</p>
 
</p>
  

Revision as of 16:46, 1 November 2017

Wrong