The Illumina HiSeq 2000 sequencing system generates up to 600 Gb per run with the industry's lowest cost per Gb.
This platform is suitable for whole genome re-sequencing, de novo sequencing, targeted re-sequencing, SNP discovery and structural variation analysis, ChIP-Seq and RNA-Seq.
The sequencing library is prepared by bridge PCR, while the sequencing is done by a technology referred to as sequencing by synthesis (SBS), see here for a detailed description. Briefly, for bridge PCR, primers and DNA template are immobilized to the surface of a flow cell. The primers target adaptors ligated to the DNA fragments and with each cycle of PCR the fragments build "bridges", with subsequent denaturation leaving single stranded templates anchored to the flow cell. The copies generated remain to form dense clusters. A universal primer is used to sequence the clusters. For sequencing, all four nucleotides are added simultaneously to the flow cell with DNA polymerase. Each base carries a unique fluorescent label and the 3'-OH group is chemically blocked to allow for only one base incorporation at a time. Following each base incorporation event there is an imaging step and the 3' blocking group is then removed to prepare each strand for the next base to be incorporated. This process is repeated a user-defined number of times to generate read lengths of 35-100 bases.
The PMGC recommends following the ENCODE Consortium's guidelines for experiments which address many important considerations for deep sequencing experiments, including depth of coverage required.
Pricing: Please inquire
All samples should be assayed by PicoGreen, RiboGreen or Qbit (Invitrogen) for accurate quantitation. NanoDrop readings are not accurate to the level required for this system. Qbit quantitation is provided by our facility for all incoming samples for sequencing projects.
Other Applications: Please inquire.
Analysis Time Frame
Deep sequencing is a lengthy process. Library construction takes 2-4 days, cluster generation and sequencing reads can take up to 2 weeks and pipeline processing and QC checks add at least another 2-3 days. Please plan your experiments with the expectation of at least one month lag time from sample submission to results acquisition. Delays exceeding this level will be reported at the onset of your project, or as soon as they arise.
Deep sequencing requires that every base in a sample is sequenced several times for two reasons. First, it is necessary to gather multiple observations per base to generate a reliable base call and secondly, reads are not distributed evenly over an entire genome, due to the random nature of the sequence generation - some bases will be read many more times than the average but some will be read far fewer times. 'Depth of sequencing' refers to the number of times a genome has been sequenced.
There is not a definite answer to this question. Most users choose the coverage level required for a particular experiment by considering the following factors: the type of study, gene expression level, size of reference genome and published literature.
For example, for studies of human genome mutations/SNPs/rearrangements coverage of 10-30 times depth of coverage depending on the application and statistical model. Whereas ChIP-seq studies, where only a subset of a genome is being investigated, generally require around 100 times coverage.
For RNA-seq determining the level of coverage required is further complicated by the varied expression levels of different transcripts. Highly expressed transcripts will be read many more times than transcripts with low expression. For RNA-seq experiments, it is usually the number of millions of reads required that is considered. The number of reads required will depend on how sensitive the experiment needs to be for genes of low-level expression.
Illumina provides a Coverage Calculator that might aid you in determining the level of coverage your experiment requires.
The ENCODE Consortium also provides some useful data standards, including these guidelines for experiments which address many important considerations for deep sequencing experiments.
With RNA-seq it is possible to detect long non-coding RNAs and analyze transcription start/stop sites and splice variants. This is difficult to do with microarrays, however, Agilent arrays do interrogate a series of long non-coding RNAs.
On the other hand, microarrays have the advantage of being cheaper and less time consuming and can provide good quality gene-level results from much smaller quantities of total RNA. For gene-level differential expression analyses of mRNA 20 to 200 million reads per sample may be adequate, however, at this level multiplexed microarrays (such as the Illumina HT-12) still save approximately $150 to $1500 per sample. According to the ENCODE RNA-Seq standards if the purpose of your experiment is discovery of novel transcribed elements or strong quantification of known transcript isoforms then depth of 100-200 million reads is currently recommended.
With ChIP-seq generally there is a lower requirement for number of reads than in the case of RNA-seq and this makes ChIP-seq as attainable as a ChIP-chip experiment.
According to the ENCODE ChIP-Seq standards the current minimum standard number of reads for a ChIP-seq experiment, involving transcription factors and chromatin modifications, is 10 million uniquely mapped reads per replicate for mammalian cells. In practice, this means that up to 10 samples can be run per lane, reducing the cost dramatically.
As with any technique, replicates are necessary for statistical assessment of significant differences between groups. However, due to the quantitative nature of sequencing, useful information can also be obtained from singular experiments, in such instances as case studies where conclusions are being applied only to the samples being tested and not to any groups. Consultation will be required for determining the required number of replicates for a particular set of experiments.
The PMGC uses a series of processing steps, or algorithms, that are based on current publications in the field, for processing all sequencing data.
This series of steps is referred to as the "pipeline".
The pipeline will vary depending on the type of analysis required (RNA-seq vs. Exome-seq vs. ChIP-seq etc.).
For more details please inquire.
Customers will receive their processed sequence data as .fastq files along with FASTQC data. If alignment or further analysis is performed by the PMGC then the form the results take will depend on the type of analysis required. Please inquire.