Affymetrix: Overview & FAQ
Home Services Platforms Affymetrix: Overview & FAQ

As an Authorized Affymetrix Service Provider, the PMGC provides access to the entire suite of Affymetrix cartridge-based GeneChips™. With more than 50 different arrays, covering 30 different species – chances are good you will find an array that can help you with your research project. The PMGC offers services that can work with small quantities of RNA (down to the sub-nanogram level), or even degraded RNA that is typical from FFPE extractions. Regardless of your needs, it is highly likely we have a solution. We will also consult with you prior to the initiation of an experiment, to help design your experiment, to get the maximum impact from your project.


Pricing: Please inquire
Pricing is highly dependent on experimental design - please allow us to help you find the most cost effective solution, take advantage of our complimentary experimental design consultation!


Affymetrix Service Coordinator: Gurbaksh Basi

Quick Questions:

Show all Hide all

Where can I find more information about Affy GeneChip® content?

The PMGC has developed a powerful tool (ArrayTrans) to allow for researchers to easily search for genes or probes represented on Affymetrix (and other platforms) human and mouse arrays. For additional information about the content of Affymetrix GeneChips®, including species other than human and mouse, please visit Affymetrix website at the NetAffx Analysis Center.

How much total RNA (or DNA) is required for an Affymetrix GeneChip®?

The amount of material required depends on the application. For example, the standard protocol for Affymetrix 3’ IVT arrays requires 100 ng of total RNA, but by using one of the NuGen amplification kits we can reduce the input requirement to below 1 ng. Please refer to the specific pages for Gene Expression, miRNA, or SNP profiling for more accurate guidelines on sample requirements.

What is the difference between the standard Affymetrix and NuGen amplification methods?

Affymetrix’s standard protocol uses an in vitro transcription (IVT) reaction to perform a linear amplification, starting with 100 ng of total RNA. For some applications, obtaining even 100 ng of total RNA can be a challenge. In these cases we turn to the NuGen product line, such as Ovation® Amp V2 or Ovation® WT Pico, which allow us to work with as little as 500 pg of total RNA. This method is more costly but offers a robust solution for low input RNA.

What are my options if I have degraded RNA such as that obtained from FFPE extractions?

We have had good success with degraded RNA using the Ovation® FFPE WTA System from NuGen. For this method, 50 ng of total RNA is required. For human samples, you may also want to check out the Illumina DASL arrays.

Is the performance of the arrays impacted by the RNA isolation technique/kit used?

We have not seen a major impact on the performance of the arrays using well-established RNA extraction methods, so long as the quality of the RNA is comparable. One important metric is the 260/230 ratio which should be over 1.8. Some extraction methods that use organic reagents can result in low 260/230 ratios due to carry over of the reagents, and this can impact labelling. It is important, however, that you use the same extraction method for all samples within one project.

What are the 20X Eukaryotic controls?

The 20X Eukaryotic Hybridization Controls are spiked into the hybridization cocktail, independent of RNA sample preparation, and are used to evaluate sample hybridization efficiency on eukaryotic gene expression arrays. The sequences for these controls are derived from prokaryotic species and have been chosen to not cross-hybridize with the labelled samples generated from eukaryotic RNA. These controls allow for various quality metrics. One of the controls, BioB, is spiked in at the level of assay sensitivity (1:100,000 complexity ratio). In a successful hybridization, this control should be called "Present" at least 70% of the time, and, therefore, is used in overall process control. The other controls, BioC, BioD, and cre should always be called "Present", with increasing signal values reflecting their relative concentrations, and thus serve as excellent quality control metrics for each array processed.

What are the Poly A controls?

Poly A controls are spiked into the sample to allow for quality control of the IVT reaction when using the standard Affymetrix protocol. These control sequences are designed against genes from Bacillus sp. and, therefore, will not cross react with the cRNA generated from the sample. These controls are independent of the quality of the RNA in the sample and, therefore, evaluate the success of the amplification and labelling. The poly A controls contain both T3 and T7 promoter sequences. When reactions are spiked all poly A controls should be present, except LYS is present only 70% of the time (1:100,000 dilution). Poly A controls and their dilutions are as follows:

  • DAP is most concentrated 1:7,500
  • THR dilution is 1:25,000
  • PHE dilution is 1:50,000
  • LYS dilution is 1:100,000

Do the Affymetrix GeneChips® contain any housekeeping genes on the arrays?

The term “housekeeping gene” can mean different things to different people. The majority of Affymetrix arrays are “whole genome” meaning just about every gene should be represented. On every GeneChip® expression array, we process quality metrics based on GAPDH and b-Actin. We use these genes to help assess RNA and assay quality. For both of these genes, we look at the 3’:5’ signal ratio, which should be close to 1; however, we can accept anything with a ratio of up to 3. If this ratio is as high as 10, it indicates that either the initial RNA sample was degraded or that the amplification and labelling reactions were not effective.

What does the noise (RAWQ) represent, and what is an acceptable value?

The detection of fluorescence on a GeneChip® is carried out via the use of a photomultiplier tube (PMT). The PMT is an extremely sensitive device and is, therefore, affected to some degree by electrical noise associated with the operation of the scanner. This noise can be variable from one scanner to another, which is why the PMGC always scans all arrays from a single project on the same instrument. The RAWQ values, which are a measurement of the noise (which contributes to background on the arrays), should be relatively consistent throughout a project. Typically the values range from 1 to 4, but Affymetrix does not provide a specific guideline.

What is the Affymetrix GeneChip® Command Console ® (AGCC)?

AGCC is the latest generation of instrument control software for GeneChip® systems. AGCC summarizes probe cell intensity data (".cel" file generation) and enables sample and array registration, data management, instrument control, as well as automatic and manual image gridding.

AGCC also supports a full range of Affymetrix assays by enabling seamless integration with downstream primary analysis applications, such as Affymetrix Expression Console™ and Genotyping Console™. The PMGC ensures that we are always using the most up-to-date and validated software to control our instruments.

What are the ".dat", ".cab", ".cel", ".chp", and ".rpt" files that are generated from an Affymetrix experiment?

Several different files are generated from an Affymetrix experiment.

  • The “.dat” file represents the most “raw” data from the scan. The file contains intensity information for every pixel of the image.
  • The “.cel” file is a more processed data file whereby the average pixel intensity for an individual probe feature (spot) on the array is calculated. This file is the one typically used in most downstream analysis. The individual pixel information is lost; however, the size of the file is much smaller and easier to manipulate.
  • The “.chp” file again is a more processed data file. If a ".cel" file is analyzed using Affymetrix’s analysis software, it generates a ".chp" file, which contains processed and normalized values. For analysis with a third-party tool, most people prefer to use the ".cel" file.
  • The “.cab” file is a compressed archive of the ".dat", ".cel", and ".chp" file that is prepared by the data transfer tool (DTT).
  • The “.rpt” file is a report generated that includes the QC matrices.
Which files are required to analyze Affymetrix data in GeneSpring?

Most people find that the best way to analyze Affymetrix data in GeneSpring is to import the ".cel" files. The other files are not required. Alternatively, you can import the ".chp" files but you will not be able to use any of GeneSpring’s more advanced normalization algorithms such as RMA or gcRMA.

What does TGT stand for and are there any guidelines for choosing the TGT value?

The TGT value is an arbitrary “Target Intensity” chosen by the experimenter or technician running the assay. Whatever value is chosen should be maintained across all of the arrays for a particular project. The choice of TGT value may be chosen empirically, or many people simply opt for the default value of 500. In general, when choosing a TGT value, the idea is to use the average intensity of the array and multiply it by a scaling factor, which will give you the TGT. The scaling factor helps to compensate for the variability between chips. In general, TGT values are between 100 and 500. Often people find it better to start with a somewhat lower value and work from there.

What is the scaling factor?

The scaling factor helps to compensate for the variability between chips. The experimenter first chooses a target intensity (TGT). The target intensity is the value for which you would like the average intensity of every chip to be. The scaling factor will vary depending on the actual average intensity of an array. Lower TGT values lead to lower scaling factors, and higher TGT values lead to higher scaling factors. In general, the scaling factor should be between 1 and 4; with the best data obtained when values are below 3.

What steps does the PMGC take to minimize variability?

Microarray experiments are relatively complex, with numerous steps, reagents, and instruments involved. Despite this, it has been shown in many studies that the leading cause of variance is the facility that is performing the analysis, followed by the technician that completes the experiment. All of our technicians are highly experienced and rigorously trained. Each project is assigned a specific technician, who completes the entire project, to minimize variability. Furthermore, wherever possible, we use arrays and reagents from the same lot. We use the same equipment (hybridization oven, scanner, etc.) throughout the project; all in an effort to ensure the tightest possible data.

Often a project is large enough that it will require multiple days worth of amplification, labelling, and hybridizing to complete. In these cases we work with the customer to identify which samples are replicates, and we split these replicates across the various days to ensure that it is possible to control for any day-to-day variations that may occur.

What is the difference between a technical replicate and a biological replicate? Which type is most useful and how many replicates do I need?

A biological replicate involves independent samples (multiple patients, multiple biopsies from an individual patient, etc.). RNA or DNA would be extracted from unique biopsies, blood from unique patients, or independent cell cultures (i.e. individual culture dishes). The purpose of a biological replicate is to assess and control for biological diversity.

A technical replicate involves splitting a sample at some point and continuing on with the two aliquots as separate samples through the rest of the protocol. So for example, a technical replicate might involve taking one RNA sample and performing two independent amplifications and labellings from that initial sample. Similarly, if a labelled sample was split onto two arrays, that would be another type of technical replicate. Technical replicates provide an indication of measurement (or technical) error and are useful for diagnosing problems with the protocol, but they offer little in the way of statistical power for a biological experiment.

The exact number of replicates required for an experiment is difficult to determine a priori without a proper power analysis. Such a power analysis is not always possible as it requires that you have an estimation of the overall variance, which you often do not have before you perform the experiment. We generally recommend doing as many biological replicates as your budget can accommodate. In general, it is good to have at least three biological replicates per condition. For a more detailed determination of the number of replicates required, please contact us as we will be happy to help you design your experiment.

What kind of data will I receive with the optional data analysis?

All customers running an Affymetrix experiment through the PMGC will receive the ".dat", ".cel", ".chp", and ".rpt" files.

Customers who opt for our additional data analysis service will also receive files from a quality control assessment, completed in R, as well as the appropriate statistics, heat-map, and fold-change analysis for their data set. For a more detailed answer, specific to the particulars of your experiment, please inquire with Carl Virtanen, our Bioinformatics Manager, at .