| Frequently Asked Questions |
| |
| Table of Contents |
| General definitions: 1-5 |
| Sample related: 6-10 |
| Platform related: 11-30 |
| Study related: 31-33 |
| |
| |
| General definitions: 1-5 |
| |
1 |
What is LLOQ? This is the lower limit of quantitation. It is the point at which the response curve begins to flatten at the lower left. |
2 |
What is the LLOD? This is the lower limit of detection. It is the lowest point on the response curve. Peptides at this point can be "called" as being present or absent, but cannot be used for fold-change information. |
3 |
What is ULOQ? This is the upper limit of quantitation. It is the point at which the response curve begins to flatten at the upper right. |
4 |
What is the ULOD? This is the upper limit of detection. It is the highest point on the response curve. Peptides at this point can be "called" as being present or absent, but cannot be used for fold-change information. |
5 |
What is MRM? Multiple reaction monitoring ( Clinical Chemistry 46, No. 2, 2000, p. 279) |
|
|
| Sample related: 6-10 |
| |
|
6 |
What type of biological protein samples can be used as the basis for LC/MS based protein expression profiling, e.g. plasma, urine, saliva, bronchoalveolar lavages, lysates from leukocytes, detergent lysates from tissues? All of the above. |
7 |
If sample analysis is done from frozen stock, how does the peptide/intensity profile compare to various storage time points, e.g. day 0, day 7, day 28, day 365? The same protein would have to degrade at a different rate in separate samples for false positives to result. Since all samples are treated in exactly the same way once they arrive, and because our platform only measures relative intensity between samples, protein degradation does not create false positive results. However, protein degradation may generate false negative results when a given protein degrades below our lower limit of detection in two or more samples simultaneously. |
8 |
How do you track samples and ensure quality data (chain of custody)? We use a barcode-based specimen/data tracking system through Laboratory Information Management Systems (LIMS) which includes over 20 critical SOPs including QA SOPs (QA Responsibility, Discrepancy Management, Change Control & Training Procedure and Documentation), Sample SOPs (receiving, sample re-identification, processing, & destruction), and Raw data lock-down (third party). We also have a dedicated and full time QA/QC compliance officer and we are open to full auditing for compliance assurance. |
9 |
From the time of delivery of samples, when do you start the analyses? The samples must be logged into the LIMS, by SOP, within 3 hours of receipt on-site. Laboratory activities typically begin within 72 hours, depending on workload in the laboratories, available instrument time, and other variables. We accommodate most customer needs with sufficient notice and planning. |
10 |
What is the robustness of peptide detection from biological samples? For example, how will EDTA, Citrate, Heparin, or LMWH affect the results? While the HUPO PPP is actively researching the effects of these variables using a variety of cataloging methods, they are years away from understanding the effect of each of these variables on every protein in plasma at every point in time. For this reason, we do not recommend making comparisons across these conditions, e.g. disease patients in EDTA tubes, normals in serum tubes. We have standardized the collection protocol using the BD P100 Vacutainers (P/N: #366455) which were developed by Becton Dickinson specifically for plasma profiling, through a collaboration with Bristol-Myers Squibb and Tecan Munich ( Dr. Christoph Eckerskorn). |
|
|
| Platform related: 11-30 |
| |
|
11 |
Are SOPs available for sample generation, shipment, storage and protein determination? SOPs are made available on site for audit/inspection by our partners as part of their due diligence on our chain-of-custody . Protocols, adapted from SOPS, are available for describing the generation and storage of plasma and cell isolates, e.g. tissues, culture, PBMC's, etc. We use the BCA method and follow the manufacturer's instructions. |
12 |
Which pre-fractionation technology is used to deplete highly abundant plasma proteins? The Agilent MARS ® is used according to the manufacturer's instructions with only slight modifications. It is the industry standard and has been extensively researched, tested, and used across many organizations. http://www.chem.agilent.com/Scripts/Generic.ASP?lPage=10130&indcol=Y&prodcol=N The depleted proteins are albumin, haptoglobin, a-1-antitrypsin, IgK, IgG, an transferrin. Total depletion is ~87%, making a 50 mg/mL plasma sample yield 7 mg/mL. The other aspects of sample preparation have been described in the peer reviewed literature. J. Hulmes, D. Bethea, K. Ho, S. Huang, D. Ricci, S. A. Hefta, and G.J. Opiteck, ““An Investigation of Plasma Collection, Stabilization and Storage Procedures for Proteomic Analysis of Clinical Samples”, Clinical Proteomics , 1 (1), 17-32 (2004). |
13 |
Can you see every protein down to the LLOD or LLOQ? No. We cannot predict de novo if a given protein is stable through freezing, doesn't deplete, does digest, desalts well, chromatographs well, ionizes well, or elutes in an area of excess peak capacity. Plasma profiling is a "wide angle lens" designed to survey the plasma and identify proteins that change in abundance. If you know the protein for which you search, ELISA or MRM should be used since they will deliver better results. |
14 |
What is the variability of peptide detection, e.g. repetition of data across multiple injections of the same biological sample? 94% of peptides are found in greater than 90% of the injections with a median peptide matching rate of 98% and a median coefficient of variation of peptide intensity lower than 12%. This is a clear example of the robustness of the analytical platform. |
15 |
How are the peptides generated from the biological sample? The samples are digested in 96-well plates under denaturing conditions with LysC and subsequently under non-denaturing conditions with trypsin, a modified version of the method popularized by Professor Yates ( Nature Biotechnology , 2003, 21 (5), 508-510.) |
16 |
Since the peptides are tracked and presented in a two dimensional plot (HPLC retention time vs. mass to charge), how do the retention times compare using the same column at different times or across different columns coupled to different mass spectrometers? Typically there is less than a 8 second drift (mean +/- SD) in retention time over the course of a 30 day run because we use 500 um diameter commercial HPLC columns, run the HPLC systems near the top of their flow rate range, and correct for minor variations. All columns are checked to ensure their performance before the beginning of any runs. We also have proprietary software (Constellation MappingTM) that aligns the two dimensional plots on a global basis, using hundreds of "landmarks". |
17 |
For complex biological samples, is it possible to separate the samples by 2D-chromatography (e.g. reversed phase and ion exchange) before mass spectrometric analysis thereby creating a virtual 3D map? Yes, this is our standard process. We use eight strong cation exchange fractions which are each separated by 70 minute long HPLC separations. However, more or less fractions can be done, as needed. |
18 |
What is the sensitivity of peptide detection in a plasma sample (concentration range)? The QTOF Ultimas that we use have standard sources, standard columns, and traditional formic acid mobile phases. Since electrospray ionization is a concentration sensitive technique, we have a lower limit of detection (LLOD) the same as in any analytical lab. To this end, our absolute LLOD is 8 fmol when using an intact protein molecular weight of 10 kDa. Our 2D LC method analyzes the equivalent of 25 uL of raw plasma, so using the LLOD as 8 fmol absolute, we have an in vivo concentration LLOD of 8 fmol/25 uL, = 0.32 fmol/uL = 32 pg/uL = 32 ng/mL. |
19 |
What is the dynamic range for peptides detected by this method? Using a 10 kDa protein as a surrogate biomarker protein, our dynamic range for detection spans 6 orders of dynamic range. Our upper limit of detection (ULOD) is 50 ug/uL and our lower limit of detection (LLOD) is 32 pg/uL. Our linear dynamic range for quantitation spans 4 orders of dynamic range, in that our upper limit of quantitation is 2 ug/uL and our lower limit of quantitation is 200 pg/uL. |
20 |
How is peptide mass spectrometer's signal, which is given in relative intensity units, normalized? Is an internal reference used? There are two forms of normalization conducted. First, an individual injection is normalized by standardizing its peptide intensity distribution back to the norm via transformations. The second normalization determines if the ratio of two patient injections is excessively skewed. This is detected by statistical tests and then the distribution of ratios is repositioned through a multiplicative constant. This is rarely necessary, since the first normalization deals with most situations. Normalization of mass spec data does not appear to be a significant process step as compared to genomics, probably because the analytical platform (HPLC/MS) is much older and better controlled than the still evolving cDNA arrays. |
21 |
How does the quantification method used by you compare to commercially available techniques, like ICAT or iTRAQ (Applied Biosystems)? ICAT, iTRAQ, etc. are two or four channel methodologies that limit the user to two or four parallel comparisons at a time, which only works in the simplest of cases (typically cell lines), and doesn't work in clinical trials where there are multiple comparisons to be made (doses, time points, patients, cohorts, etc.). On the other hand, single channel methods, such as RNA expression profiling, have been popularized by many groups (e.g. Affymetrix, Pat Brown), and single channel methodologies in general have emerged as being preferable to 2/4 channel techniques. This is why we also use an unlabelled / single channel methodology (LC/MS). It is also important to remember that quantitative LC/MS is used in many areas beyond protein expression profiling, e.g. pharmacokinetics, environmental testing, semi-conductors, etc. |
22 |
I know that LC/MS shows a linear response between signal and abundance in all other areas where it is used, e.g. pharmacokinetics, environmental testing, semi-conductors, etc., but does that linear response also hold for digested proteins (peptides)? Yes. Many experiments have demonstrated that peptide intensity is proportional to abundance. Moreover, experiments have demonstrated that differential peptide abundance is proportional to the differential abundance of the protein precursor. |
23 |
Can you provide us with examples of raw data (Mascot results, MS/MS spectra, LC-MS data, threshold identification, blast search)? Yes, provided a CDA is in place. |
24 |
Do you have data that shows protein expression profiling my mass spectrometry agrees with protein expression data generated by an orthogonal assay? We have used plasma levels of hemoglobin to show abundance/intensity correlations between mass spec and an assay. Hemoglobin was chosen because the assay was readily available and could be used out of the box. Moreover, there were no IP restrictions preventing us from using the assay to demonstrate our analytical platform. In this data, hemoglobin was readily detected in all samples under all conditions by the assay; in addition, multiple peptides to hemoglobin were tracked by mass spec across all samples under all conditions, making it an ideal demonstration. |
25 |
Do you take into account the information provided by the BLAST search when calculating your % protein coverage ? Yes, but we go one step further and use the "mass-spec-able" sequence in the denominator. In other words, we weight each of the peptides that are predicted to be present based on an in silico digest of the full length protein, e.g. – for cys and met, + for his and lys and arg. This provides a more accurate view of the situation, since a mass spectrometer cannot "see" all peptides equally well. |
26 |
Are there examples of proteins present with lower abundance in the biological sample (e.g. ng/ml range) and detected by your analytical methodology ? In a recent analysis of human clinical plasma samples, we identified a cytokine and an interleukin. The confidence scores were such that there was less than a 1 in 100 chance, given the data base size and the accuracy of mass and retention time, that these were false positives. The existence of these proteins was confirmed as being biologically consistent in the context of the study they were discovered. We surmise that these were observed because the peptides fell in an uncluttered region of RT – M/Zspace and ionized well because of their high pI. We routinely see tissue leakage proteins (ferritin, rantes, T PA ) in the ng/mL range. It is difficult to model / predict the behaviour of a particular protein, so we cannot guarantee that every protein above a certain level will always be measurable. |
27 |
Is there a score which takes protein sequence coverage into consideration? Yes, we use the Mascot protein score. We typically see 3 peptides per protein by MS/MS and we use a MASCOT peptide threshold of 25. We extend coverage of differentially expressed proteins via peptide mass fingerprinting, which provides even more confidence in the MS/MS "sequencing" data. |
28 |
How many proteins can be visualized/identified in a given proteome? It's important that you understand that we do not catalogue the proteins in a sample, precisely because it is impossible to separate the true positives from the false positives in those types of experiments. We focus our protein identification on the differentially expressed proteins only, and of those, only those that are statistically significant and thereby of value. In our protein expression profiling experiments, we track ~30,000 peptides across a given plasma sample when we use eight SCX fractions. In theory, this could represent as many as 10,000 proteins, but is probably more like 1,000-3,000. |
29 |
What is the general complexity of a routine analysis? In our protein expression profiling experiments, we track ~35,000 peptides in each plasma sample. Every experiment is different, but in general, between 1-5% of these (300-1500) peptides may be differentially expressed in a statistically significant manner (p < 0.05), neglecting species, age, phlebotomy, medications, concurrent disease, etc. We sequence ~70% of the targeted peptides, leading to the identification of 400 non-redundant proteins, which is usually still too many proteins for most collaborators to follow up on. Using more patients and better controlled trials usually rectifies this by reducing the likelihood of false positives and better focusing the sequencing resources onto those peptides of the highest value. |
30 |
If You were to try to catalogue a human plasma samples via iterative data dependent acquisition (DDA), how many proteins would be identified? In a given 2D LC run of eight SCXC fractions, there are 35,000 peaks that would need to be "sequenced". However, since the quadrupoles used across the industry for this activity (DDA) are not capable of isolating every peptide across this type of sample, MS/MS is not capable of identifying all these peaks. On average, the highest abundance peptides will overwhelm the low abundance peptides, leading one to conclude that only high abundance peptides are present. |
|
|
| Study related: 31-33 |
| |
|
31 |
What is your experience in study size and design, e.g. patient numbers, stratification strategy, number of recommended sample replicates? We consider biomarker discovery to be very much akin to drug discovery. We usually start with small 50 person "Phase I" studies to get a power analysis based on platform variation, estimated biological variation and the desired level of protein differential abundance a collaborator wants to measure. We then move on to "Phase II" studies of ~200 patients, to refine the candidates and variables (analytical and biological) and possibly validate the Phase I data using specific assays. Ultimately, a well designed Phase III” study of >500 patients is usually necessary to truly determine the utility of the markers in a clinical population. These large studies are usually a combination of learning and testing, with MRM and ELISA assays being used to asses previously discovered markers, and CellCarta ® being used alongside to find new biomarkers, e.g. idiosyncratic effects. In any phase of clinical trial, in any one cohort, one must always rely on the “Rule of Large Numbers”, meaning >30 samples per cohort after dropouts, to enable parametric analysis. |
32 |
How long does a typical study require? Phase I = 16 weeks, Phase II = 24 weeks, Phase III = 32 weeks. This does not include sample accrual time. |
33 |
Can you catalog samples? In other words, can you identify every” protein in a sample? Yes, but this is an extremely hazardous activity because there is no control over false positives and false negatives. These experiments require large amounts of protein and a good deal more time. This is not usually a clinical activity, but lies in the realm of small, pre-clinical discovery projects. |
|
|