Diagnostic Signature Challenge

The ultimate goal of this Challenge was to verify if transcriptomics data contains enough information for the diagnosis and/or prognosis of four human diseases (psoriasis, multiple sclerosis, COPD, and lung cancer).
In addition, it was setup to allow to:

  • Identify best methods for particular data types
  • Determine the dependence of performance on the methods of choice
  • Study if the wisdom of crowds applies to diagnostics signatures
  • Study the overlap of genes in the signatures (when applicable)
Background
Challenge detail
Participants
Scoring and ranking
sbv IMPROVER events
Media library
Testimonials

BACKGROUND

The need for this challenge to infer clinical phenotype from genomics data in 2011

The situation back in 2011 when the challenge was launched was the following:

  • A few success stories of gene expression based biomarkers in clinical use:
    • MammaPrint (breast cancer recurrence assay, 70-gene profile, requires fresh tissue)
    • Oncotype Dx (breast cancer recurrence assay), 21-gene profile, works on both fresh and fixed tissue
  • Counter-balanced by a few failure stories of gene expression based biomarkers in clinical use:
    • Potti et al, Nat Med (2006) claimed to identify genomic signatures for drug response. Three clinical trials begun in 2007, 2008 for lung and breast cancer. The research was later deemed statistically flawed and at least 10 high profiled publications were retracted and the clinical trials stopped.
    • Amgen scientists tried to confirm 53 landmark papers in pre-clinical oncology research: Only 6 (11%) were confirmed.
    • Bayer HealthCare reported that only about 25% of published preclinical studies could be validated.

Psoriasis

Psoriasis is the most prevalent autoimmune disease in the U.S; according to current studies, as many as 7.5 million Americans — approximately 2.2 percent of the population — have psoriasis. It is a chronic inflammatory and hyperproliferative skin disease, which, in addition to cutaneous manifestation, is accompanied with inflammatory arthritis in up to 40% cases.

The disease is diagnosed following physical examination of the skin lesions. Microscopic analysis of psoriatic skin biopsy shows thick, red, flaky cells with no sign of inflammation and blood tests can differentiate psoriasis from rheumatoid arthritis.

Psoriasis is typically treated with topical treatments of both steroids and non-steroids and phototherapy: UVB and UVA with light-sensitizing medication. There are also systemic medications and new drugs that target the autoimmune response and specific parts of the immune system (T cells, TNF, interleukin).

Psoriasis

Clinical manifestation of psoriasis. (A) The red boxes show the most prevalent sites where psoriasis affects the skin. (B) Schematic view of the skin structure of a healthy and a psoriatic patient. Psoriatic skin shows signs of inflammation and scales (dead skin).

Multiple sclerosis

Multiple sclerosis (MS) is an autoimmune disease that affects the central nervous system. The trigger of the autoimmune process in MS is unknown. MS is believed to occur as a result of some combination of genetic, environmental and infectious factors, and possibly other factors such as vascular problems. Previous studies of identical twins have demonstrated a concordance of 30% to develop MS, suggesting that the genetic background has a relatively limited but significant role in triggering MS.

The symptoms of the disease result from inflammation, swelling, and lesions on the myelin.

There are a number of MS progression subtypes (see Figure 1): relapsing-remitting MS (RRMS), primary progressive MS (PPMS), and secondary progressive MS (SPMS). In 85% of the patients, the disease has a relapse-remitting (RR) course, which is characterized by the onset or deterioration of the neurological symptoms (relapses), followed by partial or complete recovery (remissions).

Similarly to most other autoimmune diseases, MS is significantly more common (at least 2-3 times) in women than men. This disease is most commonly diagnosed between the ages of 20 and 50. The risk of developing MS in the general population is 1/750 and over 2.5 million people are living with the disease worldwide. The disease can be managed and the symptoms controlled to various degrees of success with an individualized, multifaceted approach that includes medications and other therapies. However, there is no cure for multiple sclerosis.

Diagnosis by a neurologist usually involves ruling out other nervous system disorders with invasive and expensive tests such as lumbar puncture, Magnetic Resonance Imaging (MRI) brain scan and nerve function study.

MS_stages

Progression of the disease for clinically isolated syndromes and multiple sclerosis types. A clinically isolated syndrome (CIS) is an individual's first neurological episode, caused by inflammation or demyelination of nerve tissue. The diagnosis of multiple sclerosis is only possible after a MRI confirms lesions in the brain, which typically happens after multiple sites are affected (in the course of usually multiple events). The main forms of MS are distinguished by their different courses over time. RRMS is the most common form of MS. It defines patients having relapses followed by periods of remission. Multiple sclerosis diagnosis is made after a minimum of 2 relapses for RRMS. PPMS patients have constant symptoms without remission. SPMS progression starts the same way as RRMS, but at some point there is no more remission.

COPD

COPD encompasses chronic obstructive bronchiolitis with obstruction of small airways and emphysema with enlargement of airspaces and destruction of lung parenchyma, loss of lung elasticity, and closure of small airways. Although the disease is manifested in the small airways, the challenge is to produce a COPD signature that is valid in large airways where sample collection is easier to perform.

COPD causes a progressive airflow limitation that is not fully reversible and is associated with abnormal inflammatory responses to noxious particles or gases [1]. COPD is a major cause of chronic morbidity and mortality throughout the world with its prevalence being variable across different countries and groups. In developed countries smoking is a contributing factor to the disease.

As of 2011, COPD treatment was still in the active research and development phase. Pharmacotherapy decreased symptoms and complications and includes the use of long-acting bronchodilators and inhaled glucocorticosteroids. However, none of the existing medications offered a cure for or prevention of the long-term decline in lung function.

COPD_challenge_disease_tissues

COPD is a disease that is manifested in the small airways. The challenge is to produce a COPD signature that is valid in large airways where sample collection is easier to perform.

COPD_GOLD_stages

COPD stages and symptoms

The Global Initiative for Chronic Obstructive Lung Disease (GOLD) characterizes COPD patients into GOLD Stage 1-4 depending on the severity of disease (with GOLD Stage 4 being the most severe). Diagnosis is based on spirometry (a test which measures expiratory air flow) with or without a bronchodilator (to differentiate from asthma) and through questionnaires related to respiratory symptoms.

Historically, a GOLD Stage 0 characterized a higher risk population who did not present the clearer symptoms used to describe stage 1. Since not all of these patients will eventually develop COPD, we did not include them in this challenge. In addition, subjects that suffer from alpha1-antitrypsin deficiency represent a unique group of COPD patients and were also excluded.

In summary, the COPD phenotype refers to GOLD stages 1-4 while Controls are asymptomatic subjects that have no consistent symptoms.

Lung cancer

In 2006 medical expenses from cancer care in the United States were an estimated $104.1 billion. As the population ages, costs are expected to continue to increase as cancer prevalence rises and expensive, targeted treatment strategies are becoming the standard of care. According to the World Health Organization (WHO) between 2004 and 2030, global cancer deaths will increase from 7.4 million to 11.8 million and cancer will be the leading cause of death followed by heart disease and stroke.

Non Small Cell Lung Cancer (NSCLC) accounts for approximately 85% of all lung cancers. NSCLC is divided into adenocarcinoma (AC), squamous cell carcinoma (SCC), and large cell carcinoma (LCC) histologies.

NSCLC stage is generally defined by the TNM system. The T category describes the original (primary) tumor – tumor size and whether it has spread to surrounding tissue. The N category signifies any lymph node involvement (in and around the lungs), and the M category indicates whether the cancer has spread to other parts of the body, i.e. metastasized.

According to the overall TNM staging, stage 1 lung cancer is small and localized to only one area of the lung. Stage 2 and 3 cancers are larger and may have grown into the surrounding tissues and there may be cancer cells in the lymph nodes. Stage 4 cancer has spread to another body part.

LC_subtypes

Lung cancer subtypes. Distribution of lung cancer subtypes in a study with smoking status at the time of diagnosis. The pie chart shows the distribution of the non small cell lung cancer (NSCLC) subtypes: SCC (squamous cell lung cancer), AC (adenocarcinoma) and LCC (large cell lung cancer), and the small cell lung cancer (SCLC). The distribution of current (red) and former (green) smokers is shown as a histogram for each subtype. (B). Schematic of the tissues involved in squamous cell carcinoma and adenocarcinoma

The challenge

Aim

The ultimate goal of this Challenge was to verify if transcriptomics data contains enough information for the diagnosis and/or prognosis of four human diseases (psoriasis, multiple sclerosis, COPD, and lung cancer).
In addition, it was setup to allow to:

  • Identify best methods for particular data types
  • Determine the dependence of performance on the methods of choice
  • Study if the wisdom of crowds applies to diagnostics signatures
  • Study the overlap of genes in the signatures (when applicable)

Challenge overview

Psoriasis subchallenge

Psoriasis Sub-challenge

The aim of this sub-challenge was to verify that a robust diagnostic signature for Psoriasis can be extracted from gene expression data.

Participants were asked to develop and then submit a classifier that can stratify skin samples into one of two phenotype groups - Psoriasis or Control. The classifier was built by using any publicly available gene expression data with their related clinical, demographic and batch information, and was tested on an independent dataset.

MS stage subchallenge

Multiple sclerosis stage Sub-challenge

The aim of this sub-challenge was to verify that a robust diagnostic signature for different stages of relapsing-remitting multiple sclerosis (RRMS) patients can be extracted from gene expression data.

Participants were asked to develop and submit a classifier that can stratify MS patients in one of two phenotype groups – Relapsing RRMS or Remitting RRMS – based on the Peripheral Blood Mononuclear Cells (PBMC) transcriptome. The classifier was built by using publicly available gene expression data with clinical, demographic, and batch information, and was tested on an independent dataset.

MS diagnostic subchallenge

Multiple sclerosis diagnostic Sub-challenge

The aim of this sub-challenge was to verify that a robust diagnostic signature for different types of multiple sclerosis (MS) patients can be extracted from gene expression data.

Participants were asked to develop and submit a classifier that can stratify MS patients in one of two phenotype groups – relapsing-remitting multiple sclerosis (RRMS) or Control - based on the Peripheral Blood Mononuclear Cells (PBMC) transcriptome. The classifier was build by using publicly available gene expression data with clinical, demographic, and batch information, and was tested on an independent dataset.

COPD subchallenge

COPD Sub-challenge

The aim of this sub-challenge wa to identify a classifier that can distinguish between COPD and Control subjects in large airway tissue gene expression data. At the time, publicly available training data were derived from large airways and small airways whereas test data consisted large airway data only. While gene signatures are the typical components of classifiers from gene expression, we believe that there is room for exploration of other biologically-interpretable signatures that go beyond over- or under-expressing genes.

COPD_subchallenge

Schematics diagram of COPD challenge. The training data (blue outline) consist of data from large airways (green symbols) and small airways (orange symbols), whereas test data (yellow outline) consist large airway data only.

Psoriasis subchallenge

Lung cancer Sub-challenge

The aim of this sub-challenge was to classify Adenocarcinoma (AC) and Squamous Cell Carcinoma (SCC) and their respective stages (I & II) based on transcriptome from tumor samples. While gene signatures are the typical components of classifiers from gene expression, we believe that there is room for exploration of other biologically-interpretable signatures that go beyond over-or-under expressing genes.

Data

Psoriasis subchallenge

Psoriasis Sub-challenge

Each participant could find any suitable training data from publicly available repositories. For convenience, we included a list of third party publicly available datasets that participants may be able to use for training purposes:

  • GSE13355 - Normal skin: 58 samples, Lesional skin: 64 samples
  • GSE14905 - Normal skin: 21 samples, Lesional skin: 28 samples
MS stage subchallenge

Multiple sclerosis stage Sub-challenge

Each participant could find any suitable training data from publicly available repositories. For convenience, we included a list of third party publicly available datasets that participants may be able to use for training purposes:

  • GSE15245 - Relapsing RRMS: n/a, Remitting RRMS: 62 samples
  • GSE19224 - Relapsing RRMS: 14 samples, Remitting RRMS: 14 samples
  • E-MTAB-69 - Relapsing RRMS: 12 samples, Remitting RRMS: 14 samples
MS diagnostic subchallenge

Multiple sclerosis diagnostic Sub-challenge

Each participant could find any suitable training data from publicly available repositories. For convenience, we included a list of third party publicly available datasets that participants may be able to use for training purposes:

  • GSE14895 - RRMS: n/a, Relapsing RRMS: n/a, Remitting RRMS: n/a, Control: 11 samples
  • GSE15245 - RRMS: n/a, Relapsing RRMS: n/a, Remitting RRMS: 62 samples, Control: n/a
  • GSE19224 - RRMS: n/a, Relapsing RRMS: 14 samples, Remitting RRMS: 14 samples, Control: n/a
  • GSE23832 - RRMS: 4 samples, Relapsing RRMS: n/a, Remitting RRMS: n/a, Control: 4 samples
  • GSE24427 - RRMS: 50 samples, Relapsing RRMS: n/a, Remitting RRMS: n/a, Control: n/a
  • GSE21942 - RRMS: n/a, Relapsing RRMS: n/a, Remitting RRMS: n/a, Control: 15 samples
  • GSE26104 - RRMS: 8 samples, Relapsing RRMS: n/a, Remitting RRMS: n/a, Control: n/a
  • E-MTAB-69 - RRMS: n/a, Relapsing RRMS: 12, Remitting RRMS: 14, Control: 18 samples
COPD subchallenge

COPD Sub-challenge

Training data can be obtained from any publicly available source.

Psoriasis subchallenge

Lung cancer Sub-challenge

Training data can be obtained from any publicly available source.

Rules and awards

Rules of the challenge can be viewed here.

Challenge outcome

  • There is no one-size-fits-all method for classifying disease:
    • No single normalization method conferred a performance advantage
    • No single classification method conferred a performance advantage
    • The specifics of the methodology used to classify disease seems to be decisive in extracting signal from the data
  • If the signal is strong, most methods will get the classification right, as was the case with Psoriasis.
  • If the signal is strong, most methods will get the classification right, as was the We can determine that the signal is too weak or inexistent, by finding that statistical significance was not attained by any prediction, as was the case with MS Stages.
  • If the signal is strong, most methods will get the classification right, as was the When the signal is faint, the method used can be decisive. Crowd-sourcing is particularly relevant in these cases (COPD).
  • The advantage of having many participants can be offset by the multiple testing problem that ensues
  • The wisdom of crowds enhances the performance at least from the perspective of one of the performance metrics
  • It is important to keep the test set data from the participants to better represent the situation at the clinic
  • Many of these lessons learned are consistent with the conclusions reached in the MACQ-II study (2010) to be discussed in a forthcoming session
  • An open source software package was provided to the community that allows researchers worldwide to develop prediction models starting with raw microarray data.

Challenge participants

Over 54 teams world-wide participated in the challenge. Participants per country are shown on the map below.


Teams could choose to participate in one or more subchallenges. The distribution of teams submitting their predictions per subchallenge is given below.

DSC_participation_per_challenge

Scoring and ranking

Scoring

Scoring Review Panel

  • Richard A. Bonneau, New York University
  • Alberto de la Fuente, CRS4 Bioinformatica
  • Igor Jurisica, University of Toronto
  • Daniel Marbach, MIT, Computational Biology Group
  • Tamir Tuller, Tel Aviv University

IBM Scoring Team:

  • Raquel Norel
  • Erhan Bilal
  • Gustavo Stolovitzky

Rationale behind the chosen scoring methodology

  • Basic premise: no single metric can capture all the subtleties of a prediction.
  • We used non-redundant metrics that highlight different qualities of a prediction
    • Threshold vs non-threshold
    • Order-based versus confidence based
    • Different ways of rewarding correct versus incorrect predictions
  • All metrics must be generalizable to multi-class problems to accommodate for the lung cancer sub-challenge.
  • A metric should avoid to reward pathological cases (e.g., predict all subject to be control)

Ranking

The Scoring Review Panel reviewed and approved the scoring methodology and procedures before the challenge closure as well as the below results of the scoring and final ranking:

Ranking

Overall best performing teams

Best_performers
  1. Team 221: PRB
    Team Members: Adi L. Tarca & Roberto Romero
    Institution: Wayne State University, Detroit, USA
  2. Team 227: COSBI
    Team Member: Mario Lauria
    Institution: Computational Systems Biology, Rovereto, Italy
  3. Team 161: BISON
    Team Members: M. Unger, P. Nandy, K.K. Dey, C. Zechner & H. Koeppl
    Institution: ETH, Zurich, Switzerland
winners_announcement_DSC

Best performers announcement as published in Nature, 24 Jan. 2013, page 565

Final full ranking:

Final_ranking_full

Challenge symposium

The Diagnostic Signature Challenge: Smarter Algorithms for better Disease Detection Symposium was successfully conducted at the Omni Parker House Hotel in Boston, MA, USA, on 2 – 3 October 2012. The event included lectures, presentations by the best performers in the challenge, and social events and networking.

The objectives of the symposium were:

  • to discuss and share experiences on SBV IMPROVER and the Diagnostic Signature Challenge
  • to engage with experts in the fields of system biology, crowd-sourcing and related topics
  • to announce the best performing teams
  • for the award winners to share their approaches with the scientific community.

Further details regarding the sbv IMPROVER Symposium 2012 can be found in the following links:

Media library

Scientific publications

The challenge in the news

Asia BiotechDec 2013Challenged TO ImproveINTERVALS_Icons-wwwINTERVALS_Icons-pdf
Drug Discovery NewsDec 2012Room for IMPROVER INTERVALS_Icons-pdf
nanowerkNov 2012An international competition reaffirms the potential of bioinformatics in the diagnosis of diseaseINTERVALS_Icons-wwwINTERVALS_Icons-pdf
BioITWorldOct 2012IMPROVER-ing Data Verification for Systems BiologyINTERVALS_Icons-wwwINTERVALS_Icons-pdf
GenomeWebOct 2012Philip Morris International, IBM Launch Industry-Focused Systems Biology Verification ChallengeINTERVALS_Icons-wwwINTERVALS_Icons-pdf
IlDenaroOct 2012Olympics of bioinformatics triumphs a researcher Naples INTERVALS_Icons-pdf
GenomeWebMar 2012Philip Morris International, IBM Launch Industry-Focused Systems Biology Verification ChallengeINTERVALS_Icons-www 

Tutorials and webinars

Flyers and posters

Testimonials

What they say about the challenge