iDPP @ CLEF 2024 Guidelines - Intelligent Disease Progression Prediction

Tasks

iDPP CLEF 2024 offers two evaluation tasks focused on predicting the progression of Amyotrophic Lateral Sclerosis (ALS) from prospective data and one task on impact of exposure to pollutants on predicting the progression of Multiple Sclerosis (MS) from retrospective data.

For Tasks 1 and 2 on ALS participants are given prospective patient data collected over an average of nine months via a dedicated app developed by the BRAINTEASER project and sensor data collected from the sensors of a fitness smartwatch in the context of clinical trials in Turin, Pavia, Lisbon, and Madrid, fully anonymized.

For Task 3 on MS, participants are given a retrospective dataset containing roughly 1.5 years of visits and environmental data. This dataset comes from two clinical institutions, one in Pavia, Italy, and the other in Turin, Italy, and it contains data about real patients, fully anonymized.

All the datasets are highly curated and they are produced from the BRAINTEASER Ontology (BTO), developed by the BRAINTEASER project, which ensures the consistency of the data represented. Moreover, several checks have been performed to ensure that all the instances are clean, contain proper values in the expected ranges, and do not have contradictions.

Task 1 - Predicting ALSFRS-R Score from Sensor Data (ALS)

It focuses on predicting the twelve scores of the ALSFRS-R (ALS Functional Rating Scale - Revised), assigned by medical doctors roughly every three months, from the sensor data collected via the app. The ALSFRS-R is a somehow “subjective” evaluation usually performed by a medical doctor and this task will help in answering a currently open question in the research community, i.e. whether it could be derived from objective factors.

Participants will be given the ALSFRS-R questionnaire at the first visit with the scores for each question together with the time (number of days from diagnosis) at which the questionnaire was taken.

Participants will have to predict the values of the ALSFRS-R sub-scores at the second visit.

Participants will be given the time of the second visit (number of days from diagnosis) together with all the sensor data up to the time of the second visit.

The training data are available upon registration for the challenge and the test data will be available one week before the run submission deadline. Please see the Datasets and Important Dates sections for more information.

Task 2 - Predicting Patient Self-assessment Score from Sensor Data

It focuses on predicting the self-assessment score assigned by patients from the sensor data collected via the app. Self-assessment scores correspond to each of the ALSFRS-R scores but, while the latter ones are assigned by medical doctors during visits, the former ones are assigned via auto-evaluation by patients themselves using the provided app.

If the self-assessment performed by patients, more frequently than the assessment performed by medical doctors every three months or so, can be reliably predicted by sensor and app data, we can imagine a proactive application which, monitoring the sensor data, alerts the patient if an assessment is needed.

Participants will be given the first set of self-assessed scores together with the time (number of days from diagnosis) at which the questionnaire was taken.

Participants will have to predict the values of the self-assessed scores at the second auto-evaluation, happening one or two months after the first one.

Participants will be given the time of the second auto-evaluation (number of days from diagnosis) together with all the sensor data up to the time of the second auto-evaluation.

The training data are available upon registration for the challenge and the test data will be available one week before the run submission deadline. Please see the Datasets and Important Dates sections for more information.

Task 3 - Predicting Relapses from EDDS Sub-scores and Environmental Data (MS)

It focuses on predicting a relapse using environmental data and EDSS (Expanded Disability Status Scale) sub-scores. This task allows us to assess if exposure to different pollutants is a useful variable in predicting a relapse.

Participants will be asked to predict the week of the first relapse after the baseline considering environmental data based on a weekly granularity, given the status of the patient at the baseline, which is the first visit available in the considered time span. For each patient, the date of the baseline will be week 0 and all the other weeks will be relative to it.

Participants will be given all the observations and environmental data about a patient, i.e. also observations which may happen after the relapse to be predicted. All the patients are guaranteed to experience, at least, one relapse after the baseline.

The training data are available upon registration for the challenge and the test data will be available one week before the run submission deadline. Please see the Datasets and Important Dates sections for more information.

Participating

To participate in iDPP CLEF 2024, groups need to register at the following link:

Register

Important Dates

Registration closes: April 22, 2024
Test data release: April 29, 2024
Runs submission deadline: May 6, 2024
Evaluation results out: May 17, 2024
Participant and position paper submission deadline: May 31, 2024
Notification of acceptance for participant and position papers: June 24, 2024
Camera-ready participant papers submission: July 8, 2024
iDPP CLEF Workshop: September 9-12, 2024 during the CLEF Conference in Grenoble, France

Datasets

Tasks 1 & 2 Datasets (ALS)

Tasks 1 and 2 share the same training dataset, comprising the following data:

Static patient data, including demographic and clinical information;
All the ALSFRS-R evaluations (twelve scores for each) from the medical doctor assessment or self-assessed through the app, each with the time of collection;
All available sensor data, summarized by day, each with the time of collection.

All times are expressed in days between the diagnosis and the collection. Note that sensor data is not available for every day and may contain gaps due to lack of adherence from the patient and/or technical problems in the data collection. However, finding ways to work around these limitations is part of the challenge.

For a reference on the ALSFRS-R scores, please refer to LINK.

The test data will contain the same static and sensor data for other patients not in the training set. Only the first ALSFRS-R evaluation will be available, together with the time of the target ALSFRS-R evaluation whose scores are to be predicted.

Training Dataset

52 patients, 189 clinical ALSFRS-R, 301 self-assessed ALSFRS-R, 13946 days of sensor data.

Training datasets can be downloaded here:

Check out the legend for a description of the data.

Test Dataset

Details on the Datasets

Each of the training datasets consists of the following files:

train-static.csv: contains the static variables about each patient.
train-sensors.csv: contains the daily summaries of the sensor data measured by the smartwatch.
train-alsfrs.csv: contains the twelve ALSFRS-R scores, together with the (relative) time of assessment and whether it was evaluated by a clinician or self assessed.

Each of the test datasets consists of the following files:

test-static.csv: contains the static variables about each patient.
test-sensors.csv: contains the daily summaries of the sensor data measured by the smartwatch.
test-alsfrs.csv: contains the twelve ALSFRS-R scores, together with the time of assessment relative to the diagnosis (first_alsfrs_days_from_diagnosis) and the relative time of the second assessment (target_alsfrs_days_from_diagnosis) whose corresponding ALSFRS-R values need to be predicted.

Tasks 3 Dataset (MS)

Task 3 encompasses the following data:

Static patients data, comprising demographic and clinical information;
EDSS scores assessed by clinicians, alongside their corresponding sub-scores, each annotated with the week of the EDSS visit;
Environmental exposure measurements aggregated weekly;
The week of the first relapse occurrence after the baseline for each patient (all patients are guaranteed to experience at least one relapse after the baseline).

All time intervals are expressed in weeks from the baseline. EDSS visits occur between the baseline and the occurrence of the first relapse, while environmental measurements span from January 1st, 2013, to 2023. It's important to note that environmental measurements may not be available for every week and contain missing data.

For more information on the EDSS score, please refer to this link.

Training and test datasets are divided in a 70-30% proportion. The test data will include the same static, EDSS, and environmental data for patients not included in the training set.

Specifically:

Training Dataset: 199 patients, 834 EDSS scores, 113,923 environmental measurements. Training dataset can be downloaded here: Training Dataset
Test Dataset: 81 patients, 290 EDSS scores, 46,354 environmental measurements. Test dataset can be downloaded here: Test Dataset
Ground Truth: Ground Truth

Details on the Dataset

The training dataset consists of the following files:

train_static.csv: contains the static variables about a patient.
train_edss.csv: contains the (relative) week when EDSS scores were measured, together with the EDSS scores evaluated by clinicians.
train_environmental_meas.csv: contains the environmental measurements aggregated with a weekly granularity.
train_outcome.csv: contains the week of the first relapse occurrence after the baseline for each patient.

The test dataset consists of the following files:

test_static.csv: contains the static variables about a patient.
test_edss.csv: contains the (relative) week when EDSS scores were measured, together with the EDSS scores evaluated by clinicians.
test_environmental_meas.csv: contains the environmental measurements aggregated with a weekly granularity.

The test dataset does not contain the test_outcome.csv file as it is the target that needs to be predicted by the challenge participants.

Check out the legend for a description of the data

More in detail, the train_outcome.csv file adopts the following format:


67396654612589370083623092407810766693,190
267505643482073532971907590162817517438,96
292211325266845962706770530341466085902,30
272084949450943266236977822325902693890,216
291361686090369065397020169907588173057,13
...

where:

Columns are separated by a comma;
The first column is the patient ID, a hashed version of the original patient ID (should be considered just as a string);
The second column is the week of the first relapse occurrence after the baseline for each patient.

Participant Repository

Participants are provided with a single repository for all the tasks they take part in. The repository contains the runs, resources, and possibly the code produced by each participant in order to promote reproducibility and open science.

The repository is organised as follows:

submission: this folder contains the runs submitted for the different tasks.
score: this folder contains the performance scores of the submitted runs.
code: this folder contains the source code of the developed system.
resource: this folder contains any additional resources created during the participation.
report: this folder contains the participant report.

The submission and score folders are organized into sub-folders for each task as follows:

submission/task1: for the runs submitted to the first task. Similar structure for the other tasks.
score/task1: for the performance scores of the runs submitted to the first task. Similar structure for the other tasks.

Participants which do not take part in a given task can simply delete the corresponding sub-folders.

Here, you can find a sample participant repository to get a better idea of its layout.

The goal of iDPP CLEF 2024 is to speed up the creation of systems and resources for MS and ALS progression prediction as well as openly share these systems and resources as much as possible. Therefore, participants are more than encouraged to share their code and any additional resources they have used or created.

All the contents of these repositories are released under the Creative Commons Attribution-ShareAlike 4.0 International License.

Run Submission Guidelines

Participating teams should satisfy the following guidelines:

The runs should be submitted in the textual format described below;
Each group can submit a maximum of 30 runs for each of Task 1 and Task 2 and Task 3.

Task 1 Run Format

Runs should be submitted as a text file (.txt) with the following format:


100619256189067386770484450960632124211 1 2 3 4 1 2 3 4 1 2 3 4 upd_T1_myDesc
101600333961427115125266345521826407539 1 2 3 4 1 2 3 4 1 2 3 4 upd_T1_myDesc
102874795308599532461878597137083911508 1 2 3 4 1 2 3 4 1 2 3 4 upd_T1_myDesc
123988288044597922158182615705447150224 1 2 3 4 1 2 3 4 1 2 3 4 upd_T1_myDesc
100381996772220382021070974955176218231 1 2 3 4 1 2 3 4 1 2 3 4 upd_T1_myDesc
...

where:

Columns are separated by a white space;
The first column is the patient ID, an hashed version of the original patient ID (should be considered just as a string);
Columns from 2 to 13 represent the predicted ALSFRS-R sub-score. Each column corresponds to an ALSFRS-R question, e.g. column 2 to Q1, column 3 to Q2, and so on). Each values is expected to be integer in the range [0, 4];
The last column is the run identifier, according to the format described below. It must uniquely identify the participating team and the submitted run.

It is important to include all the columns and have a white space delimiter between the columns. No specific ordering is expected among patients (rows) in the submission file.

Task 2 Run Format

Runs should be submitted as a text file (.txt) with the following format:


100619256189067386770484450960632124211 1 2 3 4 1 2 3 4 1 2 3 4 upd_T2_myDesc
101600333961427115125266345521826407539 1 2 3 4 1 2 3 4 1 2 3 4 upd_T2_myDesc
102874795308599532461878597137083911508 1 2 3 4 1 2 3 4 1 2 3 4 upd_T2_myDesc
123988288044597922158182615705447150224 1 2 3 4 1 2 3 4 1 2 3 4 upd_T2_myDesc
100381996772220382021070974955176218231 1 2 3 4 1 2 3 4 1 2 3 4 upd_T2_myDesc
...

where:

Columns are separated by a white space;
The first column is the patient ID, an hashed version of the original patient ID (should be considered just as a string);
Columns from 2 to 13 represent the predicted self-assessd sub-score. Each column corresponds to an ALSFRS-R question, e.g. column 2 to Q1, column 3 to Q2, and so on). Each values is expected to be integer in the range [0, 4];
The last column is the run identifier, according to the format described below. It must uniquely identify the participating team and the submitted run.

It is important to include all the columns and have a white space delimiter between the columns. No specific ordering is expected among patients (rows) in the submission file.

Task 3 Run Format

Runs should be submitted as a text file (.txt) with the following format:


100619256189067386770484450960632124211 10 upd_T3_myDesc
101600333961427115125266345521826407539 47 upd_T3_myDesc
102874795308599532461878597137083911508 13 upd_T3_myDesc
123988288044597922158182615705447150224 1 upd_T3_myDesc
100381996772220382021070974955176218231 9 upd_T3_myDesc
...

where:

Columns are separated by a white space;
The first column is the patient ID, an hashed version of the original patient ID (should be considered just as a string);
The second column is the predicted week at which the first relapse after the baseline happens. The value is expected to be integer starting from 1;
The third column is the run identifier, according to the format described below. It must uniquely identify the participating team and the submitted run.

It is important to include all the columns and have a white space delimiter between the columns. No specific ordering is expected among patients (rows) in the submission file.

Submission Upload

Runs should be uploaded in the repository provided by the organizers. Following the repository structure discussed above, for example, a run submitted for the first task should be included in submission/task1.

Runs should be uploaded using the following name convention for their identifiers: <teamname>_T<1|2|3>_<freefield>

where:

teamname is the name of the participating team;
T<1|2|3> is the identifier of the task the run is submitted to, e.g. T1 for Task 1;
freefield is a free field that participants can use as they prefer to further distinguish among their runs. Please, keep it short and informative.

For example, a complete run identifier may look like upd_T1_myDesc

where

upd is the University of Padua team;
T1 means that the run is submitted for Task 1;
myDesc suggests an appropriate description for the run.

The name of the text file containing the run must be the identifier of the run followed by the txt extension. In the above example upd_T1_myDesc.txt

Run Scores

Performance scores for the submitted runs will be returned by the organizers in the score folder, which follows the same structure as the submission folder. For each submitted run, participants will find a file named <teamname>_T<1|2|3>_<freefield>.score.txt where <teamname>_T<1|2|3>_<freefield> matches the corresponding run. The file will contain performance scores for each of the evaluation measures described below. In the above example upd_T1_myDesc.score.txt

Evaluation Measures

Task 1 - Predicting ALSFRS-R Score from Sensor Data (ALS)

The effectiveness of the submitted runs will be evaluated using MAE (Mean Absolute Error) and RMSE (Root Mean Square Error) of the predicted sub-scores with respect to the actual sub-scores.

Task 2 - Predicting patient self-assessment score from sensor (ALS)

The effectiveness of the submitted runs will be evaluated using MAE (Mean Absolute Error) and RMSE (Root Mean Square Error) of the predicted self-assed sub-scores with respect to the actual sub-scores.

Task 3 - Predicting relapses from EDDS sub-scores and environmental data (MS)

The effectiveness of the submitted runs will be evaluated using MAE (Mean Absolute Error) and RMSE (Root Mean Square Error) of the predicted week of relapse with respect to the actual one.

Participant Papers

Participants are expected to write a report describing their participation in iDPP CLEF 2024, their proposed solution, what features have been used for prediction, the analysis of the experimental results and insights derived from them.

Participant reports will be peer-reviewed and accepted reports will be published in the CLEF 2023 Working Notes at CEUR-WS, indexed in DBLP and Scopus.

A template for the report is contained in the report folder of each participant repository. The template does not only define the layout of the report but it also provides a suggested structure for it and illustrates the kind of expected contents.
Participants are strongly encouraged to use LaTeX, whose template provides all the details about the report. However, it is possible to also use Word and a basic template is provided as well.

Participant papers are expected to be 10-20 pages in length, excluding references.

The schedule for submission, revision, and camera ready of the participant report is detailed above in the Important Dates section.

Participant papers can be submitted via Easychair at the following link https://easychair.org/conferences/?conf=clef2024 by choosing the Intelligent Disease Progression Prediction (iDPP) track.

Get in Contact

If you need any additional information, please get in contact with us writing to Nicola Ferro, ferrodei.unipd.it.

Participation Guidelines

iDPP CLEF 2024

Task 1 - Predicting ALSFRS-R Score from Sensor Data (ALS)

Task 2 - Predicting Patient Self-assessment Score from Sensor Data

Task 3 - Predicting Relapses from EDDS Sub-scores and Environmental Data (MS)

Participating

Tasks 1 & 2 Datasets (ALS)

Details on the Datasets

Tasks 3 Dataset (MS)

Details on the Dataset

Task 1 Run Format

Task 2 Run Format

Task 3 Run Format

Submission Upload

Run Scores

Task 1 - Predicting ALSFRS-R Score from Sensor Data (ALS)

Task 2 - Predicting patient self-assessment score from sensor (ALS)

Task 3 - Predicting relapses from EDDS sub-scores and environmental data (MS)