Data Competition

BHI 2024 Data Challenge Competition

Title: AI in Medicine Scientific Challenge: A Deep Dive into Privacy, Explainability, and Trustworthiness

The growing importance of explainability, data privacy, and trustworthiness in AI is becoming increasingly evident across society. These aspects are key to building public trust and confidence in AI-driven technologies, especially in sectors like healthcare. To push the boundaries of these dimensions, the BHI is excited to announce an upcoming AI Scientific Challenge.

This challenge will enable a deep exploration of privacy, explainability, and trustworthiness enhancing technologies. Participants will have the opportunity to interact with infrastructures, synthetic data and models inspired by the BEAMER (GA Nº101034369) project and the GATEKEEPER (GA Nº 857223) Large Scale Pilot. The challenge will offer a unique opportunity to engage with a transparent, cutting-edge AI framework that is shaping the future of healthcare.

The challenge consists of two distinct tracks:

Track 1: Exploring Synthetic Data Generation

In this track, participants will explore the emerging field of synthetic data generation, a key area in AI research. Contestants will be provided with a codebook and a set of high-quality synthetic data from 100 patients, each containing a variable number of instances of physical activity data collected through wearable devices. Their challenge is to generate new synthetic data based on the provided dataset, contributing to the development of anonymized datasets to further behavioral research in projects like BEAMER.

Evaluation

The evaluation will be conducted on two fronts:

  • 30% Qualitative Evaluation: Following the principles of the GATEKEEPER AI framework, experts will assess the submissions based on criteria such as design, complexity, innovation, transparency, and robustness.

 

  • 70% Quantitative Evaluation of Resemblance: The resemblance of the synthetic data to the original data will be quantitatively assessed using the following metrics:
    • Wasserstein distance
    • Kolmogorov-Smirnov (KS) Test
    • Jensen-Shannon distance
    • Distance Pairwise Correlation

 

Submission of results

Participants are expected to submit:

  • Code used to generate the synthetic data.
  • A sample of the synthetic data they generated.
  • A paper reporting their approach, methodology, and results.

Further details on how the submission will be made will be provided in the future.

Track 2: Enhancing Explainability and Trustworthiness of AI Healthcare Models

The validation and trustworthiness of AI models in healthcare is essential, with explainability being a fundamental component in this process. In this track, participants will engage with a state-of-the-art multimodal AI model designed to predict glucose levels, applying diverse explainability techniques to explore and evaluate the model’s decision-making mechanisms. Participants will have access to a public dataset that serves as the foundation for constructing their own databases, which they will use to interact with the AI model via an API designed according to BEAMER’s deployment pipeline for accessibility and flexibility. Through this API, they can make predictions based on their custom-built datasets. Detailed instructions on how the model operates will be provided in a comprehensive user manual. The core objective of this track is for participants to investigate the model’s behavior, applying advanced interpretability and explainability tools. As part of their submission, participants must generate a paper that analyzes the model’s predictions, highlighting key insights and ensuring that the model’s decision-making processes can be trusted in real-world healthcare settings.

Evaluation

The evaluation conducted by the organizers would be:

  • 100% Qualitative Evaluation: Conducted by AI experts in line with the AI Gatekeeper framework, focusing on the explainability and interpretability of the model, as well as the techniques used to demonstrate and present these aspects.
Submission of results

Participants are expected to submit:

  • Code used to explore the explainability and interpretability of the model.
  • A paper reporting their approach, methodology, and results.

Further details on how the submission will be made will be provided in the future.

How to Sign Up:

Participants interested in joining the challenge will need to complete this Google Form. In this form, you will be required to provide the following details:

  • The track(s) they wish to participate in.
  • The name of their team.
  • The names and emails of all team members.

Upon submitting the form, participants will receive a confirmation email verifying their registration. This email will also include all the necessary materials for the selected track(s) and any additional information required to get started.

        Timeline:

  • October 10th: Registration deadline for interested teams to sign up on google form
  • October 14th: Preliminary submission deadline (4 page (max.) summary report on strategy/pipeline being used, tentative results and next steps. In addition, problems/limitations encountered can be included for advice from the expert panel.)
  • October 21th: Feedback on the preliminary submission by the panel of experts
  • October 28th: Deadline for final submission (8 page paper)
  • November 4th: top Finalist teams per track released
  • November 5st: BHI Data challenge team member registration deadline (each team must register at conference day rate ~50 to quality for certificate and award)
  • November 10th: Final report from all team registered
  • November 13th: Finalist presentation & awards ceremony at BHI conference (Hybrid session)

Organizing Committee:

Rules and Conditions of the AI in Medicine Scientific Challenge

  1. Eligibility:
    • The challenge is open to individuals or teams from academic institutions, research organizations, and the AI/healthcare industry.
    • Teams must consist of a minimum of 2 members and a maximum of 5 members.
    • Participants must be 18 years or older to register.
  2. Registration:
    • All participants must register via the provided Google Form before the challenge registration deadline.
    • Only registered participants will receive the challenge materials and submission guidelines.
    • Participants are responsible for ensuring all submitted information is accurate.
  3. Code of Conduct:
    • All participants must uphold a professional and respectful demeanor throughout the challenge.
    • Plagiarism or unauthorized use of third-party code or data is strictly prohibited.
    • Any form of misconduct may result in disqualification from the challenge.
  4. Submission Guidelines:
    • Submissions must include all required deliverables: code, synthetic data samples (if applicable), and a paper detailing the methodology and results.
    • Submissions must adhere to the deadlines specified by the organizers. Late submissions will not be considered.
    • Code must be well-documented, reproducible, and provided in an easily executable format (e.g., Jupyter notebooks, Python scripts).
  5. Evaluation Criteria:
    • Submissions will be evaluated based on both qualitative and quantitative metrics, as outlined in the challenge description.
    • The decision of the judging panel will be final and binding.
  6. Use of Data and Materials:
    • Participants are provided with synthetic data and any other materials strictly for the purposes of the challenge.
    • Participants are prohibited from using the data or challenge materials for commercial purposes without prior written consent from the organizers.
  7. Intellectual Property:
    • All code, data, and written materials submitted to the challenge will remain the intellectual property of the participants.
    • By participating in the challenge, participants grant the organizers a non-exclusive, royalty-free license to use, publish, and display the submissions for promotional and educational purposes.
  8. Privacy and Confidentiality:
    • Participants agree to respect the privacy of their team members and any information shared during the challenge.
    • Personal information provided during registration will only be used for communication regarding the challenge and will not be shared with third parties without consent.