2025 International Conference on Biomedical and Health Informatics (BHI 2025)
High School – Prompt Engineering Challenge (HS-PEC)
IEEE International Conference on Biomedical and Health Informatics (BHI) has been an annual flagship conference in Biomedical Informatics, AI, and Big Data for biomedicine and healthcare. It has been organizing data competitions for college students for many years. In 2025, IEEE-BHI organizing committee has designed a simplified Prompt Engineering Competition for High School students (HS-PEC) to inspire their interests in the era of AI. High school students can participate individually or as a self-assembled team with no more than 4 members. Each high school student/team need to craft prompts for a real-world healthcare task to explore how rigorous prompt in biomedical AI can impact healthcare. Code of conduct for minors will be strictly followed.
Competition Task: To Develop Prompts for Identifying Thoracic Abnormalities in Chest X-Ray Images
Goal: Each team should construct prompts (i.e. prompt engineering) using large language model Gemini 2.5 Flash to identify common thoracic abnormalities from posterior–anterior (PA) chest X-ray images. The structured list of prompts that lead to final results will be evaluated by judges.
Team: 1-4 High School students can register for the prompt engineering competition, and should create a Google account on “Google AI studio” by end of day on Oct. 29th, 2025 anywhere on the world.
Dataset: US National Institute of Health Clinical Center has provided a public ChestX-ray14 dataset (https://www.kaggle.com/datasets/nih-chest-xrays/data, see Wang X, Peng Y, Lu L, Lu Z, Bagheri M, Summers RM. “ChestX-ray8: Hospital-scale Chest X-ray Database and Benchmarks on Weakly-Supervised Classification and Localization of Common Thorax Diseases.” IEEE CVPR. 2017.) Each image is labeled for the presence of any of the 14 common pathologies (e.g., Atelectasis, Nodule, Pneumothorax, Hernia, and etc). A subset of the original dataset has been extracted by IEEE BHI2025 OC for HS-PEC: Dataset Subset Link
Data Download: Each H.S. student/team can either download the dataset from BHI2025 link(Dataset Subset Link) or download individual image samples from the Kaggle Data Explorer Tree View:

Note: To construct prompt for the goal, each H.S. student/team needs to download one single chest X-ray image., and assume no prior patient context beyond what is visible in the image.
Prompt Engineering Example: Please watch the following video recording.
Prompt Engineering Challenge Result Format:
- The PEC result must be a simple list of bullet points, or JSON structure list of 14 NIH ChestX-ray14 findings categorized as either “Present” or “Absent”. The model may express uncertainty where appropriate. However, the final output for each finding must be a binary classification (Present/Absent).
- Prompt Tone: The PEC prompts must use objective and cautious tone.
- Rules: Speculations on features not existing in the X-ray image are strictly prohibited.
HS-PEC Reporting Timeline:
- Rolling Registration: Extended to Oct 29th 11:59pm Anywhere on Earth (AoE)
- Fill out the following Google Form for your team or individual registration (Max 4 Team Members): https://forms.gle/YxR1j2nRy29WXtdd6
- Final Prompt text file .txt and Video Clip Submission: October 29th AoE
- Submission Link: https://forms.gle/SzdUKK1bZWz39JNW6
- Participants and their parents/guardians must sign a media release form that contains email address for communication
- The IEEE BHI2025 will send an official certificate to each participant. In addition, any winner of HS-PEC will be notified to
- Participants must put together a 1–2-minute max video showcasing the following:
- Team Name and Member(s)
- Explanation for why they designed their submitted prompt that way
- Main takeaways they learned in designing strong prompts to utilize general LLMs for medical tasks
- Each H.S. student/team must submit a .zip folder containing
- a .txt file of prompt as specified below in rules
- The video clip
- The zipped folder must be named: YourTeamName_PromptEngineeringSubmission.zip
- Award announcement: November 1st, 2025.
Evaluation by BHI2025 HS-PEC Judges:
Each HS-PEC will be evaluated against the ground-truth labels using:
- An external testing chest-X-Ray dataset with ground truth labels
- Judges will run your prompts on Gemini 2.5 Flash (multimodal) on the Google AI studio linked below.
- Macro-averaged F1-Score: The F1-score (the harmonic mean of precision and recall) for each of the 14 findings are calculated and then averaged. This provides a balanced measure of how well your prompt identifies all conditions.
- Precision and Recall will be used as secondary metrics
Guidelines and Rules for Preparing Prompt:
- Structure over slogans: Use explicit section headers, bullet-like reasoning scaffolds, “verify before final” rubrics within the prompt.
- Grounding and caution: Force image-based claims only; forbid demographic speculation; encourage uncertainty phrases when appropriate.
- Output controls: Specify exact sections, sentence limits, and stylistic constraints
- “Zero-Shot” Only (Strict) – Meaning:
- No fine-tuning, adapters, reward modeling, or gradient updates of any kind.
- No retrieval from external web sources or private corpora at inference time. Your prompt may only reference the provided input (image or context).
- No chain-of-examples (“few-shot”) in the prompt. You may instruct the model to “think step-by-step,” structure outputs, or include rubrics/checklists, but must not include worked examples or any ground-truth excerpts from the evaluation datasets.
- The evaluation set is hidden. Data leakage (e.g., copying known reports/answers, using memorized content) is disqualifying.
- Data User Agreement: Respect dataset DUAs and privacy requirements (all datasets are de-identified but still covered by data use access rules).
- Team Size: Maximum members of a team may be 4 people. Individual participants are also encouraged.
- Disqualification: during prompt engineering result (.txt and video) evaluation for final winners, BHI2025 OC reserves the right to disqualify entries that violate data user agreement, uses external retrieval, fine-tuning or example-based prompting, or evidence of data leakage.
BHI 2025 Data Challenge Competition
Track 1: Understanding depression risk through demographics, clinical factors & mindfulness interventions
Objective
The goal of this track is to analyze how demographic factors, clinical characteristics, and engagement with mindfulness-based therapies influence patients’ risk of depression. Participants will investigate whether mindfulness participation reduces depression risk in the short term (12 weeks, end of intervention) and whether these effects are sustained in the long term (24 weeks, follow-up). Analyses should identify which factors, demographic, clinical, or therapy-related, are most impactful, both at a global level and across different disease groups.
Dataset
The dataset integrates multiple dimensions:
- Demographics: age, sex
- Clinical Factors: condition (specific disease), condition type (disease group), baseline BDI-II depression score, identifier of the hospital.
- Mindfulness Therapy: number of sessions started, number of sessions completed
- Health Outcomes: BDI-II depression score at 12 weeks, BDI-II depression score at 24 weeks
Challenge Tasks
Participants will be required to:
- Predictive Analysis: Build and evaluate models to estimate depression risk scores at 12 and 24 weeks.
- Factor Importance & Visualization: Identify and visualize the most influential variables (e.g., SHAP, feature importance, comparative plots).
- Temporal Impact Analysis: Compare the short-term vs. long-term impact of demographics, clinical conditions, and mindfulness participation.
- Disease-Specific Comparison: Explore differences across condition types (e.g., cancer, cardiovascular, etc.) to uncover disease-specific effects.
Deliverables
- Mid-term submission: A paper-style report (max. 4 pages) describing the planned methodology, preliminary analyses, and intended approach.
- Final submission: A paper-style report (max. 8 pages) detailing the full methodology, results, analyses (global and disease-specific), visualizations, and key conclusions.
Evaluation Criteria
Submissions will be evaluated qualitatively, focusing on:
- Depth of analysis at both global and disease-specific levels.
- Clarity and rigor in explaining short-term vs. long-term effects.
- Strength of evidence-based insights, particularly regarding the effectiveness of mindfulness.
Track 2: Predicting macronutrient information
Objective
The objective of this track in the data competition is to predict the composition of a meal by analyzing its postprandial glucose response (PPGR), as measured by a continuous glucose monitor (CGM) in the three hours after consuming the meal.
Dataset
The track is based on the CGMacros: dataset publicly available on PhysioNet https://physionet.org/content/cgmacros. The dataset contains multimodal information from two CGMs, food macronutrients, food photographs, and physical activity, in addition to anonymized participant demographics, anthropometric measurements and health parameters from blood analyses and gut microbiome profiles. CGMacros contains data from 45 study participants (15 healthy adults, 16 with pre-diabetes, and 14 with Type 2 diabetes) who consumed meals with varying and known macronutrient compositions in a free-living setting for ten consecutive days.
The task
Develop a machine-learning model that predicts the Carbohydrate Caloric Ratio (CCR) from the PPGR response. The CCR is computed as the ratio of the caloric content of the net carbohydrates (Cnet) with respect to the total caloric content of the meal, as
𝐶𝐶𝑅= 𝐶𝑛𝑒𝑡 / (𝐶𝑛𝑒𝑡+𝑃+𝐹+𝐵),
where P, F, and B are the caloric content of the meal from protein, fat and fiber, respectively. You may use a machine-learning model of your choice to map the 3-hour PPGR (a 13-dimensional vector) into the CCR (a 1-dimensional vector). PPGRs are known to depend not only on the meal composition but also metabolic and other health parameters of the subject, physical activity and gut microbiome, all of which are also available on the CGMacros dataset.
Deliverables
- Mid-term submission: A paper-style report (max. 4 pages) describing the planned methodology, preliminary analyses, and intended approach.
- Final submission: A paper-style report (max. 8 pages) detailing the full methodology, results, analyses (global and disease-specific), visualizations, and key conclusions.
Evaluation
Model predictions will be evaluated based on the Normalized Root Mean Squared Error and correlation between the predicted and ground truth CCR, as described in the reference below:
Kerr, D.; Glantz, N.; Bevier, W.; Santiago, R.; Gutierrez-Osuna*, R.; Mortazavi, B. J., “A pilot scientific dataset for personalized nutrition and diet monitoring,” Scientific Data, 2025, in press.
How to Sign Up:
Participants interested in joining the challenge will need to complete this Google Form. In this form, you will be required to provide the following details:
- The track(s) they wish to participate in.
- The name of their team.
- The names and emails of all team members.
Upon submitting the form, participants will receive a confirmation email verifying their registration. This email will also include all the necessary materials for the selected track(s) and any additional information required to get started.
Rules and Conditions of the AI in Medicine Scientific Challenge
- Eligibility:
- The challenge is open to individuals or teams from academic institutions, research organizations, and the AI/healthcare industry.
- Teams must consist of a minimum of 2 members and a maximum of 5 members.
- Participants must be 18 years or older to register.
- Registration:
- All participants must register via the provided Google Form before the challenge registration deadline.
- Only registered participants will receive the challenge materials and submission guidelines.
- Participants are responsible for ensuring all submitted information is accurate.
- Code of Conduct:
- All participants must uphold a professional and respectful demeanor throughout the challenge.
- Plagiarism or unauthorized use of third-party code or data is strictly prohibited.
- Any form of misconduct may result in disqualification from the challenge.
- Submission Guidelines:
- Submissions must include all required deliverables: code, synthetic data samples (if applicable), and a paper detailing the methodology and results.
- Submissions must adhere to the deadlines specified by the organizers. Late submissions will not be considered.
- Code must be well-documented, reproducible, and provided in an easily executable format (e.g., Jupyter notebooks, Python scripts).
- Evaluation Criteria:
- Submissions will be evaluated based on both qualitative and quantitative metrics, as outlined in the challenge description.
- The decision of the judging panel will be final and binding.
- Use of Data and Materials:
- Participants are provided with synthetic data and any other materials strictly for the purposes of the challenge.
- Participants are prohibited from using the data or challenge materials for commercial purposes without prior written consent from the organizers.
- Intellectual Property:
- All code, data, and written materials submitted to the challenge will remain the intellectual property of the participants.
- By participating in the challenge, participants grant the organizers a non-exclusive, royalty-free license to use, publish, and display the submissions for promotional and educational purposes.
- Privacy and Confidentiality:
- Participants agree to respect the privacy of their team members and any information shared during the challenge.
- Personal information provided during registration will only be used for communication regarding the challenge and will not be shared with third parties without consent.
Timeline
- October 1st: Registration deadline for interested teams to sign up via Google Form
- October
3rd7th: Preliminary submission deadline (max. 4-page summary report on strategy/pipeline, tentative results, next steps, and any encountered problems/limitations for expert advice) - October
8th12th: Feedback on preliminary submissions from the expert panel - October 15th: Final submission deadline (8-page paper)
- October 19th: Announcement of top finalist teams per track
- October 22nd: BHI Data Challenge team member registration deadline (each team must register at the conference day rate ~100 to qualify for certificate and award)
- October 26th: Finalist presentations at the BHI conference
- October 29th: Awards ceremony at the BHI conference
Frequently Asked Questions
Is there a participation fee this year?
- As in previous editions, there is no dedicated participation fee. However, at least one member per team must be registered to the conference (Workshop/Tutorials/Special Sessions day pass is sufficient).
If I already have a full ticket as a Graduate Student (to present my paper), will my teammates also need to register?
- No. One registration per team is enough.
Can one person be part of more than one team? Can a team register in both tracks?
- Yes, one person may participate in more than one team.
- Yes, a team can register in both tracks.
Will all teams be invited to the closing ceremony, or only the winners?
- Only finalist teams will present their work on October 26th.
- Winners will be announced and awarded on October 29th.
- As a reference, last year we had 10 finalists and 3 winners.
Will all teams present their solutions during the Data Competition session?
- Only finalist teams will present during the Data Competition session.
Will the paper-style reports be published in the proceedings or elsewhere?
- Not at the moment.
Note on rules:
The rules established last year will also apply this year. The team responsible for handling the registration form will coordinate with the respective Task Leaders. Task Leaders remain responsible for evaluating their assigned tasks.
