Software Engineer | Data Scientist | Digital Healthcare Solutions | Health AI
I'm a software engineer with 4 years of experience in developing enterprise-level applications, mainly focused on digital healthcare solutions. My recent work includes, Implementing Ambulatory Glucose Profile (AGP) reports which visualize complex glucose data from CGM devices, and building a Diabetes Intervention System, a healthcare data server management dashboard, and an HR system for healthcare institutions.
And, as a data scientist with more than 2 years of part-time experience, my background includes processing medical images (X-ray, CT, MRI) and radiology reports, visualizing glucose pattern insights, and creating baseline models. Additionally, I evaluated machine translation systems with patient discharge prescriptions and provided feedback to improve system performance.
This exposure has deepened my interest in Digital Healthcare Transformation. I aim to improve healthcare delivery and physician decision-making supports by analyzing healthcare data with advanced machine learning techniques and digitalization solutions.
Pinned Publications
See all publicationsICML 2025
Paper Writing
The Chest X-ray Imaging Dataset for Multiple Cardio-respiratory Diseases in Ethiopia (Afro Chest X-ray for short) is a project funded by the LacunaFund whose aim is to close the gap in health disparities by fostering interdisciplinary collaborations that create, expand, or aggregate labeled training and evaluation datasets.
Cardio-respiratory diseases (cardiovascular and respiratory diseases) are recognized as serious, worldwide public health concerns that have remained among the leading causes of death globally. There are not many publicly available datasets from Africa making it difficult to determine whether tools and techniques developed in other geographies are as effective in our context. In this project, we propose to create a labeled chest X-ray dataset for multiple cardio respiratory diseases in Ethiopia. We will publish the dataset as open source. We believe this dataset will stimulate researchers and practitioners in Africa and beyond to push the limits of current methods to adapt them to the African context and build assistive technologies that could empower the scarce radiologists.
NAACL 2025
In Review
With the rapid development of evaluation datasets to assess LLMs understanding across a wide range of subjects and domains, identifying a suitable language understanding benchmark has become increasingly challenging. In this work, we explore LLM evaluation challenges for low-resource language understanding and introduce ProverbEval, LLM evaluation benchmark for low-resource languages based on proverbs to focus on low-resource language understanding in culture-specific scenarios. We benchmark various LLMs and explore factors that create variability in the benchmarking process. We observed performance variances of up to 50%, depending on the order in which answer choices were presented in multiple-choice tasks. Native language proverb descriptions significantly improve tasks such as proverb generation, contributing to improved outcomes. Additionally, monolingual evaluations consistently outperformed their cross-lingual counterparts. We argue special attention must be given to the order of choices, choice of prompt language, task variability, and generation tasks when creating LLM evaluation benchmarks.
Pinned Projects
See all projectsSignificant progress has been made in publicly available chest X-ray datasets for machine learning applications. However, most existing datasets are collected from limited regions, often excluding African representation.
To address this gap, we curated a dataset of 55,409 chest X-ray images from 48,962 patients, including 18,324 males, 30,387 females, and 260 individuals with undefined gender , retrospectively collected from 10 healthcare institutions in Ethiopia studied between 2015 and 2024 . The dataset includes 31,939 images paired with corresponding radiology reports and 11,806 manually annotated images by 11 radiology experts using a blinded review process. The annotations focus on localized findings, which are particularly relevant for regional disease patterns. This dataset, presented both in JPG and DICOM format along with patient demographics and machine-readable radiology reports, provides a novel resource for developing machine-learning models tailored to underrepresented populations. This study aims to enhance global diagnostic accuracy and foster equitable chest diagnosis advancements by addressing gaps in chest X-ray data diversity and geographical representation.
In this project my role includes:
• Leading the data collection team.
• Preparing data collection guidelines based on the healthcare institutions data management challenges.
• Preprocessing and standardizing the data into final forms.
• Developing annotation tools, creating annotation guidelines, and training/assisting radiologists with the annotation process.
• Analyzing the annotated data and creating a baseline model.
The dataset will be released very soon. We are currently writing the dataset paper. Stay tuned!
C#
SQL
ReactJS
+1
Health
ERP
Chat bot
Upwork Testmonials