Negasi Haile

Software Engineer | Data Scientist | Digital Healthcare Solutions | Health AI

I'm a software engineer with 4 years of experience in developing enterprise-level applications, mainly focused on digital healthcare solutions. My recent work includes, Implementing Ambulatory Glucose Profile (AGP) reports which visualize complex glucose data from CGM devices, and building a Diabetes Intervention System, a healthcare data server management dashboard, and an HR system for healthcare institutions.

And, as a data scientist with more than 2 years of part-time experience, my background includes processing medical images (X-ray, CT, MRI) and radiology reports, visualizing glucose pattern insights, and creating baseline models. Additionally, I evaluated machine translation systems with patient discharge prescriptions and provided feedback to improve system performance.

This exposure has deepened my interest in Digital Healthcare Transformation. I aim to improve healthcare delivery and physician decision-making supports by analyzing healthcare data with advanced machine learning techniques and digitalization solutions.

Pinned Publications

See all publications

Afro Chest X-ray: Chest X-ray Imaging Dataset for Multiple Cardio-respiratory Diseases in Ethiopia

ICML 2025

Paper Writing

The Chest X-ray Imaging Dataset for Multiple Cardio-respiratory Diseases in Ethiopia (Afro Chest X-ray for short) is a project funded by the LacunaFund whose aim is to close the gap in health disparities by fostering interdisciplinary collaborations that create, expand, or aggregate labeled training and evaluation datasets.

Cardio-respiratory diseases (cardiovascular and respiratory diseases) are recognized as serious, worldwide public health concerns that have remained among the leading causes of death globally. There are not many publicly available datasets from Africa making it difficult to determine whether tools and techniques developed in other geographies are as effective in our context. In this project, we propose to create a labeled chest X-ray dataset for multiple cardio respiratory diseases in Ethiopia. We will publish the dataset as open source. We believe this dataset will stimulate researchers and practitioners in Africa and beyond to push the limits of current methods to adapt them to the African context and build assistive technologies that could empower the scarce radiologists.

ProverbEval: Exploring LLM Evaluation Challenges for Low-resource Language Understanding

NAACL 2025

Published

With the rapid development of evaluation datasets to assess LLMs understanding across a wide range of subjects and domains, identifying a suitable language understanding benchmark has become increasingly challenging. In this work, we explore LLM evaluation challenges for low-resource language understanding and introduce ProverbEval, LLM evaluation benchmark for low-resource languages based on proverbs to focus on low-resource language understanding in culture-specific scenarios. We benchmark various LLMs and explore factors that create variability in the benchmarking process. We observed performance variances of up to 50%, depending on the order in which answer choices were presented in multiple-choice tasks. Native language proverb descriptions significantly improve tasks such as proverb generation, contributing to improved outcomes. Additionally, monolingual evaluations consistently outperformed their cross-lingual counterparts. We argue special attention must be given to the order of choices, choice of prompt language, task variability, and generation tasks when creating LLM evaluation benchmarks.

Pinned Projects

See all projects

AfroCXR: Afro Chest X-ray

Health

The Chest X-ray Imaging Dataset for Multiple Cardio-respiratory Diseases in Ethiopia (Afro Chest X-ray for short) is a project funded by the LacunaFund whose aim is to close the gap in health disparities by fostering interdisciplinary collaborations that create, expand, or aggregate labeled training and evaluation datasets.

Significant progress has been made in publicly available chest X-ray datasets for machine learning applications. However, most existing datasets are collected from limited regions, often excluding African representation.

To address this gap, we curated a dataset of 55,409 chest X-ray images from 48,962 patients, including 18,324 males, 30,387 females, and 260 individuals with undefined gender , retrospectively collected from 10 healthcare institutions in Ethiopia studied between 2015 and 2024 . The dataset includes 31,939 images paired with corresponding radiology reports and 11,806 manually annotated images by 11 radiology experts using a blinded review process. The annotations focus on localized findings, which are particularly relevant for regional disease patterns. This dataset, presented both in JPG and DICOM format along with patient demographics and machine-readable radiology reports, provides a novel resource for developing machine-learning models tailored to underrepresented populations. This study aims to enhance global diagnostic accuracy and foster equitable chest diagnosis advancements by addressing gaps in chest X-ray data diversity and geographical representation.

In this project my role includes:

• Leading the data collection team.

• Preparing data collection guidelines based on the healthcare institutions data management challenges.

• Preprocessing and standardizing the data into final forms.

• Developing annotation tools, creating annotation guidelines, and training/assisting radiologists with the annotation process.

• Analyzing the annotated data and creating a baseline model.

The dataset will be released very soon. We are currently writing the dataset paper. Stay tuned!

Python

Pydicom

dicom2jpg

AfroCXR: Afro Chest X-ray

The Chest X-ray Imaging Dataset for Multiple Cardio-respiratory Diseases in Ethiopia (Afro Chest X-ray for short) is a project funded by the LacunaFund whose aim is to close the gap in health disparities by fostering interdisciplinary collaborations that create, expand, or aggregate labeled training and evaluation datasets.

Significant progress has been made in publicly available chest X-ray datasets for machine learning applications. However, most existing datasets are collected from limited regions, often excluding African representation.

To address this gap, we curated a dataset of 55,409 chest X-ray images from 48,962 patients, including 18,324 males, 30,387 females, and 260 individuals with undefined gender , retrospectively collected from 10 healthcare institutions in Ethiopia studied between 2015 and 2024 . The dataset includes 31,939 images paired with corresponding radiology reports and 11,806 manually annotated images by 11 radiology experts using a blinded review process. The annotations focus on localized findings, which are particularly relevant for regional disease patterns. This dataset, presented both in JPG and DICOM format along with patient demographics and machine-readable radiology reports, provides a novel resource for developing machine-learning models tailored to underrepresented populations. This study aims to enhance global diagnostic accuracy and foster equitable chest diagnosis advancements by addressing gaps in chest X-ray data diversity and geographical representation.

In this project my role includes:

• Leading the data collection team.

• Preparing data collection guidelines based on the healthcare institutions data management challenges.

• Preprocessing and standardizing the data into final forms.

• Developing annotation tools, creating annotation guidelines, and training/assisting radiologists with the annotation process.

• Analyzing the annotated data and creating a baseline model.

The dataset will be released very soon. We are currently writing the dataset paper. Stay tuned!

Sample screenshots

Visit the website HERE.

Ambulatory Glucose Profile (AGP) report

Health

A web app project for Continuous Glucose Monitoring (CGM) data visualization with Ambulatory Glucose Profile (AGP) report focuses on leveraging glucose data to revolutionize diabetes management. It collects real-time glucose data from LibreView, providing actionable insights for patients and clinicians through visaulizing in charts. Key features include Time In Range (TIR) analysis, which highlights the percentage of time a patient’s glucose levels stay within the target range, and Glucose Metrics, offering statistical insights like average glucose and variability. The project also includes Ambulatory Glucose Profile (AGP) for visualizing glucose trends and patterns over 2 weeks (14 days) and Daily Glucose Profile, showcasing day-to-day glucose fluctuations. Together, these tools aim to enhance decision-making, optimize treatment, and improve overall diabetes care.

JavaScript

ReactJS

ChartJS

DEMER HR

Health

Healthcare Human Resource Management System is a comprehensive solution designed to streamline the operations of healthcare institutions. It centralizes key processes such as patient management, inventory tracking, billing, and staff scheduling, enabling seamless coordination across departments. The system enhances operational efficiency by integrating modules for electronic medical records (EMR), appointment scheduling, and financial management. It provides real-time data insights, aiding in informed decision-making and compliance with healthcare regulations. With features tailored for the healthcare industry, this ERP system ensures improved patient care, optimized resource utilization, and robust administrative control, making it an essential tool for modern healthcare institutions.

NestJS

ReactJS

PostgreSQL

FMCMS

ERP

Fiscal Machine Controlling and Management System (FMCMS) which tracks the process of importing and selling POS machine in Ethiopia. The FMCMS project involves the implementation of 9 user types, each with their own privileges. It utilizes the MERN stack technology (MongoDB, Express JS, React JS, and Node JS).

MERN

Bootstrap

react-pdf

EthioAI Hub

AI Hub

EthioAI HUB is a comprehensive web platform dedicated to archiving and preserving AI-related works from Ethiopia. The platform serves as a centralized repository for Machine Learning models, datasets, research papers, and other resources related to Artificial Intelligence within the Ethiopian context. The goal of the HUB is to document and showcase the growing body of AI research and innovation emerging from Ethiopia, while also making these resources easily accessible to researchers, students, and AI practitioners both locally and internationally. Whether you're looking for datasets in Amharic, Tigrinya, and Oromo languages, or seeking cutting-edge AI models trained for Ethiopian-specific challenges, EthioAI HUB is your go-to resource for exploring the diverse and impactful AI developments coming out of Ethiopia.

NextJS

TailwinddCSS

HornChat

Chat bot

HornChat is a chatbot built with a combination of Large Language Models(LLM) and Translation Systems to solve the linguistic barriers of low-resource languages specifically Horn of Africa languages. Our mission is to provide a seamless assistance for users to communicate with Large Language Models like ChatGPT using their language.

OpenAI

Lesan.ai

Chatbot UI

Upwork Testmonials