Skip to main navigation Skip to search Skip to main content

BioBERT-XGBoost for Adverse Drug Reaction Prediction: An Interpretable Hybrid Model for Risk-Aware Pharmacovigilance

  • Alexandra Ramirez
  • , Raul Pingo
  • , Sandra Wong-Durand
  • , Pedro Castañeda
  • , Alejandra Oñate-Andino
  • Universidad Peruana de Ciencias Aplicadas
  • Universidad Nacional Toribio Rodríguez de Mendoza de Amazonas
  • Escuela Superior Politécnica de Chimborazo

Research output: Contribution to journalArticlepeer-review

Abstract

Adverse drug reactions (ADRs) are a critical challenge for patient safety, with over 21,000 alerts reported in Peru in 2024. Current artificial intelligence (AI) models in pharmacovigilance present limitations in external validation, clinical scalability, and algorithmic transparency. This work proposes BioBERT-XGBoost, an interpretable hybrid model that combines biomedical natural language processing with supervised machine learning to predict ADRs. The architecture integrates BioBERT for semantic extraction of pharmacological entities with XGBoost as a calibrated classifier, trained on public datasets (DrugBank, openFDA–FAERS) and anonymized clinical records. The pipeline includes standardized preprocessing through normalized vocabularies, feature engineering with semantic embeddings, class imbalance handling, and probability calibration. Evaluation uses discrimination metrics (AUROC, AUPRC), calibration (Brier score), and explainability (SHAP). The system is deployed on Microsoft Azure through a mobile application that generates risk-stratified clinical alerts, representing a step toward trustworthy clinical decision-support systems for proactive ADR detection.

Original languageEnglish
Pages (from-to)107-122
Number of pages16
JournalInternational journal of online and biomedical engineering
Volume22
Issue number4
DOIs
StatePublished - 10 Apr 2026
Externally publishedYes

Keywords

  • adverse drug reactions (ADRs)
  • biomedical natural language processing (NLP)
  • clinical prediction models
  • hybrid machine learning
  • patient safety systems
  • pharmacovigilance
  • risk stratification
  • semantic embeddings

Fingerprint

Dive into the research topics of 'BioBERT-XGBoost for Adverse Drug Reaction Prediction: An Interpretable Hybrid Model for Risk-Aware Pharmacovigilance'. Together they form a unique fingerprint.

Cite this