Abstract
Adverse drug reactions (ADRs) are a critical challenge for patient safety, with over 21,000 alerts reported in Peru in 2024. Current artificial intelligence (AI) models in pharmacovigilance present limitations in external validation, clinical scalability, and algorithmic transparency. This work proposes BioBERT-XGBoost, an interpretable hybrid model that combines biomedical natural language processing with supervised machine learning to predict ADRs. The architecture integrates BioBERT for semantic extraction of pharmacological entities with XGBoost as a calibrated classifier, trained on public datasets (DrugBank, openFDA–FAERS) and anonymized clinical records. The pipeline includes standardized preprocessing through normalized vocabularies, feature engineering with semantic embeddings, class imbalance handling, and probability calibration. Evaluation uses discrimination metrics (AUROC, AUPRC), calibration (Brier score), and explainability (SHAP). The system is deployed on Microsoft Azure through a mobile application that generates risk-stratified clinical alerts, representing a step toward trustworthy clinical decision-support systems for proactive ADR detection.
| Original language | English |
|---|---|
| Pages (from-to) | 107-122 |
| Number of pages | 16 |
| Journal | International journal of online and biomedical engineering |
| Volume | 22 |
| Issue number | 4 |
| DOIs | |
| State | Published - 10 Apr 2026 |
| Externally published | Yes |
Keywords
- adverse drug reactions (ADRs)
- biomedical natural language processing (NLP)
- clinical prediction models
- hybrid machine learning
- patient safety systems
- pharmacovigilance
- risk stratification
- semantic embeddings
Fingerprint
Dive into the research topics of 'BioBERT-XGBoost for Adverse Drug Reaction Prediction: An Interpretable Hybrid Model for Risk-Aware Pharmacovigilance'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver