Researchers find flaws in algorithm used to identify atypical medication orders

Can algorithms identify unusual medication orders or profiles more accurately than humans? Not necessarily. A study coauthored by researchers at the Université Laval and CHU Sainte-Justine in Montreal found that one model used by physicians to screen patients performed poorly on some orders. It’s a cautionary tale of the use of AI and machine learning in medicine, where unvetted technology has the potential to negatively impact outcomes.

Pharmacists review lists of active medications — i.e., pharmacological profiles — for inpatients under their care. This process aims to identify medications that could be abused, but most medication orders don’t show drug-related problems. Publications from over a decade ago illustrate the potential of technology to help pharmacists streamline workflows like order review, but while more recent research has investigated AI’s potential in pharmacology, few studies have demonstrated efficacy.

The coauthors of this latest work looked at a model deployed in a tertiary-care mother-and-child academic hospital between April 2020 and August 2020. The model was trained on a dataset of 2,846,502 medication orders from 2005 to 2018 extracted from a pharmacy database and preprocessed into 1,063,173 profiles. Prior to data collection, every month, the model was retrained with ten years of most recent data from the database in order to minimize drift, which occurs when a model loses its predictive power.

Pharmacists at the academic hospital rated medication order in the database as “typical” or “atypical” before observing the predictions; patients were evaluated only once to minimize the risk of including profiles that the pharmacists had previously evaluated. Atypical prescriptions were defined as those that didn’t correspond to the usual prescribing patterns, according to the pharmacist’s expertise, while profiles were considered atypical if at least one medication order within them was labeled as atypical.

The model’s profile predictions were provided to the pharmacists and they indicated whether they agreed or disagreed with each prediction. In all, 12,471 medication orders and 1,356 profiles were shown to 25 pharmacists from seven of the academic hospital’s departments, mostly from obstetrics-gynecology.

The researchers report that the model exhibited poor performance with respect to medication orders, attaining an F1-score of 0.30 (lower is worse). On the other hand, the model’s profile predictions achieved “satisfactory” performance with an F1-score of 0.59.

One reason might be a lack of representative data; research has shown that biased diagnostic algorithms may perpetuate inequalities. A team of scientists recently found that almost all eye disease datasets come from patients in North America, Europe, and China, meaning eye disease-diagnosing algorithms are less certain to work well for racial groups from underrepresented countries. In another study, Stanford University researchers claimed that most of the U.S. data for studies involving medical uses of AI come from California, New York, and Massachusetts.

Cognizant of this, the coauthors of this study say they don’t believe the model could be used as a standalone decision support tool. However, they believe it could be combined with rules-based approaches to identify medication order issues independent of common practice. “Conceptually, presenting pharmacists with a prediction for each order should be better because it identifies clearly which prescription is atypical, unlike profile predictions which only inform the pharmacist that something is atypical within the profile,” they wrote. “Although [our] focus groups indicated a lack of trust in order predictions by pharmacists, they were satisfied to use them as a safeguard to ensure that they did not miss unusual orders. This leads us to believe that even moderately improving the quality of these predictions in future work could be beneficial.”

Source: Read Full Article