Pioneering Data Mining Techniques to Predict Health Hazards

Dr. Farrokh Alemi and Dr. Sanja Avramovic, professors in CHHS's Department of Health Administration and Policy, in collaboration with colleagues from the Systems Engineering and Operations Research department and the Veterans Affairs Medical Center, created a new model to predict the development of anemia. Physicians are looking for new ways to help them diagnose anemia, a serious disease that affects more than 3 million Americans and is growing. Anemia is characterized by reduced oxygen in the blood supply and can be caused by a number of factors including iron, folate, and B12 deficiencies. It is important to detect anemia in its early stage due to its significant health impacts including reduced cognitive function, low birthweight, and death.

Alemi, Avramovic, and colleagues developed and tested a model that takes into account time-dependent, interacting, and repeating risk factors to predict disease outcomes. According to Alemi, "Our model provides a tool to assist clinicians in anticipating the development of anemia. This could help them determine the need for further monitoring or intervention and potentially lead to an earlier diagnosis than with traditional methods."

Previous studies have used small sample sizes and specific risk factors, which has limited generalization. This new study was unique in two important ways. First, they used a nationally representative sample—the Department of Veterans Affairs Informatics and Computing Infrastructure (VINCI)—the most complete repository of administrative and medical records in the United States. Second, they included all risk factors for anemia in the model—even those that were rare but could be predictive.

This research holds promise to revolutionize health delivery and reduce costs. According to Avramovic, “The model we created is important because it demonstrates the possibility of predicting the development of a disease—like anemia—based on a patient’s medical history. However, even when we use the model we should be careful not to jump to premature conclusions, because no model can explain every situation.”

What comes next? The researchers recommend further studies to verify the accuracy of the model in real-time clinical settings. To read more, view the full article at Big Data.