TY - JOUR AU - Mowery, Jared PY - 2016/12/28 Y2 - 2024/03/28 TI - Twitter Influenza Surveillance: Quantifying Seasonal Misdiagnosis Patterns JF - Online Journal of Public Health Informatics JA - OJPHI VL - 8 IS - 3 SE - Original Articles DO - 10.5210/ojphi.v8i3.7011 UR - https://ojphi.org/ojs/index.php/ojphi/article/view/7011 SP - AB - <p><span style="font-family: Cambria;"><strong><span style="font-size: medium;">Background: </span></strong><span style="font-size: medium;">Influenza (flu) surveillance using Twitter data can potentially save lives and increase efficiency by providing governments and healthcare organizations with greater situational awareness. However, research is needed to determine the impact of Twitter users’ misdiagnoses on surveillance accuracy. </span></span></p><p><span style="font-family: Cambria;"><strong><span style="font-size: medium;">Objective: </span></strong><span style="font-size: medium;">This study establishes the importance of Twitter users’ misdiagnoses by showing that Twitter flu surveillance in the United States failed during the 2011-2012 flu season, estimates the extent of misdiagnoses, and tests several methods for reducing the adverse effects of misdiagnoses.</span></span></p><p><span style="font-family: Cambria;"><strong><span style="font-size: medium;">Methods:</span></strong><span style="font-size: medium;"> Metrics representing flu prevalence, seasonal misdiagnosis patterns, diagnosis uncertainty, flu symptoms, and noise were produced using Twitter data in conjunction with OpenSextant for geo-inferencing, and a maximum entropy classifier for identifying tweets related to illness. These metrics were tested for correlations with World Health Organization (WHO) positive specimen counts of flu from 2011 to 2014.</span></span></p><p><span style="font-family: Cambria;"><strong><span style="font-size: medium;">Results:</span></strong><span style="font-size: medium;"> Twitter flu surveillance erroneously indicated a typical flu season during 2011-2012, even though the flu season peaked three months late, and erroneously indicated plateaus of flu tweets before the 2012-2013 and 2013-2014 flu seasons. Enhancements based on estimates of misdiagnoses removed the erroneous plateaus and increased the Pearson correlation coefficients by .04 and .23, but failed to correct the 2011-2012 flu season estimate. A rough estimate indicates that approximately 40% of flu tweets reflected misdiagnoses.</span></span></p><p><span style="font-family: Cambria;"><strong><span style="font-size: medium;">Conclusions:</span></strong><span style="font-size: medium;"> Further research into factors affecting Twitter users’ misdiagnoses, in conjunction with data from additional atypical flu seasons, is needed to enable Twitter flu surveillance systems to produce reliable estimates during atypical flu seasons.</span></span></p> ER -