TY - JOUR AU - Sharpe, Danielle AU - Hopkins, Richard AU - Cook, Robert L. AU - Striley, Catherine W. PY - 2017/05/02 Y2 - 2024/03/28 TI - Using a Bayesian Method to Assess Google, Twitter, and Wikipedia for ILI Surveillance JF - Online Journal of Public Health Informatics JA - OJPHI VL - 9 IS - 1 SE - Novel algorithms, statistical or mathematical methods DO - 10.5210/ojphi.v9i1.7604 UR - https://ojphi.org/ojs/index.php/ojphi/article/view/7604 SP - AB - <div style="left: 90px; top: 282.425px; font-size: 14.1667px; font-family: sans-serif; transform: scaleX(1.08129);" data-canvas-width="63.778333333333336">Objective</div><div style="left: 105px; top: 297.376px; font-size: 14.1667px; font-family: serif; transform: scaleX(1.00993);" data-canvas-width="381.19666666666654">To comparatively analyze Google, Twitter, and Wikipedia by</div><div style="left: 90px; top: 314.043px; font-size: 14.1667px; font-family: serif; transform: scaleX(0.984237);" data-canvas-width="393.1207500000001">evaluating how well change points detected in each web-based source</div><div style="left: 90px; top: 330.709px; font-size: 14.1667px; font-family: serif; transform: scaleX(1.00062);" data-canvas-width="309.6408333333334">correspond to change points detected in CDC ILI data.</div><div style="left: 90px; top: 360.758px; font-size: 14.1667px; font-family: sans-serif; transform: scaleX(1.11768);" data-canvas-width="82.63416666666666">Introduction</div><div style="left: 105px; top: 375.709px; font-size: 14.1667px; font-family: serif; transform: scaleX(1.04228);" data-canvas-width="375.2367499999999">Traditional influenza surveillance relies on reports of influenza-</div><div style="left: 90px; top: 392.376px; font-size: 14.1667px; font-family: serif; transform: scaleX(1.0239);" data-canvas-width="396.6255833333335">like illness (ILI) by healthcare providers, capturing individuals</div><div style="left: 90px; top: 409.043px; font-size: 14.1667px; font-family: serif; transform: scaleX(0.974321);" data-canvas-width="395.36333333333357">who seek medical care and missing those who may search, post,</div><div style="left: 90px; top: 425.709px; font-size: 14.1667px; font-family: serif; transform: scaleX(1.02275);" data-canvas-width="394.40708333333316">and tweet about their illnesses instead. Existing research has shown</div><div style="left: 90px; top: 442.376px; font-size: 14.1667px; font-family: serif; transform: scaleX(0.969304);" data-canvas-width="395.16924999999986">some promise of using data from Google, Twitter, and Wikipedia</div><div style="left: 90px; top: 459.043px; font-size: 14.1667px; font-family: serif; transform: scaleX(1.00593);" data-canvas-width="393.5216666666666">for influenza surveillance, but with conflicting findings, studies have</div><div style="left: 90px; top: 475.709px; font-size: 14.1667px; font-family: serif; transform: scaleX(0.976864);" data-canvas-width="392.91108333333335">only evaluated these web-based sources individually or dually without</div><div style="left: 90px; top: 492.376px; font-size: 14.1667px; font-family: serif; transform: scaleX(0.956147);" data-canvas-width="163.97916666666669">comparing all three of them</div><div style="left: 253.991px; top: 492.751px; font-size: 8.5px; font-family: serif; transform: scaleX(1.01474);" data-canvas-width="11.483499999999998">1-5</div><div style="left: 265.477px; top: 492.376px; font-size: 14.1667px; font-family: serif; transform: scaleX(0.939896);" data-canvas-width="219.1994166666667">. A comparative analysis of all three</div><div style="left: 90px; top: 509.043px; font-size: 14.1667px; font-family: serif; transform: scaleX(0.981195);" data-canvas-width="393.0995000000001">web-based sources is needed to know which of the web-based sources</div><div style="left: 90px; top: 525.709px; font-size: 14.1667px; font-family: serif; transform: scaleX(0.961404);" data-canvas-width="394.76833333333326">performs best in order to be considered to complement traditional</div><div style="left: 90px; top: 542.376px; font-size: 14.1667px; font-family: serif; transform: scaleX(1.00102);" data-canvas-width="51.5525">methods.</div><div style="left: 90px; top: 570.758px; font-size: 14.1667px; font-family: sans-serif; transform: scaleX(1.07287);" data-canvas-width="58.23916666666666">Methods</div><div style="left: 105px; top: 585.709px; font-size: 14.1667px; font-family: serif; transform: scaleX(1.02028);" data-canvas-width="379.15383333333324">We collected publicly available, de-identified data from the CDC</div><div style="left: 90px; top: 602.376px; font-size: 14.1667px; font-family: serif; transform: scaleX(0.995944);" data-canvas-width="393.0824999999998">ILINet system, Google Flu Trends, HealthTweets.org, and Wikipedia</div><div style="left: 90px; top: 619.043px; font-size: 14.1667px; font-family: serif; transform: scaleX(1.00311);" data-canvas-width="393.56841666666674">for the 2012-2015 influenza seasons. Bayesian change point analysis</div><div style="left: 90px; top: 635.709px; font-size: 14.1667px; font-family: serif; transform: scaleX(1.04234);" data-canvas-width="394.6139166666661">was the method used to detect change points, or seasonal changes,</div><div style="left: 90px; top: 652.376px; font-size: 14.1667px; font-family: serif; transform: scaleX(0.972714);" data-canvas-width="395.5701666666666">in each of the web-data sources for comparison to change points</div><div style="left: 90px; top: 669.043px; font-size: 14.1667px; font-family: serif; transform: scaleX(1.04463);" data-canvas-width="394.6450833333327">in CDC ILI data. All analyses was conducted using the R package</div><div style="left: 90px; top: 685.709px; font-size: 14.1667px; font-family: serif; transform: scaleX(0.989767);" data-canvas-width="393.1023333333333">‘bcp’ v4.0.0 in RStudio v0.99.484. Sensitivity and positive predictive</div><div style="left: 90px; top: 702.376px; font-size: 14.1667px; font-family: serif; transform: scaleX(1.00087);" data-canvas-width="198.6733333333334">values (PPV) were then calculated.</div><div style="left: 90px; top: 732.425px; font-size: 14.1667px; font-family: sans-serif; transform: scaleX(1.08488);" data-canvas-width="51.17">Results</div><div style="left: 105px; top: 747.376px; font-size: 14.1667px; font-family: serif; transform: scaleX(0.994727);" data-canvas-width="378.26133333333325">During the 2012-2015 influenza seasons, a high sensitivity of 92%</div><div style="left: 90px; top: 764.043px; font-size: 14.1667px; font-family: serif; transform: scaleX(1.04219);" data-canvas-width="394.60683333333327">was found for Google, while the PPV for Google was 85%. A low</div><div style="left: 90px; top: 780.709px; font-size: 14.1667px; font-family: serif; transform: scaleX(0.94556);" data-canvas-width="394.8972499999998">sensitivity of 50% was found for Twitter; a low PPV of 43% was</div><div style="left: 90px; top: 797.376px; font-size: 14.1667px; font-family: serif; transform: scaleX(1.02516);" data-canvas-width="394.2923333333334">found for Twitter also. Wikipedia had the lowest sensitivity of 33%</div><div style="left: 90px; top: 814.043px; font-size: 14.1667px; font-family: serif; transform: scaleX(1.00063);" data-canvas-width="138.90416666666667">and lowest PPV of 40%.</div><div style="left: 90px; top: 844.091px; font-size: 14.1667px; font-family: sans-serif; transform: scaleX(1.10336);" data-canvas-width="85.01416666666667">Conclusions</div><div style="left: 105px; top: 859.043px; font-size: 14.1667px; font-family: serif; transform: scaleX(1.02894);" data-canvas-width="382.04666666666657">Google had the best combination of sensitivity and PPV in</div><div style="left: 90px; top: 875.709px; font-size: 14.1667px; font-family: serif; transform: scaleX(0.974529);" data-canvas-width="393.12499999999983">detecting change points that corresponded with change points found in</div><div style="left: 90px; top: 892.376px; font-size: 14.1667px; font-family: serif; transform: scaleX(1.00669);" data-canvas-width="393.584">CDC data. Overall, change points in Google, Twitter, and Wikipedia</div><div style="left: 90px; top: 909.043px; font-size: 14.1667px; font-family: serif; transform: scaleX(1.02604);" data-canvas-width="394.5133333333334">data occasionally aligned well with change points captured in CDC</div><div style="left: 90px; top: 925.709px; font-size: 14.1667px; font-family: serif; transform: scaleX(1.04592);" data-canvas-width="394.4000000000001">ILI data, yet these sources did not detect all changes in CDC data,</div><div style="left: 90px; top: 942.376px; font-size: 14.1667px; font-family: serif; transform: scaleX(0.993295);" data-canvas-width="393.29499999999996">which could indicate limitations of the web-based data or signify that</div><div style="left: 90px; top: 959.043px; font-size: 14.1667px; font-family: serif; transform: scaleX(1.0381);" data-canvas-width="390.1698333333331">the Bayesian method is not adequately sensitive. These three web-</div><div style="left: 90px; top: 975.709px; font-size: 14.1667px; font-family: serif; transform: scaleX(1.03817);" data-canvas-width="394.66208333333327">based sources need to be further studied and compared using other</div><div style="left: 90px; top: 992.376px; font-size: 14.1667px; font-family: serif; transform: scaleX(1.01823);" data-canvas-width="394.3093333333333">statistical methods before being incorporated as surveillance data to</div><div style="left: 90px; top: 1009.04px; font-size: 14.1667px; font-family: serif; transform: scaleX(1.00102);" data-canvas-width="183.77000000000004">complement traditional systems.</div><div style="left: 90px; top: 1217.31px; font-size: 12.5px; font-family: serif; transform: scaleX(1.00298);" data-canvas-width="329.11249999999984">Figure 1. Detection of change points, 2012-2013 influenza season</div><div style="left: 510px; top: 469.075px; font-size: 12.5px; font-family: serif; transform: scaleX(1.00298);" data-canvas-width="329.11249999999984">Figure 2. Detection of change points, 2013-2014 influenza season</div><div style="left: 510px; top: 690.001px; font-size: 12.5px; font-family: serif; transform: scaleX(1.00298);" data-canvas-width="329.11249999999984">Figure 3. Detection of change points, 2014-2015 influenza season</div> ER -