Automated Processing of Electronic Data for Disease Surveillance

Emily Roberts, Rachelle Boulton, Josh Ridderhoff, Theron Jeppson



The objective of this abstract is to illustrate how the Utah Department of Health processes a high volume of electronic data in an automated way. We do this by a series of rules engines that does not require human intervention.


National initiatives, such as Meaningful Use, are automating the detection and reporting of reportable disease events to public health, which has led to more complete, timely, and accurate public health surveillance data. However, electronic reporting has also lead to significant increases in the number of cases reported to public health. In order for this data to be useful to public health, it must be processed and made available to epidemiologists and investigators in a timely fashion for intervention and monitoring. To meet this challenge, the Utah Department of Health (UDOH)’s Disease Control and Prevention Informatics Program (DCPIP) has developed the Electronic Message Staging Area (EMSA). EMSA is a system capable of automatically filtering, processing, and evaluating incoming electronic laboratory reporting (ELR) messages for relevance to public health, and entering those laboratory results into Utah’s integrated disease surveillance system (UT-NEDSS) without impacting the overall efficiency of UT-NEDSS or increasing the workload of epidemiologists.


After parsing and translating messages, EMSA runs the messages through a series of rules to determine if a test result should update an existing UT-NEDSS event, create a new UT-NEDSS event, be archived for possible use in future cases (e.g. to help identify seroconversion) or if the test result should be discarded. All of these rules can be configured specifically for each reportable condition. First, EMSA runs age-based rules. If the incoming message is too old for the indicated condition, EMSA does not continue processing and the message is discarded. EMSA then attempts to person match to determine if the person reported in the ELR message matches a known person in UT-NEDSS. If the person matches, EMSA will then evaluate whether the laboratory result should append to any events associated with the person, create a new event under that person, or create a new person and event. This process occurs through two different rule sets: whitelist rules, and test specific rules. Whitelist Rules are condition-specific and, when available, based on CDC's case definition guidelines to determine when a new lab test result should be considered part of an existing case or a catalyst to trigger a new event. Whitelist Rules run against all existing events found for the person matched, and once a single event is matched, then the more-specific test result-based rules come into play. Within an event matched by the whitelist rules, we have another set of rules based on the test result, collection date, accession number, and test status, to determine whether to add the laboratory report to the event, update an existing laboratory report, or if the laboratory report is a duplicate to be discarded. The message also runs through rules based on test and test result, and sometimes off organism, that determine whether that result can even be used to update the case or not. Whitelist rules also determine if too much time has passed since the matching event occurred for the incoming laboratory result to be appended to the matching event. Whitelist rules exist for both morbidity and contact events, and are based on timeframes such as onset date and treatment dates. If a particular incoming laboratory test result matches a known person in UT-NEDSS, and the whitelist rules determine that the laboratory result matches that person’s disease condition and can “update an existing event”, the laboratory result is run through another set of rules, called “test specific rules”. Test specific rules match incoming laboratory tests results to a UT-NEDSS disease condition, and determine whether each unique test type and test result combination can “create a new event” and/or “update an existing event”. All tests that do not meet the criteria for inclusion into UT-NEDSS, either by updating an event or creating a new event, are held in EMSA, in what is termed the “graylist” for a period of 18 months. When EMSA creates a new event, it queries the graylist to determine if a previous reported lab should be pulled and added to the new event. Graylist rules determine how far back EMSA is allowed to search for previous test results.


From 10/10/2016 to 9/30/2017, the Utah Department of Health has received a total of 995,486 electronic messages that required processing. Of those 995,486 messages, 23,787 (2.4%) were deleted, 17,839 (1.8%) were identified as duplicates and subsequently deleted, 853,853 (85.8%) were sent to graylist, and 99,657 (10%) were added to UT-NEDSS. Of the 99,657 messages, 85,705 (86%) were processed from raw electronic messages to assignment into UT-NEDSS without any human intervention.


ELR improves the timeliness, completeness, and accuracy of laboratory reporting to public health, but often results in a significant increase in laboratory reporting to public health agencies. This increase in volume can overwhelm epidemiologists and investigators if manual processes for reviewing all incoming ELR messages are needed for processing laboratory results and entering data into surveillance systems. In order to fully leverage the benefits of ELR for public health surveillance, we knew we needed a highly automated process for receiving, parsing, translating, and entering data into UT-NEDSS that would mitigate the challenges associated with the increased volume. We developed EMSA and its series of rule sets to meet this challenge.


Full Text:



Online Journal of Public Health Informatics * ISSN 1947-2579 *