Comparing and Contrasting Two ESSENCE Syndrome Definition Query Methods

Zachary M. Stein, Sophia Crossen

Abstract


Objective

To compare and contrast two ESSENCE syndrome definition query methods and establish best practices for syndrome definition creation.

Introduction

The Kansas Syndromic Surveillance Program (KSSP) utilizes the ESSENCE v.1.20 program provided by the National Syndromic Surveillance Program to view and analyze Kansas Emergency Department (ED) data.
Methods that allow an ESSENCE user to query both the Discharge Diagnosis (DD) and Chief Complaint (CC) fields simultaneously allow for more specific and accurate syndromic surveillance definitions. As ESSENCE use increases, two common methodologies have been developed for querying the data in this way.
The first is a query of the field named “CC and DD.” The CC and DD field contains a concatenation of the parsed patient chief complaint and the discharge diagnosis. The discharge diagnosis consists of the last non-null value for that patient visit ID and the chief complaint parsed is the first non-null chief complaint value for that patient visit ID that is parsed by the ESSENCE platform. For this comparison, this method shall be called the CCDD method.
The second method involves a query of the fields named, “Chief Complaint History” and “Discharge Diagnosis History.” While the first requires only one field be queried, this method queries the CC History and DD History fields, combines the resulting data and de-duplicates this final data set by the C_BioSense_ID. Chief Complaint History is a list of all chief complaint values related to a singular ED visit, and Discharge Diagnosis History is the same concept, except involving all Discharge Diagnosis values. For this comparison, this method shall be called the CCDDHX method.
While both methods are based on the same query concept, each method can yield different results.

Methods

A program was created in R Studio to analyze a user-provided query.
Simple queries were randomly generated. Twenty randomly generated queries were run through the R Studio program and disparities between data sets were recorded. All KSSP production facility ED visits during the month of August 2017 were analyzed.
Secondly, three queries actively utilized in KSSP practice were run through the program. These queries were Firework-Related Injuries, Frostbite and Cold Exposure, and Rabies Exposure. The queries were run on all KSSP production facility ED visits, and coincided with the timeline of relevant exposures.

Results

In the random query trials, an average of 5.4% of the cases captured using the CCDD field method were unique and not captured by the same query in the CCDDHX method. Using the CCDDHX method, an average of 6.1% of the cases captured were unique and not captured by the CCDD method.

When using the program to compare syndromes from actively utilized KSSP practice, the disparity between the two methods was much lower.

Firework-Related Injuries
During the time period queried, the CCDD method returned 171 cases and the CCDDHX method returned 169 cases. All CCDDHX method cases were captured by the CCDD method. The CCDD method returned 2 cases not captured by the CCDDHX method. These two cases were confirmed as true positive firework-related injury cases.

Frostbite and Cold Exposure
During the time period queried, CCDD method returned 328 cases and the CCDDHX method returned 344 cases. The CCDDHX method captured 16 cases that the CCDD method did not. The CCDD method did not capture any additional cases when compared to the CCDDHX method. After review, 10 (62.5%) of these 16 cases not captured by the CCDD method were true positive cases.

Rabies Exposure
During the time period queried, the CCDD method returned 474 cases and the CCDDHX method returned 473 cases. The CCDDHX method captured 7 cases that the CCDD method did not. The CCDD method returned 8 cases not captured by the CCDDHX method. After review, the 7 unique cases captured in the CCDDHX method contained 3 (42.9%) true positive cases and 3 (37.5%) of the 8 cases not captured by the CCDDHX method were true positives.

Conclusions

The twenty random queries showed a disparity between methods. When utilizing the same program to analyze three actively utilized KSSP definitions, both methods yielded similar results with a much smaller disparity. The CCDDHX method inherently requires more steps and requires more queries to be run through ESSENCE, making the method less timely and more difficult to share. Despite these downsides, CCDDHX will capture cases that appear throughout the history of field updates.

Further variance between methods is likely due to the CCDD field utilizing the ESSENCE-processed CC while the CCDDHX field utilizes the CC verbatim as produced by the ED facility. This allows the CCDD method to tap into the powerful spelling correction and abbreviation-parsing steps that ESSENCE employs, but incorrect machine corrections and replacements, while rare, can negatively affect syndrome definition performance.

The greater disparity in methods for the random queries may be due to the short (3 letter) text portion of the queries. Short segments are more likely to be found in multiple words than text of actual queries. Utilizing larger randomly generated text segments may resolve this and is a planned next step for this research.

Our next step is to share the R Studio program to allow further replication. The Kansas Syndromic Surveillance Program is also continuing similar research to ensure that best practices are being met.

 


Full Text:

PDF


DOI: http://dx.doi.org/10.5210/ojphi.v10i1.8355



Online Journal of Public Health Informatics * ISSN 1947-2579 * http://ojphi.org