AbstractWe present the results of a Content Analysis of Asthma-related Tweets, which were manually annotated for a number of different content categories, including Experiencer (Self vs. Other vs. Finer-grained distinctions), Medication, Symptoms, Non-English, Information and Triggers, among other things. We used this annotated corpus of Tweets to train machine learning classifiers on unigram and bigram models of the text in order to automatically categorize Tweets according to the annotation scheme. We find that the unigram model best predicts Tweets' categorization. We suggest that Twitter combined with NLP may provide a valuable tool in monitoring chronic conditions such as Asthma.
Authors own copyright of their articles appearing in the Online Journal of Public Health Informatics. Readers may copy articles without permission of the copyright owner(s), as long as the author and OJPHI are acknowledged in the copy and the copy is used for educational, not-for-profit purposes. Share-alike: when posting copies or adaptations of the work, release the work under the same license as the original. For any other use of articles, please contact the copyright owner. The journal/publisher is not responsible for subsequent uses of the work, including uses infringing the above license. It is the author's responsibility to bring an infringement action if so desired by the author.