Automated customer service at the National Library of Medicine
First Monday

Automated customer service at the National Library of Medicine by Terry T. Ahmed, Carolyn Willard, and Marcia Zorn

The National Library of Medicine (NLM) launched a virtual customer service representative (vRep) named “Cosmo” in February 2003. Cosmo is a navigation tool that guides users to information on NLM Web pages. These Web pages contain information on programs, products and services provided by NLM. Cosmo directs users to information on consumer health, drug information, medical database instruction, grant information, online catalog access, exhibit details and onsite library information. Medical librarians manage the Cosmo knowledge base to link to this information. To ensure that we meet customer needs, staff review the user “conversation” log daily and modify answers as needed. This paper describes the lessons learned during Cosmo’s development and may help others who create and maintain a virtual representative.


Why a Virtual Representative?
Knowledge Base and Content
How It Works
Updating the Knowledge Base
How Cosmo Performs
Lessons Learned




The National Library of Medicine (NLM), on the campus of the National Institutes of Health in Bethesda, Maryland, is the world’s largest medical library. The Library is probably known best for producing the MEDLINE database, which indexes journal articles from nearly 5,000 biomedical journals. NLM also produces numerous other databases and catalogs for researchers and historians including, PubMed, IndexCat, and TOXNET. NLM databases provide information on monographs, audiovisual materials, and on such specialized subjects as toxicology, environmental health, and molecular biology. Since 1998, NLM has developed resources for the general public, including MedlinePlus, a comprehensive consumer health portal in both English and Spanish, and other consumer health portals such as NIH SeniorHealth) and Tox Town). Some other resources produced by NLM are the Unified Medical Language System (UMLS) and the Visible Human Project.

The National Library of Medicine (NLM) has a centralized customer service staff that answers questions about NLM resources such as MedlinePlus, and PubMed. Staff also answers questions about all of the services of the Library, e.g. interlibrary loans, Library grants, exhibits and Library tours and hours of operation. From October 2004 to September 2005, NLM Customer Service responded to 73,493 inquiries from off–site locations (phone and e–mail) compared to 59,634 inquiries for October 2000 to September 2001.

Virtual representatives (VReps) are new; but are becoming popular as librarians try to help users who want information anytime and anywhere [1]. Digital/virtual reference service allows access to information using asynchronous and synchronous methods such as e–mail and customized programming [2]. NLM needed second–generation digital reference to cater to "virtual patrons" who look for sources of quick information [3]. We consider Cosmo a second–generation product at NLM.

VReps are appearing more commonly on the Internet as organizations use them to respond to user’s most frequently asked questions. These products usually have human or character likenesses with names such as Nicole, Phyllis, Kate or Bill. Coca Cola® uses a cartoon personality known as Hank. Ask Jeeves® uses a butler named Jeeves [4]. They are used in addition to, or as replacements for, call center staff and e–mail replies to customers [2]. NLM chose a wise owl, named Cosmo, as our character icon.

Figure 1: Cosmo Icon

Figure 1: Cosmo Icon.

Cosmo emulates person–to–person customer service interactions 24 hours a day, seven days a week ( Powered by NeuroServer software (by Verity, Inc., formerly NativeMinds, Inc.), Cosmo uses conversational dialog and rules based responses [5]. However, we have found that most users ask two to three questions and leave the page. They are not “conversing” in an extended way with Cosmo.



Why a Virtual Representative?

The NLM makes a variety of content accessible to health professionals and the public through the Internet [6]. The content of the NLM Web site is vast and diffuse, and can be difficult for users to navigate. To locate content, users navigate from NLM Web pages or use keywords in the search engine.

Cosmo offers a third choice by allowing users to ask natural language questions.

Cosmo provides:

  • a simple navigation tool;
  • an alternative for people who prefer using natural language to search for information;
  • 24/7 access;
  • a way to gather usage statistics (see Appendix);
  • a means for more efficient response to Frequently Asked Questions; and,
  • a more anonymous customer service method than e–mail or phone.

Cosmo’s rules–based scripting and pattern matching links users to relevant content (predetermined by NLM staff) on the NLM Web site. When Cosmo cannot find a suitable link, it provides a form to send the question to the NLM Customer Service staff.

For example, to find the Library’s hours by navigating from the main Web site, users would start at the NLM homepage, then click the “About the National Library of Medicine” link, and click “Visiting the NLM”.

Or, from the NLM site, use the “Search NLM Web Site” box with terms like “library,” “hours,” or “library” and “open.” Cosmo, answers questions like “Are you open the day after Thanksgiving?” with a link to the answer.

Users can ask a question and receive an immediate answer from Cosmo.

  • How do I get to the library?
  • How much does it cost to park there?
  • Where can I find older medical journal articles?
  • How can I participate in a clinical trial?
  • Tell me about your history.

Cosmo answers health questions by linking to the consumer health information in MedlinePlus. As information on NLM Web sites is updated, Cosmo links to that updated source of information.



Knowledge Base and Content

The software developers and the NLM Office of Computer and Communication Systems (OCCS) provided support initially by creating thousands of synonyms from the MedlinePlus health topics and drug information databases for matching texts. The software developers gave a two–day course to several NLM staff members to learn the scripting language, create responses and manage the content. These staff members then trained new members of the Cosmo project team and installed the vRep software on their computers.

During planning, the project team used these NLM tools to build a knowledge base for Cosmo:

  • Frequently Asked Question’s (FAQ’s);
  • Fact Sheets;
  • MedlinePlus health topics;
  • MedlinePlus drug information;
  • PubMed searching instructions; and,
  • General NLM information Web pages.



How It Works

Librarians create answers as backward literature searches; that is, they deduce how a user might ask for information and write a script that “fires” when that question is asked.

One of the first topics created for Cosmo tells the NLM hours. The user sees:

Figure 2: Cosmo Greeting

Figure 2 (Cosmo Greeting).

If the user types “hours,” the response is

“For NLM’s hours, see the Web pages listed below. For holiday changes and Emergency Closings: Call 1–888–FINDNLM (1–888–346–3656) or 301–594–5983.”

Cosmo then offers the following links:

Another frequently asked question is “Where can I find lists of medical meetings and conferences?” Because of the scripting by librarians, Cosmo anticipates different variations of this question.

Any of the following variations will link to a page of information about medical conferences:

“I am interested in medical conferences.”
“Can you tell me about upcoming medical conferences?”
“Medical meetings”
“Medical conferences”
“Are there any resources for conferences here?”
“Can I find medical conference resources at the NLM?”
“Does the NLM have schedules of medical conferences around the country?”
“Do you have info on medical meetings?”
“Where can I find information on medical conferences?”
“Where can I find lists of medical meetings and conferences?”

In these examples the key phrases “medical conference” and “medical meetings” are mapped so that no matter how the user enters the question, Cosmo directs them to medical conference information. Cosmo does this with pattern matching. A complete sentence creates the best match because it provides context as well as content. Ideally, Cosmo recognizes a complete sentence as a standard question or statement category.

Cosmo is not a search engine and requires special scripting for retrieval. Users who utilize it as a search engine by using Boolean operators, entering keywords or incomplete sentences that are not scripted to match, may find the results unpredictable. At best, Cosmo will recognize a keyword that is scripted to match a document or Web site.



Updating the Knowledge Base

Each time a user interacts with Cosmo, the system stores conversation in a log. Daily, the Cosmo reviewers read the previous day’s log (see Appendix) in order to:

  • alter incorrect responses;
  • study user conversation habits;
  • suggest new entry terms for medical conditions (i.e., build the knowledge base); and,
  • collect data for statistics.

The reviewer assigns each question a user asks to one of nine categories:

  • Correct;
  • Incorrect;
  • Out of Scope for Cosmo;
  • Out of Scope for NLM;
  • Out of Scope (answer misleading);
  • Add New Term;
  • I Can’t Understand You;
  • Insults/Inanities; and,
  • Questions about Cosmo.

The reviewer records the statistical information and adjusts the scripts for incorrect responses. The project manager then updates the “live” vRep file on the NLM site.



How Cosmo Performs

In the first year, Cosmo answered 86 percent of “In Scope” questions (i.e. those questions that Cosmo should answer) correctly.

Cosmo receives about 25 questions a day including medical questions, questions about NLM products and services, clinical trials questions and collection questions (see Appendix).

Figure 3: Total In Scope Questions

Figure 3: Total In Scope Questions.



Lessons Learned

Some of the challenges we discovered during our first year with our vRep:

  • Communicating clearly with multiple audiences;
  • Scripting for similar but different questions;
  • Explaining Cosmo’s limitations; and,
  • Encouraging customer use while the knowledge base is in development.

Clear Communication with Multiple Audiences

NLM Web site information content must communicate with a variety of audiences (librarians, health professionals, students, researchers, and the public). However, information mainly of interest to librarians (NLM outreach programs, grant information, on–site services, interlibrary loan, database searching assistance, citation error information, Fact Sheets, and Frequently Asked Questions) may also answer questions from the public. We are sensitive to library “jargon” and are able to restate information in “plain language” understood by the public.

Scripting for Similar yet Different Questions

Scripting Cosmo to answer Frequently Asked Questions isn’t difficult. Scripting Cosmo to recognize and answer similar, yet different, questions is more complicated. For example, for a question about herpes (e.g., What is herpes?) Cosmo suggests a MedlinePlus page for herpes. For a question about whether NLM has books on herpes, Cosmo recognizes both herpes and books, and links to the NLM catalog to find books and journals in general, not just about herpes If the user wants to research the medical literature on herpes, Cosmo directs them to search PubMed/MEDLINE. So we have one subject (herpes), but three different interpretations of that subject such as general information on herpes, books on herpes, and research articles on herpes. This means that three different scripts are needed to direct users to a suitable answer.

When asked a question, Cosmo produces an answer or a hyperlink to a Web page that contains the answer. Sometimes the answer may not answer the question correctly because Cosmo interprets the question by how the user asks it.

The unavoidable issue is how to deal with word order of user questions that might make a difference in a correct response. This is similar to the need to script for similar but different questions. For example, a publisher who wants to submit a journal for inclusion in MEDLINE asks “How to get a journal listed in Medline?” Cosmo interprets the question as “Where can I get a list of the Journals in Medline” because it matches the words “journal”, “listed” and “MEDLINE”. So proactive scripting for word order is difficult because Cosmo gets “smarter” as it is used more. Homographs are another challenge. Words like “delivery” could mean childbirth or could mean drug delivery of pharmaceuticals.

Explaining Cosmo’s Limits

Cosmo does not work like Google or PubMed. It either matches a response in the knowledge base, or it does not match and tells you that it doesn’t understand.

This does not leave room for error. Misspelled words are a challenge. For example, when a user asks “What is the latest treatment for assthma?”, he gets a default “I can’t understand you” message If asthma were spelled correctly, Cosmo would link to a correct MedlinePlus page. People who use Web search engines, such a Google, may expect Cosmo to recognize the misspelling and ask “Do you mean asthma?” Cosmo does not have that capability; but we can, and often do, script for likely misspellings. Since we cannot anticipate all misspellings, we hope to implement a spell–checker in the future.

Reviewers decide on a case–by–case basis whether to adjust the scripting to account for misspelled words and new phrases. For example, a user typed only “2005 funding.” Cosmo interpreted the question to be asking about funding for treatment. We must make a decision in a case like this to decide whether to add these phrases to the knowledge base.

Initially, Cosmo was designed to respond only to complete sentences. For example, “What are your hours?” or “Are my children allowed in the library?” or “How do I access PubMed?” or “I need information on asthma,” instead of “hours” or “children” or “PubMed” or “asthma.” We tried to educate users on how to ask Cosmo questions. A default script activated anytime someone typed a one– or two–keyword search (unless it was obvious what they are looking for). This script told users to try asking a question rather than typing keywords. This attempt to educate produces mixed results. Many users continued to enter search terms and their questions were not answered. We later altered the scripting so Cosmo can be used more like a search engine, as well as responding to natural language.

Encouraging Use While the Knowledge Base is Still Development

Cosmo gets better when we discover wrong answers to reasonable questions. For example, the question “How can I access online journals from off–site” helped us craft an appropriate answer that matches the question. But when the question was first asked, Cosmo interpreted it to be “How many journals or books do you own?”

To encourage users to ask Cosmo a question before writing or calling Customer Services, we put the link on the Contact Us page, which is one click from the NLM homepage. We will also ask the user, “Have you checked Cosmo for an answer?” on our customer service request form in the future.




Virtual representatives have a place in reference services and introduce new opportunities as well as challenges for librarians, users, and vendors. While some users need human assistance to formulate a question and to find an appropriate answer, Cosmo is a nice solution for answering basic and repetitive questions. With continuous improvement by librarians who know the customers, Cosmo can perform at a sophisticated level. End of article


About the authors

Terry T. Ahmed (ahmedt@, Carolyn Willard (, and Marcia Zorn ( are librarians in the Reference and Web Services Section of the National Library of Medicine in Bethesda, Maryland.



1. R. Balleste, 2004. “Intelligent systems: The world of AI for libraries [abstract],” In: C. Nixon and J. Burmood (compilers). 19th Annual Computers in Libraries 2004 (10–12 March), Washington, D.C. Medford N.J.: Information Today, pp. 43–44.

2. A. Zanin–Yost, 2004. “Digital Reference: What the Past Has Taught Us and What the Future Will Hold,” Library Philosophy and Practice, volume 7, number 1 (Fall), at, accessed 6 November 2006.

3. P. Jacsó, 2003. “Virtual Reference Service and Disservice,” Computers in Libraries, volume 23, number 4 (April), at, accessed 6 November 2006.

4. S. Coffman and S. McGlamery, 2000. “The librarian and Mr. Jeeves,” American Libraries, volume 31, number 5 (May), pp. 66–69.

5. M. Moore, T. Ahmed, and A. Glazer, 2004. “Using an automated knowledge agent for reference and customer service,” Journal of the Medical Library Association, volume 92, number 2 (April), pp. 271–273.

6. D.A.B. Lindberg, 2000. “Internet access to the National Library of Medicine,” Effective Clinical Practice, volume 3, number 5 (September/October), pp. 256–260, and at, accessed 6 November 2006; E.M. Lacroix and R. Mehnert, 2002. “IThe US National Library of Medicine in the 21st century: expanding collections, nontraditional formats, new audiences,” Health Information and Libraries Journal, volume 19, number 3 (September), pp. 126–132, and at




The following is a breakdown of the nine statistical categories developed for Cosmo. Category Examples:

Category 1: Correctly answered questions

  • Example: What are your hours?
  • Answer: The Library offers different hours on different days ...
  • Example Can you correct my misspelled name in PubMed?
  • Answer See the information on the Web pages below for the policy about name misspellings.
  • Example I need information on hives.
  • Answer: You mentioned "hives," click on the link below to see a related Web page.

Action: No action needed. The answer is correct and this information is logged into a spreadsheet.

Category 2: Incorrectly answered questions.

  • Example: What are your hours?
  • Answer: PubMed is a World Wide Web search system, produced by NLM’s National Center for Biotechnology Information ...
  • Example: How can I get a grant?
  • Answer: To determine your eligibility for DOCLINE, please refer to the DOCLINE Eligibility Guidelines at ...
  • Example: Hives? (If Cosmo cannot answer this, it would be wrong since hives is a Topic in MedlinePlus)
  • Answer: I didn’t recognize what you just typed. I can only answer a question about a National Library of Medicine product or service. If you’re looking for a disease or drug, check your spelling; or click on the button below to send an e–mail to the Library’s Customer Service. (default answer)

Action: A librarian examines topics and determines why the questions were not answered correctly. Then scripting is modified to give the correct answer the next time the question is asked.

Category 3: Add term to Health Topic pattern list.

  • Example: What is Melkersson–Rosenthal Syndrome?
  • Answer: I was created to respond to questions about the National Library of Medicine. You may be asking me a question that is beyond my knowledge. (default answer)
  • Example: I’d like to find out about canker sores.
  • Answer: I was created to respond to questions about the National Library of Medicine. You may be asking me a question that is beyond my knowledge. (default answer)

Action: The MedlinePlus health topic pages are checked for the terms “Melkersson–Rosenthal Syndrome” and “canker sore.” No MedlinePlus topics exist for those illnesses, but they exist under the broader term “mouth disorders”; so they are added to the “mouth disorders” pattern list. The next time users ask about canker sores, they will be directed to the mouth disorders information page in MedlinePlus.

Category 4: Out of scope for Cosmo (but NLM can answer this question).

  • Example: At the beginning of the Civil War, the U.S. medical staff consisted of how many people?
  • Answer: I didn’t recognize what you just typed. I can only answer a question about a National Library of Medicine product or service. If you’re looking for a disease or drug, check your spelling; or click on the button below to send an e–mail to the Library’s Customer Service. (default answer)
  • Example: Correspondence session of January issue of Gastroenterology is missing from PubMed. Where is it?
  • Answer: I didn’t recognize what you just typed. I can only answer a question about a National Library of Medicine product or service. If you’re looking for a disease or drug, check your spelling; or click on the button below to send an e–mail to the Library’s Customer Service. (default answer)

Action: No action needed. These are question that cannot be answered by Cosmo because of their complexity; however, the NLM staff can answer these questions. When Cosmo produces a default answer to a question it cannot answer, the user is given an option to send an e–mail to NLM Customer Service staff.

Category 5: Out of scope for NLM

  • Example: Who is God?
  • Answer: I was created to respond to questions about the National Library of Medicine. You may be asking me a question is beyond my knowledge.
  • Example: How many people work in the White House?
  • Answer: I was created to respond to questions about the National Library of Medicine. You may be asking me a question is beyond my knowledge.

Action: No action needed. These questions are out of scope for NLM.

Category 6: Out of scope with misleading answer.

  • Example: Who is the leader of Russia?
  • Answer: Donald A.B. Lindberg, M.D. is the Director of the National Library of Medicine. More information is available from the Web pages listed below.
  • Example: Where can I find an online currency converter?
  • Answer: There is no fee to use the National Library of Medicine and no charge to register for a library card…

Action: An attempt is made to correct examples like the above for this category. Although the questions are out of scope for NLM, they cause Cosmo to produce an answer from its knowledge base. The term “leader” was in a pattern list with the synonym “director” which combined with “Lindberg” and triggered the above answer. We take steps to prevent this from happening. For example, we take the “Lindberg” topic and “not” in a special operator, word or another pattern list such as CAT__NATIONS_N listing all the countries in the world.

Category 7: Question not understood.

  • Example: What are your hurs?
  • Answer: I was created to respond to questions about the National Library of Medicine. You may be asking me a question that is beyond my knowledge.
  • Example: Databases.
  • Answer: Can you elaborate for me? Please enter your terms in the form of a sentence.
  • Example: Where can I find information on the UUMLS?
  • Answer: I am unable to answer your question. What product or service of NLM are you interested in?

Action: None needed. This category captures users entering text that is perplexing to Cosmo. There is no understanding because of misspelling or bad language syntax. As in above examples, the option is presented to either reenter the question or Send an e–mail.

Category 8: Inanities/Insults


  • What good are you?
  • Hello.
  • How are you?
  • You make me sick.
  • I think you’re useless.
  • Why don’t you understand me?
  • You are no help.
  • Thanks.
  • That’s not answering my question.

Action: No action needed. A default response addresses some of these questions, and we decided not to devote too much time to scripting answers for this area. Cosmo is not designed to be as conversational as some vReps.

Category 9: About Cosmo


  • Who made you?
  • Who programmed you?
  • How much did you cost?
  • What are you?
  • What’s a vRep?
  • Are you chat?

Action: As users ask questions about Cosmo, we provide them with answers about how it works and why it’s available on the Web site.



Editorial history

Paper received 21 April 2006; accepted 15 October 2006.

Contents Index

Copyright ©2006, First Monday.

Copyright ©2006, Terry T. Ahmed, Carolyn Willard, Marcia Zorn.

Automated customer service at the National Library of Medicine by Terry T. Ahmed, Carolyn Willard, and Marcia Zorn
First Monday, volume 11, number 11 (November 2006),

A Great Cities Initiative of the University of Illinois at Chicago University Library.

© First Monday, 1995-2020. ISSN 1396-0466.