Skip to main content

Amplity Insights Database to Determine the True Prevalence of Disease

 Exploring Real-World Evidence Data Sources

This is the second article in a series exploring Amplity Insights, the industry’s only database containing unstructured full-text transcripts of dictated physician notes from physician–patient encounters. While structured data sets from EMR systems serve an important business need and provide analytical value, they leave large gaps related to patient characteristics and treatment decisions. (To learn about the limitations of structured data sets from EMR systems, see Closing the Insight Gaps Left by Structured EMR Datasets.) Fortunately, dictated medical transcription records—unstructured data—when mined effectively using leading digital processes, provide a much richer on-the-ground view of patient care at a level of detail unavailable from any other source.


Despite the vast amount of information available from electronic medical record (EMR) and claims data sets, pharmaceutical companies still struggle to ascertain the true prevalence of underreported symptoms in many conditions. For example, while it is widely known that individuals with diabetes face various health challenges associated with hyperglycemia, it is less well known that many of them also suffer from recurrent hypoglycemia. Unfortunately, hypoglycemia in diabetes is often underreported due to patient unawareness or deliberate failure to report. This restricts the ability of healthcare professionals to manage treatment effectively.


Accurately tracking and treating hypoglycemia in people with diabetes is essential, since it can reduce quality of life by impairing sleep quality, causing anxiety and depression, and reducing productivity at work. Importantly, severe hypoglycemia can cause recurrent morbidity and predict mortality in patients with diabetes.


Case Study in Hypoglycemia

In an important study testing the limits of typical claims-based and EMR data, the Amplity Insights database was recently employed to determine the degree to which the prevalence of hypoglycemia was being underreported in these other medical data sources. The resultant report, Assessing Prevalence of Hypoglycemia in a Medical Transcription Database, was published in Diabetes, Metabolic Syndrome and Obesity: Targets and Therapy, an international peer-reviewed medical journal. The study used the Amplity Insights database to determine the prevalence of hypoglycemia among patients with type 1 or type 2 diabetes mellitus (T1DM, T2DM) and compared that data to results from traditional administrative claims databases.


In the paper’s introduction, the authors note 2 important limitations of traditional claims databases. First, in any single medical encounter, the number of health concerns discussed may well exceed the maximum of 4 ICD-9/ICD-10 codes reportable on a standard reimbursement claim. As a result, certain voiced concerns may not be recorded in the claims database. Second, many EMR platforms employ restrictive pull-down menus and limited options for entering free text, further limiting a provider’s ability to capture a patient’s complete health status. With these limitations, it stands to reason that the prevalence of hypoglycemia as reported by administrative claims databases is very low, with estimates of 4% for patients with T1DM and 1% to 3% for patients with T2DM.


The investigators hypothesized that “in a qualitative and rich unstructured database where providers are not restricted to entering information directly tied to a billing claim nor forced to use designated drop-down boxes for diagnoses or codes, instances of hypoglycemia events

(based upon spoken word and/or symptom intersection) captured would be higher than what is reported in more structured databases.”


The study identified patients within the Amplity Insights database who had at least 1 medical transcript mentioning diabetes during the study, ultimately including records of 41,688 patients with T1DM and 317,399 patients with T2DM. Amplity Insights analyzed these transcripts using its proprietary natural language processing (NLP) algorithms, through which meaningful data end points were extracted and evaluated to identify patient encounters that mentioned keywords or concepts related to hypoglycemia.


In the final analysis, Amplity’s NLP algorithms estimated that the prevalence of hypoglycemia was 18% among patients with T1DM and 8% among patients with T2DM. These assessments show the prevalence of hypoglycemia to be 2- to 9-fold higher than the 1% to 4% prevalence estimates suggested by claims database analyses.


The study authors conclude that the use of the Amplity Insights database and NLP improved the

capture of hypoglycemic events that were likely missed or undocumented in data sources such as administrative claims databases or EMRs. The results suggest that “hypoglycemia events are most likely occurring more often and are being discussed more often than analyses of other databases indicate.”



While structured data sets derived from EMR systems provide value at scale and for certain analyses, they are incomplete in their capture of the care of the patient. Dictated medical transcription records—unstructured data—when mined effectively using cutting-edge digital processes such as NLP, add enrichment to structured EMR-based data sets via an on-the-ground view of patient care.


In a study specifically designed to use the Amplity Insights database to determine the prevalence of hypoglycemia among patients with T1DM or T2DM, the frequency was found to be 2- to 9-fold higher than the 1% to 4% estimates suggested by claims database analyses.


Amplity Insights provides a more complete view of the true prevalence of disease, patient–provider interactions, and providers’ treatment rationale. To learn more about how Amplity Insights can help in your situation, visit Amplity Insights. Or choose Contact Us to begin a conversation.


How We Can Help
Amplity Insights is a provider of medical data sets mined from unstructured text transcriptions of patient physician interactions. Ours is the only database containing full-text transcripts of dictated physician notes for physician-patient encounters, providing data, analytics, and insights not available from any other single source. Amplity Insights offers a direct view into the treatment of patients and the rationale of providers while remaining fully HIPAA compliant. Our dataset covers the United States and includes >150,000 individual multi-specialty healthcare providers and 50 million patient records. What’s more, we’re adding 2 million new records every month. To learn more, visit