Medical claims data are collected for payment purposes. However, these data are often used for other purposes such as studying quality of care, assessing provider performance, and measuring health. These data are a rich resource for health services research, but when they do not include key pieces of information we can find ourselves bending over backward to develop (or use) algorithms to fill in the missing pieces. Algorithms used in one study can gain a life of their own and soon become the de facto standard. In our paper, recently published by Medical Care, we revisit commonly used algorithms to assign patients to a primary care physician (PCP) and assess how well these algorithms work in a sample of Medicare beneficiaries with diabetes.
Continuity of care from a single PCP is important for patients with multiple chronic conditions, but it can be difficult to identify a patient’s PCP in medical claims datasets frequently used for research. We compared commonly used algorithms for selecting a PCP against the PCP listed in the electronic health record and found that claims-based algorithms can misclassify a patient’s PCP in as much as 75% of cases, particularly for vulnerable patients.
Researchers and payers have developed a variety of algorithms to infer a patient’s primary physician or responsible healthcare provider using claims data. Algorithms typically assign patients to the healthcare provider who delivers the majority or plurality of services. When these algorithms do not identify a single physician, researchers sometimes include tie-breakers to maximize attribution such as amount billed or time from first to last visit.
We compared how well claims data algorithms identify the listed PCP of record in an academic health system’s electronic health record. We focused on patients with diabetes who were continuously enrolled in the Medicare program. Reflecting our interest in identifying which algorithms performed better, we report the percent of agreement between the PCP of record and the physician identified by the claims-data algorithm. While there are a number of potential algorithms to choose from, we focused on two popular algorithms that assign patients based on the majority or plurality of visits, and then also examined permutations of these algorithms limited to all physician visits versus primary care visits and the use of tiebreakers.
Overall, we found that the plurality algorithms had significantly higher concordance rates than the majority algorithm and that the analysis limited to primary care physician visits generally had higher concordance than when all visits were included. However, it is important to note that performance of these algorithms varied substantially: we found claims-based assignment algorithms misclassified a patient’s PCP of record from 14% to as much as 75% of the time. Subgroup analyses found that misclassification rates were higher for vulnerable subgroups including non-Whites and individuals dually enrolled in Medicare and Medicaid. Misclassification rates were also higher among individuals most likely to experience fragmented care during the year, including those using multiple health systems, experiencing 11 or more physician visits, and experiencing a PCP change.
We recommend that researchers should use claims data algorithms with care and, when possible, use a tie-breaker to maximize attribution. Importantly, we encourage researchers to investigate whether patients from vulnerable subgroups are more likely to have their primary care physician misclassified.