Unsupervised Machine Learning for Explainable Health Care Fraud Detection / Shubhranshu Shekhar, Jetson Leder-Luis, Leman Akoglu.

By:

Shekhar, Shubhranshu

Contributor(s):

Material type: Text

TextSeries: Working Paper Series (National Bureau of Economic Research) ; no. w30946.Publication details: Cambridge, Mass. National Bureau of Economic Research 2023.Description: 1 online resource: illustrations (black and white)Subject(s):

Other classification:

Online resources:

Available additional physical forms:

Hardcopy version available to institutional subscribers

Abstract: The US spends more than 4 trillion dollars per year on health care, largely conducted by private providers and reimbursed by insurers. A major concern in this system is overbilling, waste and fraud by providers, who face incentives to misreport on their claims in order to receive higher payments. In this work, we develop novel machine learning tools to identify providers that overbill insurers. Using large-scale claims data from Medicare, the US federal health insurance program for elderly adults and the disabled, we identify patterns consistent with fraud or overbilling among inpatient hospitalizations. Our proposed approach for fraud detection is fully unsupervised, not relying on any labeled training data, and is explainable to end users, providing reasoning and interpretable insights into the potentially suspicious behavior of the flagged providers. Data from the Department of Justice on providers facing anti-fraud lawsuits and case studies of suspicious providers validate our approach and findings. We also perform a post-analysis to understand hospital characteristics, those not used for detection but associate with a high suspiciousness score. Our method provides an 8-fold lift over random targeting, and can be used to guide investigations and auditing of suspicious providers for both public and private health insurance systems.

Tags from this library: No tags from this library for this title. Log in to add tags.

Average rating: 0.0 (0 votes)

Holdings
Item type	Home library	Collection	Call number	Status	Date due	Barcode	Item holds
Working Paper	Biblioteca Digital	Colección NBER	nber w30946 (Browse shelf(Opens below))	Not For Loan

Total holds: 0

Collection: Colección NBER Close shelf browser (Hides shelf browser)

Previous	No cover image available	No cover image available	No cover image available	No cover image available	No cover image available	No cover image available	No cover image available	Next
Previous	nber w30943 The Economics of Digital Privacy /	nber w30944 Catching Up by 'Deglobalizing': Capital Account Policy and Economic Growth /	nber w30945 Hobbesian Wars and Separation of Powers /	nber w30946 Unsupervised Machine Learning for Explainable Health Care Fraud Detection /	nber w30947 Ideas Mobilize People: The Diffusion of Communist Ideology in China /	nber w30948 Donor Contracting Conditions and Public Procurement: Causal Evidence from Kenyan Electrification /	nber w30949 Inclusion and Democratization Through Web3 and DeFi? Initial Evidence from the Ethereum Ecosystem /	Next

February 2023.

The US spends more than 4 trillion dollars per year on health care, largely conducted by private providers and reimbursed by insurers. A major concern in this system is overbilling, waste and fraud by providers, who face incentives to misreport on their claims in order to receive higher payments. In this work, we develop novel machine learning tools to identify providers that overbill insurers. Using large-scale claims data from Medicare, the US federal health insurance program for elderly adults and the disabled, we identify patterns consistent with fraud or overbilling among inpatient hospitalizations. Our proposed approach for fraud detection is fully unsupervised, not relying on any labeled training data, and is explainable to end users, providing reasoning and interpretable insights into the potentially suspicious behavior of the flagged providers. Data from the Department of Justice on providers facing anti-fraud lawsuits and case studies of suspicious providers validate our approach and findings. We also perform a post-analysis to understand hospital characteristics, those not used for detection but associate with a high suspiciousness score. Our method provides an 8-fold lift over random targeting, and can be used to guide investigations and auditing of suspicious providers for both public and private health insurance systems.