Detecting dirty data using SQL: Rigorous house insurance case

Daniel Street
James G. Lawson, Bucknell University


Proficiency with data analytics is an increasingly important skill within in the accounting profession. However, successful data analysis requires clean source data (i.e., source data without errors) in order to draw reliable conclusions. Although users often assume clean source data, this assumption is frequently incorrect. Therefore, identifying and remediating ‘‘dirty data” is a prerequisite to effective data analysis. You, an accountant working at a firm that specializes in data analytics, have been hired by Rigorous House Insurance to analyze the company’s claim insurance data. In addition to investigating specific issues mentioned by the company’s controller, you are tasked with identifying any other data integrity issues that you encounter and providing preventative information system internal control suggestions to the client to mitigate these issues in the future.