The work conducted by AXIA STUDIO focused on analysing self-reported emissions data from operators in Lombardy, monitored by ARPA through the AIDA software (Integrated Self-Control Application). The main challenge was the lack of historical inspection data, making it difficult to assess the reliability of the regular measurements provided autonomously by the businesses.
To address this challenge, the data science team adopted an approach based on bimodal analysis and time series analysis. These techniques helped identify errors and anomalies in the historical emissions data, which could result from either unintentional mistakes or deliberately incorrect behavior.
The bimodal analysis identified suspicious parameters, characterized by measurements that showed evident anomalies, such as oscillations between two values or the repetition of a fixed value. The time series analysis, using tests for randomness and autocorrelation, revealed that some measurements did not follow a random pattern, suggesting that data from certain businesses might not be entirely reliable.
Subsequently, a predictive model based on unsupervised machine learning was developed, allowing the classification of data based on the level of randomness, both before and after the removal of seasonal effects. This model enabled inspections to be targeted toward businesses with the most problematic data discrepancies, thereby improving the efficiency of monitoring and ARPA’s ability to identify high-risk situations.
Project: Reforming Regulatory Inspections in Italy at Regional and National Level (2021-2024)
Beneficiary: ARPA (Environmental Protection Agency) in the Lombardy region (Italy)
Project developed by: OECD – Organization for Economic Co-operation and Development