MIT GOV/LAB is committed to supporting MIT graduate students conducting original field research and data collection on MIT GOV/LAB topics and themes of interest. One way we support students is through our seed grant program that enables students to conduct their research. 

MIT GOV/LAB: Your use of the MIT GOV/LAB seed grant involves developing a new dataset with audit reports from the Philippines. Can you tell us what’s special about these reports?

Jerik: The Philippines’ Commission on Audit (COA) possibly has the longest and most comprehensive series of annual audit reports covering local governments and national government agencies in the developing world. Because of provisions in the country’s constitution, all governmental bodies including government-owned and controlled corporations, are subject to annual audits, which have been undertaken every year since 1998. At present, they cover the entire universe of local governments, national government agencies, and state corporations that exist in the country.

For political scientists, political economists, and governance/policy/development researchers, these audit reports are a treasure trove allowing us to probe government fiscal, budgetary, financial, and accountability-related behavior over a 25-year period at a resolution that is rarely possible in the Global South. For instance, data in these annual audit reports track (a) amounts of unliquidated cash advances (allowing an observational measure of corruption); as well as (b) reforms and improvements in public financial management and accountability practices, as based on past COA recommendations (allowing a behavioral measure of improvements in accountability and bureaucratic capacity).

This is obviously data that has intrinsic public interest for journalists, civil society watchdogs, and government reformers— but it also promises to shed new light for academic research on canonical political science themes such as state-building and state capacity, accountability and corruption, as well as the politics of budgeting, taxation, public service provision, public employment, and development more generally.  

MIT GOV/LAB: What do you plan to do with this data? And who are you working with?

COA has already provided us with digital copies of the entire corpus of audit reports in their archives from 1998 to the present. The next step is to use these reports to build a comprehensive governance database that can be eventually showcased in a public website. For this, I am in the process of converting a share of the reports into machine-readable text via optical character recognition, as well as training and deploying machine learning and natural language processing algorithms that can flexibly extract data-of-interest from the reports.  

I’m especially fortunate to be working with partners who are deeply committed to realizing the project. The database will be housed at the Ateneo School of Government (ASOG), one of the leading schools of public policy and public administration in the Philippines, and apart from being made open to the public, is planned for regular use by ASOG students in their classes. These students include both local and national government officials, as well as civil society researchers. In fact, one of ASOG’s faculty— a former COA Commissioner and past head of the United Nations Office of Internal Oversight Services— has also put a substantial personal stake in the project and has been invaluable in guiding us in unearthing the “hidden” stories that these budgets and audit reports tell. With her help, we’re exploring options to scale-up the project further, both in the Philippines and internationally.

MIT GOV/LAB: So what are some of these stories that you’ve begun to unearth from these audit reports?

There’s an incredible wealth of detail that I can’t do justice to here, but my initial foray has been in simply describing the prevalence of anomalous transactions across local governments. Existing measures of corruption, most famously Transparency International’s Corruption Perceptions Index, have been based on perceptions of key informants and experts— but for this same reason struggle to plumb down to a more fine-grained agency or subnational level. By using these audit reports, we’re able to overcome these constraints by identifying “red flag” transactions and COA-identified irregularities that are likely to be associated with corruption.

Figure 1. Topic modeling results across local government audits, 2010-2020. The prevalence of topics in the audit reports for local governments over the past decade. Unliquidated cash advances account for the most notable anomalous transaction across the reports.

Figure 2. Example of how unliquidated cash transactions are flagged in COA reports. The municipality of Babatngon in Leyte province claims to have used nearly a million pesos for a festival in 2015, but according to that year’s audit report the money was instead transferred to other local government officials.

As a very rough and initial cut, Figure 1 presents the results of Latent Dirichlet Allocation topic modeling of the “Findings and Recommendations” sections of the executive summaries of these reports (the sections where irregularities are spelled out) over the 2010-2020 period. With 25 topics, we find that three particular kinds of irregular transactions appear with relatively high frequency: (a) unliquidated cash advances; (b) flawed procurement proceedings; and (c) anomalies in the reporting of fuel and local government inventories. Figure 2 presents an example of how these anomalies (here, unliquidated cash advances) might appear in a particular report, in this case for the municipal government of Babatngon in Leyte province for the 2015 fiscal year. 

This example is just scratching the tip of the iceberg: there are many possible ways to describe and revisit governance concepts with these reports and machine learning techniques, and even more possible analyses that can be run using them. But the first step for all of them is to build the dataset itself and ensure that we’re interpreting the information in them accurately.

MIT GOV/LAB: How does this initiative relate further to the work that you’re doing and the work that you’re planning to do?

Developing this dataset isn’t my dissertation research, but its importance for academic and policy research, as well as its innate public interest value has pushed me to take it on as a major side-project. That said, budgetary data in these reports should also allow me to map the bureaucratic structure of national and local government at much more fine-grained level of detail than previously possible, which should also allow me to explore how these structures influence patterns of state-society embeddedness across a continuum of government functions and instrumentalities (e.g., health governance, local economic development, disaster risk and climate adaptation).

More broadly, doing work with equal levels of intellectual and policy relevance on matters of accountability, state capacity, and governance, is one of the reasons I started my PhD and joined MIT GOV/LAB in the first place. In the Philippines, I started my working life as a civil society researcher and campaigner on issues such as public health, land rights, environmental justice, and transparency reform. With initiatives such as these, I try to use what I’ve gained at MIT to stay true to my origins, even while pushing the frontiers of governance research.

Photo: Sunrise with a smoggy Manila in the distance (Credit: Jerik Cruz).