During 2021, MIT GOV/LAB gave me the exceptional privilege of being its Practitioner in Residence. I learned an enormous amount in that year, and I was able to use the time to write down some reflections that have, hopefully, been helpful for others. The “Don’t Build it” guide has been used by teams in many parts of the world, and has been translated into Spanish and disseminated in Latin America, thanks to the wonderful team at the ILDA.

Looking back, I started the year wanting to explore how machine learning and AI could help promote democracy. I characterized it as wanting to complement the “defence of democracy” work that many others are doing with finding ways to “extend democracy”. I wanted to very practically and concretely explore ways that ML/AI could be used in existing democratic institutions or in creating new ones.

What I’ve learned is that working out uses for AI in civic and governance tech is both the same and different to working out whether and how to use technology in general in those spheres. 

Don’t build it, unless the problem is suited to a technology solution

It’s the same, because there must be a problem and the technology and product must be fitted to it. Unfortunately, right now we’re still quite far from that with most AI/ML. For example, summarizing laws is a good problem, but the technology isn’t quite there in practice. The great team at OpenNyai in India is making a lot of progress creating datasets and benchmarks from Indian laws, but the models–even ones advertised to do amazingly well on summarization–just aren’t up to it when the examples aren’t cherry-picked. But the direction of travel is solid, and I’m optimistic that in a few years the technology will be ready, for a fairly clear problem, and with people ready to implement on the ground.

As a different example, for graph-based methods, the technology is ready but the demand just isn’t there. Graph-based data science, all the way up to knowledge graphs and graph neural networks, are now enormously powerful at extracting contextual predictions from networks–whether of companies. But the demand just isn’t there, whether from movements to understand their internal structure or for governments to conduct industrial policy. The reason is almost always that the problems aren’t one of information or knowledge, but of capacity and motivation and incentive. So, deploying such powerful tools would fall into exactly the kind of traps that “Don’t Build It” describes.

Beware of the hype bubble around AI/ML, both from AI builders and civil society

But AI/ML is different because there’s a lot more hype around it. In fact, there are two types of hype. One is excessively positive, and is about model capabilities. Cherry-picked examples can be very misleading here, as many giant text models churn out a lot of nonsense, but create enough examples to look great in a demo. An allied problem is benchmarks that are too easy to score well in by getting every trivial example right, while being very off on hard examples that are much closer to the real world. Fixing that will require benchmarks that place much, much heavier penalties on bad output. Scoring 80% on question answering is still very far from human performance if the other 20% is gibberish.

The second source of difference is the amount of funding and energy that is going into work on regulating AI. Such work is necessary, but it is being funded and written about almost to the exclusion of all else. Even mentioning that perhaps the balance needs to be recalibrated is likely to encourage attacks or just silence.  As a result, there is a real risk that a filter bubble is being created, in which people reinforce each other constantly, but drones go on fighting wars and bad HR systems make people’s lives hell. Unfortunately, the interests driving this phenomenon are a structural fact of the grant industrial complex so they are unlikely to change soon.

Keep exploring high-quality data-sets, combined methods, and right-fit applications

Despite the hype and biases, I have come out still convinced that the potential here is extraordinary. I was most surprised by the results in using the intermediate outputs of text models to predict later outcomes, whether in municipal bond defaults or for development aid projects. Whatever the over-hyping in the short-term, in the long-term the potential of this technology is very large. Just in the last few weeks a team at Stanford has shown how this technology can help ordinary people understand the gains to income from gaining new certificates, maybe helping right the balance in the often fraudulent edtech industry.

The real gains I’ve realized will come from the work of many people using combinations of methods on high-quality existing datasets to make important but often simple predictions easy and accessible. That means open source projects, it means more skills, and it means pushing as many people in governance as possible to pick up a notebook and start coding. (Here and here are free intro to coding courses to get started).

Personally, I’ll be writing up the results of the development project prediction using World Bank data into a full paper as well as deploying the model as a public good. I’ll remain engaged with the work of the GOV/LAB and its future Practitioners in Residence. And I’ll continue exploring the use of AI, trying to bring practical benefits out of research into applications. I’ll be focusing more on its use in finance and infrastructure. It’ll be fun, I hope, and useful.

Photo by Yuhan Du on Unsplash.