(Header: A camera crew films the ACLU talk for a feature film produced by the Ford Foundation).

“You have to remember that every row of data is a person”

Nasser Eledroos, an ACLU Massachusetts Technology Fellow, reminds us why we can’t forget what the numbers represent. The numbers are overwhelming. There are more than 30,000 defendants claiming wrongful convictions because of falsified drug testing. In one Boston lab, one chemist routinely made up test results and added weight to drug samples, effectively changing a minor distribution offense into a major narco-trafficking conviction. In some cases, two extra grams of cocaine is the difference between a mandatory minimum of 3.5 to 8 years. The chemist handled one out of every six cases in the state.

Elsewhere, in Amherst, Massachusetts, a chemist, herself an addict, was manufacturing drugs in a state police crime lab, ingesting state samples, working while high, and destroying lives in the process. It’s been more than five years since these chemists were caught, but the wrongfully convicted were not immediately notified.

Carl Williams and Nasser Eledroos from the American Civil Liberties Union presenting at the MIT GOV/LAB series.

The process to correct this injustice has been painfully slow. One obstacle was the District Attorney’s delay in disclosing the time period in which the lab chemists were falsifying tests and also the delay in notifying those who were wrongfully convicted. Another obstacle is that the state criminal justice system has at least four databases involving drugs cases, which don’t communicate well with each other, making it difficult to track drug samples across court cases and defendants.

Enter the ACLU, using a combination of data science and litigation to address these wrongdoings. Identifying cases is a critical step, because as ACLU attorney Carl Williams explained, it was originally up to the wrongfully convicted to challenge their sentences. This was a cumbersome process, until the ACLU sued on their behalf, prompting the courts to review and throw out the convictions en masse. ACLU Massachusetts has ongoing litigation to address these injustices.

ACLU spoke as part of GOV/LAB’s “Data Science to Solve Social Problems” seminar series. We asked this year’s seminar organizers what they found most interesting or unexpected from the talk.

Soubhik Barari, MIT GOV/LAB Data Science Research Specialist:

“One surprising thing I learned from the ACLU’s talk is the high stakes of using data science in assisting public defenders tackle court cases. In investigating Massachusetts’s two gargantuan drug lab scandals which each involve tens of thousands of wrongful convictions, the ACLU is tasked with finding a solution for each and every one of them.

Data science can offer scalable and reliable methods for predicting yes/no answers to crucial questions on social datasets (e.g. does this individual’s record look tampered?), but rarely are the stakes so high for getting them right. A single false negative (saying ‘no’ when the true answer is ‘yes’) isn’t a social media user not seeing an ad, but rather someone wrongfully getting a misdemeanor charge and potentially losing their livelihoods or access to public housing.

More than that, there could be consequential second-order effects. For example, a single laboratory sample is connected to a buyer, a seller, a seller, a distributor, a lookout, and possibly other related cases in the past – all could be affected based on a machine’s answer to that ‘yes/no’ question. Therefore, if the justice system is to adopt new algorithmic methods for things like predicting evidence tampering or back-auditing existing evidence records for fraud, we need to think carefully about what it means to make a mistake. How accurate should an auditing algorithm be in order for us to trust it? What kind of mistakes are more risky than others? Such are important concerns to be debated in the public arena.”

Dante Delaney, MIT GOV/LAB Research Associate:

“One interesting thing about the talk was the reappearance of ‘data silos’ – a theme that came up in other Data Science to Solve Social Problems talks. For example, Alma Castro shared how the Long Beach Justice Lab was able to break down information silos to better serve their residents. In Carl and Nasser’s talk, we saw how the lack of a cohesive data management process within the Massachusetts justice system harms citizens and reduces accountability. Processes that developed organically over decades to serve specific organizations’ internal needs ended up hindering their ability to work together, especially now that misconduct has taken place.

A key driver of this data siloing is the fact that the police departments, courts, and drug testing facilities all have their own ontologies. An ontology can be thought of as the information architecture – a system that explains all of the terms and elements of a dataset and how they relate to one another. One of the problems that Nasser and Carl mentioned was that the police, courts, and testing facilities all capture different data about the arrest, the persons involved, and the (alleged) drugs confiscated. And even when they do capture the same information, they don’t always record it in the same way, making it difficult to merge data sets and accurately match samples to their court cases and to all of the affected persons. According to Nasser, these data management systems are so entrenched, that in order to modernize and streamline their data management processes, the only way to fix it would be through top-down comprehensive action taken by the Massachusetts State Legislature.”

While many Massachusetts residents likely consider the state progressive in terms of criminal justice reform, these cases of drug testing corruption reveal another reality. The State just passed a major criminal justice reform bill this year and how these changes will impact the lives of those who come into contact with the system, disproportionately black and brown residents, remains to be seen.