Christina Warren ‘21 graduated from MIT this spring with degrees in Computer Science and Writing. This past semester she was a research intern with MIT GOV/LAB, supporting Luke Jordan on exploratory work looking at AI/ML and democracy.

The Data Development Lab’s database of judicial records from India’s legal system is the largest of its kind in the world. Earlier this year, we wrote about our intent to further study this database as part of our AI for democracy initiative.

Over the past several months, we used a graph of the database (a map of the relationships between different elements of the database) to explore several aspects of the legal system. We analyzed how gender affected interaction with various parts of the legal system, including varying centrality within the system, engagement with different acts, and variance in case outcomes. We also looked at various anomalies in the graph and the reasons for their emergence. These deeper analyses into the graph provide us with concrete insight into aspects such as the biases in the legal system and what anomalies might reveal about where power and influence sits.

Gender in the legal system

Of all court cases in which the defendant had a classified gender, 57% had female defendants, while the other 43% had male defendants. In contrast, 29% of judges were classified as female, while the rest were labeled nonfemale. Though the two methods of classification were slightly different, each treated gender as a binary variable and used the person’s name as a proxy for their gender.

We then looked at various cases’ centrality, a measure of how connected the case is to the rest of the legal system. Of the 25 most central cases, nine had female defendants, while three had male defendants (the remaining defendants were unclassified). Three of the top four cases had female defendants (the first case’s defendant was unclassified), and all four were tried in the state of Assam and utilized the Code of Criminal Procedure, the central law for the handling of criminal cases. While the Code of Criminal Procedure is one of the most central and most utilized acts, explaining its presence in these top cases, we might look further at why the most central cases were tried in Assam.

While male and female defendants, on average, had comparable levels of centrality, non-female judges were significantly less central than male judges. 

Finally, we created a subgraph containing only female judges and cases with female defendants. With this graph, we compared the centrality of various acts and sections of law to their centrality in the overall graph, which showed us which acts were particularly important for women.

There were several notable differences. The Protection of Women from Domestic Violence Act, somewhat predictably, rose from the 17th-most central act in the overall graph to 14th in the subgraph. Additionally, the Gujarat (Bombay) Prohibition Act and the Prohibition Act (Maharashtra), which are only used in Bombay and Maharashtra respectively, rose from 10th to sixth and from 16th to 12th, respectively. These differences indicate that these particular acts are more relevant to females interacting with the legal system, and can help us understand how to dissect and approach these gender differences.

Looking for influential judges

We also looked for unusual or unexpected patterns that might reveal something about the underlying data. In particular, we looked into judges, focusing on those with abnormal centralities. We compared each judge’s centrality to the number of cases the judge had presided over, looking for those with much higher or much lower centralities relative to the number of cases they presided over. Most judges had a fairly typical centrality given their case numbers; however, there were judges that were much higher in centrality than expected, indicating an unknown factor created this boost in centrality.

For a more fine-grained analysis of these cases and what might be causing them, we looked specifically at each “community,” a grouping of nodes (cases, judges, acts, etc.) that were highly connected. We found the judges with the most unusually high or low centralities relative to other judges in that community. By comparing judges only to others in their community, we were able to find judges who might not look abnormal compared to all judges, but were abnormal relative to judges they were connected to.

Of these anomalous judges with higher-than-expected centralities, most held the position of Chief Judicial Magistrate, a high-ranking position within the judicial system. Interestingly, one held the position of City Civil and Sessions Court sharing a community with the state of Gujarat, a departure from the position we’ve otherwise seen in the highest-ranking judges.

Because this judge has an abnormal centrality, we will further study this judge and the reasons behind its unusual centrality; for instance, the type of case the judges tend to preside over, or the specific defendants, are important factors which would help us understand what makes these judges important. This becomes useful when we’re able to generalize it and integrate it into the next steps of our project as we begin to work more closely with India’s legal infrastructure.

Screenshot of one piece of the graph exploring India’s legal database (Christina Warren).