As the pinnacle of the judicial branch, the U.S. Supreme Court is necessarily involved in some of the highest-profile, most controversial and most political cases across the country. And it is one of the most widely respected government institutions in the country. Some of its reputation may be because the justices are not seen as mere “politicians in robes.”

Research also tells us people respect the Supreme Court in part because it shares traditions and pageantry with the larger judicial system – such as judges in robes wielding gavels. As members of a team of legal scholars and information scientists who use computational methods to study the judicial system, we wondered whether another potential source of the Supreme Court’s public esteem is its use of language.

Like other courts, the Supreme Court doesn’t announce its rulings with one-line tweets, for example, the way many politicians declare their intentions to vote for or against legislative bills. Rather, it issues lengthy documents setting out facts and legal precedents and connecting them to each other in ways that both declare an outcome and explain (or object to) how the court reached that decision. The more these written opinions suggest the court is set apart from the political fray, the more they can help its reputation.

Supreme CourtBut how can we know if the Supreme Court is writing like a judicial body rather than a more political institution? One way is to compare its decisions to those issued by the next-highest level of federal courts, the U.S. Courts of Appeals, which are widely perceived to be less politically partisan and more focused on addressing run-of-the-mill legal issues. Our comparison found that from 1951 to 2007, Supreme Court opinions have indeed become increasingly different in their content from opinions issued by lower federal courts, indicating that over time, the court appears to be drifting away from its judicial roots.

Machine reading the law

In other work, our group has studied the evolution of the writing style of the Supreme Court and timescales of influence of opinion as well as ideological expression in judicial opinions. In each project, we applied various kinds of big data text mining tools to collections of tens of thousands of opinions. For our current research, we chose to view judicial opinions as a genre of lawmaking text, akin to legislatures’ statutes, the president’s executive orders (or, these days, tweets) and agencies’ regulations. We analyzed a random sample of 25,000 of the entire corpus of approximately 300,000 opinions issued from the Supreme Court and federal appeals courts between 1951 and 2007. Our analysis included all opinion types, including dissents.

We were interested not in whether there were small stylistic differences – such as increased use of footnotes – but whether the actual words of Supreme Court opinions were distinctive from those of the appeals courts, and whether that distinctiveness was changing. Our analysis found that over five decades, the language of the Supreme Court’s opinions became increasingly different from those of the appellate courts.

This trend may undermine the court’s popular legitimacy over time, particularly when viewed in concert with other developments indicating the Supreme Court may be becoming increasingly politicized, such as the process of nominating and confirming new justices.

Who wrote that?

The first step in our analysis used a specific type of machine learning, called a “topic model,” which detects groups of words that generally appear near each other with predictable frequency in a given body of texts. For example, it can tell whether a particular opinion is more focused on the equal protection rights under the 14th Amendment as opposed to environmental law because in the former, the words “discrimination” and “race” are more likely to appear together and frequently, while in the latter this is true of the words “pollution” and “water.”

For the next step, we used the results of the topic analysis to teach a machine learning program to classify thousands of opinions as either written by the Supreme Court or a federal appeals court. Based on the topic information, the machine was able to pick up on content differences between the two groups of opinions. For example, the Supreme Court’s opinions tend to have more words associated with interpreting laws and constitutional rights, like using the history of Reconstruction to interpret civil rights statutes. The appeals courts’ opinions tended to have more words referring to times, dates, testimony and evidence.

Based on this training, we tested how well the machine was able to guess whether new opinions were written by the Supreme Court. To humanize it a bit, imagine a legal scholar who had read the first set of opinions walking down the street one day and coming across a few pages of a judicial opinion with all identifying information torn away. How good would she be at identifying which court produced it – and does her accuracy vary depending on when the pages were written?

Even in the 1950s, the first decade in our sample, the Supreme Court’s opinions were already quite different from appeals court decisions. When presented with opinions written in this period, the machine was able to judge with roughly 80 percent accuracy which opinions were written by the Supreme Court. So its decisions were already fairly easily distinguished from appeals court opinions. But they got even more so as years went by: When presented with opinions written in the 2000s, the algorithm achieved an almost perfect score.

An exception that illustrates how this works is the algorithm’s tendency to misclassify the Supreme Court’s 2003 Yarborough v. Gentry ruling. That opinion provides guidance for the lower courts on how to deal with habeas corpus cases, which are a mainstay of their work. It deals with a common issue in the lower courts that does not come up to the Supreme Court as often. As a result, it is not surprising that it might be mistaken for a lower court opinion.

The idiosyncratic court

Over time, by increasingly focusing on an idiosyncratic set of topics and by constructing their arguments in an increasingly unique way, Supreme Court opinions have become more distinctive. That hypothetical random opinion found on the street is easier to identify because the court is expressing itself in a new subgenre of legal writing that is more identifiable.

This isn’t just because of differences in the mix of topics the courts rule on. For example, the Supreme Court takes up constitutional issues more commonly than any other type of case. The appeals courts, by contrast, decide the occasional high-profile constitutional cases alongside a large number of unexceptional contract law, administrative law and criminal law matters. Our analysis shows that while the details of these differences shift over the years, the degree of difference didn’t change from 1951 to 2007.

What we find, instead, is that the Supreme Court is analyzing and writing about cases in an increasingly idiosyncratic fashion, distinct from the style of the appeals courts. This may contribute to an overall impression within the public that the court is just another political body. If that is true, the Supreme Court’s unique place in American society may be compromised, as the reservoir of prestige and respect that it currently enjoys eventually runs dry.