Do you trust this post?

Dec 07, 2020

Interdisciplinary work can be great. In particular, HPE scholars should take note of advances in the digital humanities, which are combining tools like machine learning to analyze large corpuses of text-as-data and images-as-data. The ability to draw on new types of historical sources, across fields, is exciting.

But often the best work in interdisciplinary studies is done by teams of scholars from different disciplines. Each scholar brings their field’s knowledge and expertise, to create something new (and to ensure that each field is properly represented). Otherwise, one might fall into the trap of being a jack-of-all-trades, master-of-one-field — where new methodological advances are applied to a subject area, but don’t sufficiently engage that subject area.

This is the case with an article I came across recently in Nature: Communications, entitled, “Tracking historical changes in trustworthiness using machine learning analysis of facial cues in paintings.”

Data is here, and for a lively and useful Twitter thread, see here.

The motivation behind this article is that the concept of social trust is difficult to document in historical periods. This is true. Today, we can collect time series panel data on trust and other pro-social behaviors and be relatively confident in our measurement, or we can run experiments testing mechanisms relating to trust. In pre-modern times, we are limited to historical records, memoirs, and other surviving documents that discuss society and culture.

As a result, the authors decide to explore the idea of trust using a potentially novel set of sources, namely “images-as-data” in the form of European portraits in the national gallery. The idea of art as source of qualitative information is a fascinating prospect (though subject to equal, if not more, biases than the more traditional written record; more on that later).

The article is quite technical, but the authors build an algorithm that estimates trustworthiness (of the portrait sitter) based on facial features (think smiles and wide eyes).

But one historical red flag is that the idea that certain facial features are associated with trust comes from experimental work, with populations from 2007 and onwards; this is assumed to be a human characteristic (that is constant over time). Similarly, all the training data and robustness checks involved modern data (the authors were worried that historical cues would bias coders).

(Talk about comparing apples to oranges; but the article also does a robustness check comparing Instagram selfies in six major cities, and looks at which cities score higher on interpersonal trust in the European and World Value Surveys. But similarly, would anyone say Instagram represents an accurate snapshot of society?)

Here, we’re missing much needed depth from historians, art historians, and sociologists. Facial features, expressions, and mannerisms are culture and context-specific, and I can’t help but think of the fact that even smiling in portraiture has changed over time — it used to be quite radical! This is effectively the opposite of “reading history forward,” in that we are applying current human biases to past societies. [Update: as Tracy mentions, this speaks to a similar debate in history, over work by Philippe Ariès that studies childhood using portraiture; a fascinating critique of this work can be found here).

A few other issues…..

My Kingdom for a Representative Sample

The article could have benefited with more though about what this sample is, and how we obtained it. While it’s clearly difficult to get data from historical periods, this means we must be crystal clear about what our sample entails and what we can learn from it (both Emily and Adam have written posts to this effect recently).

The sample is the online database of historical portraits found in the National Portrait Gallery in England, so N=1962 English portraits. They also do a robustness check using the Web Gallery of Art, which had N=4,106 portraits from 19 Western European countries from 1360-1918.

The article notes that much of the data on society that we have that might reflect society’s mindset or attitudes is in the form of books, songs, painting, and artwork. I think it’s definitely the case that paintings reflect society-wide conceptions of appearance, and could convey other types of cultural information.

But what type of paintings survive, or are donated to a museum? Are paintings representative of European society in general, and then are the ones that end up national museums representative of the entire population of art? This might introduce selection bias that drives the results. Historically, we know the National Gallery’s contents were a result of choices made by subsequent museum directors and the generosity of private donations. Hypothetically, if museums choose to buy, collect, or preserve portraits that are aesthetically pleasing — and they learn that doing so increases revenue, attendance, or reputation over time — you might see similar trends in trustworthy paintings. Similarly, as Vicky notes, the diversity of portraits naturally increases over time, as more classes beyond just nobility can afford them. Imagine more merchants enter the painting pool, and merchants care about appearing trustworthy — this would result in similar trends, and be another hypothetical example of the challenges of sample selection.

It’s also worth considering that these sitters were presumably a majority of elites, who could commission a portrait, for a number of strategic purposes (to signal wealth, power, or certain levels of attractiveness for marriage prospects; not as an accurate representation of real life). These paintings were also created by an artist, with another set of motivations that most likely involved pleasing the client, so the artist could get paid and/or obtain a patron. None of this is really discussed, and ideally you’d control for trends in portraiture.

What is Trust?

While it’s great to think of creative ways to measure slippery concepts, equally we need to think how far can we extrapolate from such data. In particular, here the authors find a “significant increase of trustworthiness displays with time” and claims this means the “value of interpersonal trust increased from the 16th to 20th Century.” The paper also tries to show an association with resources, using GDP per capita, with the idea that poorer individuals should have lower levels of social trust. Both of these are large leaps to make.

First, it’s not clear what trustworthiness is, as defined in this paper.

Evaluations of “facial action units” is not the same as trust. And while humans might use facial cues as a heuristic to guide their interactions, when we think about trust in society we think of a huge range of interpersonal interactions (ingroup/outgroup, elite/non-elite, ethnicity, identity, networks, etc), guided by both formal and informal institutions (think authors like Landa, Greif, Draude, Holck, Stolle, Putnam). Trust is behavioral, and the very idea of trust has changed over time.

There are many theoretical reasons why changes in portraiture wouldn’t accurately reflect widespread changes in societal relations. Elites make up a large portion of portraits, but a small proportion of society. Both Tracy and Jared have written posts on how exchanges in society are difficult to realize from the historical record, particularly for everyday citizens and local customs.

And there are many empirical reasons, based on their analysis, why we would doubt this claim. A simple regression controlling for country-level factors (ie, GDP per capita, Polity2) and portrait characteristics (ie gender), using 100 year time units, is tough. There are too many unobserved variables to count. And this research design simply cannot test any individual or society level changes in actual trust (however defined). So, this is not “suggestive of a shift in social trust” — and while sometimes the authors stick closely to the measurement of their dependent variable, other times they just claim they are measuring trust. Given the academic and popular significance of Nature, precision is essential.

In sum, it was an interesting article to read, but I suspect many art historians, sociologists, political scientists, and others might take issue. And for high profile venues such as Nature, we need to think about the tradeoff between interdisciplinary mash-ups and true interdisciplinary learning — and resist the impulse to overclaim.

The National Portrait Gallery, decked out for Christmas (source: Author)

Broadstreet

Discussion about this post