For anyone looking for something to do in London before the end of August 2018, I thoroughly recommend a trip to the British Library (nearest rail/tube station: Kings Cross/St. Pancras) to see this new exhibition about the voyages of Captain James Cook.
Born in North Yorkshire in 1728, Cook joined the Royal Navy in 1755, took part in the seven year’s war against Canada and made a name for himself by charting the coast of Newfoundland. In 1765, the Admiralty engaged Cook to lead an expedition to Tahiti in order to observe the Transit of Venus. Commencing in 1768, this was the first of Cook’s three voyages to the Pacific and the order of the exhibition follows these three voyages.
Cook’s first voyage didn’t stop at Tahiti. They also spent six months circumnavigating New Zealand (given that name by the Dutch), where some of the encounters with the native population were violent. From there they went on to Australia, where they charted most of the Eastern coast (the rest of the coast having already been charted by the Dutch). And from Australia they sailed to Batavia, the centre of the Dutch empire in the East Indies. Upon return to Britain, Cook was promoted to Commander.
The second voyage, from 1773-1775, was in search of the Great South Continent. This turned out not to exist, but during the voyage Cook and his men became the first explorers to cross the Antarctic Circle, which they eventually did three times. The journey also took in Easter Island, Dusky Sound (at New Zealand), a sighting of South Georgia, and the New Hebrides (now called Vanuatu).
The third voyage, from 1776-1780, was to the North Pacific, taking in Alaska and the Hawai’ian islands.
The various naturalists and artists that accompanied Cook on his expeditions amassed a valuable collection of plants and artistic renderings of people, animals and landscapes. Some wonderful examples of these are on display in the exhibition galleries. Especially noteworthy are the artistic works of William Hodges, described by Sir David Attenborough as the first academically-trained artist to go on such an expedition (“and it shows”). Hodges accompanied Cook on the second voyage and one of his pictures shows the expedition’s ships dwarfed by the vast icebergs of the Antarctic, something which the general public would never have seen before.
There is no doubt that the voyages must have been exceedingly arduous and fatalities were numerous. Deaths through illness on the first voyage included Sydney Parkinson, famous for his drawings of Maori people and the first European to draw a kangaroo. Another artist, Alexander Buchan, died on Tahiti from an epileptic seizure. The surgeon William Monkhouse died during the stopover at Batavia, as did Tupaia – the High Priest of Tahiti – who had helped Cook chart the Tahitian islands (one current native of Tahiti describes Tupaia as a ‘traitor’, though others speak of him more admiringly).
Men were also lost in violent encounters. In 1773, ten men were killed in a dispute at Queen Charlotte Sound. Cook himself was killed by an angry crowd on the Hawai’ian island of Kealakekua.
What seems clear from the exhibition is that the scientific work carried out by the expeditions was secondary to the unstated goal of colonisation. Whilst the first voyage had the publicly-stated goal to map the Transit of Venus, Cook in fact had secret orders to search for “convenient” land. The exhibition includes testimony from the native people’s of the territories visited by Cook, one of whom notes that the expeditions described Australia as “terra nullius”, or “no people”, indicating that the non-white natives weren’t counted as people. In 1934, Cook’s house was transported from North Yorkshire to Melbourne, yet increasingly the indigenous people and their supporters are questioning the traditional view of Cook, and Australia Day has now become a day of protest for many.
It has been suggested that Cook was relatively egalitarian for a man of his time, yet the exhibition makes clear that he frequently hostaged native chiefs whenever some bit of naval property went missing. Such an event led to his own killing on Hawai’i, though the exact events of that day are unclear due to contradictory accounts. It was in fact a wealthy naturalist, Joseph Banks, who had paid for his place on the first voyage, that first proposed that Australia – specifically, Botany Bay – be used as the location for a penal colony. It is hard not to visit this exhibition and not feel a great sadness of what befell the native peoples of the places Cook visited (much of the worst, of course, came in the wake of Cook’s voyages).
Eventually, some voices of the Enlightenment began to question the wisdom of such ventures and Adam Smith, in his 1776 book The Wealth of Nations, argued in favour of free trade rather than territorial expansion.
In a video display Sir David Attenborough describes Cook as the greatest naval explorer that has ever lived, which seems like a fair assessment in terms of distance travelled, lands explored, and hardships endured. However, his legacy is increasingly under the spotlight. This is an exhibition well worth visiting.
I think the first time I became aware of metrics in the workplace was between 1990 and 1993, when I was studying for a PhD at the University of Wales, College of Cardiff (now simply ‘Cardiff University’). One day, A4 sheets of paper had appeared on walls and doors in the Psychology Department proclaiming “We are a five star department!” A friend explained to me that this related to our performance in the ‘Research Assessment Exercise’ (RAE), about which I knew nothing. He scoffed at this proclamation in a rather scathing manner, clearly thinking that this kind of rating exercise had little to do with what really mattered in science. I didn’t realise then how right he was. But the RAE was used as a determinant of how much research income institutions could expect from government (via the funding councils).
A few years later, in my first full-time lecturing post, at London Guildhall University, I was put in charge of organising our entry to the next RAE. Part of this pre-exercise exercise was to determine which members of staff would be included and which excluded. Immediately this raised the question in my mind: “If the RAE is supposed to assess a department’s strengths in research, then shouldn’t all staff members be included?” Such was my introduction to the “gaming” of metrics. Every institution was, of course, gaming the system in this and various other ways. Those that could afford it would buy in star performers just before the RAE (often to depart not long afterwards), leading to new rules to prevent such behaviour.
At some point, universities also got landed with the National Student Survey (NSS), which consisted of numerous questions relating to the “student experience”, but with most of the impact falling on lecturing staff who, either explicitly or implicitly, were informed that they needed to improve. With the introduction of – and subsequent increase in – tuition fees, students were now seen as consumers for whom league tables in research and the NSS were sources of information that could be used to distinguish between institutions when applying. The NSS has also led to gaming, sometimes not so subtly – as when lecturers or managers have warned students that they themselves might suffer from a worse educational experience resulting from institutional loss of income as a consequence of their own low ratings.
These changes within universities have been accompanied by another change: an expansion in the number of administrative staff employed and a shift in power away from academics. And academic staff themselves now spend considerably more time on paperwork than was the case in the past.
A new book by Jerry Z. Muller, The Tyranny of Metrics, shows that the experience of higher education is typical of many areas of working life. He traces the history of workplace metrics, the controversies surrounding them and the evidence of their effectiveness (or lack of). As far back as 1862, the Liberal MP Robert Lowe was proposing that the funding of schools should be determined on a payment-by-results basis, a view that was challenged by Matthew Arnold (himself a schools inspector) for the narrow and mechanical conception of education that it promoted.
In the early twentieth century, Frederick Winslow Taylor promoted the idea of “scientific management”, based on his time-and-motion studies of pig iron production in factories. He advocated that people should be paid according to output in a system that required enforced standardisation of methods, enforced adoption of the best implements and working conditions, and enforced cooperation. Note that the use of metrics and pay-for-performance are distinct things, but often go together in practice.
Later in the century, the doctrine of managerialism became more prominent. This is the idea that the differences among organisations are less important than their similarities. Thus, traditional domain-specific expertise is downplayed and senior managers can move from one organisation to another where the same kinds of management techniques are deployed. In the US, Defence Secretary Robert McNamara took metrics to the army, where “body counts” were championed as an index of American progress in Vietnam. Officers increasingly took on a managerial outlook.
The use of metrics found supporters on both the political left and the right. Particularly in the 1960s, the left were suspicious of established elites and demanded greater accountability, whilst the right were suspicious that public sector institutions were run for the benefit of their employees rather than the public. For both sides, numbers seemed to give the appearance of transparency and objectivity.
Other developments included the rising ideology of consumer choice (especially in healthcare), whereby empowerment of the consumer in a competitive market environment would supposedly help to bring down costs. ‘Principal-Agent Theory’ highlighted that there was a gap between the purposes of institutions and the interests of the people who run them and are employed by them. Shareholders’ interests are not necessarily the same as the interests of corporate executives, and the interests of executives are not necessarily the same as those of their subordinates (and so on). Principals (those with an interest) were needed to monitor agents (those charged with carrying out their interests), which meant motivating them with pecuniary rewards and punishments.
In the 1980s, the ‘New Public Management’ developed. This advocated that not-for-profit organisations needed to function more like businesses, such that students, patients, or clients all became “customers”. Three strategies helped determine value for money:
The development of performance indicators (to replace price).
The use of performance-related rewards and punishments.
The development of competition among providers and the transparency of performance indicators.
Critics of this approach have noted that not-for-profit organisations often have multiple purposes that are difficult to isolate and measure, and that their employees tend to be more motivated by the mission rather than the money. Of course, money does matter, but that recognition should come through the basic salary rather than performance-related rewards.
Indeed, evidence indicates that extrinsic (i.e. external to the person) rewards are most effective in commercial organisations. Where a job attracts people for whom intrinsic rewards (e.g. personal satisfaction, verbal praise) are more important, the application of pay-for-performance can undermine intrinsic motivation. Moreover, the people doing the monitoring tend to adopt measures for those things that are most visible or most easily measured, neglecting many other things that are important but which are less visible or not easily measured. This can lead to a distortion of organisational goals.
Many conservative and classic liberal thinkers have criticised such ideas, including Hayek, who drew a comparison with the failed attempts of socialist governments (notably the Soviet Union) at large-scale economic planning. Nonetheless, from Thatcher to Blair, from Clinton to Bush and Obama, politicians of different hues have continued to expand metrics further into the public domain.
Muller is not entirely a naysayer on metrics, noting that they can sometimes genuinely highlight areas of poor performance. In particular, he notes that in the US there have been some success stories associated with the application of metrics in healthcare. However, closer examination of these cases shows that these successes owe more to their being embedded within particular organisational cultures rather than with measurement per se. Indeed, these successes seem to be the exceptions rather than the rule, with other research showing no lasting effect on outcomes and no change in consumer behaviour. Research by the Rand corporation found that stronger methodological design in studies was associated with a lower likelihood of identifying significant improvements associated with pay-for-performance.
What is clear – and Muller looks at universities, schools, medicine, policing, the military, business, charities and foreign aid – is that metrics have a range of unintended consequences. These included various ways in which managers and employees try to game the system, including: teaching to the test (education), treating to the test (medicine), risk aversion (e.g. in medicine, not operating on the most severely ill patients), and short-termism (e.g. police arresting the easy targets rather than chasing down the crime bosses). There is also outright cheating (e.g. teachers changing the test results of their pupils).
Incidentally, another recent book, The Seven Deadly Sins of Psychology (by Chris Chambers) documents how institutional pressures and the publishing system have incentivized a range of behaviours that have led to ‘bad science’. For instance, ‘Journal Impact Factors’ (JIFs) supposedly provide information about the overall quality of the research that appears in different journals. Researchers can cite this information when applying for tenure, promotion, or for their inclusion in the UK’s Research Excellence Framework (formerly the RAE). However, only a small number of publications in any given journal account for most of the citations that feed into the JIF. Another issue with JIFs concerns statistical power – the likelihood that a study will identify a genuine effect (statistical power depends on sample size and several other factors). It turns out that there is no relationship between the JIF and the average level of statistical power within a journal’s publications. Worse, high impact journals have a higher rate of retractions due to errors or outright fraud.
But one of the impacts of metrics is the expansion of resources (people, time, money, equipment) in order to do the necessary monitoring. Even the people being monitored must give up time and effort in order to produce the necessary documentation to satisfy the system. And as new rules are introduced to crack down on attempts to game the system, so the administrative resources are expanded even further. This diversion of resources obviously works against the productivity gains that are supposed to be produced by the application of metrics.
I was less convinced by the penultimate chapter in Muller’s book, in which he addresses transparency in politics and diplomacy. He speaks scornfully of the actions of Chelsea Manning and Edward Snowden in disclosing secret documents, which he says have had detrimental effects on American intelligence. Undoubtedly, transparency can sometimes be a hazard – compromise between different parties is made harder under the full glare of transparency – and there is a balance to be struck, but I would argue that the scale of wrongdoing revealed by these individuals justifies the actions they took and for which they have both paid a price. In the UK, as I write, there is an ongoing scandal over the related issues of illegal blacklisting of trade union activists in the construction industry and spying on political and campaigning groups (including undercover police officers having sexual relationships with campaigners). A current TV program (A Very English Scandal) concerns the leader of a British political party who – in living memory – arranged the attempted murder of his former lover, and was exonerated following an outrageously biased summing up in court by the judge. And of course the Chilcot report into the Iraq war found that Prime Minister Blair deliberately exaggerated the threat posed by the Iraq regime, and was damning about the way the final decision was made (of which no formal record was kept).
However, as far as the ordinary workplace is concerned, especially in not-for-profit organisations, the message is clear – beware of metrics!
One evening in 1974, at a home in New Haven, the family of the late Jim McDonough gathered around their television to watch The Phil Donahue Show. To their horror, a piece of 1960s black and white footage was being shown in which Jim was having electrodes attached to his body. Jim was apparently the learner in an experiment whereby he would receive increasingly strong electric shocks whenever he failed to deliver a correct response to a question.
Bearing in mind that Jim had died of a heart attack in the mid-60s, his late wife Kathryn must have been concerned that there might be a connection with this extraordinary piece of research. She wrote to the show’s producer, asking to be put in touch with the man who’d run the experiment, Dr Stanley Milgram. Shortly afterwards, she received a phone call from Milgram, who provided reassurance that her late husband had not in reality received any electric shocks at all. He also sent her an inscribed copy of the book that had caused the media interest: Obedience to Authority.
The Milgram shock experiments are the subject of an enthralling book by psychologist Gina Perry, published in 2012: Behind the Shock Machine: The Untold Story of the Notorious Milgram Psychology Experiments. By sifting through Milgram’s archive material, as well as interviewing some of his experimental subjects and assistants (or their surviving relatives), Perry shows that the popular account of the shock experiments, as promoted by Milgram himself, is but a pale and dubious version of what really happened and what the research means.
The popular account goes as follows. Milgram wanted to know whether the behaviour of the Nazis during the Holocaust was due to something specific about German culture, or whether it reflected a deeper aspect of humanity. In other words, could the same thing happen anywhere? In order to investigate this question, Milgram created an experimental scenario in which people would be pressured to commit a potentially lethal act. His subjects were recruited through newspaper advertisements in which they were promised payment for taking part in a study of learning and memory. As they arrived at Milgram’s laboratory in Yale University, a second subject (actually a paid staff member) would also appear. The experimenter (also a paid confederate of Milgram’s) explained that they were to take part in a study of the effects of punishment on learning. One of them would be the teacher and the other the learner. The two men drew a piece of paper to determine which would be which, but this was of course rigged: the subject was always the teacher. The teacher was told that any shocks received by the learner would be painful but not dangerous. He would then receive a small shock himself as an illustration of what he would potentially be delivering to the learner. During the experiment, the teacher and learner would be in separate rooms, unseen to each other but connected by audio.
At the beginning of the experiment, the teacher would read out a list of word pairs to the learner. After this, he would read out each target word followed by four words, only one of which was paired with the target. The learner would supposedly press a button corresponding to the word he thought was correct. If the learner picked the wrong word, then the teacher had to flick a switch on a machine in order to deliver an electric shock to the learner. The level of shock increased with each word, varying from 15 volts to 450 volts. The two highest settings on the shock machine were labelled ‘XXX – dangerous, severe shock’. The experimenter was always present to oversee the teacher and, if the teacher began to show concern or balk at giving further shocks, would deliver an increasingly stern series of commands (according to a script) requiring the teacher to carry on.
In the first version of the experiment the teacher did not hear from the learner, but in other experiments the learner would begin to call out in increasing levels of distress once the 150V level was reached. There were additional variations, too, such as having the learner and teacher in the same room, having the teacher place the learner’s hand on the shock plate, changing the actors, changing the location to a downtown building, having the learner mention heart trouble, and using female subjects. The experiments began in August 1961 and concluded in May 1962. During the last three days of the experiments, Milgram shot the documentary footage that would form the basis of his film Obedience.
Obedient subjects were defined as those who delivered the highest possible supposed shock of 450V. In most scenarios about 65% of subjects were classed as obedient, though some of the variations (such as teacher and learner in the same room) did lead to lower levels of obedience. By the time Milgram came to write up his research, the Nazi Adolf Eichmann had been tried and hanged in Israel and Hannah Arendt had coined the phrase “the banality of evil”. The observation that dull administrative processes could lie behind the most atrocious war crimes was an ideal peg on which Milgram could hang his research. In an era when the Korean war had given rise to concerns about brainwashing, the concept of ‘American Eichmanns’ took hold.
Milgram’s first account of his work was published in October 1963 in the Journal of Abnormal Psychology, but his famous book – still in print – did not appear until 1974. The original publication of Milgram’s work, and the later publication of his book, met with a mixed response from academics. Critics raised ethical concerns about the treatment of his subjects, pointed to the lack of any underlying theory, and wondered whether it all really meant anything. Wasn’t Milgram just showing what we all knew already, that people can be pushed to commit extreme acts? In response, Milgram pointed to a survey of psychiatrists in which most of them believed that his subjects would not be willing to cause extreme harm to the learners. He also cited follow-up interviews with subjects by a psychiatrist, Dr Paul Errera, which concluded that they had not been harmed and that most had endorsed Milgram’s research.
In his 1974 book, Milgram provided the theory to explain the behaviour of his obedient subjects. This was the notion of the ‘agentic shift’, according to which the presence of an authority figure leads people to view themselves as the agents of another person and therefore not responsible for their own actions. I can recall reading Obedience to Authority as a student in the late ’80s and being confused. To me, the agentic shift theory didn’t seem to be explaining anything. It simply begged the question of why people might give up their sense of responsibility in the presence of an authority figure. Gina Perry points out that the theory also fails to explain the substantial proportion of people who didn’t obey, not to mention the discomfort, questions and objections of those people who nonetheless ended up delivering the maximum supposed shock (these objections figured in Milgram’s earlier publications but less so in his book). In suggesting that ordinary Americans could behave like Nazis, Milgram was also ignoring the entire counterculture movement and especially widespread protest and civil disobedience in relation to America’s involvement in the Vietnam war.
But Perry goes deeper than merely questioning Milgram’s theory, which many other academics have also done. Her research into the archives resulted in the realisation that, over time, Milgram’s paid actors began to depart from their script. The experimenter was provided with a series of four increasingly strict commands that he was expected to give when faced with a subject who was reluctant to continue. If the subject still refused to continue, then the experimenter was expected to call a halt. But John McDonough, Milgram’s usual paid experimenter, began to extemporise some of his commands and to cycle back through the list of four. In other words, some subjects were classed as obedient when in fact they should have been classed as disobedient.
It also turns out that many or most of Milgram’s subjects were not told straight away that the study they had taken part in was a hoax. In a relatively small community, he didn’t want the word to get about that this was the case. Despite this, in the published reports Milgram referred to “dehoaxing” the subjects at the end of the study. Subjects were sent a report about the study, including that the procedure had been a hoax, a little while after the entire series of studies had been completed. However, for whatever reason, some of the people that Gina Perry tracked down said they had never received such a report. They had gone most of their lives not knowing the truth.
Worse than this, contrary to what Milgram claimed, it is clear that some subjects were not happy about the nature of his research, either at the time (the usual experimenter, John Williams, appears to have been assaulted on more than one occasion) or later on. Some appear to have been adversely affected by their participation. In some cases, Milgram did manage to mollify people by taking them into his confidence. He then cited them as evidence that subjects were happy to endorse his studies. Some of Milgram’s subjects were Jewish, an ironic fact given Milgram’s linkage of his research to the Holocaust (Milgram himself was Jewish, but this was not something he disclosed in his earlier writings).
It also turns out that the clean bill of health given to Milgram’s research by the Yale psychiatrist Paul Errera was not quite what it seems. In fact, Errera’s interviews with some of Milgram’s subjects had taken place at the insistence of Yale University after complaints had been made. Only a small proportion of subjects were contacted and an even smaller number agreed to be interviewed, but in his book Milgram referred to these – against Errera’s wishes – as the “worst cases”, who had nonetheless endorsed his work. Milgram actually watched the interviews from behind a one-way mirror and, in some instances, revealed himself to the subjects and engaged in interaction with them. Perry suggests that Errera’s endorsement of Milgram’s work may have been influenced by his reluctance to derail the career of a young psychologist who clearly had so much riding on his controversial research. In any case, the presence of Milgram at the interviews was hardly ideal.
Milgram moved to Harvard University in July 1963. Perhaps mindful of the controversy surrounding his work, his research there avoided personal contact with subjects. In 1967, having been denied tenure at Harvard, he left for a job at the City University of New York. Perry notes that with both staff and students Milgram could alternate between graciousness and rudeness. She wonders if his mood swings might have been influenced by his drug use. This doesn’t feature highly in the book, but Milgram had been using drugs since his student days, including marijuana, cocaine and methamphetamine. When writing Obedience to Authority he used drugs to help overcome his writer’s block and occasionally kept notes on the influence of his intake on the creative process.
Did his research ultimately tell us much at all? It seems unlikely that it really sheds light on the Holocaust, an event involving the actions of people working in groups and in the grip of a specific ideology. By contrast, Milgram’s subjects were acting as individuals in a highly ambiguous context. On the one hand they believed they were being instructed by a scientist, a highly trusted figure whom they would have been reluctant to let down. On the other hand, the setup didn’t make sense. Why was it necessary for a member of the public to play the role of the teacher in the experiment? Why didn’t the experimenter do this for himself? Also, some of Milgram’s own subjects were aware that punishment is not an effective method for making people learn, something that was well-established by the time that he ran his studies. One of Milgram’s research assistants, Taketo Murata, conducted an analysis that showed that the subjects who delivered the maximum shock were more often the ones who expressed disbelief in the veracity of the setup. Whilst Milgram argued that their responses after the study couldn’t be trusted, he was nonetheless happy to use these when it suited him.
Gina Perry shows that in private Milgram often shared many of the doubts that critics voiced about his work, including their ethical concerns. Publicly, though, he strongly defended his work, and more so with the passage of time. He wanted to be seen among the greats of social psychology, including his own mentor Solomon Asch, whose work on conformity was an obvious precursor to Milgram’s work. It seems, though, that Asch eventually stopped responding to Milgram’s letters, presumably increasingly uncomfortable with the ethical issues surrounding the shock experiments. Another famous psychologist, Lawrence Kohlberg, had watched some of the experimental trials with Milgram behind the one-way mirror. Yet he subsequently regretted his own passivity in the face of unethical research. In a letter to the New York Times he described Milgram as “another victim, another banal perpetrator of evil”.
What about Milgram’s paid actors, Williams and McDonough? Were they also culpable in perpetrating evil? Perry is sympathetic to these men. Like the subjects, they had been duped. They needed the money and had responded to an advertisement for assistants in a study of learning and memory. Possibly as the trials proceeded, they themselves became desensitized to what was happening. In any case, they received two pay rises from Milgram in recognition of the efforts they were making on his behalf. Another actor, Bob Tracy, took part in some trials but quit after an army buddy arrived at the lab and he couldn’t go through with the deception. But what kind of pressure were Williams and McDonough under? We know that Williams was assaulted more than once in the lab. And both men were dead of heart attacks within five years of the research ending. This is ironic, as many of the experiments featured the learner stating at the outset that he had a heart problem. There is also evidence that McDonough did experience a heart ‘flutter’ during one of the trials. Did Milgram know about his heart problem and deliberately incorporate this into the experimental scenario?
In conclusion, it is undeniably true that human beings, under certain circumstances, can do terrible things. But Gina Perry has done us a great service by showing that the behaviour of authority figures does not automatically turn us into unthinking automata who will commit atrocities. Through an exemplary piece of detective work she has shown that the people who served as Milgram’s subjects were, by turn, concerned, questioning, rebellious and even disbelieving. Some, though, were affected by the experiments for years afterwards. After all, if you had been pressured into delivering very painful shocks, and possibly a lethal shock, in the name of science, only to be told that you were the person being studied, and possibly not being told that no real shocks were delivered, how would you feel about yourself later on?
Note: Gina Perry is also the author of a new book ‘The Lost Boys’, which I hope to write about in due course.