A composite photo of discarded surgical gloves and masks during the coronavirus lockdown in Wales

A composite photograph of surgical gloves and masks discarded in Cardiff, UK, in December 2020.Credit: Matthew Horwood/Getty

If only somebody had counted the orphans.

That was one wish I had while trawling archives on the 1918 influenza pandemic to research my book Pale Rider. Another yearning? If only someone had saved biological samples of the unidentified respiratory disease that ravaged China in late 1917.

Historians a century hence will, I think, have a lot more to go on.

On 30 January this year, the World Health Organization (WHO) sounded a global alarm when it designated an outbreak of respiratory illness a ‘public health emergency of international concern’. On the same day, the US National Library of Medicine (NLM) launched a web archive for the incipient pandemic. “The disease didn’t even have a name yet,” says Susan Speaker, a historian at the NLM in Bethesda, Maryland. “We collected the tweet in which the WHO named it.”

Since then, the NLM has archived thousands of websites and social-media posts from governments and non-governmental organizations, journalists, health-care workers and scientists around the world. That’s in addition to all the COVID-related publications in its literature database, PubMed.

Efforts to document the pandemic for posterity have been under way everywhere since early in the year. Government agencies such as the US Centers for Disease Control and Prevention (CDC) in Atlanta, Georgia, and scientific institutions including the Pasteur Institute in Paris weren’t far behind the NLM. Their archives are being complemented by those of museums, libraries, historical societies and community groups. The global frenzy of collecting has even prompted talk of curatorial burnout.

Museum curators are on the lookout for discarded ventilators and failed prototype COVID-19 tests — but they must choose the moment they ask with care. “We can’t just say to busy people, ‘Would you stop developing the vaccine and talk to me about collecting stuff?’” says Natasha McEnroe, keeper of medicine at the Science Museum in London. “We have to tread very, very carefully.”

Others are storing souvenirs of people’s lived experience — video diaries, mask fashion, recordings of the quiet of locked-down streets. Or they’re salting away objects that the pandemic has rendered iconic: the signage around the lectern from which UK Prime Minister Boris Johnson spoke to the press; a wooden spoon that a little girl broke while banging her family’s cooking pots in support of medical personnel. For the first time, a pandemic has triggered institutional plans for rapid-response collecting — an initiative pioneered by London’s Victoria and Albert Museum (its virtual COVID-19 collection even includes a toilet roll).

Collectors are motivated by an awareness that something historic is unfolding, and that past pandemics were relatively poorly documented. Some archivists warn that future historians will have more data than they can make sense of. Others worry about blind spots. Either way, COVID-19 could buck a trend, says Astrid Erll, a memory scholar at the Goethe University Frankfurt in Germany. “It is the first worldwide digitally witnessed pandemic,” she wrote in September in the journal Memory Studies1: “a test case for the making of global memory in the new media ecology.”

What to save?

In the decades after the 1918 influenza pandemic, the only people who really paid attention to it were virologists, medical historians and actuaries employed by insurance companies. Later in the twentieth century, its study became multidisciplinary, with economists, sociologists and psychologists showing an interest — along with ‘mainstream’ historians. But by then, much was lost, if it had ever been preserved.

The absences are eloquent in themselves: the lack of a reliable lab test for flu led to massive diagnostic confusion, for example, and women’s accounts are relatively rare.

So twenty-first-century archivists are trying to think ahead, to save material that might otherwise pass into oblivion (while trying to distinguish the real from the fake news). At the Pasteur Institute, for example, all internal documentation is being collected, including meeting reports; the same is true at the beleaguered CDC. “We are also beginning to ramp up efforts to collect more personal materials from CDC staff, including video diaries, journals and photographs,” says Judy Gantt, director of the David J. Sencer CDC Museum in Atlanta.

The 1918 flu outbreak, like all epidemics that have been measured, highlighted inequality. Today’s public-health organizations are — to a greater or lesser extent — documenting that dimension of the current pandemic. For example, Gantt’s team is collecting data on how CDC guidelines are being implemented in communities, as well as on health disparities, social justice and activism.

Reconstruction in the Lower Wellcome Gallery, Science Museum, London of an iron lung polio ward

An iron lung, a breathing aid for people paralysed by polio, in the Science Museum, London.Credit: SSPL/Getty

International archiving projects are trying to capture the interconnected and globally variable nature of what we’re living through. The NLM, for example, contributes to an effort launched by the International Internet Preservation Consortium (IIPC) in February. With members in more than 45 countries, including national, regional and university libraries and archives, the IIPC aims as far as possible to sample the multiplicity of the pandemic across the world. “We want to get a lot of nitty-gritty detail about individual countries, selected by experts in those countries,” says Alex Thurman, a librarian at Columbia University in New York City and one of the collection’s lead curators.

At the other end of the scale, there are local and specialized initiatives. The National Library of Israel in Jerusalem is documenting the impact of COVID-19 on Jewish communities globally through what it calls “ephemera” — e-mails about synagogue services that have moved online, for example, or announcements of innovative Jewish law rulings. The American Institute of the History of Pharmacy in Madison, Wisconsin, is gathering pharmacists’ testimonies in any and all media. The Museum of the Home in London is asking Britons to submit photos of their domestic lives, in a project called Stay Home.

Data deluge

Archivists are aware that it’s not for them to decide what future historians will consider relevant. In 1918, many people doubted that the first wave of the pandemic, which resembled seasonal flu, was caused by the same pathogen as the much more lethal second wave (it was). As evidence comes in, new connections are made while others fade. The tendency today has therefore been to collect everything, within each organization’s broad remit. Hence the ocean of data — and the pandemic rages still.

“We had to slow our colleagues down early on because we were using our data budget so quickly,” says Thurman. The IIPC’s 2020 allocation, fixed before the outbreak at 3 terabytes, was full by June. The Internet Archive, a non-profit organization that is sponsoring the IIPC’s COVID-19 collection, gifted it another two, and it has used nearly four so far — storing archived copies of close to 11,000 selected web resources in 66 languages.

Data mining and machine learning tools are going to be needed to explore such large data sets, and Speaker worries that those tools are not yet as well developed as the ones for collecting. It’s a concern others share. “The interesting thing about this pandemic is that we may end up with too much information and very little sense of how to sort through it,” says historian Erica Charters at the University of Oxford, UK.

Holes in history

Pandemics pose logistical problems — one reason they have tended to leave such light archival footprints. “Collecting infectious disease is a real challenge,” says McEnroe, speaking from her son’s bedroom. Many museums have closed, and archivists have been working from home. Physical collecting has health and safety risks, and raises ethical concerns. Samples of infected lung tissue taken from patients in 1918 were used in 2005, controversially, to bring the virus back to life2 — something that nobody in 1918 would have dreamed possible.

So there are holes in the wealth of collected material. Senior scientists and politicians might have their deliberations captured by normal reporting mechanisms such as Hansard, the official transcriptions of UK Parliamentary debates, but they tend not to have time to keep video diaries or screenshot their Slack channels. Nor do patients or front-line workers — especially those from vulnerable groups. “Those who will be suffering the most … will have the least time, energy, and ability to create a full documentary record of what’s going on as it unfolds,” wrote Eira Tansey, a digital archivist at the University of Cincinnati in Ohio, in a blogpost in June (see go.nature.com/2kci5vp).

Geographical coverage is patchy, too. The Russian Museum of Medicine in Moscow plans to collect information about the medical response across Russia’s ‘red zones’ — where infection rates are high — but the project has been delayed by the pandemic. At Krea University in Sri City, India, historian John Mathew didn’t have his proposal to collect the oral histories of migrant workers approved until October. The IIPC has only one member in Africa and few in Asia. Thurman and his co-lead curator, Nicola Bingham at the British Library in Boston Spa, are conscious of the organization’s membership bias towards wealthier nations.

During the 1918 pandemic, the continents that experienced the highest mortality rates — Asia and Africa — were the least well documented. Something similar could happen again. “There is a privilege in being able to collect, in having the resources to do so,” says Charters.

Whose record?

Debates around the recording of epidemics — or any historical event — are not new. The choice of recording tools changes according to culture and period, too. In some places, the personal COVID-19 diary has flourished, be it on paper, Instagram or TikTok. But the diary as an everyday tool for introspection was developed in Christian cultures, Charters says, and might not be privileged in the same way in non-Christian ones.

And when should you have started your diary — or your archive? People living in countries where epidemics are rare perceive a clear before and after for COVID-19. That is not true everywhere. For instance, many Kenyans see the pandemic as a continuation of an old and unremitting drama. As anthropologists Wenzel Geissler and Ruth Prince at the University of Oslo observed in July3: “People in this region have experienced a long century of epidemics and anti-epidemic measures of varying duration and intensity”.

We’re no more likely to agree on the end of the pandemic. If COVID-19 becomes a regularly circulating disease, as many predict, and if its social and economic consequences linger for years or decades, when do we say it’s over? Archivists need to fix an end date, if only to manage their dwindling resources. They know that date will be arbitrary, yet it will shape the histories that will be written. Convention in the global north is that the 1918 pandemic lasted for 18–24 months. It raged in the Pacific islands well into 1921.

The only comfort, says Charters, is that histories can be rewritten in light of accumulated knowledge, and in ways that it’s difficult to predict. Think of the less well known influenza pandemic of the 1890s. There is now debate about whether it was actually caused by a coronavirus, thanks to some results4 that have resurfaced this year. “A hundred years from now, they really will write a completely new version of COVID-19 that it’s very unlikely we would recognize,” says Charters.