Shanghai Expat Association
Archiving 'The Courier' digitally – 7000 pages in 700 days
Updated: May 7, 2022
By Jennifer Stubbs
Early in 2020, Evan Stubbs decided he needed a project. There just wasn't enough on his plate as SEA President. (That’s a joke; I think I’m a great comedienne.)
Luckily, someone on the Movin’ sales app was selling a very nice scanner at a price not-to-be-missed.
How hard could it be?
As it so happened, he already had three IKEA crates containing the archive of SEA’s publication, The Courier.
Evan watched YouTube and installed software in preparation.
What file type? What naming conventions? What folder structure? What image density? Color or black and white? All the volumes in a year in the same file? Or a different file for each page?
What rats' nest had he opened?!
After laying a solid foundation, Evan commenced scanning with gusto. Also, he got to wear white gloves! His cousin works in a university archive, so he knew to respect the page and the damage skin oil can cause.
The days flew by. He found his groove. On a good day, he could capture a year.
Rainy days were hard. The indoor light glared too much for the hardware or software to remove in post-production.
Progress slowed during typhoons and the winter when many days were too grey. While interior lighting was not ideal for scanning the glossy pages, it was possible. It took much longer and was much less efficient. In that case, Evan had to adjust the light meter for each page, adjust the angle and position of four lights (some pages are curvier than others), and open images in post-production to counter the glare which might have blocked words or photos.
But Evan persisted. It was intriguing to follow the growth of SEA through the last 30 years. From ASCII art and early Women's Lib jokes to full color. From weekly (!) issues to quarterly. From complete member directory listings to born digital (and hence no need to scan anymore!). From obscure names to familiar ones!
He learned to leverage the software to ensure every page was appropriately numbered inside every issue’s file. He saved every membership year in separate folders. The filenames reflect year (volume) and issue, even through 2010 when the association decided to switch from calendar years to volumes reflecting membership years, resulting in one volume with two years!
Evan built up a routine: if the light was good, then he was up early to start scanning! After all the pages of one issue were scanned, the computer needed 10 minutes to process the OCR (optical character recognition) which would make the files searchable by word and readable for screen-readers. Once that process began, he had time for his morning routine (in a few batches): taking care of the cats and apartment and maybe breakfast. Sometimes he forgot the last one.
It had become a race: will the weather hold out for a day when Evan isn't at an event? Can someone foster the kitten, Kia? She loved to push things off tables and to sit on the laptop keyboard.
At 12:40 pm on December 29, 2021, with less than a day left before repatriating in earnest, Evan scanned the final issue!
The files live on several thumb drives. There is a list of which volumes were born-digital, which volumes are scanned and OCR’d (searchable), and which are somewhere in between. Going forward, we are still looking for volunteers to team edit the Courier. Hopefully, Evan’s project will make retrospective articles easier to compile, or perennial article topics handier to locate and update.
A note from the SEA Editors: Between March 13, 2020 and Dec 21, 2021, Evan scanned 10,240 pages over 22 months. The digital archive of SEA Couriers is available per request. Below are a few snapshots.
About the author:
Jennifer AW Stubbs, from the USA, worked at New York University’s Shanghai library, a joint Sino-Foreign university in Pudong. She lived in Shanghai from 2017 to 2021.