Last month I took a research trip to the Huntington Library in San Marino, CA. It was my first “big” research trip. I was there for two weeks to look at a collection that comprised of over 200 boxes. I knew I couldn’t get through it all so I prioritized ahead of time what boxes I HAD to get through and what boxes I would LIKE to get through and what boxes could wait until the next trip. I researched the Huntington’s rules and regulations and brought all the appropriate tools with me on my trip and on my first day. I thought I was all set to go and then I went through orientation.
During orientation I learned what every savvy researcher probably already knows (a) every archive/library/facility has their own quirks about how they let you do research that they don’t always advertise on their website and (b) the research plan you have going in on your first day will invariably not account for said quirks and you will have to come up with a whole brand new plan of action.
This second point can be incredibly frustrating and stressful if you are on a time crunch and can’t really afford to lose a day rethinking your research strategy. So, for any of you out there planning on doing research at the Huntington, or tangentially plan on pulling a lot of editorial cartoons from Proquest’s Historical Newspapers database, here are a few things I learned.
The collection that I went to the Huntington to research was the personal papers of the cartoonist Paul Conrad who drew for The Denver Post from 1950-1964 and The Los Angeles Times from 1964-1993. In the collection were approximately 100 boxes containing his original cartoons and I needed to find out which ones were nuclear themed. My plan was to go through each box and take a picture of the nuclear related ones and transcribe the appropriate metadata into a doc. This turned out to be an unfeasible plan for two reasons: (1) the Huntington only gives you one box at a time to look through, but more importantly (2) you are only allowed to take pictures at a special reserved table that only fits 4 people so you have to sign up for a 30 minute time slot and you can only sign up for one time slot during the morning and one slot during the afternoon. You are also only allowed one box at a time to take pictures from. So, (after having done the research and knowing the stats) the boxes contained max 10 cartoons I needed pictures of (out of 40-50 cartoons per box) but really on average it was more like 5-7 cartoons and that I averaged about 20 boxes a day…my research strategy would have been time consuming and annoying not only for me but also for the amazing, wonderful, and helpful Reading Room Supervisors’ whose job it was to facilitate the dispersal of boxes.
So I quickly came up with a Plan B. Luckily for me, most of the cartoons in the Huntington’s collection were from Conrad’s years at The Los Angeles Times which is digitized on Proquest. So, I decided that I would go through the boxes marking the dates of the cartoons in the collection in a quickly constructed “cartoon spreadsheet” in my Moleskin notebook and use different symbols to track the cartoons. An (almost) finished page looks like this:
What does this all mean you may ask? Well, I can tell you. An ‘X’ in a box means I’ve seen the cartoon published on that day and it is not nuclear related at all. A ‘?’ means it’s possibly nuclear related and I might have to come back and reexamine it. An ‘O’ means that it is a nuclear related cartoon and a box around any symbol means that there were two cartoons in the archives for that date. The symbols inside the ‘O’s were added later and this is where the crux of this post comes in.
Thankfully, I was recording the metadata for any cartoon that needed an ‘O’ in a doc (incidentally, this is also partly why the cartoon spreadsheet is analog instead of digital because it was easier to record metadata on the iPad I brought with me to the archive and mark my book simultaneously then try to switch between apps on the iPad, one of which would have been Excel or Google Spreadsheets which the iPad doesn’t really like). My plan was to find the cartoons in Proquest over the weekend using the metadata to ensure I was grabbing the same cartoon I had seen in the archive. I merrily worked my way through the cartoon boxes and by the end of Week One at the Huntington I had gotten through the first big batch of 76 boxes of cartoons.
That Sunday I sat on the couch of my friend’s house with my laptop and started hunting for the cartoons. I tried to target them as best I could so I searched for “editorial cartoon” on the particular date that I was looking for. Sometimes I would find it, sometimes none of the search results would be a Conrad cartoon but a cartoon by another syndicated cartoonist like Mauldin or Herblock, and sometimes my search request would return “no results matching that criteria.” When I found it, I would download the record, save the file as the date of the cartoon in a folder labelled for that year, and put a check mark inside the circle on my spreadsheet to denote that I had a copy of the cartoon, which looked like this:
When I would get results but none of them were a Conrad cartoon i wasn’t too concerned at first since I had already discovered that the dates on the cartoon in the archives did not necessarily match the publication date in The Los Angeles Times. In addition, none of the editorial cartoons in Proquest are attributed to the cartoonist so it was also entirely possible that Conrad hadn’t had a cartoon published that day either due to it being his day off, or he was on vacation. Yet, I was starting to get nervous as more and more search requests returned no Conrad cartoon or “no results” at all. If there were this many dates (at this point I had over 20 cartoons that I couldn’t locate in Proquest) where the date on the cartoon and the publication date didn’t match I was going to have a serious methodological conundrum on my hands.
Finally, after running the same search query for every day of a week and getting “no results” each time I became suspicious. Even if Conrad was on vacation, there are anywhere between 3 and 10 single panel cartoons every day not including the Funny pages. There was no way that The Los Angeles Times didn’t run a single editorial cartoon for an entire week. Obviously there was something missing in my searches. So, I got down into the weeds of Proquest to figure this out. Trying to locate the editorial section of the paper in a digitized database is like hunting for an invisible needle in an invisible haystack. Unlike on microfilm, where you can check the paper’s ‘Index’ on the front page and then scroll/scan till you get to that section this is impossible to do on Proquest. Proquest only gives you the number of pages in each edition. So the editorial section is not always going to be on page 65 because that’s where the Metro section starts every day because each day the length of each section shifts according to how much content there is. Eventually I discovered that the editorial cartoons were not always being titled as “Editorial Cartoon __ — No Title.” Sometimes, they were titled “Comic.” Other times, they had the title of the editorial they were placed next to and the digitization process didn’t recognize the cartoon as its own entity like in this case:
The manual searching of the database also led me to discover that if Conrad had a cartoon published on a certain day, it was always on the page after the letters to the editor section, which coincidentally, was always titled the same thing: The Letters to the Times. So if I searched for “The Letters to the Times” on a particular date, it didn’t matter if the editorial cartoon was titled “Editorial Cartoon,” “Comic,” or something else entirely unrelated because I was finding it based on its proximity to a dependable search element. It was slightly more time consuming since it required about 2-3 clicks into the search results to get to it but it was infinitely more reliable. I also decided that instead of waiting to download all the cartoons from Proquest at one time (which was difficult since after about 20 minutes of intense searching the database refused to display the PDF of my results blocked me) it would be easier to get them while I was going through the boxes.
After I finished with a box I would go to Proquest and try to find the cartoons I needed. Sometimes, even with my new search methods I couldn’t find it but this was usually because it was a cartoon where the publication and archive dates didn’t match. But, the cartoon usually was published within a few days after the archive date so it generally wasn’t hard to locate.
On the rare occasion, after looking over a week in either direction of the archive date and I still couldn’t find the cartoon, I would hold the box and take a picture of it and mark the center of the ‘O’ in my spreadsheet with a * to denote that I had a picture of it on my camera. I also would mark in my metadata doc whether a cartoon was found on the correct date (blue), a different date (red) and both dates were recorded, or not at all and that I have a picture of it (yellow).
In the end, this research method actually turned out for the best since it allowed me to become slightly more conversant with the quirks of Proquest search results and will make going through and filling in the blank squares of my spreadsheet a little less painful. While I may call myself a digital historian in training, I’ve always known that this doesn’t mean that the digital will completely replace the analog. In both the problems of finding what I needed in a digitized resource and in figuring out the best way to track my research progress, analog is sometimes better and you can’t always trust the search results.