By Lindsey E. Desmet
It’s a simple fact that newspapers hold a wealth of information. Local, national and international news are covered; opinion articles reveal attitudes on the hot-button issues of the time; entertainment releases are listed; births, deaths and marriages are announced. To a researcher, even the advertisements and page layouts of a newspaper reveal information. As Bernard Reilly and James Simon state in The Serials Librarian, newspaper archives “constitute a body of historical and cultural evidence […] which is not, and could not be, replicated elsewhere.” (Reilly & Simon, 2010).
But newspapers are ephemeral. They’re printed on broad sheets of thin paper for immediate consumption on a daily or weekly basis. After they’re read, they become repurposed, recycled or discarded by the reader. Even if they are kept, they age. The ink fades, and the newspaper becomes more delicate. But with so much information contained within them, it’s important that these newspapers be preserved for exploration by generations to come. This is why the digital preservation of newspapers is essential.
Research Methodology
There is no better way to understand the current state of newspaper preservation than to see it in action, so I decided to explore the archival offerings of a variety of urban, suburban and rural newspaper archives.
I looked at 20 newspapers in total, and all twenty had at least partial archives available digitally on their websites. This selection of newspapers included ten urban and ten suburban/rural, spread throughout all regions of the United States. (A list of newspapers consulted can be found in REFERENCES.) To locate a varied sampling of newspapers, I visited the United States Newspaper List (USNPL), an index which provides links to all of the country’s newspapers, television news stations and radio stations.
Digitization Methods for Newspapers
Gone are the days of microfilm, which was the reigning most-popular method for newspaper preservation from the 1980s through the early 21st century (Hasenay & Krtalic, 2010). Today, there are two prominently-used types of digitization for newspapers and newspaper articles: PDF creation and full text.
PDF creation entails uploading a PDF version of a newspaper designed on a software such as Adobe InDesign. This allows readers to view the edition of the paper exactly as it looked when printed, but on a screen rather than on paper. For older newspapers that are being made available in archives, creating PDFs requires the printed or microfilm version of the edition to be digitized, converted into PDF format and then uploaded to the newspaper’s website. Since no original computer file exists for these pre-design software editions, only print or microfilm copies that have been preserved well enough to be read can be scanned and uploaded.
Full text is a popular format for recent articles, allowing them to become searchable without the use of additional keywords and metadata, as would be required for a PDF. Since the majority of newspapers do upload their articles to a website already, it is easy for them to archive the website posts of each article in full text, allowing readers and researchers to access that information long after the article leaves the website’s front page.
The full text method seems to be less popular with historic newspaper editions, based on my research. This makes complete sense, for offering full text of pre-computer editions would require the transcription of each article by hand. For newspapers with founding issues dating back to the 1800s, that’s a lot of typing to be done! To compensate for the changes in technology that have impacted the newspaper business as the centuries have passed, newspapers with extensive archives, such as the New York Times, the Los Angeles Times, the Miami Herald and the Detroit Free Press have utilized a combination of PDF and full text formats when creating their archives.
It’s a simple fact that newspapers hold a wealth of information. Local, national and international news are covered; opinion articles reveal attitudes on the hot-button issues of the time; entertainment releases are listed; births, deaths and marriages are announced. To a researcher, even the advertisements and page layouts of a newspaper reveal information. As Bernard Reilly and James Simon state in The Serials Librarian, newspaper archives “constitute a body of historical and cultural evidence […] which is not, and could not be, replicated elsewhere.” (Reilly & Simon, 2010).
But newspapers are ephemeral. They’re printed on broad sheets of thin paper for immediate consumption on a daily or weekly basis. After they’re read, they become repurposed, recycled or discarded by the reader. Even if they are kept, they age. The ink fades, and the newspaper becomes more delicate. But with so much information contained within them, it’s important that these newspapers be preserved for exploration by generations to come. This is why the digital preservation of newspapers is essential.
Research Methodology
There is no better way to understand the current state of newspaper preservation than to see it in action, so I decided to explore the archival offerings of a variety of urban, suburban and rural newspaper archives.
I looked at 20 newspapers in total, and all twenty had at least partial archives available digitally on their websites. This selection of newspapers included ten urban and ten suburban/rural, spread throughout all regions of the United States. (A list of newspapers consulted can be found in REFERENCES.) To locate a varied sampling of newspapers, I visited the United States Newspaper List (USNPL), an index which provides links to all of the country’s newspapers, television news stations and radio stations.
Digitization Methods for Newspapers
Gone are the days of microfilm, which was the reigning most-popular method for newspaper preservation from the 1980s through the early 21st century (Hasenay & Krtalic, 2010). Today, there are two prominently-used types of digitization for newspapers and newspaper articles: PDF creation and full text.
This chart shows the proportion of the 20
newspapers that I studied which make use of the PDF format, in comparison with
the proportion which make use of full text in their archives. The majority make
use of full text, while PDFs are popular for historical newspaper archives.
Full text is a popular format for recent articles, allowing them to become searchable without the use of additional keywords and metadata, as would be required for a PDF. Since the majority of newspapers do upload their articles to a website already, it is easy for them to archive the website posts of each article in full text, allowing readers and researchers to access that information long after the article leaves the website’s front page.
The full text method seems to be less popular with historic newspaper editions, based on my research. This makes complete sense, for offering full text of pre-computer editions would require the transcription of each article by hand. For newspapers with founding issues dating back to the 1800s, that’s a lot of typing to be done! To compensate for the changes in technology that have impacted the newspaper business as the centuries have passed, newspapers with extensive archives, such as the New York Times, the Los Angeles Times, the Miami Herald and the Detroit Free Press have utilized a combination of PDF and full text formats when creating their archives.
Digitization in Practice
Newspapers commonly maintain their own archives and as such, it is their own responsibility to make these archives available to the public. Some newspapers have done a better job of this than others.
The most extensive archive available of those I studied was that of the New York Times, which spans the paper’s entire history, from 1851 to the present. The only other archive out of these twenty newspapers that stretched into the nineteenth century was that of the Los Angeles Times, of which the archive begins in 1881. The majority of the archives that I studied featured all articles that have been printed since the newspaper’s website was founded, meaning anywhere from the mid-1990s (Tucson, Arizona’s Tucson Weekly) to as recent as 2013 (Wetumpka, Alabama’s Wetumpka Herald).
Not all newspapers have made their own archives available: sometimes the work is done for them. There has been a growth of newspaper and periodical databases, including 19th Century American Newspapers, a very valuable resource which is available through Gale. These services all require subscriptions, however, and can sometimes be tedious to search.
Google has done the public a great service by offering free, subscription-less digital archives of a variety of historic newspapers, and they are offered through a somewhat more user-friendly search interface than most of the subscription-based databases. Google’s archive offers 2,440 newspapers in at least four languages (I counted them from the list at the Google newspaper database site, http://news.google.com/newspapers). The scope of the digitization varies by newspaper. Some publications are only available in a few issues, while others have years' worth of editions available for browsing. Regardless, this service is definitely a step in the right direction, especially since these archives are offered to anyone with an internet connection, free of charge.
Issues of Access to Newspaper Archives
According to Hasenay and Krtalic in the Journal of Librarianship and Information Science, “the basic challenges and problems [of newspaper preservation] are usually divided into technical and organizational ones” (Hasenay & Krtalic, 2010). In other words, digital file formats are constantly changing and easily damaged. In addition, copyright concerns, archival management and financial resources can be problematic in preserving this type of information. This likely contributes to the use of the “full text” digitization model by so many newspapers. While PDF files have the advantage of preserving the information in the layout in which it was originally published, full text is easier to manipulate as technological changes come about. Physical preservation may seem more stable, but the paper used to print newspapers – appropriately named “newsprint” – is “not a permanent quality paper,” as Somnath Das describes, “because of the wood impurities that remain in the paper after processing” (Das, 2009). These impurities make the paper sensitive to environmental conditions such as the air quality and the amount of light, which can lead the newsprint to become weak or decompose (Das, 2009). Digital files may take more effort to manage, especially with file formats ever-changing, but in addition to not physically disintegrating over time, they also have the advantage of being more easily-accessible by a wide range of users.
But even with easier access, another issue occurs on the user’s end of newspaper archives. In the field of library and information science, we believe in the concept of open access for all. As such, there are some improvements that can be made to newspaper digitization in order to ensure that free and open access to newspaper archives.
The majority of the archives I studied require the user to do a subject search in order to find articles, rather than allowing the user to browse the full archives. While subject searches are incredibly useful to researchers, one improvement for open access would be to make all issues available as they were when printed, allowing the user to view a full edition of the newspaper as a PDF or view all of a specific edition’s articles in full text. A researcher studying a general time period or seeking news items from a specific date would have trouble finding that information through the subject search. Different options should be available to facilitate the needs of a wide variety of users.
The most extensive archive available of those I studied was that of the New York Times, which spans the paper’s entire history, from 1851 to the present. The only other archive out of these twenty newspapers that stretched into the nineteenth century was that of the Los Angeles Times, of which the archive begins in 1881. The majority of the archives that I studied featured all articles that have been printed since the newspaper’s website was founded, meaning anywhere from the mid-1990s (Tucson, Arizona’s Tucson Weekly) to as recent as 2013 (Wetumpka, Alabama’s Wetumpka Herald).
Not all newspapers have made their own archives available: sometimes the work is done for them. There has been a growth of newspaper and periodical databases, including 19th Century American Newspapers, a very valuable resource which is available through Gale. These services all require subscriptions, however, and can sometimes be tedious to search.
Google has done the public a great service by offering free, subscription-less digital archives of a variety of historic newspapers, and they are offered through a somewhat more user-friendly search interface than most of the subscription-based databases. Google’s archive offers 2,440 newspapers in at least four languages (I counted them from the list at the Google newspaper database site, http://news.google.com/newspapers). The scope of the digitization varies by newspaper. Some publications are only available in a few issues, while others have years' worth of editions available for browsing. Regardless, this service is definitely a step in the right direction, especially since these archives are offered to anyone with an internet connection, free of charge.
Issues of Access to Newspaper Archives
According to Hasenay and Krtalic in the Journal of Librarianship and Information Science, “the basic challenges and problems [of newspaper preservation] are usually divided into technical and organizational ones” (Hasenay & Krtalic, 2010). In other words, digital file formats are constantly changing and easily damaged. In addition, copyright concerns, archival management and financial resources can be problematic in preserving this type of information. This likely contributes to the use of the “full text” digitization model by so many newspapers. While PDF files have the advantage of preserving the information in the layout in which it was originally published, full text is easier to manipulate as technological changes come about. Physical preservation may seem more stable, but the paper used to print newspapers – appropriately named “newsprint” – is “not a permanent quality paper,” as Somnath Das describes, “because of the wood impurities that remain in the paper after processing” (Das, 2009). These impurities make the paper sensitive to environmental conditions such as the air quality and the amount of light, which can lead the newsprint to become weak or decompose (Das, 2009). Digital files may take more effort to manage, especially with file formats ever-changing, but in addition to not physically disintegrating over time, they also have the advantage of being more easily-accessible by a wide range of users.
But even with easier access, another issue occurs on the user’s end of newspaper archives. In the field of library and information science, we believe in the concept of open access for all. As such, there are some improvements that can be made to newspaper digitization in order to ensure that free and open access to newspaper archives.
The majority of the archives I studied require the user to do a subject search in order to find articles, rather than allowing the user to browse the full archives. While subject searches are incredibly useful to researchers, one improvement for open access would be to make all issues available as they were when printed, allowing the user to view a full edition of the newspaper as a PDF or view all of a specific edition’s articles in full text. A researcher studying a general time period or seeking news items from a specific date would have trouble finding that information through the subject search. Different options should be available to facilitate the needs of a wide variety of users.
While a fair portion of the newspapers that I
studied do make their archived editions available free of charge, the majority
of newspapers opt for pay-per-article or subscription-only models of archive
availability. Since many newspapers also have measures in place to prevent
subscription-sharing, this can be a problem for libraries that wish to offer
newspaper archive access to their users.
Journalism is a business, and the aim of many publications in making their archives available is to profit from the acquisition of archived articles on specific topics. This, unfortunately, gets in the way of the principle of open access. Some newspapers restrict their archives to subscribers, while others offer a pay-per-article service. Of the 20 newspapers I studied, only nine made their archives available free of charge. These revenue strategies may be combined, as well: the Owyhee Avalanche of Homedale, Idaho, for instance, offers most of their previous editions free of charge, but the three most recent years of the archives are only available to subscribers. Similarly, the Miami Herald requires a subscription to view the historical archives, but offers a pay-per-article system for articles published between 1982 and the present. Pay-per-article services would allow any library user to access articles by paying for the articles with a credit card, but this would restrict lower-income patrons who may not be able to afford the access fees from acquiring archived articles. A library could purchase a subscription to a newspaper which only allows archive access to subscribers, but limits on the number of articles that can be accessed per month could be a problem, if the newspaper has enacted those policies to prevent subscription-sharing.
In conclusion…
Though ideally all archives would be available free of charge, this is not always feasible in the financially-struggling world of journalism. Newspapers are also, of course, not bound by the ideals or ethics of librarianship. Open access may not seem to be of great importance to them. However, journalism and librarianship are both fields that value intellectual freedom. A collaboration between these two fields, to digitize, organize and make available the wealth of information that newspapers hold should be a priority, both for the good of today’s researchers and curious minds and for those of the future.
Journalism is a business, and the aim of many publications in making their archives available is to profit from the acquisition of archived articles on specific topics. This, unfortunately, gets in the way of the principle of open access. Some newspapers restrict their archives to subscribers, while others offer a pay-per-article service. Of the 20 newspapers I studied, only nine made their archives available free of charge. These revenue strategies may be combined, as well: the Owyhee Avalanche of Homedale, Idaho, for instance, offers most of their previous editions free of charge, but the three most recent years of the archives are only available to subscribers. Similarly, the Miami Herald requires a subscription to view the historical archives, but offers a pay-per-article system for articles published between 1982 and the present. Pay-per-article services would allow any library user to access articles by paying for the articles with a credit card, but this would restrict lower-income patrons who may not be able to afford the access fees from acquiring archived articles. A library could purchase a subscription to a newspaper which only allows archive access to subscribers, but limits on the number of articles that can be accessed per month could be a problem, if the newspaper has enacted those policies to prevent subscription-sharing.
In conclusion…
Though ideally all archives would be available free of charge, this is not always feasible in the financially-struggling world of journalism. Newspapers are also, of course, not bound by the ideals or ethics of librarianship. Open access may not seem to be of great importance to them. However, journalism and librarianship are both fields that value intellectual freedom. A collaboration between these two fields, to digitize, organize and make available the wealth of information that newspapers hold should be a priority, both for the good of today’s researchers and curious minds and for those of the future.
Resources
Das, S. (2009). Preservation of Newspapers. DESIDOC
Journal of Library & Information Technology, 29(1), 72 – 75.
Hasenay, D., & Krtalic, M. (201). Preservation of newspapers:
Theoretical approaches and practical achievements. Journal of
Librarianship and Information Science, 42(4). 245 – 255.
Reilley, B.F., & Simon, J. (2010). Shared Digital Access and
Preservation Strategies for Serials at the Center for Research Libraries. The
Serials Librarian, 59(3/4). 271 – 280.



Hi class,
ReplyDeleteI'd like to comment on the difficulties of digitizing historic newspaper collections in a public library setting, and spotlight one solution we are pursuing at my library right now.
The first obtacle is always funding, and the second is access to equipment and trained staff (and time) to handle the project. In the case of my own library, we house the original Lansing State Journal from 1869-1966 - it was then called the Lansing Republican. Since this is our state capital and one of the seats of the auto industry, the history provided in these newpapers is vital and should be preserved and freely accessible to all. However, CADL does not have the funding or staff to digitize this rare collection.
One solution is to pursue grants. Right now we are competing for a digitizing grant with other Michigan libraries housing historic newspaper collections, and the unique feature of this grant is that winner will be chosen by voters! For more information: https://digmichnews.wufoo.com/forms/vote-for-your-library-to-win/
My question for the class is: what do you think of this approach to grant funding newpaper (or other) digitization? Does it promote collections or just create an atmosphere of competition? Pros and cons or other thoughts?
-Heather
I think that grants are a great approach to securing additional funding for the digitization of newspapers. They do create an atmosphere of competition, especially with programs like the vote-driven one that you mentioned, but I think that the payoff of being able to preserve the wealth of information that newspapers hold is high enough to outweigh that. Competition isn't necessarily a bad thing -- no matter who "wins," the outcome will be that information is preserved, which is (of course) a positive. While some may argue that competitive grants put smaller libraries at a disadvantage, I don't think this is true. In a vote-driven competition a library with a higher number of users may excel, for instance, but not all grants use that selection process. Thanks for the response, Heather!
Delete-Lindsey
Heather, I agree with Lindsey. In this instance, competition is not a bad thing if the end result is preservation of material. A grant specifically targeting small libraries may allow a library with a rare or specialized local collection to digitize newspapers that may have ceased publication long ago. This may also be the only funding source available to these libraries whereas a larger library may be able to obtain funding from another source.
ReplyDeleteHi Rhonda and Lindsey,
DeleteInterestingly in the case of the grant competition I mentioned, some users were rigging the system so to speak by casting multiple votes. Since these were from the same IP address, facilitators were able to spot this issue. But it brings up a good point: online competitions for grants are a neat idea, but in some cases the people setting up these services haven't effectively planned out how the technology works. I've seen this time and again in my own library! There's another (unrelated) grant competition out there on one of my listservs too where another librarian made sure to mention to everyone that "you can vote as many times as you like!" I thought to myself, ethics anyone?
The technology is there to engage patrons and staff in healthy competition for grant for important projects like digitization, for sure. What is missing is the foresight to set up these systems properly, because there will always be someone who thinks a loophole is an opportunity for their library to gain an advantage. You'd think this profession would be more cooperative and less competitive, but when funds are tight, I'm thinking the opposite is true!
Just some food for thought!
-Heather
Along the line of other thoughts, it seems like an easy way to get around duplicate votes would be with complete information: name, address, contact information available for spot checks? It would also allow those running the contest to create a database for interested parties for future grant competitions.
Delete~M Lenox
Heather, thank you for bringing up the issue of flawed voting systems. These grant competitions definitely need to be designed with checks in place to prevent dishonest vote-casting. In addition to measures like identification verification as Melissa mentioned, think there should also be clearly-stated rules in place to restrict multiple votes, and to disqualify libraries engaging in dishonest voting practices.
DeleteJust to quickly address Melissa's response (which is a good one, but with a couple of impediments) - there are two problems with adding a layer of personal information to an online voting form for a grant competition, as I see it. One is that people do not want to vote if they have to add extra information. This is a "click one button or I'm out" society. I wouldn't vote if I had to add all that info in either - but mostly because I'm all about the anonymity! The other issue is that some libraries have a kiosk in building for patrons to vote, so again it needs to adhere to privacy policies and also be really, really fast and simple.
DeleteSo to add to this and Lindsey's thoughts, these online voting forms for winning grants are a great way to engage with patrons. But as we are learning in our 6080 class, not all non-profit staff are using the right tools or have technical know-how to get a usable, secure system in place.
From my experience, the testing stage of new idea development often involves mostly just the developers (nope) and isn't long enough to find the bugs.
The grant I mentioned has had a flurry of 'new rules' emails since, to fix the problem in hindsight.
I guess they will learn from this experience! And we will too, hopefully, from following along on the listservs.
Thanks for your great posts!
-Heather
I agree that it would be very nice to be able to access this information for free but as stated, newspapers are a for profit business. Sadly actual newspapers seem to be falling away and in order to get the same information one must also pay to see it on line. I am afraid that this may end up being the end of newspapers all together as people can get similar information for free on other sites on the internet.
ReplyDeleteI did not realize that Google had some available-- I will have to check that out!
I am a former journalism major and currently work in online journalism, which influenced my choice to write about this topic. I can see both sides of the "open access" issue because I'm involved in both fields. Newspapers are definitely struggling, and like libraries, they are in a time of transition. There is a struggle to find a balance between print and multimedia content, and to find financial solutions to compensate for decreasing print ad revenues. Paywalls, electronic subscriptions and web advertising are effective moneymakers, but as you said, they can also deter readers who will find the information at free-to-use websites. I think that a solution will be found if the old guard of traditional, print-loyal journalists (who still, in my experience, make up the bulk of the field) will accept the need for change and allow more creativity and innovation. Here is an interesting account from an intern regarding the resistance to multimedia by traditional editors: http://www.nieman.harvard.edu/reports/article/102124/Are-Newspapers-Dying-The-View-of-an-Aspiring-Journalist.aspx
DeleteThis article mentions a hyper-local news focus as one potential solution (because national and world news are widely disseminated by wire services), as well as a greater focus on providing video accompaniments to stories (as an example of how multimedia can be used).
Thanks for the response!
I would like to point out that the Northville District Library has wonderfully digitized their newspapers from the early 1900s. It's awesome to see and search through and I had the privilege of seeing them be in use when someone asked a reference question that warranted the librarian to research them. They have digitized them in the Full Text format and so they are easy to search with keywords. You click into the digital format, enter in your search terms, and then all the articles that have ever been published with those terms pop up. The search provides the year, the issue number, the page number, the title, and the paragraph within that article contains the term. It's pretty awesome.
ReplyDeleteThe Local History Librarian told me that the Friend of the Library group set funding aside for the project like 20 years ago and it just was never touched. Then about 4 years ago, the project started. She had a write for an additional grant, but she got it and the project was finished not that long ago.
It's truly an invaluable source and something that I think all libraries should look to do if they can.
Thanks for mentioning that, Melissa! I wasn't aware that Northville's library had that available. I'll have to check it out. Do you have to be a card-holding member of the library to access the digital records, or can members of nearby communities make use of them as well?
DeleteThis post is incredibly informative. Great job, Lindsey. I am interested in how you selected your newspapers, distributing them so evenly in terms of urban and suburban/rural. I have taken advantage of databases that have digitized historical newspapers (oftentimes international newspapers, what good fortune) for several assignments, and I loved every minute of doing that. When learning about a specific individual or time period, it can be super helpful to just skim newspapers to gain priceless insights. I have found this to be true because newspaper articles appeal to the gut-level, though they can be intellectually-stimulating too. Among all of the database subscriptions that could be canceled, I hope that institutions of higher-learning think twice before doing so with those that have digitized historical newspapers.
ReplyDeleteUSNPL (which is linked in the post above if you're interested in checking it out!) was very useful in selecting a wide distribution of newspapers. Most of the urban newspapers were selected because I already knew of them (like the Freep, the LA Times, etc. - I was a student of journalism as an undergrad so I'm familiar with most of the major urban papers), but USNPL is great for finding smaller papers. It allows you to browse newspapers by state, so I just chose different states and looked for newspapers located in small communities.
DeleteI agree that a newspaper's insights are priceless! As you mentioned, while they can be intellectually-stimulating, they are written with facts as the first priority (with the exception of opinion pieces) and for a general audience. Any researcher can understand them and make use of the information provided by them.
Very interesting post! I just spent quite a bit of time exploring the Google Newspaper Archives website; so very interesting! It is great that these have been preserved, posted publicly, and FREE! I have booked marked the website and think I have found a new favorite history website. Thanks for sharing!
ReplyDeleteVery, thorough research, Lindsey, as always. I, too, am new to the Google Newspaper Archives website world and will take a look there and bookmark it for future reference. Several of the newspapers I read daily have a free on-line presence, or at least a portion of it as free. I think newspapers, like libraries, are in transition and have had to reinvent themselves.
ReplyDelete