72°F
weather icon Clear

Small industry has emerged to scan books into computerized format

“Someday all this stuff is going to be online. Right?” — An attorney, circa 1988.

Digital libraries have been in existence since the 1960s. They caught on immediately and have been popular ever since because they save time for people. What does their proliferation mean for job seekers?

Before we answer the question, let’s be clear on what we are talking about.

In a certain sense the World Wide Web is the biggest library of any kind (digital or not) that has ever existed. But the information on the Web is of varying quality.

What will be discussed in this article is any intellectual work that is created in accordance with accepted standards of research, writing, editing and publication. No need to add the qualifier “digital” as virtually all intellectual work nowadays is created with electronic tools.

This definition encompasses books and journal articles, but also video, audio, games and presentations. They can be nonfiction, fiction or purely recreational. A digital library is any resource that organizes these works so that people can find them with relative ease.

130 Million Books

The library of Alexandria, Egypt, was the greatest repository of knowledge in the ancient world, with holdings estimated at more than 500,000 documents. The destruction of this collection (through war, neglect and religious strife) is still a source of anguish for scholars.

We face a similar situation today in that many of our most treasured works are printed on paper that is disintegrating. Happily, there are efforts afoot to digitize them and preserve them for future generations before they are lost for good. The most well-known of these are Google Books and Project Gutenberg.

But many universities and research libraries also are digitizing their collections as a matter of preservation. A small industry has grown up around scanning books into digital formats. Scanning firms employ advanced technology and many staff members.

The process of digitizing books begins with bibliographers, archivists and preservationists who assess and prioritize the works to be scanned and prescribe the best scanning procedure. The actual scanning is done by scanning technicians.

The fastest way to scan a book is to shave off the spine and feed the pages through scanners that process a thousand pages an hour. Obviously, you only make this sacrifice if there are plenty of copies of a work in existence. If a volume is extremely rare and fragile, a scanning technician wearing white cotton gloves will place the book on a scanner and carefully turn and scan the pages one at a time. Either way, once the scanning is complete, a quality check ensures that the work has been accurately and completely preserved.

Finally, bibliographic information in library catalogs is updated to reflect the existence of a permanent digital copy. This can be done by the scanning tech or by a digital media librarian.

Google estimates that there are 130 million unique books in existence and hoped to have them all scanned into Google Books by 2010. The fact that it is still scanning indicates the enormity of the task and the positive outlook for the scanning industry.

Digital Librarians

Digital librarianship is a relatively new specialty. These professionals do all the things traditional librarians do only with digital works: collection development, cataloging and classification, and providing reference service. But they also face a more daunting challenge: digitizing sometimes massive paper collections for preservation. For institutions not associated with Google Books, these conversions can take years and involve armies of scanners and support technologists.

Digital Rights Managers

The movie “The Hurt Locker” was critically acclaimed and won Academy Awards for Best Picture, Best Director and Best Original Screenplay. But it lost money. The reason? By the time the official DVD was released, the market was saturated with bootleg copies.

In an era when anybody with an iPhone can download and disseminate content, striking the balance between promoting use of intellectual work while preventing promiscuous copying and distribution is a serious challenge. Digital rights managers either consult in this area or develop software that attempts to prevent unauthorized copying. The evolving law of intellectual property will be the key that unlocks the power of sharing copyrighted digital works.

Indexers

The challenge in information searching is for each user to find and get what they need. Web search engines have conditioned us to accept “good enough” information — the kind that tells you something about what you want to know as long as you are willing to look at a few advertisements. But the best search results come from information that is organized by skilled information professionals. From an employment perspective, here is how this is playing out and will continue to do so .

Search algorithms, such as those developed by Google, Yahoo or Bing, will flood you with plenty of “good enough” information. But you get the best results when you search a database in which human indexers have taken steps to tell you what each item is about.

Indexing involves reading key parts of an item then assigning subject headings that convey what the contents are about in a predictable way. The subject headings come from a list of terms that describe concepts. These terms are agreed upon in advance and reflect the way people in the field speak. They reside in specialized vocabularies developed by taxonomists.

Computer searching relies on matching words (character strings, to a computer). So if you know what words to use, such as the subject heading, you are more likely to get a “hit” and it is more likely that your hits will be highly relevant (as opposed to “good enough”). These types of services deal mostly with high-value information: law, science and medicine. The people who do this work usually have degrees in information science and in the subject area.

Support Occupations

The digital information sector employs a host of support specialties, some mundane and some using technology of unimaginable sophistication.

n Database staff — All of this “stuff” is stored in digital databases. To make this happen you need database administrators, data entry staff and database designers and programmers to specify, program and update the code.

n Web developers — Except in rare instances, most people access these databases via the World Wide Web. Web developers work with database programmers to design interfaces that allow users to do fairly sophisticated searches with a few simple techniques.

n Network and storage specialists — In case you were wondering, our digital information is stored in hangar like buildings that house rank upon rank of server racks. These facilities have to be kept cool (servers generate a lot of heat) and backed up with redundant systems. They are staffed 24/7 by people who monitor performance, anticipate problems and take action before a disaster occurs.

Networking specialists ensure that these servers remain “live” on the Internet available to anyone who punches a query into their favorite search engine.

State of digital libraries

This represents the state of digital libraries based on the technology we have today. We would like to imagine that somewhere, some artificial intelligence scientist is working on a system that will make all the world’s knowledge available instantly, to anyone who needs it, with due consideration for the creators. Until that day comes, we’ll be relying on people in all the occupations we’ve mentioned above to keep the digital library growing.

Don't miss the big stories. Like us on Facebook.
THE LATEST