HTC Global Services Home | Employee Login | Contact Us     
 
     
  BPO > Content Processing

 
  Content Processing  
 
Data Capture

HTC provides document scanning and data capture services covering high volume scanning of books, documents, drawings, microfilms and photographs. Our Kodak and Sunrise paper and microfilm scanners offer a total capacity of more than 500,000 images per day making us one of the largest capacity data capture service providers in Asia.

Our expert technicians analyze the source material and scan them at the optimum settings for storage, viewing and OCR. In addition to OCR output, we can also capture word coordinates for highlighting search keywords on images.

Depending on the content and product requirements, we can zone out article clippings, identify and capture data from front and back matter of publications, article headings, illustrations, etc.

A few specific projects we have executed include:
  • Scanning and conversion of eighteenth and nineteenth century literature
  • Scanning and conversion of nineteenth century periodicals
  • Scanning and conversion of legal briefs
Data Entry

We also provide high volume automated data capture, manual keying from image, and OCR output correction to create or update metadata, index data or full-text.  We maintain an accuracy rate of 99.9% or higher, as required by customers.
 
Data Conversion

HTC will help you meet your document conversion and information capture needs. HTC has the capability to create digital content from any source-paper, microfiche, microfilm, aperture cards, and electronic files and convert data to any industry or company standard or format, for example, TIFF, PDF, ASCII, EBCDIC, comma separated value, fixed format, SGML, XML, or other tagged formats. We create custom programs, if required, to automatically convert data into XML. We work with our customers to design Document Type Definitions (DTDs) for easy integration, manipulation and management.
 
Content Repurposing

Publishing companies reuse digital assets across delivery mediums and audiences. HTC enables repurposing of a customer’s content by:
  • Analyzing disparate content formats
  • Designing a standard format to preserve the content’s important features that may be required to fit a new delivery or use scenario
  • Converting the content into the standard format
  • Loading the data into a single repository such as MarkLogic that allows easy search and retrieval
  • Develop an interface for customized content retrieval, and repurposing of the content to match the end-user’s delivery preferences and environment
The overall content repurposing solution and architecture is designed to meet the requirements for content management, rights protection, and access control

Newspaper / Periodicals Digitization

HTC has digitized approximately 4 million pages of newspapers, periodicals and journals of 19th century collection from microfilm featuring full-text content and images from a range of urban and rural regions throughout the US and UK. The content varies from American Civil War, African-American culture, history, Western migration and other subjects with a significant quantity of graphical illustrations. A sample of newspaper titles that we have digitized include:
  • New York Herald (NY)
  • Rocky Mountain News (CO)
  • Milwaukee Sentinel (WI)
  • Mountaineer (SC)
  • British Women’s Temperance Journal (UK)
  • The Northern Star (UK)
  • The Fishing Gazette (UK)
HTC delivers images in standard formats such as Times Digital DTD, NDNP (National Digital Newspaper Program), OVID and proprietary formats to enable display of clipped articles and full pages, hit-term highlighting on search terms, search by issue, articles, article categories such as Editorials, News, Arts and many others. HTC’s newspaper digitization solution, a combination of automatic and manual processes, enables our customers to provide value in the digital Internet age to their end-users such as researchers, scholars, and students. The process uses our publishing and conversion frameworks, OCR technologies, and heuristic algorithms developed by our subject matter experts with significant experience in content analysis, workflow and digitization processes.
 
e-Book Conversion
 
HTC’s digitization workflow is able to quickly convert publications such as journals and magazines of literature, science, technology and reference titles into searchable online resources. We have digitized thousands of pages for some of the world’s leading publishers and our process enables:
  • Automated conversion from hard copy, microfilm images, PDF, etc
  • High volume conversion
  • Consistent quality output from OCR, indexing and XML conversion process
  • Validation tools and routines at end of every activity to meet product specification
Our vision “Reaching out… through IT®” has been the guiding principle in defining our processes and solutions by leveraging technology. Our conversion process is automated to achieve reliable and consistent output along with manual processes to constantly upgrade our tools, technology and processes. The process is stable and scalable to perform scanning, image enhancement, zoning, tagging, optical character recognition (OCR), editing, and advanced search and retrieval. With our domain knowledge in the publishing vertical along with SMEs, our solution identifies articles, images, diagrams, equations and graphs for today’s world wide web users to search and research.
Off Site On Target
 
Can we help you?
    Contact form
    Send us an email
    Call us at our offices
 
 
© 2010 HTC Global Services                                                                                                                                   Privacy Policy | Sitemap