Digitization of Newspapers

Newspaper Industry is one of the key verticals to which we provide software and IT services since last 25 years. We digitise all kinds of newspapers and deliver the output in PDF, jpeg, XML or METS/ALTO XML file format. We also do further classification and segmentation of metadata, both at article and page level, before finally giving the output as Searchable PDF or METS/ALTO XML.

We also offer e-paper solutions that address conversion of current and archival contents, for the newspaper market.

Newspaper Digitization Process

Conversion of Contemporary Newspaper

The contemporary Newspaper are the modern-day newspapers, which have

  • Downloading of born digital files like PDF
  • Layout analysis and Extraction of the data
  • Metadata Tagging, formatting and proofreading
  • Validation
  • Quality checks
  • Uploading of the output

Digitization of Archive Newspaper

The digitization process of archive newspaper involves following processes,

  • Scanning of Microfilm or paper based Newspaper.
  • Image processing for Despeckling, Deskewing and Cropping of images.
  • Assigning of Metadata for each issue, page, and article to increase the searchability of the newspaper OCR to create searchable full text.
  • The import of OCR text, images, and metadata into a digital library software program.