Document
Conversion
We have indigenously developed an application,
which helps us to produce higher outputs by each
operator. All these projects require OCR from
TIFF to 99.995% text accuracy & tagging to
SGML/HTML/XML Standards.
Conversion to Tagged Text
Output
SGML: Standard
Generalized Markup Language. A markup language
used to define the structure of, and manage
documents in, electronic form. SGML is used
widely to manage highly interrelated documents,
larger document, and high-value documents that
are subject to frequent revisions.
XML: Extensible
Markup Language is a kind of universal web language.
It allows designers to create customized tags
that enable structured data (such as spreadsheets
and address books) to be defined, transmitted,
validated and interpreted between web applications
and between organizations. XML provides key
features needed for a new generation of Web
applications, it is highly extensible, and users
can define new tags as needed. Its structure
is hierarchical and data can be modeled to any
level of complexity, it can be easily validated
for structural correctness and it is media independent,
the same content can be published in multiple
media.
HTML: Hypertext
Markup Language. Hypertext is a computer language
of the World Wide Web. The rules of HTML define
the manner in which text; images, links and
so on are represented in a web browser. For
document with little or no graphics and where
the requirement on precise document layout is
not high, HTML is a good choice. Unlike Plain
Text HTML enabled web pages and documents to
be linked to one another for the user to have
a wider spectrum of information on the net.
Content capture services include: Quark TM,
PageMaker TM and MS Word TM to PDF and XML.
Complete Tagging for XML, SGML & HTML. Up
to 99.995% text accuracy from scanned documents.
Most database formats supported. Complete validation,
data processing and fully customizable services.
Acrobat PDF Conversion
PDF Conversion is one of our core expertises.
Given below are the most common PDF outputs:
PDF Normal (Fully text searchable with image
& graphics)
PDF Image + Text (OCR'd text embedded over
original image). The output will have full text
search ability though accuracy will be input
image dependent. We also find requirements wherein
Clients prefer parsed 99.995% accurate text
embedded on image.
Full color PDF - Replica of printed books
same as original. Ebook Publishers prefer these
outputs.
PDF Forms - Blank PDF forms - Widely used
on web.
In addition to Adobe Tools, we also use other
color supporting Applications to create high
quality color PDF output files.
|