Home | About Us | Our Team | Careers | Support | TrainingKnowledge Base | Contact Us
     
 
Google
 
WWW
samdisha.com
 
BPO Services

       
       
       
       

Document Conversion

We have indigenously developed an application, which helps us to produce higher outputs by each operator. All these projects require OCR from TIFF to 99.995% text accuracy & tagging to SGML/HTML/XML Standards.

Conversion to Tagged Text Output

SGML: Standard Generalized Markup Language. A markup language used to define the structure of, and manage documents in, electronic form. SGML is used widely to manage highly interrelated documents, larger document, and high-value documents that are subject to frequent revisions.

XML: Extensible Markup Language is a kind of universal web language. It allows designers to create customized tags that enable structured data (such as spreadsheets and address books) to be defined, transmitted, validated and interpreted between web applications and between organizations. XML provides key features needed for a new generation of Web applications, it is highly extensible, and users can define new tags as needed. Its structure is hierarchical and data can be modeled to any level of complexity, it can be easily validated for structural correctness and it is media independent, the same content can be published in multiple media.

HTML: Hypertext Markup Language. Hypertext is a computer language of the World Wide Web. The rules of HTML define the manner in which text; images, links and so on are represented in a web browser. For document with little or no graphics and where the requirement on precise document layout is not high, HTML is a good choice. Unlike Plain Text HTML enabled web pages and documents to be linked to one another for the user to have a wider spectrum of information on the net. Content capture services include: Quark TM, PageMaker TM and MS Word TM to PDF and XML. Complete Tagging for XML, SGML & HTML. Up to 99.995% text accuracy from scanned documents. Most database formats supported. Complete validation, data processing and fully customizable services.

Acrobat PDF Conversion

PDF Conversion is one of our core expertises. Given below are the most common PDF outputs:

PDF Normal (Fully text searchable with image & graphics)

PDF Image + Text (OCR'd text embedded over original image). The output will have full text search ability though accuracy will be input image dependent. We also find requirements wherein Clients prefer parsed 99.995% accurate text embedded on image.

Full color PDF - Replica of printed books same as original. Ebook Publishers prefer these outputs.

PDF Forms - Blank PDF forms - Widely used on web.
In addition to Adobe Tools, we also use other color supporting Applications to create high quality color PDF output files.

 
 © Copyright Samdisha Software Research & Development  Center Pvt. Ltd.-2004