Optical character recognition (OCR) | VDRPro

Applies to: All managers and publishers

Question

What is OCR and what file types are supported?

Answer

Optical character recognition (OCR) is the process of scanning images of printed, typewritten, or handwritten text and converting them electronically or mechanically into machine-encoded text. OCR turns scanned image text into searchable text in Intralinks.

When documents are scanned for OCR in Intralinks, the system adds the metadata to the Intralinks search engine (rather than to the original files). As a result, you can search for document keywords through the Intralinks system. However, if you try to search in the files themselves for keywords, either in the Intralinks online viewer or in a downloaded copy, the search will not provide any results.

When OCR is enabled on an exchange, supported files are scanned as they are uploaded. The metadata / content is generally searchable within 30 minutes of upload. If the OCR setting is enabled after documents are uploaded, the existing documents are triggered for OCR but may take 24+ hours to complete.

Intralinks OCR fully supports UTF-8, and as such works on all languages.

File types supported for OCR:

PDF
JPEG
GIF
TIFF

Microsoft Office files are not supported for OCR.

Additional information

Document search functionality

Optical character recognition (OCR) | VDRPro

Applies to: All managers and publishers

Question

Answer

Additional information

Was this article helpful?

< <%= previousTitle %>

<%= nextTitle %> >

Categories

Toggle navigation menu

<%= category.name %>