Document Content
The
Document Content section in the Laserfiche Capture Engine dialog box allows you to define the content you want retrieved from your documents.
For each document:
- Keep each entry as a separate document: Select if you want to keep each entry as its own separate document. When selected, imaged documents will enter Quick Fields organized into documents as they were in the Laserfiche repository. When cleared, all the image pages will enter Quick Fields as a single set, and will only be sorted into documents according to the settings in Quick Fields. This is useful if you have pages that have not already been defined into documents, or if you want to change the document structure.
Note: This selection is one factor in how pages will be grouped into documents by Quick Fields. The others are the First Page Identification conditions, the Last Page Identification conditions, and the Document Length settings. For more information, see How Documents are Identified.
Note: If Keep each entry as a separate document is not selected and you do not have Generate images for each page or Extract the text from each page selected, you will not be able to select Keep PDF after using it to generate Laserfiche Pages. If the Laserfiche pages in the repository are going to be separated and made into new documents of different lengths, the PDF will not be kept because it cannot be attached to new arbitrary Laserfiche documents. A combination of different file types (doc, pdf, etc) retrieved can be used to create new Laserfiche documents, but the electronic files cannot be kept because of the mixed file types.
- Retrieve electronic file: If the entry you are retrieving is an electronic file (e.g., PDF), selecting this option will retrieve that electronic file, giving you the ability to generate pages from PDFs or extract text from Word documents, etc.
Note: If you do not have Retrieve electronic file selected, you will not be able to select any options under If the document is a PDF. There is no need to select any PDF options if you are not retrieving electronic files.
- Retrieve fields: Select this option to retrieve the document fields.
- Retrieve tags: Select this option to retrieve the document tags.
For each page in the document, retrieve:
To specify the page data to retrieve, select any or all of the following:
- Image: Retrieves each page's image.
- Use the image's original orientation: When selected, this will retrieve the document at the original orientation at which it was scanned. When cleared, it will use the orientation you have set in the image display
- Text: Retrieves each page's text.
- Annotations: Retrieves each page's annotations.
If the document is a PDF:
- Generate images for each page: Generate Laserfiche images for each page of the retrieved PDF.
- Convert images to black & white: Convert the generated image pages to black and white.
- Scale images to use DPI: Customize the generated image's dots per inch (DPI).
- Extract the text from each page: Extract text from the retrieved PDF documents.
- Include PDF form field values in the text: If you are extracting text from a PDF form, the values in the form's fields will be included in the extracted text. It will be located at the bottom of the Text Pane.
- Keep PDF after using it to generate Laserfiche pages: The PDF can be used to generate Laserfiche images, extract text, or both. Selecting this option will keep the PDF file after it has been used to create Laserfiche pages.
- If Keep each entry as a separate document is selected above but you do not have Generate images for each page or Extract the text from each page selected, Keep PDF after using it to generate Laserfiche pages will be required and automatically selected. Since the new document being created will not have Laserfiche pages, it will require the PDF be attached as the electronic file component.
- If Keep each entry as a separate document is not selected above, you will not be able to select Keep PDF after using it to generate Laserfiche pages. If the Laserfiche pages in the repository are going to be separated and made into new documents of different lengths, the PDF cannot be kept because it cannot be attached to new arbitrary Laserfiche documents.
Note: Keep PDF after using it to generate Laserfiche pages needs to be selected in order to use Retrieve PDF Form Content.
- Convert the PDF annotations to Laserfiche annotations: Any PDF annotations on the retrieved PDFs will be converted into Laserfiche annotations.
- If the entry already has Laserfiche pages, overwrite them: PDFs retrieved may already have Laserfiche pages associated with them. Select this option if you want the newly processed Laserfiche pages (generated from the PDFs) to overwrite the existing Laserfiche pages in your repository.