Radix DM Crawler

Radix DM Crawler

The Radix DM Crawler is a program that can be run from any machine at any time, but is generally set to run as a Windows scheduled task on the machine on which Radix DM was originally installed.  This program verifies that data stored about the documents in Radix DM matches the actual file details.

 

For the Radix DM Crawler to function, the machine on which it runs must have a MAPI component installed.  For most users, Microsoft Outlook provides this functionality.  For servers or other machines which do not have Microsoft Outlook installed, you will need to install Microsoft Exchange Server MAPI Client and Collaboration Data Objects 1.2.1 or equivalent.

 

The Radix DM Crawler can be run as a command line, a Windows application or by Radix DM Administrators from the Radix DM Search Administration tab.

 

To start the Radix DM Crawler as a Windows application, simply select Start > Programs > Radix DM > Radix DM Crawler from the Start Menu.

 

The initial configuration screen will resemble the following:


Create Commandline Shortcut: This command creates a new shortcut to the program (with parameters set as defined by the current window) which can be saved.

Validate Library File Paths:  This process works in reverse to a normal Radix DM Crawler operation by scanning the details in Radix DM to identify any files that may be missing.  These results appear in the section Progress.  This function cannot be called from a command line.

Commandline Parameters Information: Display a list of command line parameters which are detailed in the section  Running as a Scheduled Task.

Using the Program

Crawler Configuration

If Radix DM is configured to run across a WAN then the Radix DM Crawler should be run on a local server for each physical location.  

Process all Locations and Library Groups:  This toggle button switches the process range of Locations and Library Groups from the list of selected values in accompanying drop downs to all possible values.

Process Selected Locations/Library Groups:  This toggle button switches the process range of Locations and Library Groups from the list of selected values in accompanying drop downs to all possible values.  If Radix DM is configured to run across a WAN then the Radix DM Crawler should be run on a local server for each physical location.  In each of these locations, only the library groups that are local to that particular server should be selected with this option.

Move files to quarantine: If this option is checked, then files that the Radix DM Crawler detects in the document file locations that should not be there are moved to the Quarantine folder.  The default path for this folder is \\SharedNetworkLocation\Programs\RadixDM\Quarantine, as determined by the Radix DM installation. No files are deleted as part of this process.

Locations: This dropdown allows the user to select the Locations which the Radix DM Crawler process will operate on.

Library Groups: This dropdown allows the user to select the Library Groups which the Radix DM Crawler process will operate on.

Process: Click this button to verify that the files and folders in the physical directory store match the data stored in the database, based on the settings selected.  In addition, tasks associated with the three tabs will be performed.  Results will appear in the text box Progress.

 

Document Titles Tab

If the check box Set Document Titles is checked then when the Radix DM Crawler processes the designated files then it will also set the document property Title with the value of the corresponding Radix DM system field Title.  The only extensions supported for this operation are: doc, docx, xls, xlsx, ppt, pptx.  If any of these document types are not desired, then remove them from the Set Document Extensions text box if you are running the program from Windows or from the value for the key SetDocumentTitleExtensions in the RadixDMCrawler.exe.config file which will be located in the same folder as the program executable.  Please note that this operation will fail for documents that are password protected or designated as Read Only.

 

Page Count Tab

When the Radix DM Crawler processes the designated files it will also set the value of the built in Radix DM system field PageCount for each document which has an extension of one of the values in the section Page Count Extensions with the value of the PageCount property for these documents.  The only extensions supported for this operation are: doc, docx, xls, xlsx, ppt, pptx, pdf.  If any of these document types are not desired, then remove them from the Page Count Extensions text box if you are running the program from Windows or from the value for the key PageCountExtensions in the RadixDMCrawler.exe.config file which will be located in the same folder as the program executable. Please note that this operation will fail for documents that are password protected.

 

OCR Tab (Beta)

If the check box OCR PDF Files is checked then when the Radix DM Crawler processes the designated files then it will attempt to OCR all PDF files that do not contain fonts that it operates on.  If this scan completes successfully the original file will be moved to the Quarantine\OCR folder (\\SharedNetworkLocation\Programs\RadixDM\Quarantine\OCR) and the processed file will replace it.  This allows the original file to be recovered if there are any issues associated with the process.

The DPI used for the OCR can be changed.  A lower DPI results in smaller files, but less accurate OCR.  Higher scan rates result in larger files but more accurate OCR.  300 DPI offers a good balance between size and accuracy.

 

This function is still in Beta and some minor issues may still be experienced.

Running as a Scheduled Task

The Radix DM Crawler is designed to be run as a scheduled task.  Schedule this task to occur at regular intervals to ensure that Radix DM search results are accurate, generally at least once per day (in the evenings).

The scheduled task can be created with the following details which are also specified on the the tab Command Line Parameters:

 

C:\Program Files (x86)\Radix Software\Radix DM Client\RadixDMCrawler.exe
 
 
Command Line Parameters
 
/A

If this argument is used, the programs runs automatically and closes when it is complete.

/LOC="xxx"
A comma separated list of Location IDs that the Radix DM Crawler will use.  All documents in library groups that belong to these locations will be operated on.
/LIB="xxx"
A comma separated list of Library Group IDs that the Radix DM Crawler will use.  All documents in these library groups will be operated on.  If this parameter is omitted, as library groups will be used.
/SETTITLE
If this argument is used, then the title property will be set for documents in the selected library groups.
/OCR
If this argument is used, then the text will be OCRed for documents in the selected library groups.  This argument is not required if /OCRDPI is used.
/OCRDPI="xxx"
Thie DPI of the OCR scanning that will be applied to documents in the selected library groups.
/MQ
If this argument is used, then documents that appear in the Radix DM document folders that do not belong in Radix DM are moved to Quarantine.
    • Related Articles

    • Radix DM 2.0.29

      This page details all the changes introduced in version 2.0.29 of Radix DM. Update Notes Please ensure that all instructions on the Radix DM Client Update process is followed as per details specified in the document How to Update the Radix DM Client. ...
    • Radix DM 2.0.35

      This page details all the changes introduced in version 2.0.35 of Radix DM. Update Notes Please ensure that all instructions on the Radix DM Client Update process is followed as per details specified in the document How to Update the Radix DM Client. ...
    • Radix DM 2.0.33

      This page details all the changes introduced in version 2.0.33 of Radix DM. Update Notes Please ensure that all instructions on the Radix DM Client Update process is followed as per details specified in the document How to Update the Radix DM Client. ...
    • Radix DM 2.0.31

      This page details all the changes introduced in version 2.0.31 of Radix DM. Update Notes Please ensure that all instructions on the Radix DM Client Update process is followed as per details specified in the document How to Update the Radix DM Client. ...
    • Radix DM 2.0.32

      This page details all the changes introduced in version 2.0.32 of Radix DM. Update Notes Please ensure that all instructions on the Radix DM Client Update process is followed as per details specified in the document How to Update the Radix DM Client. ...