Operation

The Catalog Ingest Tool (CITool) has three functions:

  • Compare a Catalog Submission

    Comparisons of catalog submissions, either file to file or directory to directory, can be performed with a report detailing the differences.

  • Validate a Catalog Submission

    Validation specific to catalog files can be performed with a report detailing the results. In addition, referential integrity checks are performed within the set of catalog files. Details on the types of checks performed can be found in the Referential Integrity Checking section. If viewing this document in PDF form, see the appendix for details.

  • Ingest a Catalog Submission

    Ingestion into the central catalog database can be performed assuming the user has appropriate permissions.

The following topics can be found in this section:

Note: The command-line examples in this section have been broken into multiple lines for readability. The commands should be reassembled into a single line prior to execution.

Tool Setup

In order to execute CITool, the user's environment must first be configured appropriately. This section describes how to setup the user environment on UNIX-based and Windows machines.

UNIX-Based Setup

This section details the environment setup for UNIX-based machines. The preferred method is to specify the shell script, CITool, on the command-line. Setting the PATH environment variable to the location of the script, enables the shell script to be executed from any location on the user's machine.

The following command demonstrates how to set the PATH environment variable, by appending to its current setting:

% setenv PATH ${PATH}:$HOME/citool-1.1.0/bin
        

The tool can now be executed via the shell script as demonstrated in the following example:

% CITool <command-line arguments>
        

Additional methods for setting up a UNIX-based environment can be found in the UNIX Setup Options section. If viewing this document in PDF form, see the appendix for details.

Windows Setup

This section details the environment setup for Windows machines. The preferred method is to specify the batch file, CITool.bat, on the command-line. Setting the PATH environment variable to the location of the file, enables the batch file to be executed from any location on the user's machine.

The following command demonstrates how to set the PATH environment variable, by appending to its current setting:

C:\> set PATH = %PATH%;C:\citool-1.1.0\bin
        

The tool can now be executed via the batch file as demonstrated in the following example:

C:\> CITool <command-line arguments>
        

Additional methods for setting up a Windows environment can be found in the Windows Setup Options section. If viewing this document in PDF form, see the appendix for details.

Tool Execution

CITool can be executed in various ways. This section describes how to run the tool, as well as its behaviors and caveats.

Command-Line Options

CITool can be run in three modes: compare, validate and ingest. The following tables describe the command-line options available when each mode is enabled.

Compare Mode

Setting the m option with compare will enable the tool to run in compare mode. In the compare mode, the following options are valid:

Command-Line Option Description
-m, --mode compare Specifying compare runs the tool in compare mode.
-t, --target <catalogs,URLs,dirs> Explicitly specify two targets (catalog files, directories, and URLs) to compare. Targets can also be specified implicitly (example: CITool OLDDATASET.CAT, NEWDATASET.CAT).
-r, --report-file <file> Specify the report file name. Default is standard out.
-L, --local Do not peform directory recursion on a target directory.
-v, --verbose <1|2|3> Specify the severity level and above to include in the report: (1=Info, 2=Warning, 3=Error). Default is warning and above (level 2).
-c, --config <file> Specify a configuration file to set the default values.
-h, --help Display CITool usage.
-V, --version Display CITool version.

Validate Mode

Setting the m option with validate will enable the tool to run in validate mode. In the validate mode, the following options are valid:

Command-Line Option Description
-m, --mode validate Specifying validate runs the tool in validate mode.
-d, --dict <.full file(s)> Specify the Planetary Science Data Dictionary full file name and any local dictionaries.
-I, --include <path(s)> Specify paths to search for files referenced by pointer statements in a catalog file. Separate each path with a comma. Default is to always look in the directory of the catalog file, then search the specified directory paths.
-a, --alias; Enable aliasing. Allows the tool to properly handle object and element names defined as aliases in the Planetary Science Data Dictionary.
-A, --allrefs <allrefs file> Specify the allrefs dictionary support file or URL.
-r, --report-file <file> Specify the report file name. Default is standard out.
-L, --local Do not perform directory recursion on a target directory.
-t, --target <catalogs,URLs,dirs> Explicitly specify the target (catalog file, directory, or URL). The target can also be specified implicitly (example: CITool DATASET.CAT).
-v, --verbose <1|2|3> Specify the severity level and above to include in the report: (1=Info, 2=Warning, 3=Error). Default is warning and above (level 2).
-c, --config <file> Specify a configuration file to set the default values.
-h, --help Display CITool usage.
-V, --version Display CITool version.

Ingest Mode

Setting the m option with ingest will enable the tool to run in ingest mode. In the ingest mode, the following options are valid:

Command-Line Option Description
-m, --mode ingest Specifying ingest runs the tool in ingest mode.
-t, --target <catalog,URLs,dirs> Explicitly specify target (catalog file, directory, and URLs) to ingest. Targets can also be specified implicitly (example: CITool -m ingest TEST.CAT).
-u, --dbuser <username> Username of the database.
-p, --dbpass <password> Password of the database.
-s, --dbserver <servername> Name of the database server.
-n, --dbname <db name> Name of the database.
-r, --report-file <file> Specify the report file name. Default is standard out.
-v, --verbose <1|2|3> Specify the severity level and above to include in the report: (1=Info, 2=Warning, 3=Error). Default is warning and above (level 2).
-L, --local Do not perform directory recursion on a target directory.
-c, --config <file> Specify a configuration file to set the default values.
-h, --help Display CITool usage.
-V, --version Display CITool version.

Execute CITool

This section demonstrates execution of the tool using the command-line options. The examples below execute the tool via the batch/shell script. Alternate methods for executing the tool can be found in the Tool Setup section.

Compare Mode

In compare mode, the tool can be executed as follows:

  • Comparing Two Catalog Files

    The following command demonstrates how to compare a source catalog file $HOME/DIR1/DATASET.CAT with a target catalog file $HOME/DIR2/DATASET.CAT:

    % CITool $HOME/DIR1/DATASET.CAT, $HOME/DIR2/DATASET.CAT -m compare
                
  • Comparing Two Catalog Directories

    The following command demonstrates how to compare a source directory, $HOME/DIR1, containing a set of catalog files with a target directory, $HOME/DIR2, containing another set of catalog files.

    % CITool $HOME/DIR1, $HOME/DIR2 -m compare
                

    In this example, the tool will look for files with matching file names between the source and target directory before doing a comparison. So, if a DATASET.CAT file is found in directory $HOME/DIR1, a DATASET.CAT file needs to be found in directory $HOME/DIR2, and so forth.

  • Writing a Compare Report to File

    In the first two examples above, the output report is written to standard out. The following command demonstrates how to write the report to a file named report.txt:

    % CITool $HOME/DIR1/DATASET.CAT, $HOME/DIR2/DATASET.CAT -m compare -r report.txt
                

Validate Mode

In validate mode, the tool can be executed as follows:

  • Validate a Single Catalog File

    The following command demonstrates how to validate a single catalog file DATASET.CAT against a data dictionary pdsdd.full:

    % CITool DATASET.CAT -m validate -d pdsdd.full
                
  • Validate a Directory of Catalog Files

    The following command demonstrates how to validate a directory $HOME/DIR containing a set of catalog files against a data dictionary pdsdd.full:

    % CITool $HOME/DIR -m validate -d pdsdd.full
                
  • Checking for Referenced Files in Different Locations

    If a catalog file contains a pointer statement that references a file, the tool will always assume it is in the same location as the catalog file. If it cannot be found there, then the tool will look for that referenced file in the paths specified by the include directories option.

    The following command demonstrates the validation of a catalog file that contains pointer statements to files located in a directory called CATALOG:

    % CITool VOLDESC.CAT -m validate -d pdsdd.full -I $HOME/CATALOG
                
  • Perform Referential Integrity Check with the Allrefs File

    The allrefs dictionary support file ensures that the reference citations in the REFS.CAT file are consistent with what is currently in the PDS database.

    The following command demonstrates how to specify an allrefs dictionary support file, allrefs.out, to validate and do a complete referential integrity check on references from a set of catalog files in a directory $HOME/DIR.

    % CITool $HOME/DIR -m validate -d pdsdd.full -A allrefs.out        
                
  • Writing the Validation Report to File

    The following command demonstrates how to write a validation report to a file named report.txt:

    % CITool $HOME/DIR -m validate -d pdsdd.full -A allrefs.out -r report.txt           
                

Ingest Mode

In ingest mode, the tool can be executed as follows:

  • Ingest a Single Catalog File

    The following command demonstrates how to ingest a single catalog file DATASET.CAT as a user tempuser with a password temppass into the database server starsyb and the database name tempdb.

    % CITool DATASET.CAT -m ingest -u tempuser -p temppass -s starsyb -n tempdb
    		        
  • Ingest a Directory of Catalog Files

    The following command demonstrates how to ingest a directory $HOME/DIR containing a set of catalog files as a user tempuser with a password temppass into the database server starsyb and the database name tempdb.

    % CITool $HOME/DIR -m ingest -u tempuser -p temppass -s starsyb -n tempdb
                

Changing Tool Behaviors Using a Configuration File

A configuration file can be passed to the tool to change its default behaviors. This provides a way to use the tool with a single option. For more details on how to setup the configuration file see the Using a Configuration File section.

The following command demonstrates how to run the tool using a configuration file:

% CITool -c config.cfg
        

Using a Configuration File

A configuration file is used to set the default behaviors of the tool. It consists of a text file made up of keyword/value pairs. The configuration file follows the syntax of the stream parsed by the Java Properties.load(java.io.InputStream) method. The following rules apply to the content of configuration files:

  • Blank lines and lines which begin with the hash character "#" are ignored.
  • Values may be separated on different lines if a backslash is placed at the end of the line that continues below.
  • Escape sequences for special characters like a line feed, a tabulation or a unicode character, are allowed in the values and are specified in the same notation as those used in Java strings (e.g. \n, \t, \r).
  • Since backslashes (\) have special meanings in a configuration file, keyword values that contain this character will not be interpreted properly by CITool even if it is surrounded by quotes. A common example would be a Windows path name (e.g. c:\VTT_EN_1-1\target). Use the forward slash character instead (c:/VTT_EN_1-1/target) or escape the backslash character (c:\\VTT_EN_1-1\\target).

Note: Any option specified on the command-line takes precedence over any equivalent settings placed in the configuration file.

The following table contains valid keywords that can be specified in the configuration file when running in compare mode:

Property Keyword Associated Option Valid Value(s)
citool.mode -m Specify compare to run the tool in compare mode.
citool.targets -t Specify two targets (catalog files, directories, and/or URLs) to compare.
citool.report -r Specify the report file name. Do not specify this property key if writing to standard out.
citool.recurse -L Set to 'false' to disable directory recursion on a target directory. Set to 'true' otherwise or do not specify this property key.
citool.verbose -v Specify the severity level and above to include in the report (1=info, 2=warning, 3=error). Default is warnings and above (level 2).

The following table contains valid keywords that can be specified in the configuration file when running in validate mode:

Property Keyword Associated option Valid Value(s)
citool.mode -m Specify validate to run the tool in validate mode.
citool.targets -t Specify a target (catalog file, directory, or URL) to validate.
citool.dictionaries -d Specify the Planetary Science Data Dictionary full file name and any local dictionaries.
citool.includepaths -I Specify paths to search for files referenced by pointer statements in a catalog file.
citool.recurse -L Set to 'false' to disable directory recursion on a target directory. Set to 'true' otherwise or do not specify this property key.
citool.alias -a Set to true to enable aliasing.
citool.allrefs -A Specify the 'allrefs' dictionary file or URL.
citool.report -r Specify the report file name. Do not specify this property key if writing to standard out.
citool.verbose -v Specify the severity level and above to include in the report (1=info, 2=warning, 3=error). Default is warnings and above (level 2).

The following table contains valid keywords that can be specified in the configuration file when running in ingest mode:

Property Keyword Associated option Valid Value(s)
citool.mode -m Specify ingest to run the tool in ingest mode.
citool.targets -t Specify targets (catalog files, directories, and URLs) to ingest.
citool.dbuser -u Specify the username of the database.
citool.dbpass -p Specify the password of the database.
citool.dbserver -s Specify the name of the database server.
citool.dbname -n Specify the name of the database.
citool.report -r Specify the report file name. Do not specify this property key if writing to standard out.
citool.recurse -L Set to 'false' to disable directory recursion on a target directory. Set to 'true' otherwise or do not specify this property key.
citool.verbose -v Specify the severity level and above to include in the report (1=info, 2=warning, 3=error). Default is warnings and above (level 2).

The following example demonstrates how to set a configuration file:

# This is a CITool configuration file          

citool.mode         = validate
citool.targets      = ./TEST_DIR
citool.report       = report.txt
citool.dictionaries = pdsdd.full
        

This is equivalent to running the tool with the following option options:

-t ./TEST_DIR -m validate -r report.txt -d pdsdd.full
        

The following example demonstrates how to set a configuration file with multiple values for a property key:

# This is a CITool configuration file with multiple values

citool.mode         = compare
citool.targets      = DIR
citool.dictionaries = pdsdd.full, localdd.full
        

This is equivalent to running the tool with the following options:

-t DIR -m compare -d pdsdd.full, localdd.full
        

The following example demonstrates how to set a configuration file with multiple values that span across multiple lines:

# This is a CITool configuration file with multiple values

citool.mode         = compare
citool.targets      = DIR
citool.dictionaries = pdsdd.full, \
                      localdd.full
        

The following example demonstrates how to override a setting in the configuration file.

Suppose the configuration file config.cfg is defined as follows:

# This is a CITool configuration file          

citool.mode         = validate
citool.targets      = ./TEST_DIR
citool.dictionaries = pdsdd.full
        

To use another dictionary instead like mypdsdd.full, then the following command demonstrates how to perform this behavior change:

% CITool -c config.cfg -d mypdsdd.full
        

Report Formats

This section describes the contents of the CITool report formats. The links below detail the results. If viewing this document in PDF form, see the appendix for the actual examples.

The tool has a different report format depending on the CITool mode.

Compare Report

In a compare report, the location, severity, and textual representation of the differences between a source and target catalog file are shown. The differences being reported are about the target catalog file with respect to the source. A 'SAME' or 'DIFFERENT' keyword is displayed next to each target to indicate when a target file is identical or different, respectively, against its source.

Validate Report

In a validate report, the location, severity, and textual description of each detected anomaly is reported. A 'PASS', 'FAIL', or 'SKIP' keyword is displayed next to each file to indicate when a file has passed, failed, or skipped PDS validation, respectively. In addition, anomalies against the referential integrity of the catalog files are reported.

Ingest Report

In a ingest report, the location, severity, and textual description of each detected anomaly is reported. Completion or failure of each catalog file ingestion is reported. Detailed information is displayed when the catalog ingestion has failed.