Planetary Data System
Key Word Definition Tool (DDICT) User’s
Guide (UG)
Draft Release Version 1
Prepared by:
Michael Cayanan/Dale Schultz
Document Custodian: Valerie
Henderson
Approved by:
________________________________________________
Dan Crichton
PDS Project Manager
Jet Propulsion Laboratory
CHANGE LOG
Revision |
Date |
Description |
Author |
Start Draft |
April, 2004 |
Initial draft |
D. Schultz |
Release 1 |
|
Draft Release 1 – Edited all |
V. Henderson |
|
August 2005 |
Update Document |
M. Cayanan |
|
|
|
|
|
|
|
|
|
|
|
|
Table of Contents
2.5.1 Extracting Keyword Definitions From One File
2.5.2 Using DDICT to Verify a Group of Files
4.2 –d Enable Data Dictionary index file
4.3 –df
Enable Data Dictionary full file
4.5 -ivf
<file name> Specifies name of file containing a list of file extensions
to skip
4.7 -r Specify Report File Name
4.8 –di
Do not search subdirectories for label files to be processed.
4.9 –ivd
<file name> Specifies a file containing directories to skip.
4.10 -lef <file name> Create a Log File of all Skipped Files
4.11 –sf <file name> Specify Keywords to Skip
4.12 –nol3d
Do Not Check For Level 3 Labels in Input Files
Project name: Planetary Data System (PDS)
Program set name: PDS Key Word Definition Tool
Abbreviation: ddict
Version number: 4.5
Platforms supported: Windows, Solaris, Linux
This document provides information on the use of the PDS DDICT software set. This software will extract the data dictionary definition for every keyword used in a specified PDS label file, a specified list of PDS label Files, all of the labels in a directory, or all of the files on an entire volume. DDICT will also list those keywords that are not in the data dictionary.
This document provides:
This document is intended for those who are responsible for ensuring that labels submitted to the PDS conform to PDS standards. This document assumes familiarity with the PDS data preparation process and label design requirements. The data preparation process is described in the PDS Data Preparation Workbook (JPL Document D-7669, Part 1); label file requirements are described in full in Chapter 5 of the PDS Standards Reference (http://pds.jpl.nasa.gov/documents/sr/Chapter05.pdf ). See Section 1.6.
PDS labels are written in the Object Description Language (ODL). ODL consists of a series of lines of the form "keyword = value", with certain keywords (for example, OBJECT) being used to delimit named groups of keywords within the label. For a description of the PDS implementation of ODL, see Chapter 12 of the PDS Standards Reference.(http://pds.jpl.nasa.gov/documents/sr/Chapter12.pdf ).
The keywords and objects are themselves defined in the Planetary Science Data Dictionary (PSDD). The Dictionary exists as a Word file for human readers, as a text file for use by the software, and is available via an online look up tool at (http://pds.jpl.nasa.gov/tools/data_dictionary_lookup.cfm).
·
Planetary Science Data Dictionary Document,
· Planetary Data System Data Preparation Workbook, JPL D-7669, Part 1. (http://pds.jpl.nasa.gov/documents/dpw/index.html)
The PDS DDICT software set consists of the following files:
Windows
ddict.exe - executable
pdsdd.full - Data Dictionary full file
make_index.exe – utility that creates the pdsdd.idx
SOLARIS AND LINUX
ddict – executable
pdsdd.full - Data Dictionary full file
make_index – utility that creates the pdsdd.idx
PDSDD.FULL and MAKE_INDEX are included in the ddict distribution package. PDSDD.FULL will change often and should be downloaded frequently from: http://pds.jpl.nasa.gov/tools/software_download.cfm as can all of the central node validation software tools. Each time PDSDD.FULL is downloaded it is important to run MAKE_INDEX again to create a new index. Once at that url select “for data producers”. When the new url opens scroll down the Software Id/Version ID window until the desired software tool is highlighted. Click on the selected tool once and select the submit button. When the new information is displayed click once on the appropriate download.
PDSDD.FULL can be obtained by scrolling through the Software Id/Version ID window to pdsdd(CAT1R40):PDS Toolbox Data Dictionary Version *****(the version number will be changed from time to time and is not shown here). Select MULTIPLE from the download list.
MAKE_INDEX can be obtained by scrolling through the Software Id/Version ID window to make_index, selecting it, and then selecting the submit button.
Click once on the appropriate download.
Download the zip or tar file for ddict, make_index, and data dictionary file to the hard drive. Ddict is a command line program and must be executed from a DOS window or from a unix command line prompt.
Use make_index to make pdsdd.idx by typing:
make_index pdsdd.full
The PDS DDICT program is a command line tool and will need to be run from a command prompt in Windows or from a terminal/console in unix.
To start DDICT from the same directory in which ddict and default Data Dictionary files are located, enter the command:
ddict –d pdsdd.idx -f <input-file> -r <output-file>
where <input-file> is the name of the label file to verify, and <output-file> is the name of the file of the verification report. If the Verifier cannot locate the Data Dictionary files it needs, it will generate a message. Pdsdd.idx is the name of the data dictionary index that was created when running make_index.
DDICT may also be started from a different directory that the one in which DDICT and Data Dictionary files are located. For example, on a Unix system, if DDICT and dictionary files are located in directory /usr/local/pds, enter the command:
/usr/local/pds/ddict -f
<input-file> -r <report-file> -d
usr/local/pds/pdsdd.idx
The -d option tells DDICT the path (if different than the current directory) and name of the PDS Data Dictionary index file.
The -d option tells DDICT the path and name of the PDS Data Dictionary index file.
If using the Data Dictionary full file, then use the –df option instead of –d:
/usr/local/pds/kwvtool
-f <input-file> -r <report-file> -df
/usr/local/pds/pdsdd.full
The –df option tells DDICT the path name of the PDS Data Dictionary full file. You will need to also make sure the make_index program (version 1.7 or higher) is located in the same path as DDICT since it uses that to create the index file internally when given the full file as an input.
To get a brief description of the command line parameters, at the command prompt type:
ddict
The parameters are also explained in detail in the COMMAND REFERENCE section of this document.
For further help or information, contact the PDS Operator at pds_operator@jpl.nasa.gov
The following examples will help in using DDICT immediately.
While reviewing the following examples, refer to the “Command Reference” chapter, which explains the command line options available for DDICT, and the “DDICT Messages” chapter, which explains the error and warning messages generated by DDICT.
The following examples assume that both DDICT and the Data Dictionary reside in the current directory.
Note: In all cases
below, you can substitute the full file instead of the index file by using the
–df argument instead of –d:
ddict
–df pdsdd.full….
Remember to have
make_index version 1.7 or higher residing in the same directory as DDICT.
Assume that the
following label file, EXAMPLE1 is to be processed:
/* File Format and Length */
RECORD_TYPE = FIXED_LENGTH
RECORD_BYTES = 1000
FILE_RECORDS = 806
/* Pointers to Data Objects */
^IMAGE_HEADER = ("EXAMPLE.IMG",1)
/* Description/Catalog Keywords */
SPACECRAFT_NAME = "GALILEO ORBITER"
INSTRUMENT_NAME = SOLID_STATE_IMAGING
/* Image Object */
OBJECT = IMAGE
LINES = 800
LINE_SAMPLES = 800
SAMPLE_BITS = 8
SAMPLE_TYPE = UNSIGNED_INTEGER
INVALID = "N/A"
END_OBJECT
END
Enter the command:
ddict –d
pdsdd.idx -f example1.lbl -r myreport.rpt
This tells ddict: “Extract all keywords from ‘example1.lbl’, look up the keyword definitions from pdsdd.idx, and write the report to file ‘myreport.rpt’.” After this command completes, the “myreport.rpt” file will look as follows:
_____________________________________________________________
************************************************************************
* *
* Planetary Data System Data Dictionary *
* Extractor Version 4.2 *
* *
*
Sun May 02
* *
* Data Dictionary Name: pdsdd.idx *
* Data Dictionary Version: OPS *
* Data Dictionary
Generated:
* *
* Data Dictionary Validation is ON *
* *
************************************************************************
------------------------------------------------------------------------
***** The following keywords are not present *****
***** in the Data Dictionary *****
------------------------------------------------------------------------
WARNING: Not in Data Dictionary -- INVALID
------------------------------------------------------------------------
***** The following set of keywords and definitions *****
***** were extracted from the Data Dictionary *****
------------------------------------------------------------------------
FILE_RECORDS
The file_records element indicates the number of physical
file records, including both label records and data
records.
Note: In the PDS the use of file_records along
with other file-related data elements is fully described in
the Standards Reference.
INSTRUMENT_NAME
The instrument_name element provides the full name of an
instrument.
Note: that the associated instrument_id element
provides an abbreviated name or acronym for the instrument.
Example values: FLUXGATE MAGNETOMETER, NEAR_INFRARED
MAPPING SPECTROMETER.
LINES
The lines element indicates the total number of data
instances along the vertical axis of an image.
Note: In PDS label convention, the number of lines is
stored in a 32-bit integer field. The minimum value of 0
indicates no data received.
LINE_SAMPLES
The line_samples element indicates the total number of data
instances along the horizontal axis of an image.
RECORD_BYTES
The record_bytes element indicates the number of bytes in a physical
file record, including record terminators and separators. When
RECORD_BYTES describes a file with RECORD_TYPE = STREAM
(e.g. a SPREADSHEET), its value is set to the length of the longest
record in the file.
Note: In the PDS, the use of record_bytes, along with other
file-related data elements is fully described in the Standards Reference.
RECORD_TYPE
The record_type element indicates the record format of a
file.
Note: In the PDS, when record_type is used in a
detached label file it always describes its corresponding
detached data file, not the label file itself.
The use of record_type along with other file-related data
elements is fully described in the PDS Standards Reference.
SAMPLE_BITS
The sample_bits element indicates the stored number of
bits, or units of binary information, contained in a
line_sample value.
SAMPLE_TYPE
The sample_type element indicates the data storage
representation of sample value.
SPACECRAFT_NAME
The spacecraft_name element provides the full, unabbreviated
name of a spacecraft. See also: spacecraft_id,
instrument_host_id.
************************************************************************
* *
* This is the end of the Data Dictionary Extractor Report *
* *
*
Sun May 02
* *
************************************************************************
Report File “myreport.rpt “
_____________________________________________________________
The report file is divided into 2 sections:
· Section 1 is a list of all keywords not found in the data dictionary.
· Section 2 is a list of all keywords that were found in the data dictionary and the definition for the keyword contained in the data dictionary.
If there are several label files in the directory, (e.g. “example1.lbl”, “example2.lbl”, and “example3.lbl”), enter the command:
ddict -f *.lbl -r myreport.rpt
This tells DDICT: “Process all the files in the directory that have extension ‘.lbl’ and write all the results to file ‘myreport.rpt’”. The report file will look the same as it did in the first example. All keywords will appear in alphabetical order in whichever section is appropriate. The keywords in the report will not be separated by label, simply by not found and found.
Ddict parses the odlc label into a list containing each object and all of the keywords in each object. It then simply goes through the list and looks up each keyword in the data dictionary. Those keywords that are not found in the data dictionary are listed in the first section of the report. Keywords that are found in the data dictionary are listed in the second part of the report along with the definition that was found in the data dictionary.
This tool performs no validation of the odl semantics.
Default: -na (no aliasing)
The -a command line option enables the aliasing feature of DDICT. This option is useful when processing a label which was developed according to another Data Dictionary, or to identify out-of-date keywords and objects in the label.
When this option is turned on, DDICT will compare the object classes in the label with the list of object class aliases in the Data Dictionary, and replace them with their official names. Object classes that were changed may be identified by examining the label hierarchy in the report file. For example:
OBJECT =
TABLE_STRUCTURE
will be replaced by
OBJECT = TABLE
if TABLE_STRUCTURE is defined to be an alias for object class TABLE in the Data Dictionary.
DDICT will then compare the keywords in the label with the list of keyword aliases in the Data Dictionary and replace them with their official names.
After aliasing is finished, DDICT will continue checking for key word definitions, using the official names instead of the aliases when it searches the Data Dictionary.
Note that only full class names will be replaced. For example, EDR_TABLE_STRUCTURE will not become EDR_TABLE just because TABLE_STRUCTURE is defined as an alias for TABLE. The same is true for the keyword replacement. Remember that DDICT does not change the actual label file at all. The changes to the label remain in effect only for the duration of DDICT’s run.
Enabling the aliasing option significantly slows processing, and so the default is
-na.
Default: -d pdsdd.idx
The information following the -d is the name of the index into the data dictionary being used. Typically it will be pdsdd.idx which is the index for pdsdd.full. The following is an example of this parameter:
ddict -d pdsdd.idx –f mylabel.lbl –r myreport.txt
Default: -df pdsdd.full
The –df command line option enables the user to give the Data Dictionary full file instead of the index file as an input. When this is given, DDICT will make the index file internally by making a system call to the make_index program. It will then use this internally created index file to continue verifying the labels. For this reason, the make_index program (version 1.7 or higher) must be located in the same directory as DDICT or the path to the program must be set in the PATH variable of the user system.
If there was an error in creating the Data Dictionary index file, then it will display something like the following:
**************Creating
Data Dictionary index file**************
PDS Make Dictionary
Index - Version 1.7
The data dictionary
file you specified does not exist or cannot be read.
The data dictionary
file you specified does not contain any object or element de
finitions.
Data Dictionary index
file could not be created, Verification will cease
Check that the Data
Dictionary full file exists in the directory path you have specified
If this happens, make sure that the correct path to the Data
Dictionary Full file is specified and make sure that the make_index.
Either –d or –df is a mandatory parameter. DDICT will not run if either
parameter is not specified.
Default: None.
Off Switch: None. Specify a label file for processing.
The -f command line option specifies one or more files to be processed. DDICT will search all subdirectories of the pathname of the file specified in this option.
To specify just one filename:
ddict -f mylabel.lbl . . .
Or a wildcard pattern:
ddict -f
/home/joeuser/*.lbl . . . (in unix -f
/home/joeuser/”*.lbl”)
Or a series of filenames and / or wildcard patterns:
ddict -f
/hme/joeuser/*.lbl *.fmt catalog.lbl
In the last case, all files with extension “.lbl” in the directory /home/joeuser, all files in the current directory with the extension “.fmt”, and the file “catalog.lbl” in the current directory will be processed. The directory specifications and wildcards should be those expected by the particular operating system.
-ivf <file name>
Any file with an extension in this list will not be processed. Some types of file extensions that might be included in this list are: c, h, txt, img. The list is a string of file extensions that are comma separated. The program is case sensitive. The list may not contain blanks and can only be 200 characters in length. Lower case characters can be included in the string. A sample string is:
C,H,TXT,IMG,c,h
This string of characters tells DDICT not to process any file having an extension of C (or c), H (or h), TXT, or IMG. The string is saved in an ASCII file with a name of the choice.
Default: -na
This option disables alias substitution. To turn on alias substitution use the -a option.
The -r command parameter is mandatory, it is followed by the name of the report file to which DDICT will write its report.
ddict -r /home/joeuser/label.rpt ...
or
ddict –r c:\pds\testarea\label.rpt…
The default is for DDICT to search all subdirectories for labels to process. This option will prevent DDICT from searching all subdirectories for labels to be processed. The search will begin in the directory specified with the file name(s) in the -f option or in the current directory if no path is specified. The -di option only works for DOS at this time.
Default: no file.
The file contains a list of directories the user does not want to be checked for files to process. Each directory is on a separate line, names are case sensitive. There can be up to 30 entries. Each directory file specification must be less than 100 characters in length. A sample contents of a file is:
D:\SOFTWARE
D:\CATALOG
D:\INDEX
Default: Do not create a log file.
Directs DDICT to keep a log file of all files that are skipped because of the
-ivf, -ivd, or -l3d options and specifies a file name for the log.
The –sf option specifies
a file containing keywords to be skipped.
Keywords are comma separated in a single string not to exceed 200
characters, do not include any blanks.
The program is case sensitive.
The default is to make sure that the file being validated is in fact a label. Specifying this option will make DDICT not check what file is being validated.
TBD
DDICT now gives the user the option to input either the pdsdd.idx file or the
pdsdd.full file. If the full file is desired, you will need to use the -df option
instead of the -d option like so,
ddict -df c:\data\pdsdd.full
DDICT uses the MAKE INDEX program (version 1.7 or higher) to create the data
dictionary index file in this case. Because of this dependency, you must make sure that the MAKE INDEX program is in the same directory as the DDICT program or that the PATH environment is set properly to where the MAKE INDEX program can be located.
Additional changes:
Added the following command-line options:
The following command line arguments are no longer valid and should not be used even though the program will still display and allow them: -nd, -t, -nt, -p,
-np, -nv, -v.
PDSDD.FULL and MAKE_INDEX were not distributed as part of this release of DDICT.
TBD
Appendix C- Glossary
TBD