Fields Grid

A single button is displayed above the fields grid:

      Lock or unlock fields properties: this toggle button is shown above and to the right of the Fields grid.  It allows you to control as to whether it is possible to make changes to the properties of the fields listed in the Fields grid at second or subsequent data project execution.

At the second or subsequent execution of a data project, if the Lock or Unlock... property is set to Unlocked, a field's Data Type can be changed.  However, only the following changes are supported:

If an Integer, can only be converted to Decimal

If Date, can only be converted to Date Time

Note that the Data Type of the table's primary key cannot be changed.

The fields grid itself contains a list of the fields in the uploaded file.  It is populated during the file’s initial analysis.  However, its initial contents vary depending upon whether the file is delimited or fixed-width.

      If the file is delimited, a full set of columns is displayed in the grid.

You can make changes therein as required.

      If the file is fixed-width, a limited set of read-only columns is displayed in the grid after initial analysis.

Note the message displayed above the grid.  The message is removed from display after invocation of Re-analyze.  It will subsequently be redisplayed should field boundaries change again.

The grid is populated as follows:

Number: an incrementing integer value that begins at 1.  Number maps to the watermark number displayed in the fixed-width Preview dialog.

Start Position: the position within the file at which the field begins.  In all fields except the first, Start Position equals the previous field’s Start Position plus its Length.

Length: the length of the field in characters.

The limited column set is replaced with the full set of columns following successful re-analysis of the file.

Typically, you will invoke Preview & Define to view a sample of the file in the Preview dialog, make manual adjustments to field boundaries, and then invoke Re-analyze to confirm your stipulations.  Note that you can make subsequent changes to field boundaries using Preview & Define, but, if you do so, the limited column set will be redisplayed and any manual changes made within the full column set (e.g. to a field’s Data Type or Classification) will be lost.  The ‘Preview the file…’ message will also be redisplayed.

When all columns are displayed, you can make amendments to the file’s schema directly within the grid.  The following columns are presented:

      Number: a read-only, incrementing integer indicating the field’s ordinal position within the file.  Note that Number corresponds to the watermark displayed when Preview & Define is invoked for a fixed-width file.

      Index: this checkbox allows you to specify that a field is to be used for table indexing.  Post-validate and load, for each field checked as an Index, a database index will be created on the data project table.

      Key: clicking a cell within the Key column allows you to define that the selected field is a key field.  When clicked, a key icon is displayed in the column (clicking it again removes the icon from display).  You can specify that a file has a compound key by clicking on more than one field.  A SQL Server primary key constraint is created at load of the initial data project file in accordance with your specifications.  If no fields are defined as key, a primary key is not created.

      Name: a mandatory text field that can be a maximum of 50 characters in length.  Name is used to specify column names within the table into which data will be loaded.  Name is either set by default in accordance with a file’s header row or set to a default value if no header row exists (full details of how this is carried out are provided elsewhere).  Note that a field’s name must be unique within the data project.

      Data Type: set automatically during file analysis (full details of the logic employed by RPI in doing so are provided elsewhere).  Data type can be set manually using a drop-down that exposes the following options:

Date

DateTime

Decimal

Integer

String

Time   

      Size: relevant for data types String, Integer and Decimal.  Different validation criteria are enforced depending on the field’s data type:

String: must be an integer, with a minimum value of 1 and a maximum value of 4000.

Integer/Decimal: must be an integer, with a minimum value of 1 and a maximum value of 16.  In the case of a decimal, size represents the field’s precision (the total number of digits before and after the decimal point in a number).

      Scale: relevant only if Decimal; an integer, with a minimum value of 0, and a maximum value of 38.  Scale represents the number of digits after the decimal point in a number.

      Format: only enabled (and mandatory) when the field’s data type is one of Date, Time or DateTime.  Format can be a maximum length of 30 characters.  Format is used to define how data within the field is to be structured – e.g. “d/M/yyyy”.  Full details on supported date formats are provided elsewhere.

      Column Name: displays a read-only representation of the final column name.  Based on Name, but with the following caveats:

Shown in upper case

Any database-incompatible characters are replaced with underscores

If a Name is set to a reserved word, a validation error is raised

If Name begins with a numeric character, a ‘X’ is added as its first character

      Exclude: a checkbox, unchecked by default.  If you exclude a field, it will not be exposed through the creation of an RPI attribute after data is loaded.  Right-clicking this column provides access to Exclude all rows and Clear all excluded rows options.

      Distribution Key: this checkbox is used when importing data into a Netezza data warehouse only It is unchecked by default.  Up to four fields can be selected to serve as a data project's distribution key.  When a data project at which distribution keys are defined is executed, distribution keys will be created for the data project's 'UD_XXXXX' table.