Delimited File Analysis

If the file is delimited, the system analyzes the first [x] records, then every 1 record in [y] (where [x] is defined by system configuration setting NumberFirstLinesToRead, and [y] is defined by system configuration setting AnalyzeEveryNLines).

RPI is able to determine the following high-level information during analysis of a delimited file:

• Delimiter: supported delimiters are defined by the system configuration settings FileAnalysisDelimiters and FileAnalysisDelimitersSeparator (the latter is used to parse the list of delimiters provided in the former). If the file happens to be delimited using another character, RPI is unable to analyze it. In this case, it is necessary to define the delimiter manually, and invoke Re-analyze. Note that the Tab character is defined using “\t”.

• Header row: RPI is able to determine whether the file has a header row by testing for the following conditions:

o If all of the first row's fields contain string values, and at least one other row contains a non-string value, the file is determined to contain a header row.

• Skip lines: RPI can make a rudimentary determination of the number of rows at the beginning of a file to disregard as non-data in nature. If a header row is present, Skip lines is set to 1; if a header rows is not present, it is set to 0.