Remote data file

Read(22) Label: remote data file,

Check the option of “Remote data source” to select a text file on a remote server and configure related options on the same box to convert the file to an mtx file.

Batch generating

Click “Batch generating” and by default all CSV files and text files under the data file directory on the selected server will be shown. After “File options” are configured, select a file and click “Execute”. The newly-generated modeling tables will be saved under the same data file directory.

 “Filter by file name”: Select a data file by name. By default all data files under the data file directory on the server will be listed.

File options

On “Generate Modeling Table” window, you can select “Remote data file” and click “File options” to configure data format and define missing values for a data file, as shown below:

“Import the first line as variable name”: When this option is checked, a data file’s first line will be loaded as headers; when it is unchecked, the headers are displayed in the format of “_n” (n is the number of fields), such as _1,_2,….

 “Omit all quotation marks”: When this is checked, both single quotation marks and double quotation marks will be automatically removed from field names and string type field values.

“Check Column Count”: If it is checked, the system will automatically check a local file to find whether the number of columns in each row is consistent when generating the modeling table.

 “Delete a line when column count does not match value count at line 1”: By checking this option, a line where the number of columns is inconsistent with that in the first line will be automatically deleted when generating a modeling table.

 “Use double quotation marks as escape characters”: If this is checked, double quotation marks will be treated as the escape characters; if unchecked, the default escape character is a slash (/).

 “Delimiter”: Select an eligible separator for your data file.

 “Charset”: Select a character set supported by to-be-handled data file.

 “Date format”, “Time format”, “Date time format”: Users can self-define a format for date, time and datetime data.

“Locale”: Select a language you need.

“Missing values (bar-separated)”: Default is Null|N/A. A Null or N/A in a data file represents a missing value. User can manually add a missing value; when there are multiple missing values, separate them with “|”.

Execute

Select a file for generating the modeling table and click “Execute”. A modeling table is successfully created when a message saying “The file was created successfully.” appears in “Description” box in the lower part of “Generate Modeling Table” window.

Below is part of the local file train.csv:

In a CSV file, comma is the separator and other options are set as default. Click “Execute” and you get the following: