Processing Common Data

Read(865) Label: data source,

To make data analysis, first you need to import the to-be-processed data. Most of the time, the data comes from text files or databases. It’s easy and fast to retrieve data in esProc.

Text data

esProc can read data from a text file as a table sequence. Here’s an example. The text file employee.txt contains employee information:

EID

NAME

SURNAME

GENDER

STATE

BIRTHDAY

HIREDATE

DEPT

SALARY

1

Rebecca

Moore

F

California

1974-11-20

2005-03-11

R&D

7000

2

Ashley

Wilson

F

New York

1980-07-19

2008-03-16

Finance

11000

3

Rachel

Johnson

F

New Mexico

1970-12-17

2010-12-01

Sales

9000

4

Emily

Smith

F

Texas

1985-03-07

2006-08-15

HR

7000

5

Ashley

Smith

F

Texas

1975-05-13

2004-07-30

R&D

16000

6

Matthew

Johnson

M

California

1984-07-07

2005-07-07

Sales

11000

7

Alexis

Smith

F

Illinois

1972-08-16

2002-08-16

Sales

9000

8

Megan

Wilson

F

California

1979-04-19

2004-04-19

Marketing

11000

9

Victoria

Davis

F

Texas

1983-12-07

2009-12-07

HR

3000

 

 

 

 

 

 

 

 

esProc uses import function to import data from files:

 

A

1

=file("employee.txt")

2

=A1.import@t()

3

=A1.import()

In A2, import function uses @t option to import the text file’s first line as the column names of the table sequence during data importing. A2’s data is as follows:

Let’s look at what it will be like without @t option. Here’s the resulting table sequence A3 gets:

Database data

esProc can access various databases through JDBC. Click Datasource Connection from the menu list of Tool to view the datasource manager:

You can connect to or disconnect from a certain datasource, as well as configure the database to be connected through the datasource manager. demo is esProc’s built-in datasource which can be launched by executing esProc\bin\startDataBase.bat under the installation directory. Once it connects to the datasource, esProc gains access to the database and fetches data using SQL:

 

A

1

=demo.query("select * from CITIES")

2

$select * from CITIES

The query function is used to get the result set of executing the SQL command and retrieve it as a table sequence, as with the code in A1. When the database is connected, a SQL statement can follow immediately after $, as A2 shows. The results of A1 and A2 are the same, as shown below:

Besides using the datasource manager, you can use connect function to connect to a datasource. In this case, you should close the connection using close function after data is retrieved from the database:

 

A

1

=connect("demo")

2

=A1.query("select * from CITIES")

3

>A1.close()

With this method, A2 also gets the same table sequence of cities information.