7.2.3 Cursor-based operations

Besides fetching data from a cursor, esProc also offers performing common data computations by directly using certain functions. The following example is a conditonal query based on the cursor:

 

A

B

1

=file("Order_Books.txt")

=A1.cursor@t()

2

=B1.select(SalesID==1)

=A2.fetch()

3

=A2.fetch()

=B1.fetch()

B1 creates an external file cursor and A2 selects the sales data of the salesperson whose ID number is 1 from B1’s cursor. Here’s A2’s result:

You can see that A2 and B1 get the same result. Different from the select function with a table sequence or a record sequence, the select function used with a cursor returns the cursor itself. A2’s operation doesn’t really get data from the cursor. It is actually a bundled operation. It is in B2 where cs.fecth() function is being executed that the data queried and fetched from B1’s cursor according to the condition. Here’s B2’s result:

No number of rows to be returned or a data fetching condition is specified for cs.fetch() function used in B2. Thus all data in A2’s cursor, i.e. all the sales data of the specified salesperson, will be returned, and the cursor will close automatically. The cursor in A2 is equivalent to the one in B1, and actually during the data query, it is B1’s cursor data that is traversed during which the bundled data fetching is executed. So as the data in A2’s cursor is fetched out, B1’s cursor will be fetched out too, and close. Therefore the effort of A3 and B3 trying to fetch data from the cursors in both A2 and B1 will definitely fail.

If the data in a file is already sorted, you can use f.iselect(A,x,Fi,…;s) or f.iselect(a:b, x,Fi,…;s) to select data to generate cursor, which is more efficient. For example:

 

A

B

1

=file("Order_Books.txt").iselect@t(["S020000012", "S020000022"],ID; ID,PID,Date,Amount)

=A1.fetch()

2

=file("Order_Books.txt").iselect@ti(date(2013,2,1): date(2013,2,28),Date; ID,PID,Date,Amount)

=A2.fetch@x(100)

Since the file contains field names, @t option is used. Similarly, if it is a bin file, @b option can be added. A1 uses f.iselect(A,x,Fi,…;s) to get the sales records by order ID. By default iselect function requires the desired data is distinct when performing a search. A2 searches for sales records during a specified time period using f.iselect(a:b, x,Fi,…;s) function with an @i option, meaning there could be more than one Date and all eligible records will be returned. When executed, B1 and B2 get results as follows:

You can also create new records or append new fields based on a cursor. For example:

 

A

B

1

=file("employee.txt")

=A1.cursor@t(EID, NAME,SURNAME, GENDER, STATE,BIRTHDAY)

2

=B1.new(EID, NAME+" "+SURNAME:FullName, GENDER,STATE,BIRTHDAY)

=B1.derive(age(BIRTHDAY):Age)

3

=A2.fetch(100)

=B2.fetch(100)

4

>A2.close()

=B2.fetch()

5

>B2.close()

 

B1 creates a file cursor using some of the fields of the external file employee.txt. A2 generates employees’ full names by joining NAME field and SURNAME field in B1’s cursor. B2 computes each employee’s age according to the BIRTHDAY field in B1’s cursor and enter the results to Age field. Similar to cs.select(), both cs.new() and cs.derive() return a cursor, rather than returning data directly. When executed, A2 and B2’s cursor values are the same as B1’s:

Here are the first 100 rows fetched from A3:

Generating data with cs.new() function means executing new statement on every record.

Here are the first 100 rows fetched from B3:

Generating data with cs.derive() function means appending computed columns to the returned table sequence.

B3’s data has the same structure as A3’s data. It begins from the 101th employee. That’s because both A2 and B2 have a bundled operation with their cursors while both fetch data from the same cursor, and a cursor will be traversed only once in a forward direction. After A3 is executed, the first 100 rows have been traversed. So B3 can only continue the traversal in order by beginning with the 101th employee when executing the cs.fecth() function.

But as data hasn’t been all fetched out in both cursors, they need to be closed using cs.close() function. As A4 closes the cursor in A2, the cursor in B1 is also closed simultaneously, for A2 and B1 are in reality the same cursor. A try of fetching more data thus fails.

What’s more, cs.run() function can be used to modify values of a certain field in cursor:

 

A

B

1

=file("employee.txt")

=A1.cursor@t(EID, NAME,SURNAME, GENDER, STATE,BIRTHDAY)

2

=demo.query("select STATEID, NAME, ABBR from STATES")

=B1.run((a=STATE,STATE=A2.select@1(NAME==a).ABBR))

3

=B2.fetch(100)

>B2.close()

B2 replaces names of the states in the returned data with their abbreviations and returns a cursor with a bundled operation:

A3 fetches the first 100 rows:

Similar to a table sequence, the cursor allows the switch between one of its fields and the corresponding records in another table sequence by using cs.switch() function. For example:

 

A

B

1

=file("employee.txt")

=A1.cursor@t(EID, NAME,SURNAME, GENDER, STATE,BIRTHDAY)

2

=demo.query("select STATEID, NAME, ABBR from STATES").keys(STATEID)

=B1.switch(STATE,A2:NAME)

3

=B2.fetch(100)

>B2.close()

B2 uses cs.switch() to convert certain data of the cursor. The result is still a cursor with a bundled operation:

Here’s the result of data fetching in A3:

In the result values of STATE field have been converted into the corresponding state records.

The cs.switch() function can convert more than one field of the cursor. For example:

 

A

B

1

=file("employee.txt")

=A1.cursor@t(EID, NAME,SURNAME, GENDER, STATE,BIRTHDAY)

2

[F,Female,M,Male]

=create(ID,Gender).record(A2).keys(ID)

3

=demo.query("select STATEID, NAME, ABBR from STATES").keys(STATEID)

=B1.switch(GENDER,B2;STATE,A3:NAME)

4

=B3.fetch(100)

>B3.close()

B3 switches values of several fields of the cursor to the corresponding records in another table sequence. A4 fetches the first 100 records:

Primary Key and Index explained the use of switch function in the ordinary table sequences, which is similar to that of cs.switch() in external memory computations. Both functions will create the index table for certain field(s) of the dimension table in order to increase efficiency. It is more important to adopt this processing method while using the cursor because usually the data under processing is big.