Loop Computations

Loop computation works on each member of a sequence, or each record of a table sequence and a record sequence. esProc provides simple loop functions to perform the computation, without having to use the loop statements.

2.9.1 Returned results and relative reference

In performing loop computation on a sequence, you can reference each member of the sequence through a parameter of the loop function. “~” is used to represent a member handled currently, or base member; “#” is used to represent ordinal number of this current member.

When calculating a table sequence or a record sequence by loop, you can directly use the field name F to reference the field value of the current record. Note: For a record sequence, the prerequisite for referencing value of a certain field F is that all records must have this field F. You can use ~.F, r.F or A.F to specify record r, and record sequence (or table sequence) A, to which the referenced field belongs.

Special Note: If there is a namesake variable in the memory, the simplified form of referencing the field by a loop function should be avoided. The complete form, ~.F or A.F is required because an F alone could be interpreted as the namesake variable. 

Creating a sequence with results

Compute each member of sequence A by loop and return a new sequence composed of the results. The computation can be realized using function A.(x). For example:

 

A

1

[2,5,-3,8]

2

=A1.()

3

=A1.(#*#)

4

=A1.(power(~,3))

Because A2’s function doesn’t specify a loop expression, it returns the sequence in A1 directly. A3 generates a new sequence composed of the square values of ordinal numbers. And A4 generates a new sequence composed of the powers of members. The results of A1, A2, A3 and A4 are as follows:

 

 

Among these results, the sequences in A1 and A2 are same, while those in A3 and A4 are newly-created with the computed results.

A.(x) function can also be used with a table sequence or a record sequence. In that case, an expression can reference one or more fields of the records, as shown below:

 

A

1

=demo.query("select EID,NAME,SURNAME,GENDER, BIRTHDAY,STATE from EMPLOYEE")

2

=A1.(STATE)

3

=A1.(age(BIRTHDAY))

A1 is a table sequence retrieved from the database:

A2 finds the state where each employee stays and A3 computes the age of each employee. Results of A2 and A3 are as follows:

 

As can be noticed, the returned result of calculating a table sequence or a sequence by A.(x) function is a sequence instead of a table sequence; and the result doesn’t contain the field name.

In a loop computation, we can set multiple continuous expressions in the format of A.(x, …). In this case the expressions are computed in order on each of the members during the loop, result of the last expression is stored, and a sequence of all results is returned. For example:

 

A

1

[2,3,5,7,11]

2

>sum=0

3

=A1.(sum=sum+~, round(~/sum,2))

Below are A1 and A3’s results after computation:

 

A3 first calculates the cumulative sum and then the proportion of the current member to the existing cumulative sum. For the computation of A.(x, …), basically the first (several) expression(s) performs value assignment or gets the intermediate result while the last expression returns the final result.

Returning the original sequence

You can also use A.run(x) function to loop through each member of sequence A. Meanwhile, the function will calculate expression x based on each member but will return the original sequence A. For example:

 

A

1

[2,5,-3,8]

2

=A1.run(~=~*2)

After the program is executed, you can get the same sequence in A1 and A2:

 

Since the run function returns the original sequence rather than a new sequence, it is usually used to modify members of a sequence. In this case, for instance, members of the original sequence are doubled.

A.run(x) function is often used to modify records by loop in a table sequence, or a sequence. For example:

 

A

1

=demo.query("select EID,NAME,SURNAME,GENDER, BIRTHDAY,STATE from EMPLOYEE")

2

>A1.run(GENDER=case(GENDER,"F":"Female","M":"Male"),BIRTHDAY= string(BIRTHDAY,"dd/MM/yyyy"))

In A2, run function modifies the value of GENDER field to Male or Female and changes the format of BIRTHDAY field. After the code is executed, the table sequence in A1 becomes as follows:

Relative reference

In loop computations, A[i] or ~[i] can be used to reference the member that is i records after the current member in a sequence. i can be a negative number. For example:

 

A

1

[2,5,-3,8]

2

=A1.(~[1]-~)

3

>A1.run(~=~+~[-1])

A2 generates a sequence consisting of the differences of the current member and the following one in A1’s original sequence:

That the expression in A3 starts with a > means a call to run function only modifies the original sequence but won’t return it. After execution the sequence in A1 becomes one made up of the accumulated values of the original members:

With a sequenec or a record sequence, besides using ~[i] to relatively reference a record, you can use F[i] to relatively reference the F field of a record. The latter is equal to A[i].F. For example:

 

A

1

=demo.query("select EID,NAME,SURNAME,GENDER, BIRTHDAY,STATE from EMPLOYEE")

2

=A1.sort(BIRTHDAY)

3

=A2.(interval(BIRTHDAY[-1],BIRTHDAY))

4

=A2.(interval(~[-1].BIRTHDAY,BIRTHDAY))

A2 sorts the table sequence by BIRTHDAY:

The expressions in A3 and A4 are equal. Both compute the number of days between the birth date of each employee and that of the employee who is ahead of him/her in age:

Instead, using BIRTHDAY[1] in A3’s expression represents the birth date of an employee who is behind the current employee in age. 

Similarly, you can relatively reference multiple records, rather than only one record. In this case, A[a:b], ~[a:b] are used to reference a sequence or a record sequence composed of members from the ath member to the bth member after the current one. For example:

 

A

1

=demo.query("select EID,NAME,SURNAME,GENDER, BIRTHDAY,STATE from EMPLOYEE")

2

=A1.groups(year(BIRTHDAY);count(~):Count)

3

=A2.(~[-1,1])

4

=A2.(Count[-1,1])

A2 counts the number of employees who were born in each year:

A3 selects the statistic data according to a time range, which includes each year, its previous year and its following year:

In a similar way, you can also relatively reference the fields of multiple records with F[…]. For example, A4 counts the number of employees who were born in a time range of each year, its previous year and its following year:

2.9.2 Loop functions

A loop function performs a certain kind of computation on each member of a sequence in order. It is generally written as A.f(x). Actually, the aggregate operations are loop functions too. The behavior of a loop function is specified by the function name, such as the sum function for calculating a sum and the avg function for calculating the average value, etc. Sequence-related functions introduces the general uses of the loop functions, which allow the use of fields or expressions when performing operations over a table sequence or a record sequence.

Aggregate operations

An aggregate operation loop through each member of a sequence or a table sequence or a record sequence to calculate the result according to what the function specifies. For example:

 

A

1

[2,5,-3,8]

2

=A1.sum()

3

=demo.query("select EID,NAME,SURNAME,GENDER, BIRTHDAY,STATE from EMPLOYEE")

4

=A3.avg(age(BIRTHDAY))

5

=A3.count(STATE=="California")

A2 sums up members of the sequence using A.sum() function. The result is as follows:

With A.avg() function, A4 computes the average age of all employees in the table sequence. A5 finds the number of the Californian employees. The results of A4 and A5 are as follows:

 

There are many other loop functions for performing the aggregate operations, like A.min(), A.max(), A.ranks(), A.variance(), etc.

 

A

1

[2,5,,-3,8]

2

=A1.min()

3

=A1.min@0()

The computation of A.min() returns the smallest value among members of sequence A by ignoring null values. If null values need to be taken into consideration, @0 option is required. In that case the null value is actually regarded as the smallest value. For the above example, A1, A2 and A3 get results as follows:

   

In particular, the aggregate operation in the form of [x1, x2,…xn].f() can also be written as f([x1, x2,…xn]) or f(x1, x2,…xn) on the premise that no ambiguity can be caused about it.

 

A

1

[2,5,-3,8]

2

=sum(A1)

3

=sum(2,5,-3,8)

4

=demo.query("select EID,NAME,SURNAME,GENDER, BIRTHDAY,STATE from EMPLOYEE")

5

=avg(A4.(age(BIRTHDAY)))

The results of A2 and A3 are the same as that of A2 in the previous example. And the result of A5 is the same as that of A4 in the previous example. 

Integer loop

For operations that need to specify the number of loop times, a sequence like [1,2,3,…,n] is used. to(n).f(x) ,which is called the integer loop, can be abbreviated to n.f(x). For example:

 

A

1

=10.sum()

2

=demo.query("select EID,NAME,SURNAME,GENDER, BIRTHDAY,STATE from EMPLOYEE")

3

=5.(A2(rand(500)).NAME)

A1 sums up the numbers from 1 to 10. The result is as follows:

A3 randomly selects 5 employees and lists their names, as shown below:

Nested Loop

A loop function can be used in a nested form, which means there are multiple layers of loop operations written in one expression in a format of A1.f1(A2.f2(x)). In a nested loop function, “~” and “#” represent respectively the current member and ordinal number of the innermost sequence, while the referencing of an outer sequence requires putting the sequence name before the signs, which are written as A.~ and A.#. For example:

 

A

1

=demo.query("select EID,NAME,SURNAME,GENDER, BIRTHDAY,STATE from EMPLOYEE")

2

=A1.group(year(BIRTHDAY))

3

=create(Year,Male,Female)

4

>A2.run(A3.insert(0,year(BIRTHDAY),A2.~.count(GENDER== "M"),A2.~.count(GENDER=="F")))

A2 groups the records of employees by the birth year:

A3 creates an empty table sequence for storing the computed results. A4 loops over the grouped records of each year to calculate the total number of male employees and female employees who were born in this year and insert the results into the resulting table sequence. After the code is executed, you can view the results in A3, as shown below:

Iterative functions

A loop function performs an operation by retrieving members from a sequence or a records sequence in order. An iterative function is used to perform an iterative operation during the computing process. The most basic iterative function is iterate(x,a;Gi,…) that needs to be called during the loop process. The iterative operation computes expression x circularly during which ~~ is used to call the last value and store the last result of the computing expression x. If the expression Gi,… changes during the computation, expression x will be recalculated to get the initial value according to expression a. For example:

 

A

1

=demo.query("select * from CITIES")

2

=A1.sort(STATEID, left(NAME,1),CID)

3

=A2.derive(iterate(~~+1, STATEID*100; STATEID):Code1)

4

=A3.derive(iterate(~~+1, 0; STATEID,left(NAME,1)):Code2)

Expression in A3 and A4 use the iterate function within the derive function for adding fields. In A3, the iterate function numbers the cities in a state and gets the initial value for each city by multiplying the state code by 100. A4 adds a Code2 field using the iterative function that numbers cities of a state with the same initial letter; the number begins from 0 for each loop. Here is A4’s result:

If the count is performed by accumulation and begins from 0 for each loop, the iteration operation can be expressed by the simpler seq(Gi,…). By entering the expression =A3.derive(seq(STATEID,left(NAME,1)):Code2) in A4, you can get the same result.

Besides the seq() function that number members in order when certain values remain unchanged, you can use rank(F;Gi,…) or ranki(F;Gi,…) to sort records according to changes of a specified expression F and number records having same F value with the same number. For example:

 

A

1

=demo.query("select * from CITIES")

2

=A1.sort(STATEID, left(NAME,1),CID)

3

=A2.derive(iterate(~~+1, STATEID*100; STATEID):Code1)

4

=A3.derive(seq(STATEID,left(NAME,1)):Code2)

5

=A4.derive(rank(left(NAME,1);STATEID):Rank1)

6

=A5.derive(ranki(left(NAME,1);STATEID):Rank2)

A5 uses the iterative function rank to add a Rank1 field. A6 uses the function to add Rank2 field. Below is the result of A6:

In both iterative operations in A5 and A6, the parameter Gi,… only uses STATEID. By comparing Rank1 and Rank2, you can see that, when numbering cities in a state, the rank function records a number after two same numbers by putting it off to the next number while the ranki function records the same numbers as one.

Apart from generating code, an iterative function can be used to compute acumulative sum. For example:

 

A

1

=to(6)

2

=A1.(iterate(~~+~*~))

3

=A1.(cum(~*~))

4

=to(10).iterate(~~+~*~,0,~>6)

A2 calculates the the accumulative sum of squares for A1’s numbers. Here’s the result:

The iterative accumulation beginning from 0 can be simply expressed by cum function. A3 gets the same result as A2 does. To get only the last accumulative result, use A.iterate(x,a,c) function. A4 calculates the accumulative sum of squares iteratively until the current member of the given sequence is greater than 6. Here’s A4’s result: