A .groups()

Description:

Group a table sequence and then get the aggregate result cumulatively.

Syntax:

A.groups(x:F,…;y:G,…)

Note:

The function groups and aggregates table sequence A simultaneously by one or multiple fields/expressions to generate a new table sequence with F,.. G… as the fields. Namely, during the traversal through members of A, they will be placed to the corresponding result set one by one while a result set is aggregated cumulatively. Compared with the method of first grouping and then aggregation, the function has a better performance.

Options:

@o

Group records by comparing adjacent ones, which is equal to the merging operation, and the result set won’t be sorted.

@n

x gets assigned with group numbers which can be used to define the groups. @n and @o are mutually exclusive.

@u

Do not sort the result set by x. It doesn’t work with @o/@n

@i

x is a bool expression. If the result of x is true, then start a new group. There is only one x.

@m

Use parallel algorithm to handle data-intensive or computation-intensive tasks; no definite order for the records in the result set; can’t be used with @oi options.

@0

Discard the group over which the result of grouping expression x is null

@h

Used over a grouped table with each group ordered to speed up grouping

Parameters:

A

A sequence.

x

Grouping expression.

F

Field name of the result table sequence.

y

y is the function with which A is traversed. When y is an aggregate function, it only supports sum/count/max/min/top/avg/iterate/icount/median. When the function work with iterate(x,a;Gi,…) function, the latter’s parameter Gi should be omitted. When y isn’t an aggregate function, perform calculation over only the first record in each group.

G

Summary field name in the result table sequence.

Return value:

Post-grouping table sequence.

Example:

 

A

 

1

=demo.query("select * from SCORES where CLASS = 'Class one'")

2

=A1.groups(STUDENTID:StudentID;sum(SCORE):TotalScore)

group by a single field.

3

=demo.query("select * from SCORES")

 

4

=A3.groups(CLASS:Class,STUDENTID:StudentID;sum(SCORE):TotalScore)

group by multiple fields.

5

=A3.groups@m(STUDENTID:StudentID;sum(SCORE):TotalScore)

Use @m option to increase performance of big data handling.

6

=A3.groups@o(STUDENTID:StudentID;sum(SCORE):TotalScore)

Only compare and merge with the neighboring element, and the result set is not sorted.

7

=demo.query("select * from STOCKRECORDS where STOCKID<'002242'")

 

8

=A7.groups@n(if(STOCKID=="000062",1,2):StockID;sum(CLOSING):TotalPrice)

The value of x is the group sequence number.

9

=demo.query("select * from EMPLOYEE")

 

10

=A9.groups@u(STATE:State;count(STATE):TotalScore)

Do not sort result set by the sorting field.

11

=A9.groups@i(STATE=="California":IsCalifornia;count(STATE):count)

Start a new group when STATE=="California".

12

=A3.groups(CLASS:Class,STUDENTID:StudentID;iterate(~~*2,10): Score1)

Perform iterate operation within each group

13

=file("D:\\Salesman.txt").import@t()

14

=A13.groups@0(Gender:Gender;sum(Age):Total)

Discard groups where Gender values are nulls

15

=file("D:/emp10.txt").import@t()

For data file emp10.txt, every 10 records are ordered by DEPT

16

=A15.groups@h(DEPT:dept;sum(SALARY):bouns)

A15 is grouped and ordered by DEPT, for which @h option is used to speed up grouping

17

=A4.groups(STUDENTID:StudentID;SUBJECT,sum(SCORE):SUMSCORE)

Parameter y isn’t an aggregate function, so the function performs operation over the first record; the result is English

Note:

Compared with A.group(x:F,…;y:G,…), A.groups(x:F,…;y:G,…) computes in a accumulative way and thus gets a better performance.

Related functions:

A.group(xi,)

A.group(x:F,...;y:G,…)