• esProc
Tutorial
Function Reference
Sample Program
Code Reference
User Reference
External Library Guide
Data File Tool Manual
DQL Tutorial
Cluster Server Manager Manual
SPL Command Executor Manual
• YModel
User Reference
JSON-style Parameter Guide
• Official Website

# groups()

Here’s how to use groups() function.

## A .groups()

Description:

Group a table sequence and then get the aggregate result cumulatively.

Syntax:

A.groups(x:F,…;y:G,…)

Note:

The function groups and aggregates table sequence A simultaneously by one or multiple fields/expressions to generate a new table sequence with F,.. G… as the fields. Namely, during the traversal through members of A, they will be placed to the corresponding result set one by one while a result set is aggregated cumulatively. Compared with the method of first grouping and then aggregation represented by A.group(x:F,…;y:G,…) function, the function has a better performance.

Options:

 @o Group records by comparing adjacent ones, which is equal to the merging operation, and the result set won’t be sorted. @n x gets assigned with group numbers which can be used to define the groups. @n and @o are mutually exclusive. @u Do not sort the result set by x. It doesn’t work with @o/@n @i x is a Boolean expression. If the result of x is true, then start a new group. There is only one x. @m Use parallel algorithm to handle data-intensive or computation-intensive tasks; no definite order for the records in the result set; can’t be used with @o and @i options. @0 Discard the group over which the result of grouping expression x is null @h Used over a grouped table with each group ordered to speed up grouping @t Return an empty table sequence with data structure if the grouping and aggregate operation over the sequence returns null

Parameters:

 A A sequence. x Grouping expression. F Field name of the result table sequence. y y is the function with which A is traversed. When y is an aggregate function, it only supports sum/count/max/min/top/avg/iterate/icount/median/maxp/minp. When the function work with iterate(x,a;Gi,…) function, the latter’s parameter Gi should be omitted. When y isn’t an aggregate function, perform calculation over only the first record in each group. G Summary field name in the result table sequence.

Return value:

Post-grouping table sequence.

Example:

 A 1 =demo.query("select * from SCORES where CLASS = 'Class one'") 2 =A1.groups(STUDENTID:StudentID;sum(SCORE):TotalScore) group by a single field. 3 =demo.query("select * from SCORES") 4 =A3.groups(CLASS:Class,STUDENTID:StudentID;sum(SCORE):TotalScore) group by multiple fields. 5 =A3.groups@m(STUDENTID:StudentID;sum(SCORE):TotalScore) Use @m option to increase performance of big data handling. 6 =A3.groups@o(STUDENTID:StudentID;sum(SCORE):TotalScore) Only compare and merge with the neighboring element, and the result set is not sorted. 7 =demo.query("select * from STOCKRECORDS where STOCKID<'002242'") 8 =A7.groups@n(if(STOCKID=="000062",1,2):StockID;sum(CLOSING):TotalPrice) The value of x is the group sequence number. 9 =demo.query("select * from EMPLOYEE") 10 =A9.groups@u(STATE:State;count(STATE):TotalScore) Do not sort result set by the sorting field. 11 =A9.groups@i(STATE=="California":IsCalifornia;count(STATE):count) Start a new group when STATE=="California". 12 =A3.groups(CLASS:Class,STUDENTID:StudentID;iterate(~~*2,10): Score1) Perform iterate operation within each group 13 =file("D:\\Salesman.txt").import@t() 14 =A13.groups@0(Gender:Gender;sum(Age):Total) Discard groups where Gender values are nulls 15 =file("D:/emp10.txt").import@t() For data file emp10.txt, every 10 records are ordered by DEPT 16 =A15.groups@h(DEPT:dept;sum(SALARY):bouns) A15 is grouped and ordered by DEPT, for which @h option is used to speed up grouping 17 =A4.groups(STUDENTID:StudentID;SUBJECT,sum(SCORE):SUMSCORE) Parameter y isn’t an aggregate function, so the function performs operation over the first record; the result is English 18 =demo.query("select * from SCORES where CLASS = 'Class three'") Return an empty table sequence 19 =A18.groups@t(STUDENTID:StudentID;sum(SCORE):TotalScore) Return an empty table sequence with the data structure Related functions:

A.group(x:F,...;y:G,…)

## ch. groups()

Description:

Group records in a channel.

Syntax:

ch.groups(x:F,…;y:G…;n)

Note:

The function groups records in channel ch according to grouping expression x, by which the records are ordered, to get a channel having F,...G,… fields.

Sort records in the new channel by x. Values of G field are the results of computing expression y over each group. It aims to fetch the grouping result set from the channel.

Options:

 @n With the option the value of expression x is a group number, which points to the desired group @u Won’t sort the resulting set by expression x; the option and @n are mutually exclusive

Parameters:

 ch Channel x Grouping expression, by which an aggregation over the whole grouped set is performed if x:F is omitted. In that case the semicolon “;” should not be omitted F Field names of the resulting table sequence y An aggregate function on channel ch, which only supports sum/count/max/min/top /avg/iterate; the parameter Gi should be given up if function iterate(x,a;Gi,…) is used G The aggregate fields in the resulting table sequence

Return value:

Channel

Example:

 A 1 =demo.cursor("select * from EMPLOYEE ") 2 =channel() Create a channel 3 =channel() Create a channel 4 =channel() Create a channel 5 =channel() Create a channel 6 =A1.push(A2,A3,A4,A5) Be ready to push the data in A1’s cursor into channel A2, A3 and A5, but the action needs to wait. 7 =A2.groups(;sum(SALARY):TotalSalary) As x:F is omitted, calculate the sum of salaries of all employees 8 =A3.groups(DEPT:dept;sum(SALARY):TotalSalary) Group and sort records by DEPT field 9 =A4.groups@n(if(GENDER=="F",1,2):SubGroups;sum(SALARY):TotalSalary) The value of x is group number; put records where GENDER is “F” into the first group and others into the second group, and then aggregate each group 10 =A5.groups@u(STATE:State;count(STATE):count) Won’t sort the resulting set by grouping field 11 A1.select(month(BIRTHDAY)==2) 12 A11.fetch() Attach a fetch operation to A11’s cursor 13 =A2.result() 14 =A3.result() 15 =A4.result() 16 =A5.result() ## cs. groups()

Description:

Group records in a cluster cursor, sort them by the grouping field and peform aggregation over each group and add each aggregate to the result set.

Syntax:

cs.groups(x:F,…;y:G…;n)

Note:

The function groups records in a cluster cursor by expression x, sorts result by the grouping field, and calculates the aggregate value on each group.

This creates a new table sequence consisting of fields F,...G,… and sorted by the grouping field x.The G field gets values by computing y on each group.

Options:

 @c Perform the group operation over data in every node and compose the result sets into a cluster memory table in the segmentation way of the cursor; suppport a cluster dimension table

Parameters:

 cs Records in a cluster cursor x Grouping expression; if omitting parameters x:F, aggregate the whole set; in this case, the semicolon “;” must not be omitted F Field name in the result table sequence y An aggregate function on cs, which only supports sum/count/max/min/top /avg/iterate; when the function works with iterate(x,a;Gi,…) function, the latter’s parameter Gi should be omitted G Aggregate field name in the result table sequence n The specified maximum number of groups; stop executing the function when the number of data groups is bigger than n to prevent memory overflow; the parameter is used in scenarios when it is predicted that data will be divided into a large number of groups that are greater than n.

Return value:

A table sequence/cluster memory table

Example:

 A 1 =file("emp1.ctx","192.168.0.111:8281") Below is emp1.ctx: 2 =A1.open() Open a cluster composite table 3 =A2.cursor() Return a cluster cursor 4 =A3.groups(Dept:dept;count(Name):count) Group data by DEPT and perform aggregation A 1 [192.168.0.110:8281,192.168.18.143:8281] 2 =file("emp.ctx":[1,2], A1) 3 =A2. open () Open a cluster composite table 4 =A3.cursor() Create a cluster cursor 5 =A4.groups(GENDER:gender;sum(SALARY):totalSalary) Group data by GENDER and perform aggregation and return result as a table sequence 6 =A3.cursor() 7 =A6.groups@c(GENDER:gender;sum(SALARY):totalSalary).dup() Retain the way of segmentation of the distributed cursor and return a cluster memory table Related functions:

A.group(xi,…)

A.groups()

## cs.groups()

Description:

Group records in a cursor.

Syntax:

cs.groups(x:F,…;y:G…;n)

Note:

The function groups records in a cluster cursor by expression x, sorts result by the grouping field, and calculates the aggregate value on each group. This creates a new table sequence consisting of fields F,...G,… and sorted by the grouping field x.The values of F field are the value of x field of the first record in each group and G field gets values by computing yon each group. The aggregation over a cluster cursor will first be performed by the main process on the local machine and the result will then be returned to the machine that initiates the invocation; the process is called reduce.

Options:

 @n The value of grouping expression is group number used to locate the group; you can use n to specify the number of groups and generate corresponding number of zones first. @u Do not sort the result set by the grouping expression; it doesn’t work with @n @o Compare each record only with its neighboring record to group, which is equivalent to the merge operation, and won’t sort the result set @i With this option, the function only has one parameter x that is a boolean expression; start a new group if its result is true @h Used over a grouped table with each group ordered to speed up grouping

Parameters:

 cs Records in a cursor x Grouping expression; if omitting parameters x:F, aggregate the whole set; in this case, the semicolon “;” must not be omitted F Field name in the result table sequence y An aggregate function on cs, which only supports sum/count/max/min/top/avg/iterate; when the function works with iterate(x,a;Gi,…) function, the latter’s parameter Gi should be omitted G Aggregate field name in the result table sequence n The specified maximum number of groups; stop executing the function when the number of data groups is bigger than n to prevent memory overflow; the parameter is used in scenarios when it is predicted that data will be divided into a large number of groups that are greater than n; it doesn’t work with @n option. Support a multicuror when this parameter is absent.

Return value:

Post-grouping table sequence

Example:

 A 1 =demo.cursor("select * from SCORES where CLASS = 'Class one'") 2 =A1.groups(;sum(SCORE):TotalScore) As parameters x:F absent, calculate the total score of all students 3 =demo.cursor("select * from FAMILY") 4 =A3.groups(GENDER:gender;sum(AGE):TotalAge) Group and order data by specified fields 5 =demo.cursor("select * from STOCKRECORDS where STOCKID<'002242'") 6 =A5.groups@n(if(STOCKID=="000062",1,2):SubGroups;sum(CLOSING):ClosingPrice) The value of grouping expression is group number; put records whose STOCKID is “000062” to the first group and others to the second group; and meanwhile aggregate each group 7 =demo.cursor("select * from EMPLOYEE") 8 =A7.groups@u(STATE:State;count(STATE):Total) The result set won’t be sorted by the grouping field 9 =demo.cursor("select * from EMPLOYEE") 10 =A9.groups@o(STATE:State;count(STATE):Total) Compare each record with its next neighbor and won’t sort the result set 11 =demo.cursor("select * from EMPLOYEE") 12 =A11.groups@i(STATE=="California":IsCalifornia;count(STATE):count) Start a new group if the current record meets the condition STATE=="California" 13 =file("D:/emp10.txt").cursor@t() For data file emp10.txt, every 10 records are ordered by DEPT 14 =A13.groups@h(DEPT:DEPT;sum(SALARY):bouns) As A13 is grouped and ordered by DEPT, use @h option to speed up grouping 15 =demo.query("select  * from employee") 16 =A15.cursor@m(3) Return a multicursor 17 =A16.groups(STATE:state;sum(SALARY):salary) Group the multicursor with groups function ## T.groups()

Description:

Group records in a pseudo table.

Syntax:

T. groups(x:F,…;y:G…;n)

Note:

The function groups records in a pseudo table by expression x, sorts result by the grouping field, and perform an aggregate operation on each group.

This creates a new table sequence consisting of fields F,...G,… and sorted by the grouping field x. G field gets values by computing aggregate function y on each group.

Options:

 @n The value of grouping expression is group number used to locate the group; you can use n to specify the number of groups and generate corresponding number of zones first @u Do not sort the result set by the grouping expression; it doesn’t work with @n @o Compare each record only with its neighboring record to group, which is equivalent to the merge operation, and won’t sort the result set @i With this option, the function only has one parameter x that is a Boolean expression; start a new group if its result is true @h Used over a grouped table with each group ordered to speed up grouping @b Enable returning a result set containing aggregates only without group-level data @v Store the composite table in the column-wise format when loading it the first time, which helps to increase performance

Parameters:

 T A pseudo table x Grouping expression; if omitting parameters x:F, aggregate the whole set; in this case, the semicolon “;” must not be omitted F Field name in the result table sequence y An aggregate function on T, which only supports sum/count/max/min/top/avg/iterate; when the function works with iterate(x,a;Gi,…) function, the latter’s parameter Gi should be omitted G Aggregate field name in the result table sequence n The specified maximum number of groups; stop executing the function when the number of data groups is bigger than n to prevent memory overflow; the parameter is used in scenarios when it is predicted that data will be divided into a large number of groups that are greater than n

Return value:

A pseudo table

Example:

 A 1 =create(file).record(["D:/file/pseudo/empT.ctx"]) 2 =pseudo(A1) Generate a pseudo table 3 =A2.groups(DEPT:dept;avg(SALARY):AVG_SALARY) Group records in A2’s pseudo table by DEPT field, calculate average of SALARY in each group – which is the aggregate method, and return result as a table sequence made up of dept field, and AVG_SALARY field and sorted by dept 4 =A2.groups@n(if(GENDER=="F",1,2):GenderGroup;avg(SALARY):AVG_SALARY) Devide recrods of A2’s pseudo table into two groups according to whether GENDER is F, and calculate average SALARY in each group 5 =A2.groups@u(DEPT:dept;avg(SALARY):AVG_SALARY) With @u option, the grouping result won’t be sorted 6 =A2.groups@o(DEPT:dept;avg(SALARY):AVG_SALARY) Withe @o option, compare each record with its next neighbor and won’t sort the result set 7 =A2.groups@i(GENDER=="M":isMAN;count(EID):Count) Start a new group if the current record meets the condition GENDER=="M" 8 =A2.groups@b(DEPT:dept;avg(SALARY):AVG_SALARY) With @b option, return only a column of aggregates without group data 