Dataset querying functions

Read(612) Label: dataset querying, optional,

This section introduces dataset querying functions used for report making by listing their descriptions, syntax, parameters, return values and options, and giving related examples.

Note: A function parameter enclosed by braces {} is optional.

avg()

Description:

Perform filtering to get a set of records meeting the specified condition from the current set of record rows in the data set and compute average of a given field or expression.

Syntax:

datasetName.avg(selectExp{,filterExp})

Parameter:

selectExp

A field or an expression on which the average is computed; numeric type

filterExp

A filter condition expression

Return value:

A real number

Option:

@r

A root data set expression or not; a root data set expression represents data in the outermost layer

@a

Perform COUNT on the data set, including null values

@b

Use binary search to speed up computation; this requires that data be ordered by the field(s) specified in filterExp and == be used in it. In the filter condition expression, there can be one or more fields and && are used to concatenate them; there is the  dataset-related expression to the left of the equal sign and there is the value to the right of it

Example:

Example 1: ds1.avg(score,gender=="F")

Get a set of records where gender is "F" from the current set of rows in data set ds1 and compute average of score field values

Example 2: ds1.avg(quantity*price)

Compute average of expression quantity*price on the current set of rows in data set ds1

Example 3: ds1.avg@a(score)

Compute average of score field values on the current set of rows in data set ds1, counting records with null score values

Example 4: ds1.avg@r(score)

Compute average of score field values on the set of record rows in the root data set

Example 5: ds1.avg@b(score,gender=="F")

Same requirement as Example 1; but it uses binary search to speed up the computation

bfind()

Description:

Use binary search to get the record(s) meeting the specified filter condition from a data set according to the order of records and return values of the specified field or expression.

The function is equivalent to select(), but it uses a different algorithm; the binary search is suitable for perform computations on records of a data set.

It is faster than select() when records are already ordered by the reference field.

Syntax:

datasetName.bfind(selectExp,referExp1:referValueExp1:referOrderExp1{,referExp2:referValueExp2:referOrderExp2{....}})

Parameter:

selectExp

A selected field or its expression

referExp1

The reference field expression

referOrderExp1

Data order for the reference field expression - >0 for ascending order and <0 for descending order

referValueExp1

The reference field value expression; once a record having same value as this parameter is found, return the record’s selectExp value

......

There can be multiple reference fields; in that case, find the record where the multiple reference fields match the reference field values to return

Return value:

Indefinite data type, which is determined by the result of computing selectExp

Option:

@r

A root data set expression or not; a root data set expression represents data in the outermost layer

Example:

Example 1: ds1.bfind(name,id:value():1)

Use binary search to find the record whose id field is the same as the current cell value from data set ds1 and return the record’s name field value

Example 2: ds1.bfind(name,id:value():1;class:A1:1; gender:B1:-1)

Use binary search to find the record where id field is the same as the current cell value, class field is the same as A1 and gender field is the same as B1 from data set ds1 and return the record’s name field value. Note: the order of the three field expression conditions should be consistent with the sorting order in the data set – that is, the data set is first ordered by id ascendingly, then by class ascendingly, and then by gender descendingly.

Example 3: ds1.bfind@r(NAME,EID:2:1)

Use binary search to find the record whose EID is 2 from the root data set ds1 and return the record’s NAME field value.

count()

Description:

Calculate the number of records meeting the specified condition in the current set of record rows in a data set.

Syntax:

datasetName.count({filterExp})

Parameter:

filterExp

A filter condition, which can be absent when all records need to be selected

Return value:

Integer type

Option:

@r

A root data set expression or not; a root data set expression represents data in the outermost layer

@b

Use binary search to speed up computation; this requires that data be ordered by the field(s) specified in filterExp and == be used in it. In the filter condition expression, there can be one or more fields and && are used to concatenate them; there is the dataset-related expression to the left of the equal sign and there is the value to the right of it

Example:

Example 1: ds1.count()

Count all records in the current set of record rows in data set ds1 and return the number 

Example 2: ds1.count(true)

Same function as the expression in Example 1 but slower than ds1.count(); recommend that parameter true is omitted when all records need to be selected

Example 3: ds1.count(quantity>500)

Get and count records where quantity>500 in the current set of record rows in data set ds1 and return the number

Example 4: ds1.count@r(quantity>500)

Count records where quantity>500 in the set of root record rows

Example 5: ds1.count@b(quantity==500)

Count records where quantity>500 in the current set of record rows

Here @b option is used to enable binary search for speeding up the computation

dcount()

Description:

Calculate the number of records meeting the specified condition in the current set of record rows in a data set, treating records having same selected field value as one record.

Syntax:

datasetName.dcount(<select_exp>{,filterExp})

Parameter:

select_exp

Name or ordinal number of the field to be selected, or an expression

Field number is represented by #n, such as #0 for column 0, #1 for column 1, and so on

filterExp

A filter condition, which can be absent when all records need to be selected

Return value:

Integer type

Option:

@r

A root data set expression or not; a root data set expression represents data in the outermost layer

@b

Use binary search to speed up computation; this requires that data be ordered by the field(s) specified in filterExp and == be used in it. In the filter condition expression, there can be one or more fields and && are used to concatenate them; there is the dataset-related expression to the left of the equal sign and there is the value to the right of it

Example:

Example 1: ds1.dcount(name)

Get all name field values from the current set of record rows in data set ds1, treating records having same name field value as one record

Example 2: ds1.dcount(#5)

Get all values of the 5th column from the current set of record rows in data set ds1, treating records having same value in the 5th column as one record

Example 3: ds1.dcount( name,name!=”Ryan”)

Get all values of the field where name is not Ryan from the current set of record rows in data set ds1, treating records having same name field value as one record

Example 4: ds1.dcount@r( name )

Get all name field values from root data set ds1, treating records having same name field value as one record

Example 5: ds1.dcount@b( name,name==”Ryan”)

Get all values of the field where name is Ryan from the current set of record rows in data set ds1, treating records having same name field value as one record; here binary search is used to speed up the computation

fcount()

Description:

Get the number of columns in a data set.

Syntax:

datasetName.fcount()

Return value:

Integer type

Example:

Example 1: ds1.fcount()  Get the number of columns in data set ds1; result is integer type

field()

Description:

Get data of the specified column in the current row of a data set or the first row of the current set of rows in a data set.

Syntax:

datasetName.field(fieldNameExp)

datasetName.field( filedNoExp)

Parameter:

fieldNameExp

A string, whose content is the expression of data set column name; the parameter is suitable for getting a column whose name contains special characters

filedNoExp

An integer representing the ordinal number of a data set column, such as 2 for the 2nd column and 0 for the row header column

Return value:

A column object of a data set, which can be used only as an element in an expression but not as the final value of a cell

Note:

The function has low performance, make sure you think twice before using it.

Example:

Example 1: Enter expressions into a report as follows:

 

A

B

1

 

= to(1,ds2.fcount())

2

 

= ds2.fname(B1)

3

= ds2.select(#0)

= ds2.field(B2)

Set expanding mode for B1 as horizontal and for A3 as vertical. For this report, B1 expands horizontally according to the number of columns in data set ds2, A3 expands vertically according to the number of rows in this data set, and B2 and B3 dynamically obtain column names and column objects and calculate column values. In this way, a report with dynamic columns will be achieved.

Preview the report and we get this:

   

Example 2: ds1.field(“DEPT”)  Return data corresponding to DEPT field in the current row in data set ds1

Example 3: ds1.field(3)  Return data corresponding to the 3rd column in the current row in data set ds1

fname()

Description:

Get a column name in a data set according to the column number.

Syntax:

datasetName.fname(fieldNoExp)

Parameter:

fieldNoExp

The data set column number expression

Return value:

Character type

Example:

Example 1: ds1.fname(3)  Return "score"

Which means that name of the 3rd column of data set ds1 is "score"

 

Example 2: Enter expressions into a report as follows:

 

A

B

1

 

= to(1,ds2.fcount())

2

 

= ds2.fname(B1)

3

= ds2.select(#0)

= ds2.field(B2)

Set expanding mode for B1 as horizontal and for A3 as vertical. For this report, B1 expands horizontally according to the number of columns in data set ds2, A3 expands vertically according to the number of rows in this data set, and B2 and B3 dynamically obtain column names and column objects and calculate column values. In this way, a report with dynamic columns will be achieved.

Preview the report and we get this:

group()

Description:

Get a grouped subset according to the grouping expression.

Syntax:

datasetName.group(<selectExp:order1>{,filterExp};;{ groupSortExp }:{ groupOrderExp })

// Fit for computing scenarios where sorting is not needed or data is already sorted. The function is the optimization of the following syntax, where the grouping field and the sorting field is same field.

datasetName.group(<selectExp>{,filter_exp};{sort1}{:order1},…;{ groupSortExp }:{ groupOrderExp })

Parameter:

selectExp

Expression of the selected field, which can be a field name, field number or an expression; field number is represented by #n, such as #0 for column 0, #1 for column 1, and so on

filterExp

The filter expression

sort1

The sorting expression by which records are ordered before they are grouped

order1

The sorting direction in which records are sorted before they are grouped; do not sort when the parameter is absent, >0 for ascending order and <0 for descending order; represent ascending order when order1,..,ordern are omitted

groupSortExp

The sorting expression by which groups are sorted after records are grouped; generally, it is an aggregate expression, such as sum on each group

groupOrderExp

The sorting direction in which groups are ordered; >0 for ascending order and <0 for descending order

Return value:

A set of data in one group; the data type is determined by the result of computing expression selectExp

Option:

@r

A root data set expression or not; a root data set expression represents data in the outermost layer

@b

Use binary search to speed up computation; this requires that data be ordered by the field(s) specified in filterExp and == be used in it. In the filter condition expression, there can be one or more fields and && are used to concatenate them; there is the dataset-related expression to the left of the equal sign and there is the value to the right of it

Example:

Example 1: ds1.group(class)

Group all records of data set ds1 by class field by putting neighboring ones having same value in a certain field and without sorting, and return a set consisting of class values of every group

Example 2: ds1.group(class:-1)

Sort all records of data set ds1 by class field in descending order, group them, and return a set consisting of class values of every group

Example 3: ds1.group(class, gender =='F'; class)

Get records where gender is "F" from data set ds1, sort them by class field in ascending order and group them, and return a set consisting of class field values in every group; sort records in ascending order when parameter order1 is omitted

Example 4: ds1.group(class, gender =='F'; id:-1)

Get records where gender is "F" from data set ds1, sort them by id field in descending order, group them by class, and return a set consisting of class field values in every group

Example 5: ds1.group(state;state:1; ds1.sum(indProduction):-1)

Group data set ds1 by state, calculate sum of industrial productions in each group, and sort groups by industrial production sum in descending order

Example 6: ds1.group@r(GENDER;GENDER;ds1.sum(SALARY))

Group root data set ds1 by GENDER, calculate sum of SALARY values in each group, and sort groups by salary sum in ascending order

Example 7: ds1.group@b(class, gender =='F'; class)

Same function as Example 3, but it uses @b option to enable binary search for speeding up the computation

isfield()

Description:

Check whether the specified column name exists in a data set.

Syntax:

datasetName.isfield( stringExp )

Parameter:

stringExp

The column name expression to be checked

Return value:

Boolean type

Example:

Example1: ds1.isfield("score")  Return true, meaning that column name score exists in ds1

max()

Description:

Obtain a set of records meeting the specified condition from the current set of record rows in a data set and get the maximum value of the given field or an expression.

Syntax:

datasetName.max(selectExp{,filterExp})

Parameter:

selectExp

The field or expression whose maximum value will be obtained; to use a column number, represent it using #n, such as #0 for column 0, #1 for column 1, and so on

filterExp

A filter expression

Return value:

The maximum value in a set, whose data type is indefinite and determined by the result of computing expression selectExp

Option:

@r

A root data set expression or not; a root data set expression represents data in the outermost layer

@b

Use binary search to speed up computation; this requires that data be ordered by the field(s) specified in filterExp and == be used in it. In the filter condition expression, there can be one or more fields and && are used to concatenate them; there is the dataset-related expression to the left of the equal sign and there is the value to the right of it

Example:

Example 1: ds1.max(score)

Get the maximum score field value in the current set of rows in data set ds1

Example 2: ds1.max(score, gender =="F")

Get the maximum score field value in records meeting the condition gender =="F" from in the current set of rows of data set ds1

Example 3: ds1.max(#3)

Get the maximum value of the 3rd column in the current set of rows in data set ds1

Example 4: ds1.max@r(#3)

Get the maximum value of the 3rd column in the current set of rows in root data set ds1

Example 5: ds1.max@b(score, gender =="F")

Same function as Example 2, but it uses @b option to enable binary search for speeding up the computation

min()

Description:

Obtain a set of records meeting the specified condition from the current set of record rows in a data set and get the minimum value of the given field or an expression.

Syntax:

datasetName.min(selectExp{,filterExp})

Parameter:

selectExp

The field or expression whose maximum value will be obtained; to use a column number, represent it using #n, such as #0 for column 0, #1 for column 1, and so on

filterExp

The filter expression

Return value:

The minimum value in a set, whose data type is indefinite and determined by the result of computing expression selectExp

Option:

@r

A root data set expression or not; a root data set expression represents data in the outermost layer

@b

Use binary search to speed up computation; this requires that data be ordered by the field(s) specified in filterExp and == be used in it. In the filter condition expression, there can be one or more fields and && are used to concatenate them; there is the dataset-related expression to the left of the equal sign and there is the value to the right of it

Example:

Example 1: ds1.min(score)  Get the minimum score field value in the current set of rows in data set ds1

Example 2: ds1.min(score, gender =="F")  Get the minimum score field value in the records meeting the condition gender =="F" from in the current set of rows of data set ds1

Example 3: ds1.min(#3)  Get the minimum value of the 3rd column in the current set of rows in data set ds1

Example 4: ds1.min@r(#3)  Get the minimum value of the 3rd column in the current set of rows in root data set ds1

Example 5: ds1.min@b(score, gender =="F")  Same function as Example 2, but it uses @b option to enable binary search for speeding up the computation

plot()

Description:

Group data according to intervals of values, which is also known as “group by segment”.

Syntax:

datasetName.plot(valueExp,listExp{,filterExp})

Parameter:

valueExp

A value expression for grouping

listExp

A list of intervals of values, which are arrays having same data type as valueExp and where elements should be arranged in ascending order

filterExp

The filter expression

Return value:

An integer array representing the order of groups after records are grouped according to the intervals of values, such as list(0,1,2,3), which represents the 1st group, the 2nd group, the 3rd group and the 4th group. So, you need to define a display value expression after defining data value expression for cells, such as fail for the 1st group and pass for the 2nd group.

Option:

@r

A root data set expression or not; a root data set expression represents data in the outermost layer

@c

The threshold value only belongs to the first corresponding interval when comparing with each element and same threshold values do not belong to the 2nd corresponding interval. By default, the threshold value in an interval of values only belongs to the second corresponding interval

@z

Order an array returned from listExp in descending order; default is ascending order

@b

Use binary search to speed up computation; this requires that data be ordered by the field(s) specified in filterExp and == be used in it. In the filter condition expression, there can be one or more fields and && are used to concatenate them; there is the dataset-related expression to the left of the equal sign and there is the value to the right of it

Example:

Example 1: ds1.plot(Score,list(60,80,90), Subject=="Math")  Return list(0,1,2,3)  Perform filtering on data set ds1 to get records where Subject is Math and group them by intervals of scores – <60,≥60 & ≤ 79, ≥80 & ≤ 89 and ≥90, while defining display value expression as map(list(0,1,2,3),list("Fail","Pass","Good","Excellent"))

Example 2: ds1.plot@c(Score,arg1)  arg1 is a parameter whose data type is integer array and whose value is "60,70,80,90"; the display value expression is map(to(0,count(arg1)),arg2), where arg2 is a string array whose value is {"Fail","Pass","Satisfactory","Good","Excellent"}

In this example, we dynamically control intervals of values for grouping and display value in the result set through parameters

Example 3: ds1.plot@b(Score,list(60,80,90), Subject=="Math")  Same function as Example 1, but it uses @b option to enable binary search for speeding up the computation

select()

Description:

Get records meeting the specified condition from the current set of record rows in a data set. When a parameter or a cell name is namesake with a column name, add the data set name to define it.

Syntax:

datasetName.select(<select_exp>{: order_exp},{ filter_exp },{num_exp},{distinct_exp})

datasetName.select(<select_exp>{, filter_exp },{ num_exp },{distinct_exp}; {sort1_exp}:{order1_exp},….) // Fit for scenarios where sorting is not needed or data is already sorted

Parameter:

select_exp

To-be-selected column name, column number or column expression; to use a column number, represent it using #n, such as #0 for column 0, #1 for column 1, and so on

order _exp

Specified order in which data will be sorted; <0 for descending order and >0 for ascending order

filter_exp

A filter expression; when all fields will be selected, it can be omitted but its place needs to be held by “,”

num_exp

The first n records to be retrieved rom the query result

distinct_exp

The to-be-filtered field for which distinct values will be returned

sort1_exp

The sorting expression; when the parameter is absent, first check whether order_exp is absent or not; if order_exp is absent, do not sort, otherwise sort data by select_exp

Return value:

A set of records, whose data type is determined by the result of computing select_exp

Option:

@r

A root data set expression or not; a root data set expression represents data in the outermost layer

@z

Get data backwards; return all data when num_exp is absent

@b

Use binary search to speed up computation; this requires that data be ordered by the field(s) specified in filter_exp and == be used in it. In the filter condition expression, there can be one or more fields and && are used to concatenate them; there is the dataset-related expression to the left of the equal sign and there is the value to the right of it

Example:

Example 1: ds1.select(name)

Get all name field values from data set ds1 without sorting

Example 2: ds1.select(ds1.F1)

Get all values of F1 field (F1 is a field column name) from data set ds1; since the to-be-selected field name and a column name are namesake, use data set name to define it

Example 3: ds1.select( #2:-1)

Get all values of the 2nd field column from data set ds1 and sort them in descending order

Example 4: ds1.select(name:1, gender =='M')

Get values of name field whose corresponding gender is male from data set ds1 and sort them in ascending order

Example 5: ds1.select(name:-1, gender =='M',50, id )

Remove duplicate id field values from data set ds1, get 50 name field values whose corresponding gender is male, and sort them by name in descending order

Example 6: ds1.select(int(EMPID), gender =='M',50,id;BONUS:1)

Remove duplicate id field values from data set ds1, sort the data set by BONUS in ascending order, and get 50 EMPID field values whose corresponding gender is male

Example 7: ds1.select@r(#9:-1,SALARY>=10000)

Get values of the 9th column in records where SALARY is greater than 10000 from root data set ds1, and sort them in descending order

Example 8: ds1.select@z(#9:1,SALARY>=10000,3)

Get the last 3 values for the 9th column in records where SALARY is greater than 10000 from data set ds1, and sort them in ascending order

Example 9: ds2.select@b(amount,c_code==A3)

Use binary search to get amount field values whose corresponding c_code is A3 from data set ds2, during which code field is ordered

sum()

Description:

Get records meeting the specified condition from the current set of record rows in a data set, and calculate sum on a given field or expression.

Syntax:

datasetName.sum(selectExp{,filterExp})

Parameter:

selectExp

A field or field expression on which sum will be computed; numeric type

filterExp

A filter expression

Return value:

A real number

Option:

@r

A root data set expression or not; a root data set expression represents data in the outermost layer

@b

Use binary search to speed up computation; this requires that data be ordered by the field(s) specified in filterExp and == be used in it. In the filter condition expression, there can be one or more fields and && are used to concatenate them; there is the dataset-related expression to the left of the equal sign and there is the value to the right of it

Example:

Example 1: ds1.sum(quantity)

Calculate sum on quantity field in the current set of rows in data set ds1

Example 2: ds1.sum(quantity,productid==1)

Get a set of records meeting the condition productid==1 from the current set of record rows in data set ds1, and calculate sum on quantity field

Example 3: ds1.sum@r(score)

Calculate sum on score field in the current set of record rows in the root data set

Example 4: ds1.sum@b(quantity,productid==1)

Same function as Example 2, but it uses @b option to enable binary search for speeding up the computation