Complex computations

Read(668) Label: filter, sort, group,

After the data source is opened and accessed, let~{!/~}s move on to look at how to further compute data with a series of operations such as filtering, sorting and grouping.

Filter

The operation filters data by one or more fields. It can be used to select certain records according to a specified filtering expression.

Example: Filter data in orders2020.txt according to the condition Amount > 1000.

Select Text File, and click Calculate -> Filter or click the Filter icon  on the Menu Bar to enter the Filter window, where you can edit a filter expression, as shown below:

Click OK and records with filtered Amount field are displayed.

Sort

The item re-arranges data in a specified direction and by a specific field.

Example: Sort data in orders2020.txt by SID field in descending order and by Amount field in ascending order.

Select Text File, and click Calculate -> Sort or click the Sort icon  on the Menu Bar to enter the Sort window, where you add a sorting field, as shown below:

Keep the original order: It is used to sort multilevel data by the specified field, arranging records according to the original position it appears the first time.

Use parallel processing: Select this option to enhance performance of data-intensive complex sorting operations.

Place null values in the end: Put records where the sorting field values are null in the end.

 

Click OK to get the sorting result. In the following result, SID field is ordered in descending order and Amount field is arranged in ascending order.

 

Group

The item group and summarize records according to a specified field or expression.

Example: Group records in orders2020.txt by SID field and sum Amount in each group.

Select Text File, and click Calculate -> Group or click the Group icon  on the Menu Bar to enter the Group window, where you add a grouping field and a field to aggregated, as shown below:

 

Do not sort result set by grouping field: The result set won~{!/~}t be sorted by the specified grouping field when it is selected.

Use parallel processing: It is selected to enhance performance of data-intensive complex grouping operations.

Compare with the neighboring value only: Group records by comparing each with its next one, which is equivalent to a merge, and do not sort the result set.

Discard a group whose members are null: If result of the grouping field expression matches none, discard the group whose members are null.

Output details: Add a detail data field to store the original records in each group before summarization.

 

Click OK to get a new data panel, as shown below:

 

Set Operations

This item is for performing set-oriented operations on two data sets.

Example: Compute union of orders2020.txt and orders2021.txt.

First, import data sets to be unioned, select one of the data sets, and click Calculate -> Set Operations or click Set Operations icon on the Menu Bar to pop up Set Operations window, where you select Target Table, which is the other one of the two data sets, Operator and Target Field, as shown below:

 

Click OK to get a new data panel where the union result is displayed.

 

Join

The item is used to associate multiple data sources through joining field.

Example: Left join orders2020.txt and orders2021.txt by SID field.

First, import data tables to associated, select one of the data sources, and click Calculate -> Join or click Join icon on the Menu Bar to pop up Join window, where you select Target Table, which is the other one of the two table to be associated, Join Type, Join Field and Target Fields, as shown below:

   

Click OK to get a new data panel where the association result is displayed.