User-defined Functions

Read(380) Label: user-defined functions,

Besides the system-provided functions, esProc supports invoking user-defined functions to handle some special operations or the encapsulation of certain computations. This part briefly explains how to invoke user-defined functions with invoke function, and how to use parameter in a user-defined function and return result.  

Basic method of invoking user-defined functions

The invoke function can be used to invoke the static method(s) in a specified user-defined Java class. Take the following Java class test.Calc01 as an example:

package test;

public class Calc01 {

public static Double distance1(Number loc) {

double len = Math.abs(loc.doubleValue());

len = Math.round(len*1000)/1000d;

return Double.valueOf(len);

}

}

In this simple class, the static method distance1 computes the distance between a given coordinate and the origin, with the value being rounded to three decimal places. The methods invoked in esProc must be both static and public. Before invocation, the class of user-defined functions needs to be placed in esProc’s classpath, i.e. [installation directory]\esProc\classes; if invoked in the web environment, it needs to be placed in WEB-INF/classes path. After that, the user-defined functions can be invoked using invoke function in the cellset:

 

A

1

-12.34567

2

=invoke(test.Calc01.distance1,A1)

3

=invoke(test.Calc01.distance1, -512)

As shown in this example, when using invoke function, first specify the full path of the class being invoked and the static method name and then list the parameters in order. The parameters can be cell values in a cellset and cellset parameters, or can be directly entered. After computation, results of A2 and A3 are as follows:

 

There can be multiple user-defined functions in a single class. Now add another method:

public static Double distance2(Number loc1, Number loc2) {

double len = Math.abs(loc1.doubleValue()-loc2.doubleValue());

len = Math.round(len*1000)/1000d;

return Double.valueOf(len);

}

The newly-added distance2 computes the distance between two coordinates in a numerical axis. As esProc invokes methods by their names, different user-defined functions need to have different method names. Once the class(es) has been place in the application’s classpath, the user-defined functions can be invoked in the same cellset, no matter they are from a single class or multiple classes:

 

A

1

=invoke(test.Calc01.distance1,-12.3456)

2

=invoke(test.Calc01.distance2,12,-12)

In the invoke function, the number and type of parameters need to match the methods being invoked. Here’re results of A1 and A2:

 

Returning results as needed

The user-defined functions may or may not return results. For example:

public static void distance3(Number loc1, Number loc2) {

double len = Math.abs(loc1.doubleValue()-loc2.doubleValue());

len = Math.round(len*1000)/1000d;

System.out.println(Double.toString(len));

}

When the computation with static method distance3 is completed, the result will only be printed out to the system console but won’t be returned. Such methods are similar to the executable cells in a cellset, and > can be used when they are invoked. For example:

 

A

1

>invoke(test.Calc01.distance3,-12.3,15)

Now the output data can be viewed in the console. To access the console, click Tool>Options on the menu bar, and check Console takeover on the General page. The result is as follows:

 

If a user-defined function uses parameters or returns result, it should use data types supported by esProc; otherwise, errors may occur during presentation or invocation. Following lists the commonly-used data types in esProc and their counterparts in Java:

Integer

java.lang.Integer

Long integer

java.lang.Long

Floating point number

java.lang.Double

Big decimal

java.math.BigInteger

Real number

java.lang.Number

Boolean

java.lang.Boolean

String

java.lang.String

Date

java.sql.Date

Time

java.sql.Time

Datetime

java.sql.TimeStamp

Sequence

com.raqsoft.dm.Sequence

Table sequence

com.raqsoft.dm.Table

Binary data

byte[]

In the previous examples, all the returned results are of Double object, which corresponds to the floating point number in esProc; all parameters are Number data type, which corresponds to Real number in esProc. User-defined function can return results of different data types as needed. For instance, add another two user-defined functions to test.Calc01 to return results of different data types:

public static String distance4(Number loc1, Number loc2) {

double len = Math.abs(loc1.doubleValue()-loc2.doubleValue());

len = Math.round(len*1000)/1000d;

return Double.toString(len);

}

public static Sequence distance5(Number loc1, Number loc2) {

double len = Math.abs(loc1.doubleValue()-loc2.doubleValue());

len = Math.round(len*1000)/1000d;

com.raqsoft.dm.Sequence result = new com.raqsoft.dm.Sequence();

result.add(loc1);

result.add(loc2);

result.add(Double.valueOf(len));

return result;

}

These two static methods are also used to compute the distance between two points in a numerical axis. distance4 returns a result of string type and distance5 returns a result in a form of a sequence storing the coordinates of the two points and the distance between them. The invoke function is still used to invoke the two methods in a cellset:

 

A

1

=invoke(test.Calc01.distance2,-12.3,15)

2

=invoke(test.Calc01.distance4,-12.3,15)

3

=invoke(test.Calc01.distance5,-12.3,15)

For the convenience of comparison, A1 calls distance2 used in the previous example. After the code is executed, results of A1~A3 are as follows: 

   

Notice that the results of A1 and A2 are of different data types and are thus displayed differently. The return results of user-defined functions can be used for later computation.

Sequence type parameters

User-defined functions use esProc-specific parameters. The correspondence between data types of the parameters and those of Java objects is the same as that mentioned above. For an invocation, the parameter used in the user-defined function needs to be in consistent with the one used in the static method.

Particularly, sequence is the most frequently used data type in esProc. Besides returning a sequence as the result, user-defined functions can also use sequence type parameters. For example:

public static Double distance6(com.raqsoft.dm.Sequence seq1, com.raqsoft.dm.Sequence seq2) {

int len1 = seq1.length();

int len2 = seq2.length();

double x1 = len1 > 0 ? ((Number) seq1.get(1)).doubleValue(): 0;

double x2 = len2 > 0 ? ((Number) seq2.get(1)).doubleValue(): 0;

double y1 = len1 > 1 ? ((Number) seq1.get(2)).doubleValue(): 0;

double y2 = len2 > 1 ? ((Number) seq2.get(2)).doubleValue(): 0;

double len = Math.sqrt((x1-x2)*(x1-x2)+(y1-y2)*(y1-y2));

len = Math.round(len*1000)/1000d;

return Double.valueOf(len);

}

The user-defined function distance6 computes the distance between two points in a rectangular coordinate system. Coordinates of the two points need to be input using sequence type parameters during invocation. For example:

 

A

1

[1,1]

2

[3,3]

3

=invoke(test.Calc01.distance6,A1,A2)

Here’s A3’s result after computation:

User-defined cursor

To return a cursor based on a user-defined function, the return class should be one that implements the interface com.raqsoft.dm.IlineInput. The following test.RandDataCursor file is such an example:

package test;

public class RandDataCursor implements com.raqsoft.dm.ILineInput {

private int rowno = 0;

private int range = 1000;

public RandDataCursor(Integer rg) {

this.range = rg;

}

 

public Object[] readLine() throws java.io.IOException {

rowno++;

Object[] result = new Object[2];

result[0] = Integer.valueOf(rowno);

result[1] = Integer.valueOf((int) (Math.random() * range ));

return result;

}

public boolean skipLine() throws java.io.IOException {

rowno++;

if (rowno <= 10000) return true;

return false;

}

 

public void close() throws IOException {

rowno = 10001;

}

}

The IlineInput class should implement three methods: readLine, skipLine, and close. Here the simple RandDataCursor file aims to generate a sequence comprising random integers, whose range can be specified through a parameter. Every record returned from readLine method is made up of a row number, which increases sequentially, and a random integer, which are similar to two fields of a table sequence. the skipLine method skips one record and then returns whether or not there is still data in the cursor. In the above file, it specifies that the cursor will return 10000 records at most. The close method is used to closes the cursor to delete unwanted resources, like the database connection.

Thus the user-defined cursor that returns a cursor can take RandDataCursor as a return class. For example:

package test;

public class Calc02 {

public static test.RandDataCursor getCursor(Integer range) {

RandDataCursor rdc = new RandDataCursor(range);

return rdc;

}

}

Here the user-defined function needs to define only one parameter, the range for the random intergers, and returns a cursor RandDataCorsor, which will generate random records. But the returned IlineInput class can’t be directly used as a cursor. You need to call the newCursor method of the com.raqsoft.dm.UserUtils class to transfer it to a cursor. For example:

 

A

1

=invoke(test.Calc02.getCursor,1000000)

2

=invoke(com.raqsoft.dm.UserUtils.newCursor,A1,"")

3

>A2.skip(100)

4

=A2.fetch@x(100)

A1 returns a user-defined cursor class, RandDataCursor. A2 invokes UserUtils’s newCursor method to pass A1’s result as a parameter; its result is an esProc cursor. A3 skips the first 100 records in the cursor. A4 then fetches 100 records from the cursor, and closes it. Below is A4’s result:

Similar to retriveing text data, @t option can be used here when returning data, which will take the first row it fetches as the headers. You can also use @1 option to fetch data out as one column. By rewriting A2’s code as =invoke(com.raqsoft.dm.UserUtils.newCursor,A1,"t"), the funciton will work as an @t option is used. Below is A4’s result:

In fact RandDataCursor doesn’t specify any column names. But the first row will be returned as the column names. As you can see, the second column, which consists of random integers, in the result set is different from the previous one.