Showing posts with label java. Show all posts
Showing posts with label java. Show all posts

Tuesday, May 24, 2016

RCaller 3.0 is released!

RCaller 3.0 is released with new features.

Please visit the page

http://mhsatman.com/rcaller-3-0

for the source code, compiled binaries, other downloads and the blog post.

Hope you enjoy the project!


Sunday, March 22, 2015

Introduction to Fuzuli : JFuzuli REPL


JFuzuli is the JVM implementation of our programming language Fuzuli which is based on LISP syntax and Algol family programming logic. Fuzuli is a modern collaboration of these two separate family of languages.

Let's try JFuzuli:




1. Download the Jar 

The current compiled jar of JFuzuli interpreter is release candidate 1.0. You can download it using the link https://github.com/jbytecode/fuzuli/releases/tag/v1.0_release_candidate. You can always find the newest releases in site JFuzuli Releases.

2. Open the Command Prompt

After downloading the jar file, open your operation system's command prompt and locate the jar file by using cd (change directory) command.

3. Start trying it!

In command prompt, type

java -jar JFuzuli.jar

to start. You will see the options:

Usage:
java -jar JFuzuli.jar fzlfile
java -jar JFuzuli.jar --repl
java -jar JFuzuli.jar --editor


You can specify a fuzuli source file to run. The option --repl opens a command shell.  The last option --editor opens the GUI.  Let's try the command shell. 

java -jar JFuzuli.jar --repl
F: 

The prompt F: waits for a convenient Fuzuli expression. Now we can try some basic commands:

F: (+ 2 7)
9.0
F: (- 7 10)
-3.0
F: (require "lang.nfl")
0.0
F: (let mylist '(1 2 3))
[1.0, 2.0, 3.0]
F: (first mylist)
1.0
F: (last mylist)
3.0
F: (length mylist)
3
F: (nth mylist 0)
1.0
F: (nth mylist 1)
2.0


Well, we introduce some basic operators, data types and commands here but not all of them. We always put an operator or command after an opening parenthesis, arguments follow this operator and a closing parenthesis takes place. This is the well-known syntax of LISP and Scheme. So what is the language properties, what are the commands, how to try more Fuzuli codes in JFuzuli??

Fuzuli Language home page: http://fuzuliproject.org/
Have a nice read!


Thursday, March 19, 2015

Why is R awesome?

For someone it is a magic, somebody hates its notation (maybe you!),  it has some weird rules and maybe it is just a programming language like others (That is also my opinion). As the other programming languages, R has its good and bad properties but I can say it is the best candidate as a toolbox of a statistician or researchers who work on data analysis.

In this blog post, I collect 8 (from 0 to 7) nice properties of R. As a lecturer and researcher, I experienced that many students are more capable to understand some statistical concepts when I try to show and get them work using Monte Carlo simulations.  In R, we are able to write compact codes to demonstrate these concepts which would be difficult to implement in an other programming language. R is not a simple toy, so we are always capable to enhance our knowledge, programming skills and get capabilities of writing better codes by introducing external codes that are written in real programming languages (an old joke of real man which uses C).


So, if it is, why is R awesome ?



0. Syntax of Algol Family

R has a weird assign operator but the remaining part is similar to Algol family languages such as C, C++, Java and C#.  R has a similar facility of operator overloading (yes, it is not exactly the operator overloading), in other terms, single or compound character of symbols can be assigned to function names like this:


> '%_%' <- function(a,b){
+    return(exp(a+b))
+ }
> 5 %_% 2
[1] 1096.633


1. Vectors are primitive data types

Yes, vectors are also primitives with an opening and a closing bracket in other members of Algol. In C/C++ they are arrays of primitives and objects in Java. Contrary this, binary operators are directly applicable on the vectors and matrices in R.  For example estimation of least squares coefficients is a single line expression in R as:


> assign("x",cbind(1,1:30))
> assign("y",3+3*x[,2]+rnorm(30))
> solve(t(x) %*% x) %*% t(x) %*% y
         [,1]
[1,] 2.858916
[2,] 3.003787

This example shows the differences between a scaler and a vector:


 1
 2
 3
 4
 5
 6
 7
 8
 9
10
> assign("x", c(1,2,3))
> assign("a", 5)
> typeof(x)
[1] "double"
> typeof(a)
[1] "double"
> class(x)
[1] "numeric"
> class(a)
[1] "numeric"

No difference!


2. Theorems get alive in minutes

Suppose that X is a random variable that follows an Exponential Distribution with ratio = 5.
Sum or mean of randomly selected samples with size of N follows a normal distribution.  This is an explanation of the Central Limit Theorem with an example. Theorems are theorems. But you may see a fast demonstration (and probably a proof for educational purposes only) and try to write a rapid application. A process of writing a code like this takes minutes if you use R.


> assign("nsamp", 5000)
> assign("n", 100)
> assign("theta", 5.0)
> assign("sums", rep(0,nsamp))
> 
> for (i in 1:nsamp){
+     sums[i] <- sum(rexp(n = n, rate = theta)) 
+ }
> hist(sums)




3. There is always a second plan for faster code

Now suppose that we are drawing 50,000 samples randomly using the code above. What would be the computation time?


> assign("nsamp", 50000)
> assign("n", 100)
> assign("theta", 5.0)
> assign("sums", rep(0,nsamp))
> 
> s <- system.time(
+     for (i in 1:nsamp){
+         sums[i] <- sum(rexp(n = n, rate = theta)) 
+     }
+ )
> 
> print(s)
   user  system elapsed 
  0.582   0.000   0.572 




Drawing 50,000 samples with size 100 takes 0.582 seconds. Is it now fast enough? Lets try to write it in C++ !


#include <Rcpp.h>
using namespace Rcpp;


// [[Rcpp::export]]
NumericVector CalculateRandomSums(int m, int n) {
   NumericVector result(m);
   int i;
   for (i = 0; i < m; i++){
     result[i] = sum(rexp(n, 5.0));
   }
   return(result);
}


After compiling the code within Rcpp, we can call the function CalculateRandomSums() from R.


> s <- system.time(
+ vect <- calculaterandomsums(50000,100)
> print(s)
   user  system elapsed 
  0.185   0.000   0.184 

Now our R code is 3.145946 times slower than the code written in C++.


4. Interaction with C/C++/Fortran is enjoyable

Since a huge amount of R is written in C, migration of old C libraries is easy by writing wrapper methods using SEXP data types. Rcpp masks these routines in a clever way. Fortran code is also
linkable. Interaction with other languages makes use of old libraries in R and enables the possibility of writing faster new libraries.  It is also possible to create instances of R in C and C++ applications.
For an enjoyable example, have a look at the section 3. There is always a second plan for faster code.
The R package eive includes a small portion of C++ code and it is a compact example of calling C++ functions from within R. Accessing C++ objects from R is also possible thank to Rcpp. Click here to see the explanation and an example.


5. Interaction with Java

Calling Java from R (rJava) and calling R from Java (JRI, RCaller) are all possible. Renjin has a different concept as it is the R interpreter written in Java (Another possibility of calling R from Java , huh?).  A detailed comparison of these method is given in this documentation and this.


6. Sophisticated variable scoping

In R, functions have their own variable scopes and accessing variables at the top level is possible. Addition to this, variable scoping is handled by standard R lists (specially they are called environments) and in any side of code user based environments can be created. For detailed information visit Environment in R.


7. Optional Object Oriented Programming (O-OOP) 

R functions take values of variables as parameters rather than their addresses. If a vector with size of 10,0000 is passed through a function, R first copies this vector then passes it to the function. After body of the function is performed, the copied parameter is then labeled as free for later garbage collecting. As C/C++ programmers know, passing objects with their addresses rather than their values is a good solution for using less memory and spending less computation time. Reference classes in R are passed to functions with their addresses in a way similar to passing C++ references and Java objects to functions and methods:



 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
Person <- setRefClass(
    Class = "Person",
    fields = c("name","surname","email"),
    methods = list(
        initialize = function(name, surname, email){
            .self$name <- name
            .self$surname <- surname
            .self$email <- email
        },
        
        setName = function(name){
            .self$name <- name
        },
        
        setSurname = function(surname){
            .self$surname <- surname
        },
        
        setEMail = function (email){
            .self$email <- email
        },
        
        toString = function (){
            return(paste(name, " ", surname, " ", email))
        }   
    ) # End of methods
) # End of class



p <- Person$new("John","Brown","brown@server.org")
print(p$toString())

The output is

[1] "John   Brown   brown@server.org"

Java and C++ programmers probably like this notation!


Have a nice read!


Saturday, March 14, 2015

Handling all variables in a workspace in R with RCaller

It is known that the R assigns a value to a variable name by using the Assignment Symbol <- which corresponds to assign function.

RCaller handles results as list objects. Since R environments are list s, they can easily be converted to R lists (Visit the previous blog post on R list here).

Here is an example of RCaller on getting all variables that are created in the run time in R side.





package rcallerenvironments;

import rcaller.RCaller;
import rcaller.RCode;

public class RCallerEnvironments {

    public static void main(String[] args) {
        RCaller rcaller = new RCaller();
        RCode code = new RCode();
        rcaller.setRscriptExecutable("/usr/bin/Rscript");

        code.addRCode("a <- 3");
        code.addRCode("b <- 10.45");
        code.addRCode("d <- TRUE");
        code.addRCode("avector <- c(9,6,5,6)");
        code.addRCode("allvars <- as.list(globalenv())");

        rcaller.setRCode(code);

        rcaller.runAndReturnResult("allvars");

        System.out.println(rcaller.getParser().getNames());
        try {
            System.out.println(rcaller.getParser().getXMLFileAsString());
        } catch (Exception e) {
            System.out.println("Error in accessing XML");
        }
    }

}

The output is 



As it is seen in output, created variables avector, a, b and d are returned to Java side in a single call without any manual translations.

Have a nice read!


Friday, March 13, 2015

RCaller 2.5 is available for downloading

We are happy to announce that our 'easy to use' Java library for calling R from Java is available for downloading by now on. Developers access the compiled jar file in site


 https://github.com/jbytecode/rcaller/releases/tag/2.5


This release does not extend the main functionality of the library but now there are some handy functions for performing some calculations and later development of the library.



What is new:

* Official document bibtex added to cite RCaller in any projects or papers

* RealMatrix class is implemented. Matrix operations are performed in more 'java-ish style'

* RService is implemented for developing wrapper functions


Where to start?

* Read the web page on RCaller http://mhsatman.com/tag/rcaller/
* Read blog entries in http://stdioe.blogspot.com.tr/search/label/rcaller
* Have a look at the source tree in https://github.com/jbytecode/rcaller
* Download the library in  https://github.com/jbytecode/rcaller/releases/tag/2.5

Have a nice try!


Migration of RCaller and Fuzuli Projects to GitHub

Since Google announced that they are shutting down the code hosting service 'Google code' in which our two projects RCaller and Fuzuli Programming Language are hosted.

We migrated our projects into the popular code hosting site GitHub.

Source code of these projects will no longer be committed in Google code site. Please check the new repositories.

GitHub pages are listed below:





RCaller:

https://github.com/jbytecode/rcaller




Fuzuli Project:

https://github.com/jbytecode/fuzuli



Monday, March 9, 2015

Nearest-Neighbor Clustering using RCaller - A library for Calling R from Java

RCaller is a software for calling R from Java. A blog post includes the latest version of downloadable jar and documentation here. The latest news can always be traced using the RCaller label in Practical Code Solutions blog.

A blog post on performing a k-means clustering analysis using RCaller is also available at this link.

In the code below, two double arrays, x and y, are created in Java side. These variables are then passed to R. In R side, distance matrix d is calculated. The R function hclust performs the main calculations. Finally, calculated heights of clustering tree and a dendrogram plot are returned to Java. The source code, output text and the returned plot are presented here:




package kmeansrcaller;

import java.io.File;
import rcaller.RCaller;
import rcaller.RCode;

public class SingleLinkageClustering {

    public static void main(String[] args) {
        RCaller caller = new RCaller();
        RCode code = new RCode();
        File dendrogram = null;

        double[] x = new double[]{1, 2, 3, 4, 5, 10, 20, 30, 40, 50};
        double[] y = new double[]{2, 4, 6, 8, 10, 20, 40, 60, 80, 100};

        code.addDoubleArray("x", x);
        code.addDoubleArray("y", y);

        code.addRCode("d <- dist(cbind(x,y))");
        code.addRCode("h&<- hclust(d, method=\"single\")");

        try {
            dendrogram = code.startPlot();
            code.addRCode("plot(h)");
            code.endPlot();
        } catch (Exception e) {
            System.out.println("Plot Error: " + e.toString());
        }

        caller.setRCode(code);

        caller.setRscriptExecutable("/usr/bin/Rscript");

        caller.runAndReturnResult("h");
        System.out.println(caller.getParser().getNames());

        if (dendrogram != null) {
            code.showPlot(dendrogram);
        }

        double[] heights = caller.getParser().getAsDoubleArray("height");
        for (int i = 0; i < heights.length; i++) {
            System.out.println("Height " + i + " = " + heights[i]);
        }
    }
}




The output is 

[merge, height, order, method, call, dist_method]
Height 0 = 2.23606797749979
Height 1 = 2.23606797749979
Height 2 = 2.23606797749979
Height 3 = 2.23606797749979
Height 4 = 11.1803398874989
Height 5 = 22.3606797749979
Height 6 = 22.3606797749979
Height 7 = 22.3606797749979
Height 8 = 22.3606797749979



The screen shot of the plotted graphics is here:



Have a nice read!


Saturday, March 7, 2015

K-means clustering with RCaller - A library for calling R from Java

Here is an example of RCaller, a library for calling R from Java.

In the code below, we create two variables x and y. K-means clustering function kmeans is applied on the data matrix that consists of x and y. The result is then reported in Java.






package kmeansrcaller;

import rcaller.RCaller;
import rcaller.RCode;

public class KMeansRCaller {

    public static void main(String[] args) {
        RCaller caller = new RCaller();
        RCode code = new RCode();

        double[] x = new double[]{1, 2, 3, 4, 5, 10, 20, 30, 40, 50};
        double[] y = new double[]{2, 4, 6, 8, 10, 20, 40, 60, 80, 100};

        code.addDoubleArray("x", x);
        code.addDoubleArray("y", y);

        code.addRCode("result <- kmeans(cbind(x,y), 2)");

        caller.setRCode(code);

        caller.setRscriptExecutable("/usr/bin/Rscript");

        caller.runAndReturnResult("result");
        System.out.println(caller.getParser().getNames());

        int[] clusters = caller.getParser().getAsIntArray("cluster");
        double[][] centers = caller.getParser().getAsDoubleMatrix("centers");
        double[] totalSumOfSquares = caller.getParser().getAsDoubleArray("totss");
        // RCaller automatically replaces dots with underlines in variable names
        // So the parameter tot.withinss is accessible as tot_withinss
        double[] totalWithinSumOfSquares = caller.getParser().getAsDoubleArray("tot_withinss");
        double[] totalBetweenSumOfSquares = caller.getParser().getAsDoubleArray("betweenss");

        for (int i = 0; i < clusters.length; i++) {
            System.out.println("Observation " + i + " is in cluster " + clusters[i]);
        }

        System.out.println("Cluster Centers:");
        for (int i = 0; i < centers.length; i++) {
            for (int j = 0; j < centers[0].length; j++) {
                System.out.print(centers[i][j] + " ");
            }
            System.out.println();
        }

        System.out.println("Total Within Sum of Squares: " + totalWithinSumOfSquares[0]);
        System.out.println("Total Between Sum of Squares: " + totalBetweenSumOfSquares[0]);
        System.out.println("Total Sum of Squares: " + totalSumOfSquares[0]);
    }

}



The output is



[cluster, centers, totss, withinss, tot_withinss, betweenss, size, iter, ifault]
Observation 0 is in cluster 2
Observation 1 is in cluster 2
Observation 2 is in cluster 2
Observation 3 is in cluster 2
Observation 4 is in cluster 2
Observation 5 is in cluster 2
Observation 6 is in cluster 2
Observation 7 is in cluster 1
Observation 8 is in cluster 1
Observation 9 is in cluster 1
Cluster Centers:
40.0 6.42857142857143 
80.0 12.8571428571429 
Total Within Sum of Squares: 2328.57142857143
Total Between Sum of Squares: 11833.9285714286
Total Sum of Squares: 14162.5



Have a nice read!






Friday, August 22, 2014

Javascript and Fuzuli Integration

JFuzuli, the Java implementation of Fuzuli Programming Language now supports limited Javascript integration.

JFuzuli currently supports passing Fuzuli variables to Javascript environment, passing Javascript variables to Fuzuli environment, embedding Javascript code in any part of a Fuzuli source code.

The full support is planned to have ability of calling Fuzuli functions directly from within Javascript.

Here is the examples. This is the simplest one to demonstrate the basic usage of Javascript support:



In the example above, the variable a is set to 10 in Fuzuli part, is incremented by 1 in Javascript part and is printed in the Fuzuli part again. After all, value of a is 11.





In the example above, the variable message is first defined in Javascript section and was null in Fuzuli section at the top. And also, it is clear that the variable message is defined using the var keyword in Javascript section. After all, at the Fuzuli section, message is printed with its value which was set in Javascript section.


The example above is more interesting as it has a function which is written in Fuzuli language, but the function has its body written in Javascript! In this example, square function has a single parameter x. x is then passed to Javascript body and the result is calculated. Value of result is then returned in Fuzuli. At the end, the Fuzuli function call  (square 5) simply returns 25 which is calculated by Javascript.


Passing Arrays 

Because the list object in Fuzuli is simply a java.util.ArrayList, all public fields and methods of ArrayList are directly accessable in Javascript section. Look at the example below. In this example a list object is created with values 1,2 and 3, respectively. In Javascript section, the values of this object is cleaned first and then 10 and 20 are added to the list. Finally, in the Fuzuli section, object is printed only with values 10 and 20.


List objects can be created directly in Javascript section. Look at the example below. Since JFuzuli interpreter uses the javax.scripting framework, a Java object can be created with new keyword. The variable a is a list object in Fuzuli section again and the printed output includes two values of 10 and 20.



You can try similar examples using our online interpreter in url 

http://fuzuliproject.org/index.php?node=tryonline

Hope you get fun with Fuzuli...







Monday, April 21, 2014

Matrix Inversion with RCaller 2.2

Here is the example of passing a double[][] matrix from Java to R, making R calculate the inverse of this matrix and handling the result in Java. Note that code is current for 2.2 version of RCaller.


RCaller caller = new RCaller(); 
Globals.detect_current_rscript(); 
caller.setRscriptExecutable(Globals.Rscript_current); 

RCode code = new RCode(); 
double[][] matrix = new double[][]{{6, 4}, {9, 8}};

code.addDoubleMatrix("x", matrix); 
code.addRCode("s<-solve font="" x="">); 

caller.setRCode(code); 

caller.runAndReturnResult("s"); 

double[][] inverse = caller.getParser().getAsDoubleMatrix("s"
                                       matrix.length, matrix[0].length); 
        
for (int i = 0; i < inverse.length; i++) {
    for (int j = 0; j < inverse[0].length; j++) {
        System.out.print( inverse[i][j] + " ");
    System.out.println();
} 
 

Fuzuli for Java Online Interpreter

JFuzuli, Java version of Fuzuli interpreter, is now the main implementation of Fuzuli Programming Language. There is always a gossip on the efficiency of C and C++ over Java and people who starts to writing computer programs are encouraged to make a decision between them but not the Java. However, when you don't do things correctly, C and C++ are the worst programming languages in means of efficiency. Allocating memory and garbage collection are important issues and should be handled by user or other third party libraries but not by the language itself. As not being a computer scientist, my both C++ and Java codes include algorithmic errors and my Java codes run faster than the code written in C++. I can confess that is my fault! But it is not false to say that a big portion of programmers does not either write the correct code and their C++ code does not reach its maximum efficiency. Finally, our Java implementation is faster than the C++ version.

Lets try our online interpreter! Fuzuli is a little bit Lisp, Scheme, C and Java! Try it, learn it and join the development team. The link for the online interpreter is http://fuzuliproject.org/index.php?node=tryonline

The screenshot of the online algorithm is given below.


Sunday, April 20, 2014

Hello world application with Google Dart


Google Dart

Dart is a new programming language developed by Google, which has a similar syntax with Java and Javascript. Dart is targeted directly for the web, however, console based applications can also be written as well as it is compilable into Javascript. Google's web browser Chromium is able to run Dart codes directly but there is not an available add-on for other browsers such as Firefox and Internet Explorer. Dart SDK supports compiling Dart to Javascript, that is, any browser will run Dart code without knowing that the original code is written Dart. The community will determine whether the language will be a standard for web based application, instead using Javascript.

In this blog entry, we will show writing a basic Dart Web Application which can be considered as a "Hello World" but it is a little bit complicated. This application will create a textbox and a button. When user writes her name and after clicks the button, the program pop ups a message box.

When a new application is created in Dart Editor, you will see something like this:



Lets change the html file. Put a textbox and command button:



Clear the Dart Code:



Fill main method. Here we handle the input element using variable name. name.onclick.listen method defines
the event handler.



Complete the code


In the screen capture above, name and say are global variables and they are defined at the top of the code. They are accessable in both main() and button_click(). The body of the method button_click() is such like its counterparts written in both Java or Javascript. Lets run the code:




Have a nice read!


Saturday, August 17, 2013

A User Document For RCaller

A new research paper as a RCaller documentation is freely available at http://www.sciencedomain.org/abstract.php?iid=550&id=6&aid=4838#.U5YSoPmSy1Y




RCaller: A library for calling R from Java

by M.Hakan Satman

August 17, 2013

Contents


Abstract

RCaller is an open-source, compact, and easy-to-use library for calling R from Java. It offers not only an elegant solution for the task but its simplicity is key for non-programmers or programmers who are not familier with the internal structure of R. Since R is not only a statistical software but an enormous collection of statistical functions, accessing its functions and packages is of tremendous value. In this short paper, we give a brief introduction on the most widely-used methods to call R from Java and highlight some properties of RCaller with short examples. User feedback has shown that RCaller is an important tool in many cases where performance is not a central concern.

1 Introduction


R [R Development Core Team(2011)] is an open source and freely distributed statistics software package for which hundreds of external packages are available. The core functionality of R is written mostly in C and wrapped by R functions which simplify parameter passing. Since R manages the exhaustive dynamic library loading tasks in a clever way, calling an external compiled function is easy as calling an R function in R. However, integration with JVM (Java Virtual Machine) languages is painful.
The R package rJava [Urbanek(2011a)] provides a useful mechanism for instantiating Java objects, accessing class elements and passing R objects to Java methods in R. This library is convenient for the R packages that rely on external functionality written in Java rather than C, C++ or Fortran.
The library JRI, which is now a part of the package rJava, uses JNI (Java Native Interface) to call R from Java [Urbanek(2009)]. Although JNI is the most common way of accessing native libraries in Java, JRI requires that several system and environment variables are correctly set before any run, which can be difficult for inexperienced users, especially those who are not computer scientists.
The package Rserve [Urbanek(2011b)] uses TCP sockets and acts as a TCP server. A client establishes a connection to Rserve, sends R commands, and receives the results. This way of calling R from the other platforms is more general because the handshaking and the protocol initializing is fully platform independent.
Renjin (http://code.google.com/p/renjin) is an other interesting project that addresses the problem. It solves the problem of calling R from Java by re-implementing the R interpreter in Java! With this definition, the project includes the tasks of writing the interpreter and implementing the internals. Renjin is intended to be 100% compatible with the original. However, it is under development and needs help. After all, an online demo is available which is updated simultaneously when the source code is updated.
Finally, RCaller [RCaller Development Team(2011)] is an LGPL’d library which is very easy to use. It does not do much but wraps the operations well. It requires no configuration beyond installing an R package (Runiversal) and locating the Rscript binary distributed with R. Altough it is known to be relatively inefficient compared to other options, its latest release features significant performance improvements.

2 Calling R Functions


Calling R code from other languages is not trivial. R includes a huge collection of math and statistics libraries with nearly 700 internal functions and hundreds of external packages. No comparable library exists in Java. Although libraries such as the Apache Commons Math [Commons Math Developers(2010)] do provide many classes for those calculations, its scope is quite limited compared to R. For example, it is not easy to find such a library that calculates quantiles and probabilities of non-central distributions. [Harner et al.(2009)Harner, Luo, and Tan] affirms that using R’s functionality from Java prevents the user from writing duplicative codes in statistics softwares.
RCaller is an other open source library for performing R operations from within Java applications in a wrapped way. RCaller prepares R code using the user input. The user input is generally a Java array, a plain Java object or the R code itself. It then creates an external R process by running the Rscript executable. It passes the generated R code and receives the output as XML documents. While the process is alive, the output of the standard input and the standard error streams are handled by an event-driven mechanism. The returned XML document is then parsed and the returned R objects are extracted to Java arrays.
The short example given below creates two double vectors, passes them to R, and returns the residuals calculated from a linear regression estimation.
RCaller caller = new RCaller();
RCode code = new RCode();
double[] xvector = new double[]{1,3,5,3,2,4};
double[] yvector = new double[]{6,7,5,6,5,6};

caller.setRscriptExecutable("/usr/bin/Rscript");

code.addDoubleArray("X", xvector);
code.addDoubleArray("Y", yvector);
code.addRCode("ols <- lm ( Y ~ X )");

caller.setRCode(code);

caller.runAndReturnResult("ols");

double[] residuals =
   caller.getParser().
     getAsDoubleArray("residuals");  

The lm function returns an R list with a class of lm whose elements are accessible with the $ operator. The method runAndReturnResult() takes the name of an R list which contains the desired results. Finally, the method getAsDoubleArray() returns a double vector with values filled from the vector residuals of the list ols.
RCaller uses the R package Runiversal [Satman(2010)] to convert R lists to XML documents within the R process. This package includes the method makexml() which takes an R list as input and returns a string of XML document. Although some R functions return the results in other types and classes of data, those results can be returned to the JVM indirectly. Suppose that obj is an S4 object with members member1 and member2. These members are accessible with the @ operator like obj@member1 and obj@member2. These elements can be returned to Java by constructing a new list like result\A1-list(m1=obj@member1, m2=obj@member2).

3 Handling Plots


Although the graphics drivers and the internals are implemented in C, most of the graphics functions and packages are written in the R language and this makes the R unique with its graphics library. RCaller handles a plot with the function startPlot() and receives a java.io.File reference to the generated plot. The function getPlot() returns an instance of the javax.swing.ImageIcon class which contains the generated image in a fully isolated way. A Java example is shown below:
RCaller caller = new RCaller();
RCode code = new RCode();
File plotFile = null;
ImageIcon plotImage = null;

caller.
setRscriptExecutable("/usr/bin/Rscript");

code.R_require("lattice");

try{
 plotFile = code.startPlot();
 code.addRCode("
      xyplot(rnorm(100)~1:100, type=’l’)
      ");
}catch (IOException err){
 System.out.println("Can not create plot");
}

caller.setRCode(code);
caller.runOnly();

plotImage = code.getPlot(plotFile);
code.showPlot(plotFile);

The method runOnly() is quite different from the method RunAndReturnResult(). Because the user only wants a plot to be generated, there is nothing returned by R in the example above. Note that more than one plots can be generated in a single run.
Handling R plots with a java.io.File reference is also convenient in web projects. Generated content can be easly sent to clients using output streams opened from the file reference. However, RCaller uses the temp directory and does not delete the generated files automatically. This may be a cause of a too many files OS level error which can not be caught by a Java program. However, cleaning the generated output using a scheduled task solves this problem.

4 Live Connection


Each time the method runAndReturnResult() is called, an Rscript instance is created to perform the operations. This is the main source of the inefficiency of RCaller. A better approach in the cases that R commands are repeatedly called is to use the method runAndReturnResultOnline(). This method creates an R instance and keeps it running in the background. This approach avoids the time required to create an external process, initialize the interpreter, and load packages in subsequent calls.
The example given below returns the determinants of a given matrix and its inverse in sequence, that is, it uses a single external instance to perform more than one operation.
double[][] matrix =
    new double[][]{{5,4,5},{6,1,0},{9,-1,2}};
caller.setRExecutable("/usr/bin/R");
caller.setRCode(code);

code.clear();
code.addDoubleMatrix("x", matrix);
code.addRCode("result<-list(d=det(x))");
caller.runAndReturnResultOnline("result");

System.out.println(
"Determinant is " +
  caller.getParser().
   getAsDoubleArray("d")[0]
   );

code.addRCode("result<-list(t=det(solve(x)))");
caller.runAndReturnResultOnline("result");

System.out.println(
"Determinant of inverse is " +
  caller.getParser().
   getAsDoubleArray("t")[0]
   );

This use of RCaller is fast and convenient for repeated commands. Since R is not thread-safe, its functions can not be called by more than one threads. Therefore, each single thread must create its own R process to perform calculations simultaneously in Java.

5 Monitoring the Output


RCaller receives the desired content as XML documents. The content is a list of the variables of interest which are manually created by the user or returned automatically by a function. Apart from the generated content, R produces some output to the standard output (stdout) and the standard error (stderr) devices. RCaller offers two options to handle these outputs. The first one is to save them in a text file. The other is to redirect all of the content to the standard output device. The example given below shows a conditional redirection of the outputs generated by R.
if(console){
 caller.redirectROutputToConsole();
}else{
 caller.redirectROutputToFile(
     "output.txt" /* filename */,
     true  /* append? */);
}

6 Conclusion


In addition to being a statistical software, R is an extendable library with its internal functions and external packages. Since the R interpreter was written mostly in C, linking to custom C/C++ programs is relatively simple. Unfortunately, calling R functions from Java is not straightforward. The prominent methods use JNI and TCP sockets to solve this problem. In addition, renjin offers a different perspective to this issue. It is a re-implementation of R in Java which is intended to be 100% compatible with the original. However, it is under development and needs help. Finally, RCaller is an alternative way of calling R from Java. It is packaged in a single jar and it does not require setup beyond the one-time installation of the R package Runiversal. It supports loading external packages, calling functions, handling plots and debugging the output generated by R. It is not the most efficient method compared to the alternatives, but users report that performance improvements in the latest revision and its simplicity of use make it an important tool in many applications.

References


[Commons Math Developers(2010)]   Commons Math Developers. Apache Commons Math, Release 2.1. Available from http://commons.apache.org/math/download_math.cgi, Apr. 2010. URL http://commons.apache.org/math.
[Harner et al.(2009)Harner, Luo, and Tan]   E. Harner, D. Luo, and J. Tan. JavaStat: A Java/R-based statistical computing environment. Computational Statistics, 24(2):295–302, May 2009.
[R Development Core Team(2011)]   R Development Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, 2011. URL http://www.R-project.org/. ISBN 3-900051-07-0.
[RCaller Development Team(2011)]   RCaller Development Team. RCaller: A library for calling R from Java, 2011. URL http://code.google.com/p/rcaller.
[Satman(2010)]   M. H. Satman. Runiversal: A Package for converting R objects to Java variables and XML., 2010. URL http://CRAN.R-project.org/package=Runiversal. R package version 1.0.1.
[Urbanek(2009)]   S. Urbanek. How to talk to strangers: ways to leverage connectivity between R, Java and Objective C. Computational Statistics, 24(2):303–311, May 2009.
[Urbanek(2011a)]   S. Urbanek. rJava: Low-level R to Java interface, 2011a. URL http://CRAN.R-project.org/package=rJava. R package version 0.9-2.
[Urbanek(2011b)]   S. Urbanek. Rserve: Binary R server, 2011b. URL http://CRAN.R-project.org/package=Rserve. R package version 0.6-5.





A new research paper as a RCaller documentation is freely available at http://www.sciencedomain.org/abstract.php?iid=550&id=6&aid=4838#.U5YSoPmSy1Y