Datatype conversion

R and Java have quite different philosophy on datatypes that mean that loss less round tripping of data through the limited expressibility of the JNI bridge is a non trivial exercise. This is why a code generation library is valuable as it removes the need for developers to understand the grim complexity of the details. The following areas proved to be the most difficult:

  • In R everything is a vector. In Java vectors are closest in nature to collections, either Lists or insertion ordered Maps.
  • In Java there is a fundamental difference between a single value and an array of values. In R they are interchangeable.
  • In R everything can be named. In Java data is not “named” as such unless it is in a Map.
  • Java distinguishes between primitive and boxed data types. R has no primitive type except raw bytes.
  • JNI only really natively supports primitive data types, and arrays thereof.
  • R expects data to be in vector structures containing the same data types, and thus operations on R vectors and dataframes are intrinsically column based operations. Java is generally written to operate on collections of objects which compose different data types into a single structure, and thus operations tend to be intrinsically row based.
  • R lists are complex un-typed structures with more similarity to JSON trees than to Java Collections.
  • R is 1 based, java is 0 based for indexes.
  • R has typed NA values. Java null is un-typed.
  • Java void and R NULL are not exactly the same conceptually, as an R variable can be NULL but a Java variable can never be void. Boxed Void types in Java are the programming equivalent of Schrodinger’s cat.

By inspecting the Java code at compilation time, and by imposing constraints on the types of data transferred between R and Java, it is possible to use a combination of convention, data-type coercion, and type checking to ensure that inputs to Java code from R are type safe, and (almost) 100% faithful copies of their R equivalents and vice-versa. This differs in approach from r cran("jsr223") which performs dynamic data conversion from R types to generic R data structures. This is harder to make consistent and requires a degree of introspection, during the marshaling and un-marshaling process. Enforcing rigid type systems on the interface between R and Java allows the simpler transformation to be made, based on the strongly typed Java code at compile time, which should result in faster but less flexible data transfer.

Round tripping data from R to Java to R

R data types converted into Java.

Nulls and voids

Ensuring an R NULL value is correctly returned requires a placeholder class in Java. This is uk.co.terminological.rjava.types.RNull and it enforces some of the identity constraints. Java void types can be best represented as invisible(NULL) which is almost the same as not returning anything from a method.

    
    @RMethod
    public RNull bounceNull(RNull x) {
        log.info("java: "+x);
        System.out.println(x.rCode());
        return x;
    }
    
    @RMethod
    public void bounceVoid() {
        log.info("java void");
    }
    
# devtools::install("~/Git/r6-generator-docs/")
J = testRapi::JavaApi$get()
## Initialised testRapi
b = J$BounceTest$new()
## Tests the round tripping of supported datatypes
b$bounceNull(NULL)
## java: NULL
## NULL
b$bounceVoid()
## java void

Character data

Strings are the least contentious format with high similarity between R and Java, so long as we are all using UTF-8. An R character vector can be transferred to Java seamlessly except for the fact that it is transferred as an array of characters over JNI. A Java equivalent of uk.co.terminological.rjava.types.RCharacter is provided to mark the object as an R compatible datatype, in general though java.lang.String can used instead of singleton strings. For character vectors we have uk.co.terminological.rjava.types.RCharacterVector which is a specialised Java collection of uk.co.terminological.rjava.types.RCharacter.

    
    @RMethod
    public String bounceString(String x) {
        log.info("java: "+x);
        return x;
    }
    
    
    @RMethod
    public RCharacter bounceCharacter(RCharacter x) {
        log.info("java: "+x);
        //System.out.println(x.rCode());
        return x;
    }
    
    @RMethod
    public RCharacterVector bounceCharacterVector(RCharacterVector x) {
        log.info("java: "+x);
        //System.out.println(x.rCode());
        return x;
    }
    
b$bounceString("Hello")
## java: Hello
## [1] "Hello"
b$bounceCharacter("Hello")
## java: Hello
## [1] "Hello"
b$bounceCharacterVector(c("Hello","World"))
## java: <rcharacter[2]>{Hello, World}
## [1] "Hello" "World"

Numeric data

Numerics in R can be represented a number of ways in Java with different degrees of precision. Various Java types, such as java.lang.Double, java.lang.Float, java.math.BigDecimal, java.lang.Long, and their primitive counterparts double,float,long are all best represented by R numeric values. The uk.co.terminological.rjava.types.RNumeric type allows Java programmers to dynamically convert input R Numeric values to a specific native Java type. The uk.co.terminological.rjava.RConverter class provides various methods to convert native Java types to uk.co.terminological.rjava.types.RNumeric for output. These functions also handles the special values of Inf and -Inf, NaN, and NA_real_ and their equivalent values in Java. To support the use of native Java, singleton RNumeric values can be substituted with primitive Java double in method signatures as long as the inputs can never be NA_real_, and the code is not run asynchronously. Vector numeric inputs from R are specified by the uk.co.terminological.rjava.types.RNumericVector class, which is a collection of RNumeric values.

    
    @RMethod
    public double bounceDouble(double x) {
        log.info("java: "+x);
        return x;
    }
    
    @RMethod
    public RNumeric bounceNumeric(RNumeric x) {
        log.info("java: "+x);
        //System.out.println(x.rCode());
        return x;
    }
    
    @RMethod
    public RNumericVector bounceNumericVector(RNumericVector x) {
        log.info("java: "+x);
        //System.out.println(x.rCode());
        return x;
    }
    
b$bounceDouble(1.23)
## java: 1.23
## [1] 1.23
# throws an error as primitive double cannot be null
try(b$bounceDouble(NA_real_))
## Error in self$.api$.toJava$double(x) : 
##   cant use NA as input to java double
b$bounceNumeric(4.6)
## java: 4.60
## [1] 4.6
b$bounceNumeric(NA_real_)
## java: NA
## [1] NA
b$bounceNumeric(Inf)
## java: Infinity
## [1] Inf
b$bounceNumeric(-Inf)
## java: -Infinity
## [1] -Inf
b$bounceNumeric(0/0)
## java: NaN
## [1] NaN
# Wrongly typed input is coerced to numerics
b$bounceNumericVector(c(2L,5L,34L))
## java: <rnumeric[3]>{2.00, 5.00, 34.0}
## [1]  2  5 34
b$bounceNumericVector(c(2.3,4.6,NA_real_,34,NaN,Inf,-Inf))
## java: <rnumeric[7]>{2.30, 4.60, NA, 34.0, NaN, Infinity, -Infinity}
## [1]  2.3  4.6   NA 34.0  NaN  Inf -Inf

Integer data

In an identical way to above uk.co.terminological.rjava.types.RInteger types hold integer values. R inputs to Java functions are coerced to integers and an error thrown if this is not possible. The primitive int Java type can be used for singletons if they are not NA, and vectors are handled in the same way as before with a specialist uk.co.terminological.rjava.types.RIntegerVector. Non integer numeric types are not coerced to integer, but rather a runtime error is thrown to the user if they try and use the wrong type.

    
    @RMethod
    public int bounceInt(int x) {
        log.info("java: "+x);
        return x;
    }
    
    @RMethod
    public RInteger bounceInteger(RInteger x) {
        log.info("java: "+x);
        //System.out.println(x.rCode());
        return x;
    }
    
    @RMethod
    public RIntegerVector bounceIntegerVector(RIntegerVector x) {
        log.info("java: "+x);
        //System.out.println(x.rCode());
        return x;
    }
    
b$bounceInt(1)
## java: 1
## [1] 1
b$bounceInteger(3)
## java: 3
## [1] 3
b$bounceIntegerVector(c(2L,3L,4L))
## java: <rinteger[3]>{2, 3, 4}
## [1] 2 3 4
b$bounceIntegerVector(c(2,3,4))
## java: <rinteger[3]>{2, 3, 4}
## [1] 2 3 4
b$bounceIntegerVector(c(2L,NA,4L))
## java: <rinteger[3]>{2, NA, 4}
## [1]  2 NA  4

Factors

Factors are somewhat complicated as a individual factor only makes sense in the context of a vector of possible options. However the uk.co.terminological.rjava.types.RFactor type and uk.co.terminological.rjava.types.RFactorVector collection allow information to be retained about the values and labels for R Factors. There is support for mapping R factors to Java Enum classes which is provided by uk.co.terminological.rjava.RConverter and creating R ordered factors from Enums.

    
    @RMethod
    public RFactor bounceFactor(RFactor x) {
        log.info("java: "+x);
        //System.out.println(x.rCode());
        return x;
    }
    
    @RMethod
    public RFactorVector bounceFactorVector(RFactorVector x) {
        log.info("java: "+x);
        //System.out.println(x.rCode());
        return x;
    }
    
factorVec = as.factor(c("a","b","c","b","a"))
b$bounceFactor(as.factor(factorVec[[1]]))
## java: a
## [1] "a"
b$bounceFactorVector(factorVec)
## java: <rfactor[5]>{a, b, c, b, a}
## [1] a b c b a
## Levels: a < b < c

Dates

Date support is provided by uk.co.terminological.rjava.types.RDate which allows R Date and POSIXt types to be represented in Java as java.time.LocalDates. Vectors of dates are also supported as before. There is currently no support for datetime classes but this is a possible enhancement. There is a uk.co.terminological.rjava.types.RDateVector class for collections.

    
    @RMethod
    public RDate bounceDate(RDate x) {
        log.info("java: "+x);
        //System.out.println(x.rCode());
        return x;
    }
    
    @RMethod
    public RDateVector bounceDateVector(RDateVector x) {
        log.info("java: "+x);
        //System.out.println(x.rCode());
        return x;
    }
    
b$bounceDate(as.Date("2001-02-03"))
## java: 2001-02-03
## [1] "2001-02-03"
class(b$bounceDate(as.Date("2001-02-03")))
## java: 2001-02-03
## [1] "Date"
b$bounceDateVector(as.Date(c("2001-02-03","2001-02-04","2001-02-05")))
## java: <rdate[3]>{2001-02-03, 2001-02-04, 2001-02-05}
## [1] "2001-02-03" "2001-02-04" "2001-02-05"
b$bounceDateVector(as.Date(c("2001-02-03",NA,"2001-02-05")))
## java: <rdate[3]>{2001-02-03, NA, 2001-02-05}
## [1] "2001-02-03" NA           "2001-02-05"
b$bounceDate(as.Date("0000-01-01")-5)
## dates smaller than 0001-01-01 will be converted to NA
## java: NA
## [1] NA
b$bounceDateVector(as.Date(c("0000-01-01","1-01-01","11-01-01","101-01-01","1001-01-01")))
## dates smaller than 0001-01-01 will be converted to NA
## java: <rdate[5]>{NA, 0001-01-01, 0011-01-01, 0101-01-01, 1001-01-01}
## [1] NA           "1-01-01"    "11-01-01"   "101-01-01"  "1001-01-01"

Logicals

R logicals are mapped to uk.co.terminological.rjava.types.RLogical objects which can represent NA_logical_ values faithfully. If NA values are not needed then primitive boolean types can be substituted as before, and vectors work as before.

    
    
    @RMethod
    public RLogical bounceLogical(RLogical x) {
        log.info("java: "+x);
        //System.out.println(x.rCode());
        return x;
    }
    
    @RMethod
    public RLogicalVector bounceLogicalVector(RLogicalVector x) {
        log.info("java: "+x);
        //System.out.println(x.rCode());
        return x;
    }
    
    @RMethod
    public RFile bounceFile(RFile x) {
        log.info("java: "+x);
        return x;
    }
    
b$bounceLogical(TRUE)
## java: true
## [1] TRUE
b$bounceLogicalVector(c(TRUE,TRUE,FALSE))
## java: <rlogical[3]>{true, true, false}
## [1]  TRUE  TRUE FALSE
b$bounceLogicalVector(c(TRUE,NA,FALSE))
## java: <rlogical[3]>{true, NA, false}
## [1]  TRUE    NA FALSE

Files

Files in R need to be converted to absolute paths before they can be used in Java. Tilde path expansion is also needed to correctly pick up the users home directory. Relative paths are considered to be relative to whatever the current working directory is in R at the time the function is called. All of these edge cases can be ignored in Java. However we do not enforce that the parent directory must exist, that is up to the Java developer.

b$bounceFile("~/tmp/test1")
## java: /home/vp22681/tmp/test1
## /home/vp22681/tmp/test1
b$bounceFile("../tmp/test2")
## java: /home/vp22681/Dropbox/Git/r6-generator-docs/tmp/test2
## /home/vp22681/Dropbox/Git/r6-generator-docs/tmp/test2
b$bounceFile("tmp/test3")
## java: /home/vp22681/Dropbox/Git/r6-generator-docs/vignettes/tmp/test3
## /home/vp22681/Dropbox/Git/r6-generator-docs/vignettes/tmp/test3

Dataframes

In Java R dataframes are modelled as a named list of uk.co.terminological.rjava.types.RVector<?> each holding columnar data of unspecified type. This is represented internally as a column wise Map, but the uk.co.terminological.rjava.types.RDataframe class contains a number of methods to make using dataframes intuitive in Java. This includes support for Iterable and Stream interfaces operating row-wise over the data. The uk.co.terminological.rjava.types.RBoundDataframe can map typed columns to a stream of proxy objects satisfying an interface specification. This can be used to convert a RDataframe into a stream of custom POJOs (more examples TBD). The dataframe can support any column with vector data types mentioned above. At present however it does not support named rows, as the focus is on tidy dataframes, nor does it support purrr style list columns.

    
    @RMethod
    public RDataframe bounceDataframe(RDataframe x) {
        log.info("java: "+x);
        //System.out.println(x.rCode());
        //System.out.println(x.rConversion());
        return x;
    }
    
testDf = tibble::tibble(
  grp = c("A","A","A","B","B","B"),
  x=c(0,1,2,4,5,6),
  y=c(3L,2L,1L,-1L,-2L,-3L)
)
testDf = dplyr::group_by(testDf,grp,x)

b$bounceDataframe(testDf)
## java: groups: [grp, x]
## grp: <rcharacter[6]>{A, A, A, B, B, B}
## x: <rnumeric[6]>{0.00, 1.00, 2.00, 4.00, 5.00, 6.00}
## y: <rinteger[6]>{3, 2, 1, -1, -2, -3}
## # A tibble: 6 × 3
## # Groups:   grp, x [6]
##   grp       x     y
##   <chr> <dbl> <int>
## 1 A         0     3
## 2 A         1     2
## 3 A         2     1
## 4 B         4    -1
## 5 B         5    -2
## 6 B         6    -3
b$bounceDataframe(tibble::tibble(
  u=factorVec[1:3],
  v=c(TRUE,NA,FALSE),
  w=c("alpha",NA,"gamma"),
  x=c(0,1,2),
  y=c(3L,2L,1L),
  z=as.Date(c("2001-02-03",NA,"2001-02-05"))
))
## java: groups: []
## u: <rfactor[3]>{a, b, c}
## v: <rlogical[3]>{true, NA, false}
## w: <rcharacter[3]>{alpha, NA, gamma}
## x: <rnumeric[3]>{0.00, 1.00, 2.00}
## y: <rinteger[3]>{3, 2, 1}
## z: <rdate[3]>{2001-02-03, NA, 2001-02-05}
## # A tibble: 3 × 6
##   u     v     w         x     y z         
##   <ord> <lgl> <chr> <dbl> <int> <date>    
## 1 a     TRUE  alpha     0     3 2001-02-03
## 2 b     NA    NA        1     2 NA        
## 3 c     FALSE gamma     2     1 2001-02-05

Arrays

test2DArr = matrix(seq(0,9.9,0.1),c(10,10))
b$bounceArray(test2DArr)
## java: <rnumeric[10]>{
## <rnumeric[10]>{0.00, 0.100, 0.200, 0.300, 0.400, 0.500, 0.600, 0.700, 0.800, 0.900},
## <rnumeric[10]>{1.00, 1.10, 1.20, 1.30, 1.40, 1.50, 1.60, 1.70, 1.80, 1.90},
## <rnumeric[10]>{2.00, 2.10, 2.20, 2.30, 2.40, 2.50, 2.60, 2.70, 2.80, 2.90},
## <rnumeric[10]>{3.00, 3.10, 3.20, 3.30, 3.40, 3.50, 3.60, 3.70, 3.80, 3.90},
## <rnumeric[10]>{4.00, 4.10, 4.20, 4.30, 4.40, 4.50, 4.60, 4.70, 4.80, 4.90},
## <rnumeric[10]>{5.00, 5.10, 5.20, 5.30, 5.40, 5.50, 5.60, 5.70, 5.80, 5.90},
## <rnumeric[10]>{6.00, 6.10, 6.20, 6.30, 6.40, 6.50, 6.60, 6.70, 6.80, 6.90},
## <rnumeric[10]>{7.00, 7.10, 7.20, 7.30, 7.40, 7.50, 7.60, 7.70, 7.80, 7.90},
## <rnumeric[10]>{8.00, 8.10, 8.20, 8.30, 8.40, 8.50, 8.60, 8.70, 8.80, 8.90},
## <rnumeric[10]>{9.00, 9.10, 9.20, 9.30, 9.40, 9.50, 9.60, 9.70, 9.80, 9.90}
## }
##       [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
##  [1,]  0.0  1.0  2.0  3.0  4.0  5.0  6.0  7.0  8.0   9.0
##  [2,]  0.1  1.1  2.1  3.1  4.1  5.1  6.1  7.1  8.1   9.1
##  [3,]  0.2  1.2  2.2  3.2  4.2  5.2  6.2  7.2  8.2   9.2
##  [4,]  0.3  1.3  2.3  3.3  4.3  5.3  6.3  7.3  8.3   9.3
##  [5,]  0.4  1.4  2.4  3.4  4.4  5.4  6.4  7.4  8.4   9.4
##  [6,]  0.5  1.5  2.5  3.5  4.5  5.5  6.5  7.5  8.5   9.5
##  [7,]  0.6  1.6  2.6  3.6  4.6  5.6  6.6  7.6  8.6   9.6
##  [8,]  0.7  1.7  2.7  3.7  4.7  5.7  6.7  7.7  8.7   9.7
##  [9,]  0.8  1.8  2.8  3.8  4.8  5.8  6.8  7.8  8.8   9.8
## [10,]  0.9  1.9  2.9  3.9  4.9  5.9  6.9  7.9  8.9   9.9
test3DArr = array(seq(0,63),c(8,4,2))
b$bounceArray(test3DArr)
## java: <rnumeric[2]>{
## <rnumeric[4]>{
## <rnumeric[8]>{0.00, 1.00, 2.00, 3.00, 4.00, 5.00, 6.00, 7.00},
## <rnumeric[8]>{8.00, 9.00, 10.0, 11.0, 12.0, 13.0, 14.0, 15.0},
## <rnumeric[8]>{16.0, 17.0, 18.0, 19.0, 20.0, 21.0, 22.0, 23.0},
## <rnumeric[8]>{24.0, 25.0, 26.0, 27.0, 28.0, 29.0, 30.0, 31.0}
## },
## <rnumeric[4]>{
## <rnumeric[8]>{32.0, 33.0, 34.0, 35.0, 36.0, 37.0, 38.0, 39.0},
## <rnumeric[8]>{40.0, 41.0, 42.0, 43.0, 44.0, 45.0, 46.0, 47.0},
## <rnumeric[8]>{48.0, 49.0, 50.0, 51.0, 52.0, 53.0, 54.0, 55.0},
## <rnumeric[8]>{56.0, 57.0, 58.0, 59.0, 60.0, 61.0, 62.0, 63.0}
## }
## }
## , , 1
## 
##      [,1] [,2] [,3] [,4]
## [1,]    0    8   16   24
## [2,]    1    9   17   25
## [3,]    2   10   18   26
## [4,]    3   11   19   27
## [5,]    4   12   20   28
## [6,]    5   13   21   29
## [7,]    6   14   22   30
## [8,]    7   15   23   31
## 
## , , 2
## 
##      [,1] [,2] [,3] [,4]
## [1,]   32   40   48   56
## [2,]   33   41   49   57
## [3,]   34   42   50   58
## [4,]   35   43   51   59
## [5,]   36   44   52   60
## [6,]   37   45   53   61
## [7,]   38   46   54   62
## [8,]   39   47   55   63

Lists

In R, lists and named are complex objects with optionally named sequences of arbitrary typed data. They are analogous to JSON objects and it is tempting to serialise all R lists to JSON and use a JSON library to interpret them in Java. This would be possible but lose some of the support built into the R classes mentioned above. As such we took a hybrid approach where R lists and named lists are dynamically and recursively mapped to collection types from R to Java, and exported back from Java to R serialised as a character string containing R code, which is evaluated by the R interpreter. Despite being somewhat hacky this does a surprisingly good job at transferring lists from R to Java and back to R faithfully. However it is probably not well suited to very large lists and definitely could not support lists that have cyclical structures in the object graph. To support fluent use of R Lists in Java all classes that derive from RObject support the visitor pattern, which can be used to relatively simply select out datatypes of interest. Support for a XPath like syntax to access specific elements of nested lists is planned.

    
    @RMethod
    public RList bounceList(RList x) {
        log.info("java: "+x);
        System.out.println(x.rCode());
        return x;
    }
    
    @RMethod
    public RNamedList bounceNamedList(RNamedList x) {
        log.info("java: "+x);
        System.out.println(x.rCode());
        return x;
    }
    
b$bounceList(list("a","b","c",c(1,2,3)))
## java: <rlist>{
## a,
## b,
## c,
## <rnumeric[3]>{1.00, 2.00, 3.00}
## }
## list('a', 'b', 'c', c(1.0, 2.0, 3.0))
## [[1]]
## [1] "a"
## 
## [[2]]
## [1] "b"
## 
## [[3]]
## [1] "c"
## 
## [[4]]
## [1] 1 2 3
input = list("a",list("b",1,"z"),"c",c(1,2,3))
output = b$bounceList(input)
## java: <rlist>{
## a,
## <rlist>{
## b,
## 1.00,
## z
## },
## c,
## <rnumeric[3]>{1.00, 2.00, 3.00}
## }
## list('a', list('b', 1.0, 'z'), 'c', c(1.0, 2.0, 3.0))
if (!identical(input,output)) stop("FAIL")

b$bounceNamedList(list(x="a",b=c("a",NA,"c"),c=1))
## java: <rnamedlist>{
## x: a,
## b: <rcharacter[3]>{a, NA, c},
## c: 1.00
## }
## list(x='a', b=c('a', NA, 'c'), c=1.0)
## $x
## [1] "a"
## 
## $b
## [1] "a" NA  "c"
## 
## $c
## [1] 1

Generating data in Java

So far we have concentrated on the use case of transferring data from R to Java and back again. However we also wish to be able to rapidly create data in Java that is going to be faithfully preserved in R. For this end we have created a number of type converters, builder functions and collectors, that help to marshal native Java data into RObjects.

Primitive equivalents

the RPrimitive interface possesses a range of factory methods to generate appropriately typed RPrimitives from Java primitives, boxed types, and Enums.

    
    @RMethod
    public RCharacter generateCharacter() {
        return RPrimitive.of("Hello world");
    }
    
    @RMethod
    public RNumeric generateNumeric() {
        return RPrimitive.of(123.0);
    }
    
    @RMethod
    public RInteger generateInteger() {
        return RPrimitive.of(345);
    }
    
    public static enum Test {
        ONE,TWO,THREE
    }
    
    @RMethod
    public RFactor generateFactor() {
        return RPrimitive.of(Test.ONE);
    }
    
    @RMethod
    public RLogical generateLogical() {
        return RPrimitive.of(true);
    }
    
g = J$FactoryTest$new()
## Tests the java creation of supported datatypes
g$generateCharacter()
## [1] "Hello world"
g$generateNumeric()
## [1] 123
g$generateInteger()
## [1] 345
g$generateFactor()
## [1] "ONE"
g$generateLogical()
## [1] TRUE

Vectors

Similarly the RVector class supports a range of fluent builder methods which allow de novo creation of correctly typed RVectors. The RConverter class also provides a range of collectors that facilitate mapping Java Streams to R Vectors.

    
    @RMethod
    public RCharacterVector generateCharacterVec() {
        return RVector.with("Hello world","Ola el mundo","Bonjour le monde", null);
    }
    
    @RMethod
    public RNumericVector generateNumericVec() {
        return DoubleStream
                .of(3.0, 4.3, 2.1, 2.3).boxed()
                .collect(RConverter.doubleCollector());
    }
    
    @RMethod
    public RIntegerVector generateIntegerVec() {
        return RVector.with(345, 678, null, 89);
    }
    
    @RMethod
    public RFactorVector generateFactorVec() {
        return RVector.with(Test.ONE,Test.THREE,null,Test.TWO);
    }
    
    @RMethod
    public RLogicalVector generateLogicalVec() {
        return RVector.with(true,false,null);
    }
    
g$generateFactorVec()
## [1] ONE   THREE <NA>  TWO  
## Levels: ONE < TWO < THREE
g$generateIntegerVec() 
## [1] 345 678  NA  89
g$generateCharacterVec()
## [1] "Hello world"      "Ola el mundo"     "Bonjour le monde" NA
g$generateNumericVec()
## [1] 3.0 4.3 2.1 2.3
g$generateLogicalVec()
## [1]  TRUE FALSE    NA

Dataframes and lists

Both RDataframe and RList classes implement fluent methods to allow the creation of complex data structures in a method familiar to Java programmers. Again RConverter provides specialised collectors to map a stream of objects representing sequential rows of data to the columnar format of the RDataframe using MapRules specified using functional lambda syntax to define the mapping from object to dataframe column.

    
    @RMethod
    public RDataframe generateDataframe() {
        return RDataframe.create()
            .withCol("A", RVector.with(3.0, 4.3, 2.1))
            .withCol("B", RVector.with(Test.ONE,Test.THREE,Test.TWO))
            .withCol("C", RVector.with("Hello world","Ola el mundo","Bonjour le monde"));
    }
    
    @RMethod
    public RDataframe generateStreamDataframe() {
        return 
            Arrays.asList("Hello","World","Stream","Support","in","Java")
            .stream()
            .collect(RConverter.dataframeCollector(
                RConverter.mapping("original", s-> s),
                RConverter.mapping("lowercase", s-> s.toLowerCase()),
                RConverter.mapping("uppercase", s-> s.toUpperCase()),
                RConverter.mapping("subst", s-> s.substring(0,Math.min(3,s.length()))),
                RConverter.mapping("length", s-> s.length())
            ));
        
    }
    
g$generateDataframe()
## # A tibble: 3 × 3
##       A B     C               
##   <dbl> <ord> <chr>           
## 1   3   ONE   Hello world     
## 2   4.3 THREE Ola el mundo    
## 3   2.1 TWO   Bonjour le monde
g$generateStreamDataframe()
## # A tibble: 6 × 5
##   original lowercase uppercase subst length
##   <chr>    <chr>     <chr>     <chr>  <int>
## 1 Hello    hello     HELLO     Hel        5
## 2 World    world     WORLD     Wor        5
## 3 Stream   stream    STREAM    Str        6
## 4 Support  support   SUPPORT   Sup        7
## 5 in       in        IN        in         2
## 6 Java     java      JAVA      Jav        4

Finally a note on list generation that contains Enum values in Java cannot always be converted to factors in R. If this is not possible then the conversion will fall back to a character string of the label of the factor value.

    
    /**
     * Lists are much harder to type check than vectors hence RList builder methods throw checked exceptions 
     * @return a RList containing the supplied Java objects converted into RObjects
     * @throws UnconvertableTypeException if those objects are not themselves or cannot be converted into an RObject 
     */
    @RMethod
    public RList generateList() throws UnconvertableTypeException {
        return RList.withRaw("one", Test.TWO, 3.0);
    }
    
    @RMethod
    public RNamedList generateNamedList() throws UnconvertableTypeException {
        return RNamedList
                .withRaw("A","one")
                .andRaw("B", Test.TWO)
                .andRaw("C", RVector.with(3.0, 4.3, 2.1));
    }
    
g$generateList()
## [[1]]
## [1] "one"
## 
## [[2]]
## [1] "TWO"
## 
## [[3]]
## [1] 3
g$generateNamedList()
## $A
## [1] "one"
## 
## $B
## [1] "TWO"
## 
## $C
## [1] 3.0 4.3 2.1