Alexandre DuBreuil is a french Canadian software developer living in France working as a Java Architect at LesFurets.com. He really likes to code, talk about code in meetups, user groups and conferences.
Apache Spark proposes a Java API as a first class citizen, but is it as powerful as the Scala API? Does it use every feature of the language, such as lambdas? Does it integrate properly with our unit test tooling and existing Java code base? We will dive into the Spark Java API through examples and live coding from our code base, by covering the basic usage and dependency management, unit testing with JUnit, launching from an IDE and integrating Spark code with our existing Java code base. Since Spark version 2.0, the unified DataFrame API makes Spark easier to use and faster to execute in Java, but there is still little documentation on specific use cases, and many syntax quirks make Scala code difficult to convert to Java. The slides and live coding will present the good, the bad and the ugly moments our Java development team encountered while using Spark.