Refactoring to Eclipse Collections with Java 25 at the dev2next Conference
Showing what makes Java great after 30 years is the vibrant OSS ecosystem
This blog will show you how and I live-refactored a single test case with nine method category unit tests at dev2next 2025. The test starts off passing using the built in JDK Collections and Streams. We refactored it live in front of an audience to use Eclipse Collections. I will be refactoring the same test case as I write this blog, and explaining different lessons learned along the way. You can follow along as I refactor the code here, or accomplish this on your own by starting with the pre-refactored code available on GitHub. Here are the slides we used for the talk, available on GitHub.
Note: A Decade as OSS at the Eclipse Foundation
The Eclipse Collections library has been available in open source since December 2015, managed as a project at the Eclipse Foundation. Prior to that the GS Collections library, which was the Eclipse Collections predecessor, was open sourced in January 2012. That will be 14 years total in open source at the end of this year.
I have been conditioned for the past decade to start all conversations about Eclipse Collections, with a statement that should be obvious, but unfortunately isn’t. You do not need to use the Eclipse IDE or any other IDE to use Eclipse Collections. Eclipse Collections is a standalone open source collections library for Java. See the following blog for more details.
Now that the preamble is out of the way, let’s continue.
The Idea of Refactoring to Eclipse Collections
The idea of “Refactoring to Eclipse Collections” started out as an article by Kristen O’Leary and in June 2018. The two Goldman Sachs alumni wrote the following article for InfoQ.
Kristen and Vlad wouldn’t know it at the time, but they would recognize something fundamentally important in this article, that I would go on to leverage to organize the chapters of my book “Eclipse Collections Categorically” on — Method Categories.
You can see where Vlad and Kristen organized the methods in Eclipse Collections into Method Categories in their article.
Neither Vlad, Kristen, or myself would understand at the time this article was written, or even over the past seven years how important the idea of grouping methods by method category would be for me when I wrote “Eclipse Collections Categorically.” When I wrote the book, I didn’t appreciate at the time that Kristen and Vlad had a similar basic idea in their article. The book took this idea to its natural conclusion, that the idea of Method Categories is a fundamentally missing feature in Java and most other file based programming languages. This feature needs to be added to Java and other languages for developers to be able to better organize their APIs both in the IDE and in documentation (e.g. Javadoc).
Read on to learn more.
Refactoring to Eclipse Collections, Revisited
Vlad approached me with the idea of submitting a talk to dev2next on “Refactoring to Eclipse Collections”, and I agreed.
When the talk was accepted, I thought it would be good to revise the code examples with a familiar domain concept that I had used in my book — Generation. As Java 25 was released a couple weeks before the talk, I upgraded the code examples to use Java 25 with and without Compact Object Headers (JEP 519) enabled. You can find some memory comparison charts in the slide deck linked above.
All of the code examples for Refactoring to Eclipse Collections can be found in the following GitHub repo.
Generation Alpha to the Rescue
Everything we have done in the past decade in Java has become a part of the history of Generation Alpha. We don’t hear much about Generation Alpha, because no one from this generation has graduated from high school yet. The beginning of Generation Alpha was 2013, which means no one in Generation Alpha will remember a time before Java had support for concise lambda expressions. Lambdas arrived in March 2014, with the release of Java 8.
Below is the full code for Generation enum that Vlad and I would use in our talk at dev2next 2025. This Java enum is somewhat similar to the Generation enum I use in my book, “Eclipse Collections Categorically.”
package refactortoec.generation;
import java.util.stream.IntStream;
import org.eclipse.collections.impl.list.primitive.IntInterval;
public enum Generation
{
UNCLASSIFIED("Unclassified", 0, 1842),
PROGRESSIVE("Progressive Generation", 1843, 1859),
MISSIONARY("Missionary Generation", 1860, 1882),
LOST("Lost Generation", 1883, 1900),
GREATEST("Greatest Generation", 1901, 1927),
SILENT("Silent Generation", 1928, 1945),
BOOMER("Baby Boomers", 1946, 1964),
X("Generation X", 1965, 1980),
MILLENNIAL("Millennials", 1981, 1996),
Z("Generation Z", 1997, 2012),
ALPHA("Generation Alpha", 2013, 2029);
private final String name;
private final YearRange years;
Generation(String name, int from, int to)
{
this.name = name;
this.years = new YearRange(from, to);
}
public int numberOfYears()
{
return this.years.count();
}
public IntInterval yearsInterval()
{
return this.years.interval();
}
public IntStream yearsStream()
{
return this.years.stream();
}
public boolean yearsCountEqualsEc(int years)
{
return this.yearsInterval().size() == years;
}
public boolean yearsCountEqualsJdk(int years)
{
return this.yearsStream().count() == years;
}
public String getName()
{
return this.name;
}
public boolean contains(int year)
{
return this.years.contains(year);
}
}For our talk, we introduced a Java record called YearRange, which is used to store the start and end years for each Generation. This is different than the Generation in my book, which just stores an IntInterval. You will see IntInterval can be created from a YearRange by calling the method interval(). Similarly, an IntStream can be created from YearRange by calling stream(). Both of these code paths look very similar. The difference between them is subtle. An instance of IntInterval can be used as many times as a developer needs. An instance of IntStream can only be used once, before the IntStream becomes exhausted and you have to create a new one.
import java.util.stream.IntStream;
import org.eclipse.collections.impl.list.primitive.IntInterval;
public record YearRange(int from, int to)
{
public int count()
{
return this.to - this.from + 1;
}
public boolean contains(int year)
{
return this.from <= year && year <= this.to;
}
public IntStream stream()
{
return IntStream.rangeClosed(this.from, this.to);
}
public IntInterval interval()
{
return IntInterval.fromTo(this.from, this.to);
}
}GenerationJdk
For our talk, we created a class called GenerationJdk that contains the JDK specific elements of the code. GenerationJdk looks as follows.
package refactortoec.generation;
import java.util.Arrays;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.Set;
import java.util.function.BiFunction;
import java.util.stream.Gatherers;
import java.util.stream.Stream;
public class GenerationJdk
{
public static final Set<Generation> GENERATION_SET =
Set.of(Generation.values());
public static final Map<Integer, Generation> BY_YEAR =
GenerationJdk.groupEachByYear();
private static Map<Integer, Generation> groupEachByYear()
{
Map<Integer, Generation> map = new HashMap<>();
GENERATION_SET.forEach(generation ->
generation.yearsStream()
.forEach(year -> map.put(year, generation)));
return Map.copyOf(map);
}
public static Generation find(int year)
{
return BY_YEAR.getOrDefault(year, Generation.UNCLASSIFIED);
}
public static Stream<List<Generation>> windowFixedGenerations(int size)
{
return Arrays.stream(Generation.values())
.gather(Gatherers.windowFixed(size));
}
public static <IV> IV fold(IV value, BiFunction<IV, Generation, IV> function)
{
return GENERATION_SET.stream()
.gather(Gatherers.fold(() -> value, function))
.findFirst()
.orElse(value);
}
}GenerationEc
There is an equivalent class that uses Eclipse Collections types and methods called GenerationEc, which looks as follows.
package refactortoec.generation;
import org.eclipse.collections.api.RichIterable;
import org.eclipse.collections.api.block.function.Function2;
import org.eclipse.collections.api.factory.Sets;
import org.eclipse.collections.api.map.primitive.ImmutableIntObjectMap;
import org.eclipse.collections.api.map.primitive.MutableIntObjectMap;
import org.eclipse.collections.api.set.ImmutableSet;
import org.eclipse.collections.impl.factory.primitive.IntObjectMaps;
import org.eclipse.collections.impl.list.fixed.ArrayAdapter;
public class GenerationEc
{
public static final ImmutableSet<Generation> GENERATION_IMMUTABLE_SET =
Sets.immutable.with(Generation.values());
public static final ImmutableIntObjectMap<Generation> BY_YEAR =
GenerationEc.groupEachByYear();
private static ImmutableIntObjectMap<Generation> groupEachByYear()
{
MutableIntObjectMap<Generation> map = IntObjectMaps.mutable.empty();
GENERATION_IMMUTABLE_SET.forEach(generation ->
generation.yearsInterval()
.forEach(year -> map.put(year, generation)));
return map.toImmutable();
}
public static Generation find(int year)
{
return BY_YEAR.getIfAbsent(year, () -> Generation.UNCLASSIFIED);
}
public static RichIterable<RichIterable<Generation>> chunkGenerations(int size)
{
return ArrayAdapter.adapt(Generation.values())
.asLazy()
.chunk(size);
}
public static <IV> IV fold(IV value, Function2<IV, Generation, IV> function)
{
return GENERATION_IMMUTABLE_SET.injectInto(value, function);
}
}Set vs. ImmutableSet
The primary differences between GenerationJdk and GenerationEc are the types used for GENERATION_SET and IMMUTABLE_GENERATION_SET. In the talk, the differences between Set and ImmutableSet are explained in the following slides. First, we explain the difference of type, and how to be explicit about whether a type is Mutable or Immutable. We show how Eclipse Collections types can be used as drop-in-replacements for JDK types (Step 1 in slide), and how the types on the left can be migrated to more intention revealing types once the types on the right have been refactored (Step 2 in slide).
Note: The squirrel at the bottom left of this slide is what I used to mark slides I was presenting during our talk. I couldn’t easily screenshot this squirrel out of the picture. I hope it is not too distracting. :)
An ImmutableSet conveys its intent much more clearly than Set. Set is a mutable interface, which may be optionally mutable, if the type it contains throws exceptions for the mutating methods. This is a surprise better left unhidden and exposed by a more explicit type like ImmutableSet, which has no mutating methods.
The biggest difference between Set and ImmutableSet is the number of methods available directly for developers on the collection types. The following Venn diagram shows the difference in the number of non-overloaded methods.
The large number of methods on ImmutableSet may seem daunting. This is where method categories help. Instead of sorting and scrolling through 158 methods, the methods can be grouped into just nine categories. The following slide shows how I accomplished this in IntelliJ using Custom Code Folding Regions to emulate Methods Categories, which are available natively in Smalltalk IDEs.
What may be less obvious is that a developer has to look in five places to find all of the behaviors for JDK Set. There are methods in Set, Collections, Stream, Collectors, and Gathers, for a total of 170 methods. Note, not all of the methods in the Collections utility class work for Set. Some are specific to List and Map. There is no organized way of viewing the 64 methods there. Just scroll.
Other Differences in GenerationJdk and GenerationEc
Another difference in these two classes are the groupEachByYear methods. We kept these methods equivalent in that they use nested forEach calls to build a Map. The keys in the map are individual years as int values, and the values are Generation instances corresponding to each year. In the case of the JDK, a Map<Integer, Generation> is used. In the case of EC, an ImmutableIntObjectMap<Generation> is used. The ImmutableIntObjectMap<Generation> reveals the intent that this map cannot be modified, where the Map<Integer, Generation> cannot do this, even thought the Map.copyOf() call creates an immutable copy of the Map. The primitive IntObjectMap used by EC will generate a map that takes less memory than the Map used by JDK because the int values will not be boxed as Integer objects.
The two other differences in these classes are the methods used for windowFixed/chunk and fold. The method chunk in Eclipse Collections can either be used directly by calling chunk on the collection (eager), or by calling asLazy first (lazy). The lazy version is arguably better in the example we use because we don’t hold onto the chunked results after computation is finished. Waste not, want not.
In Eclipse Collections, we categorize chunk as a grouping operation. It groups elements of a collection together based on an int value. So if you have a collection of 10 items and call chunk(3), you will wind up with a collection with 4 collections of size 3, 3, 3, 1.
The method fold is useful for aggregating results. In the test class I will refactor in this blog, we will see how to use fold to calculate the max, min, and sum of items in a collection using fold. In Eclipse Collections, the method that is the equivalent of fold in the JDK is named injectInto.
Refactoring to Eclipse Collections
There is a single test class in the GitHub repository that we leveraged for live refactoring from JDK to Eclipse Collections. The test class is linked below.
The Javadoc for this class is intended to act as a guide for developers to refactor this class on their own. Check out the whole project from this GitHub repo and give it a try!
The class level Javadoc explains how the test is organized into method categories that will test multiple methods.
/**
* In this test we will refactor from JDK patterns to Eclipse Collections
* patterns. The categories of patterns we will cover in this refactoring are:
*
* <ul>
* <li>Counting - 🧮</li>
* <li>Testing - 🧪</li>
* <li>Finding - 🔎</li>
* <li>Filtering - 🚰</li>
* <li>Grouping - 🏘️</li>
* <li>Converting - 🔌</li>
* <li>Transforming - 🦋</li>
* <li>Chunking - 🖖</li>
* <li>Folding - 🪭</li>
* </ul>
*
* Note: We work with unit tests so we know code works to start, and continues
* to work after the refactoring is complete.
*/Refactoring to use a drop-in-replacement
The first refactoring we did during our talk was to replace all references in this class to GENERATION_SET, which is stored on GenerationJdk, with GENERATION_IMMUTABLE_SET, which is stored on GenerationEc.
For a small example, the following code would be transformed as follows:
// BEFORE
// Counting with Predicate -> Count of Generation instances that match
long count = GENERATION_SET.stream()
.filter(generation -> generation.contains(1995))
.count();
// AFTER
// Counting with Predicate -> Count of Generation instances that match
long count = GENERATION_IMMUTABLE_SET.stream()
.filter(generation -> generation.contains(1995))
.count();After the search and replace in the test, we run all of the methods and see that they all still pass.
Now we will continue refactoring each of the method categories included in this test class.
Refactoring Counting 🧮
The first category of methods we will refactor are counting methods.
JDK Collections / Streams
/**
* There are two use cases for counting we will explore.
* <ol>
* <li>Counting with a Predicate -> return is a primitive value</li>
* <li>Counting by a Function -> return is a Map<Integer, Long></li>
* </ol>
*/
@Test
public void counting() // 🧮
{
// Counting with Predicate -> Count of Generation instances that match
long count = GENERATION_IMMUTABLE_SET.stream()
.filter(generation -> generation.contains(1995))
.count();
assertEquals(1L, count);
// Counting by a Function -> Number of years in a Generation ->
// Count of Generations
Map<Integer, Long> generationCountByYears =
GENERATION_IMMUTABLE_SET.stream()
.collect(Collectors.groupingBy(Generation::numberOfYears,
Collectors.counting()));
var expected = new HashMap<>();
expected.put(17, 2L);
expected.put(16, 3L);
expected.put(19, 1L);
expected.put(18, 2L);
expected.put(23, 1L);
expected.put(27, 1L);
expected.put(1843, 1L);
assertEquals(expected, generationCountByYears);
assertNull(generationCountByYears.get(30));
}Refactoring Counting to Eclipse Collections
@Test
public void counting() // 🧮
{
// Counting with Predicate -> Count of Generation instances that match
int count = GENERATION_IMMUTABLE_SET
.count(generation -> generation.contains(1995));
assertEquals(1, count);
// Counting by a Function -> Number of years in a Generation ->
// Count of Generations
ImmutableBag<Integer> generationCountByYears =
GENERATION_IMMUTABLE_SET.countBy(Generation::numberOfYears);
var expected = Bags.mutable.withOccurrences(17, 2)
.withOccurrences(16, 3)
.withOccurrences(19, 1)
.withOccurrences(18, 2)
.withOccurrences(23, 1)
.withOccurrences(27, 1)
.withOccurrences(1843, 1);
assertEquals(expected, generationCountByYears);
assertEquals(0, generationCountByYears.occurrencesOf(30));
}Lessons Learned from Counting
Using Java Stream to count, first requires you to learn how to use filter. The method count() on Stream returns a long, but takes no parameter. It is the size of the Stream.
With Eclipse Collections, the count method takes a Predicate as a parameter, and counts the elements that match the Predicate.
Notice that the bun methods disappear here. Eclipse Collections gets to the point immediately. We are using count or countBy. These are active verbs, not gerunds. They do not require bun methods like stream and collect. These methods are available directly on the collections themselves. Both of these methods are eager, not lazy. They have a specific terminal result at the end of computation (int or Bag).
A Stream will return a long for a count, because a Stream can be sourced from things other than collections (e.g. files). Collection types in Java have a max size of int. In the case of Eclipse Collections, the only thing the library deals with are collections, so the result of count will never be bigger than the max size of a collection, which is int.
The less obvious thing that is happening here is the covariant nature of countBy, and other methods on Eclipse Collections Collection types. When a collection type is returned from a method, the source collection determines the result type. In the case of an ImmutableSet<Generation>, which is what GENERATION_IMMUTABLE_SET returns, the result type for countBy is an ImmutableBag<Integer>. The Map returned by the Stream version of the code is not immutable, but you wouldn’t know that from the interface named Map, because it can’t tell you.
Lastly, a Bag is a safer data structure to return than a Map for countBy. This is because a Map will return null for missing keys, where a Bag knows it is a counter, so will return 0 for missing keys when occurrencesOf is used.
Refactoring Testing 🧪
The next category of methods we will refactor are testing methods. A testing method will always return a boolean result.
JDK Collections / Streams
/**
* Testing methods return a boolean. We will explore three testing methods.
* Testing methods are always eager, but can often short-circuit execution,
* meaning they don't have to visit all elements of the collection if the
* condition is met.
*<ol>
*<li>Stream.anyMatch(Predicate) -> RichIterable.anySatisfy(Predicate)</li>
*<li>Stream.allMatch(Predicate) -> RichIterable.allSatisfy(Predicate)</li>
*<li>Stream.noneMatch(Predicate) -> RichIterable.noneSatisfy(Predicate)</li>
*</ol>
*/
@Test
public void testing() // 🧪
{
assertTrue(GENERATION_IMMUTABLE_SET.stream()
.anyMatch(generation -> generation.contains(1995)));
assertFalse(GENERATION_IMMUTABLE_SET.stream()
.allMatch(generation -> generation.contains(1995)));
assertFalse(GENERATION_IMMUTABLE_SET.stream()
.noneMatch(generation -> generation.contains(1995)));
}Refactoring Testing to Eclipse Collections
@Test
public void testing() // 🧪
{
assertTrue(GENERATION_IMMUTABLE_SET
.anySatisfy(generation -> generation.contains(1995)));
assertFalse(GENERATION_IMMUTABLE_SET
.allSatisfy(generation -> generation.contains(1995)));
assertFalse(GENERATION_IMMUTABLE_SET
.noneSatisfy(generation -> generation.contains(1995)));
}Lessons Learned from Testing
There are other methods for testing that we did not cover in this refactoring. Examples are contains, isEmpty, notEmpty, containsBy, containsAll, containsAny, containsNone.
The simple pattern to remember when refactoring any/all/None is that the suffix Match in the JDK, becomes Satisfy in Eclipse Collections. The biggest difference is that the call to stream is removed as it is unnecessary. The methods are available directly on the collections themselves in Eclipse Collections.
Refactoring Finding 🔎
The next category of methods are finding methods. A finding method is one that returns an element of the collection. There are methods that can search for elements based on Predicate or Function.
JDK Collections / Streams
/**
* Finding methods return some element of a collection. Finding methods are
* always eager.
* <ol>
* <li>Stream.filter(Predicate).findFirst() -> RichIterable.detect(Predicate) / detectOptional(Predicate)</li>
* <li>Collectors.maxBy(Comparator) -> RichIterable.maxBy(Function)</li>
* <li>Collectors.minBy(Comparator) -> RichIterable.minBy(Function)</li>
* <li>Stream.filter(Predicate.not()) -> RichIterable.reject(Predicate)</li>
* </ol>
*/
@Test
public void finding() // 🔎
{
Generation findFirst =
GENERATION_IMMUTABLE_SET.stream()
.filter(generation -> generation.contains(1995))
.findFirst()
.orElse(null);
assertEquals(MILLENNIAL, findFirst);
Generation notFound =
GENERATION_IMMUTABLE_SET.stream()
.filter(generation -> generation.contains(1795))
.findFirst()
.orElse(UNCLASSIFIED);
assertEquals(UNCLASSIFIED, notFound);
List<Generation> generationsNotUnclassified =
Stream.of(Generation.values())
.filter(gen -> !gen.equals(UNCLASSIFIED))
.toList();
Generation maxByYears =
generationsNotUnclassified.stream()
.collect(Collectors.maxBy(
Comparator.comparing(Generation::numberOfYears)))
.orElse(null);
assertEquals(GREATEST, maxByYears);
Generation minByYears =
generationsNotUnclassified.stream()
.collect(Collectors.minBy(
Comparator.comparing(Generation::numberOfYears)))
.orElse(null);
assertEquals(X, minByYears);
}Refactoring Finding to Eclipse Collections
@Test
public void finding() // 🔎
{
Generation findFirst = GENERATION_IMMUTABLE_SET
.detect(generation -> generation.contains(1995));
assertEquals(MILLENNIAL, findFirst);
Generation notFound = GENERATION_IMMUTABLE_SET
.detectIfNone(
generation -> generation.contains(1795),
() -> UNCLASSIFIED);
assertEquals(UNCLASSIFIED, notFound);
MutableList<Generation> generationsNotUnclassified =
ArrayAdapter.adapt(Generation.values())
.reject(gen -> gen.equals(UNCLASSIFIED));
Generation maxByYears =
generationsNotUnclassified.maxBy(Generation::numberOfYears);
assertEquals(GREATEST, maxByYears);
Generation minByYears =
generationsNotUnclassified.minBy(Generation::numberOfYears);
assertEquals(X, minByYears);
}Lessons Learned from Finding
Again, we see that finding in the JDK is dependent on the method filter. The method findFirst is terminal in the JDK and takes no parameters. It returns an Optional, which we then have to query to see if something was actually returned from the call to filter. We write cases where something is found, and something is not found.
Eclipse Collections detect method takes a Predicate as a parameter, and returns a found element or null if something is not found. If we want to protect against the null return case, we can use detectIfNone, which takes a Predicate and Function0 as parameters. The Function0 is evaluated in the case something is not found.
We see that the filter method has no equivalent of a filterNot. Instead, we have to negate a Predicate using an ! in the lambda, or we could wrap a Predicate in a call to Predicate.not().
Eclipse Collections has a method named reject that filters exclusively. As we will see in the next category (filtering), Eclipse Collections also has a method named select which filters inclusively.
Refactoring Filtering 🚰
The filtering category includes methods like filter and partition. In Eclipse Collections, the method names are select (inclusive filter), reject (exclusive filter) and partition (one pass select and reject)
JDK Collections / Streams
/**
* Filtering methods return another Stream or Collection based on a Predicate.
* Filtering can be eager or lazy. We will explore three filtering methods.
* <ol>
* <li>Stream.filter(Predicate) -> RichIterable.select(Predicate)</li>
* <li>Stream.filter(Predicate.not()) -> RichIterable.reject(Predicate)</li>
* <li>Collectors.partitioningBy(Predicate) -> RichIterable.partition(Predicate)</li>
* </ol>
*/
@Test
public void filtering() // 🚰
{
Set<Generation> filteredSelected =
GENERATION_IMMUTABLE_SET.stream()
.filter(generation -> generation.yearsCountEqualsJdk(16))
.collect(Collectors.toUnmodifiableSet());
var expectedSelected = Set.of(X, MILLENNIAL, Z);
assertEquals(expectedSelected, filteredSelected);
Set<Generation> filteredRejected =
GENERATION_IMMUTABLE_SET.stream()
.filter(generation -> !generation.yearsCountEqualsJdk(16))
.collect(Collectors.toUnmodifiableSet());
var expectedRejected = Sets.mutable.with(
ALPHA, UNCLASSIFIED, BOOMER, GREATEST, LOST,
MISSIONARY, PROGRESSIVE, SILENT);
assertEquals(expectedRejected, filteredRejected);
Map<Boolean, Set<Generation>> partition = GENERATION_IMMUTABLE_SET.stream()
.collect(Collectors.partitioningBy(
generation -> generation.yearsCountEqualsJdk(16),
Collectors.toUnmodifiableSet()));
assertEquals(expectedSelected, partition.get(Boolean.TRUE));
assertEquals(expectedRejected, partition.get(Boolean.FALSE));
}Refactoring Finding to Eclipse Collections
@Test
public void filtering() // 🚰
{
ImmutableSet<Generation> filteredSelected =
GENERATION_IMMUTABLE_SET
.select(generation -> generation.yearsCountEqualsJdk(16));
var expectedSelected = Set.of(X, MILLENNIAL, Z);
assertEquals(expectedSelected, filteredSelected);
ImmutableSet<Generation> filteredRejected =
GENERATION_IMMUTABLE_SET
.reject(generation -> generation.yearsCountEqualsJdk(16));
var expectedRejected = Sets.mutable.with(
ALPHA, UNCLASSIFIED, BOOMER, GREATEST, LOST,
MISSIONARY, PROGRESSIVE, SILENT);
assertEquals(expectedRejected, filteredRejected);
PartitionImmutableSet<Generation> partition = GENERATION_IMMUTABLE_SET
.partition(generation -> generation.yearsCountEqualsJdk(16));
assertEquals(expectedSelected, partition.getSelected());
assertEquals(expectedRejected, partition.getRejected());
}Lessons Learned from Filtering
While the name filtering makes sense for a method category, the name filter is ambiguous as a method. It is not clear by the name alone whether the method is meant to be an inclusive or exclusive filter. The methods select and reject in Eclipse Collections disambiguate through their names.
The method partition in Eclipse Collections returns a special type, in this case a PartitionMutableSet. Again, we see that methods in EC are covariant, and return specialized types based on the source type.
The filtering methods on Eclipse Collections collection types are all eager. If we want lazy versions of the methods, we can call asLazy() first, and then will have to do something similar to Java Stream and call a terminal method like toList(). There are many more methods available on LazyIterable than Stream, as LazyIterable extends RichIterable.
Now, to address the return type of Map<Boolean, Set<Generation>> from the Collectors.partitioningBy() method. It is difficult (although not impossible) to think of a worse return type for this method. A Map<Boolean, Anything> is a bad idea. I think it is so bad, that Eclipse Collections primitive maps do not support BooleanToAnythingMaps. We explicitly decided not to support these types. There are much better alternatives like using a Record with explicit names, or introducing a specific type as we did in Eclipse Collections for PartitionIterable. If you want me to explain more about why Map<Boolean, Anything> is bad, there is a blog for that, with the title “Map-Oriented Programming in Java.” Enjoy!
Refactoring Grouping 🏘️
The grouping category was limited to just groupBy in this talk. There are other methods that are categorized as grouping in Eclipse Collections. You can see the full list of EC methods included in the grouping category in the slide above with the Custom Code Folding Regions demonstrated in IntelliJ.
JDK Collections / Streams
/**
* Grouping methods return a Map with some key calculated by a Function and
* the values contained in a Collection. We will explore one grouping method.
*
* <ol>
* <li>Collectors.groupingBy(Function) -> RichIterable.groupBy(Function)</li>
* </ol>
*/
@Test
public void grouping() // 🏘️
{
Map<Integer, Set<Generation>> generationByYears =
GENERATION_IMMUTABLE_SET.stream()
.collect(Collectors.groupingBy(
Generation::numberOfYears,
Collectors.toSet()));
var expected = new HashMap<>();
expected.put(17, Set.of(ALPHA, PROGRESSIVE));
expected.put(16, Set.of(X, MILLENNIAL, Z));
expected.put(19, Set.of(BOOMER));
expected.put(18, Set.of(SILENT, LOST));
expected.put(23, Set.of(MISSIONARY));
expected.put(27, Set.of(GREATEST));
expected.put(1843, Set.of(UNCLASSIFIED));
assertEquals(expected, generationByYears);
assertNull(generationByYears.get(30));
}Refactoring Grouping to Eclipse Collections
@Test
public void grouping() // 🏘️
{
ImmutableSetMultimap<Integer, Generation> generationByYears =
GENERATION_IMMUTABLE_SET.groupBy(Generation::numberOfYears);
var expected = Multimaps.immutable.set.empty()
.newWithAll(17, Set.of(ALPHA, PROGRESSIVE))
.newWithAll(16, Set.of(X, MILLENNIAL, Z))
.newWithAll(19, Set.of(BOOMER))
.newWithAll(18, Set.of(SILENT, LOST))
.newWithAll(23, Set.of(MISSIONARY))
.newWithAll(27, Set.of(GREATEST))
.newWithAll(1843, Set.of(UNCLASSIFIED));
assertEquals(expected, generationByYears);
assertTrue(generationByYears.get(30).isEmpty());
}Lessons Learned from Grouping
I will refer you to the blog on Map-Oriented Programming in Java again. The groupBy method in Eclipse Collections returns a special type called Multimap. A Multimap is a collection type that knows its value types are some type of collection. A Multimap can gracefully handle a sparsely populated data set, by returning an empty collection when a key is missing. A Map will return null for missing keys. The test case illustrates this.
We see yet again, that the groupBy method is covariant on Eclipse Collections types. An ImmutableSet returns an ImmutableSetMultimap when calling groupBy on it.
Creating a Multimap is more involved than creating other types. We use the Multimaps factory class here and choose immutable and set to further refine the Multimap type we want to be an ImmutableSetMultimap. If you go to the first paragraph of this blog, you will find a link to the slides for our talk which includes a slide with all of the combinations of Eclipse Collections factories explained.
Refactoring Converting 🔌
The category of converting includes 29 methods in Eclipse Collections. We only cover the toList and toImmutableList converter methods in this talk. The converter methods in the JDK are limited to toList on Stream, and bunch of toXyz methods on Collectors.
JDK Collections / Streams
/**
* Converting method convert from a source Collection type to a target
* Collection type. Converting methods in both Java and Eclipse Collections
* usually have a prefix of "to". We'll explore a few converting methods
* in this test.
* <ol>
* <li>Collectors.toList() -> RichIterable.toList()</li>
* <li>Stream.toList() -> RichIterable.toImmutableList()</li>
* </ol>
*/
@Test
public void converting() // 🔌
{
List<Generation> mutableList =
GENERATION_IMMUTABLE_SET.stream()
.collect(Collectors.toList());
List<Generation> immutableList =
GENERATION_IMMUTABLE_SET.stream()
.toList();
List<Generation> sortedMutableList =
mutableList.stream()
.sorted(Comparator.comparing(
gen -> gen.yearsStream().findFirst().getAsInt()))
.collect(Collectors.toList());
var expected = Lists.mutable.with(values());
assertEquals(expected, sortedMutableList);
List<Generation> sortedImmutableList =
immutableList.stream()
.sorted(Comparator.comparing(
gen -> gen.yearsStream().findFirst().getAsInt()))
.toList();
assertEquals(expected, sortedImmutableList);
}Refactoring Converting to Eclipse Collections
@Test
public void converting() // 🔌
{
MutableList<Generation> mutableList =
GENERATION_IMMUTABLE_SET.toList();
ImmutableList<Generation> immutableList =
GENERATION_IMMUTABLE_SET.toImmutableList();
MutableList<Generation> sortedMutableList =
mutableList.toSortedListBy(
gen -> gen.yearsInterval().getFirst());
var expected = Lists.mutable.with(values());
assertEquals(expected, sortedMutableList);
ImmutableList<Generation> sortedImmutableList =
immutableList.toImmutableSortedListBy(
gen -> gen.yearsInterval().getFirst());
assertEquals(expected, sortedImmutableList);
}Lessons Learned from Converting
The methods for converting from one collection type to another are extremely helpful. They are also extremely limited on the Stream interface. It is confusing that the method named toList on Collectors, does not return the same type as the method named toList on Stream.
While we limited the converting category to methods for converting to mutable and immutable Lists, the following blog shows the large number of potential targets for converting methods prefixed with to in Eclipse Collections.
Refactoring Transforming 🦋
The transforming category includes methods like JDK map and EC collect. These methods transform the element type of a collection to a different type (e.g. Generation -> String).
JDK Collections / Streams
/**
* Transforming methods convert the elements of a collection to another type by
* applying a Function to each element. We'll explore the following methods.
*
* <ol>
* <li>Stream.map() -> RichIterable.collect()</li>
* <li>Collectors.toUnmodifiableSet() -> ???</li>
* </ol>
*
* Note: Certain methods on RichIterable are covariant, so return a type that
* makes sense for the source type.
* Hint: If we collect on an ImmutableSet, the return type is an ImmutableSet.
*/
@Test
public void transforming() // 🦋
{
Set<String> names =
GENERATION_IMMUTABLE_SET.stream()
.map(Generation::getName)
.collect(Collectors.toUnmodifiableSet());
var expected = Sets.immutable.with(
"Unclassified", "Greatest Generation", "Lost Generation", "Millennials",
"Generation X", "Baby Boomers", "Generation Z", "Silent Generation",
"Progressive Generation", "Generation Alpha", "Missionary Generation");
assertEquals(expected, names);
Set<String> mutableNames = names.stream()
.collect(Collectors.toSet());
assertEquals(expected, mutableNames);
}Refactoring Transforming to Eclipse Collections
@Test
public void transforming() // 🦋
{
ImmutableSet<String> names =
GENERATION_IMMUTABLE_SET.collect(Generation::getName);
var expected = Sets.immutable.with(
"Unclassified", "Greatest Generation", "Lost Generation", "Millennials",
"Generation X", "Baby Boomers", "Generation Z", "Silent Generation",
"Progressive Generation", "Generation Alpha", "Missionary Generation");
assertEquals(expected, names);
MutableSet<String> mutableNames = names.toSet();
assertEquals(expected, mutableNames);
}Lessons Learned from Transforming
We see the collect method in Eclipse Collections , like select, reject, partition, countBy, groupBy, is covariant. Using collect on an ImmutableSet returns an ImmutableSet. The collect method is the equivalent of map on the JDK Stream type. It is not the same as the collect method on the Stream type.
The following section on collect from the “Eclipse Collections Categorically” book explains the difference between collect on Stream and collect in Eclipse Collections.
Refactoring Chunking 🖖
The category of chunking can also be grouped in the category of grouping. We differentiated it in our talk because the capability of chunking was added as a method named windowFixed to the new Gatherers type in Java. The method that provides the same behavior as windowFixed in Eclipse Collections is simply named chunk.
Note: The hand emoji above reminded me of taking a collection of five fingers and chunking them by two each. This leaves three chunks, with 2, 2, 1 fingers.
JDK Collections / Streams
/**
* Chunking is a kind of grouping method, but for our purposes we will put
* the methods in their own category. Chunking is great for breaking
* collections into smaller collections based on a size parameter.
* We'll explore the following methods.
*
* <ol>
* <li>Stream.gather(Gatherers.windowFixed()) -> RichIterable.chunk()</li>
* <li>Collectors.joining() -> RichIterable.makeString()</li>
* </ol>
*/
@Test
public void chunking() // 🖖
{
Stream<List<Generation>> windowFixedGenerations =
GenerationJdk.windowFixedGenerations(3);
String generationsAsString = windowFixedGenerations.map(Object::toString)
.collect(Collectors.joining(", "));
String expected = """
[UNCLASSIFIED, PROGRESSIVE, MISSIONARY], [LOST, GREATEST, SILENT], \
[BOOMER, X, MILLENNIAL], [Z, ALPHA]""";
assertEquals(expected, generationsAsString);
String yearsAsString = MILLENNIAL.yearsStream()
.boxed()
.gather(Gatherers.windowFixed(4))
.map(Object::toString)
.collect(Collectors.joining(", "));
String expectedYears = """
[1981, 1982, 1983, 1984], [1985, 1986, 1987, 1988], \
[1989, 1990, 1991, 1992], [1993, 1994, 1995, 1996]""";
assertEquals(expectedYears, yearsAsString);
}The additional code to explore is in GenerationJdk.
public static Stream<List<Generation>> windowFixedGenerations(int size)
{
return Arrays.stream(Generation.values())
.gather(Gatherers.windowFixed(size));
}Refactoring Chunking to Eclipse Collections
@Test
public void chunking() // 🖖
{
RichIterable<RichIterable<Generation>> chunkedGenerations =
GenerationEc.chunkGenerations(3);
String generationsAsString = chunkedGenerations.makeString(", ");
String expected = """
[UNCLASSIFIED, PROGRESSIVE, MISSIONARY], [LOST, GREATEST, SILENT], \
[BOOMER, X, MILLENNIAL], [Z, ALPHA]""";
assertEquals(expected, generationsAsString);
String yearsAsString = MILLENNIAL.yearsInterval()
.chunk(4)
.makeString(", ");
String expectedYears = """
[1981, 1982, 1983, 1984], [1985, 1986, 1987, 1988], \
[1989, 1990, 1991, 1992], [1993, 1994, 1995, 1996]""";
assertEquals(expectedYears, yearsAsString);
}The additional code to explore is in GenerationEc.
public static RichIterable<RichIterable<Generation>> chunkGenerations(int size)
{
return ArrayAdapter.adapt(Generation.values())
.asLazy()
.chunk(size);
}Lessons Learned from Chunking
This is the first time we used Gatherers in this talk. The first thing we can notice about the gather method on Stream, is there is no equivalent of gather on IntStream, LongStream, or DoubleStream. The chunk method on the other hand is available for both Object and primitive collections in Eclipse Collections.
The method named chunk is available as an eager method directly on collections, and also lazily via a call to asLazy. The code could be changed to be eager as follows, but there would be a slight performance hit because a temporary collection would be created as a result.
public static RichIterable<RichIterable<Generation>> chunkGenerations(int size)
{
return ArrayAdapter.adapt(Generation.values())
.chunk(size);
}Notice how the return type of chunk is still RichIterable<RichIterable<Generation>> when we remove the call to asLazy. This is because a LazyIterable is a RichIterable, and an ImmutableSet is also a RichIterable. They behave differently for certain methods, but have a consistent API.
Refactoring Folding 🪭
The folding category is actually called aggregating in Eclipse Collections. For this talk we separated it out as a category to explain the fold method in the JDK on the Gatherers class. The method that is equivalent to fold in Eclipse Collections is called injectInto.
JDK Collections / Streams
/**
* Folding is a mechanism for reducing a type to some new result type.
* We'll explore folding to calculate a min, max, and sum.
* Methods we'll cover:
* <ol>
* <li>Stream.gather(Gatherers.fold() -> RichIterable.injectInto()</li>
* </ol>
*/
@Test
public void folding() // 🪭
{
Integer maxYears = GenerationJdk.fold(
Integer.MIN_VALUE,
(Integer value, Generation generation) ->
Math.max(value, generation.numberOfYears()));
Integer minYears = GenerationJdk.fold(
Integer.MAX_VALUE,
(Integer value, Generation generation) ->
Math.min(value, generation.numberOfYears()));
Integer sumYears = GenerationJdk.fold(
Integer.valueOf(0),
(Integer value, Generation generation) ->
Integer.sum(value, generation.numberOfYears()));
assertEquals(1843, maxYears);
assertEquals(16, minYears);
assertEquals(2030, sumYears);
}The additional code to explore is in GenerationJdk.fold().
public static <IV> IV fold(IV value, BiFunction<IV, Generation, IV> function)
{
return GENERATION_SET.stream()
.gather(Gatherers.fold(() -> value, function))
.findFirst()
.orElse(value);
}Refactoring Folding to Eclipse Collections
@Test
public void folding() // 🪭
{
Integer maxYears = GenerationEc.fold(
Integer.MIN_VALUE,
(Integer value, Generation generation) ->
Math.max(value, generation.numberOfYears()));
Integer minYears = GenerationEc.fold(
Integer.MAX_VALUE,
(Integer value, Generation generation) ->
Math.min(value, generation.numberOfYears()));
Integer sumYears = GenerationEc.fold(
Integer.valueOf(0),
(Integer value, Generation generation) ->
Integer.sum(value, generation.numberOfYears()));
assertEquals(1843, maxYears);
assertEquals(16, minYears);
assertEquals(2030, sumYears);
}The additional code to explore is in GenerationEc.fold().
public static <IV> IV fold(IV value, Function2<IV, Generation, IV> function)
{
return GENERATION_IMMUTABLE_SET.injectInto(value, function);
}Lessons Learned from Folding
The approach taken for folding in the JDK is unnecessarily convoluted. If we compare fold and injectInto next to each other, this will be clearer.
// JDK fold
public static <IV> IV fold(IV value, BiFunction<IV, Generation, IV> function)
{
return GENERATION_SET.stream()
.gather(Gatherers.fold(() -> value, function))
.findFirst()
.orElse(value);
}
// EC injectInto
public static <IV> IV fold(IV value, Function2<IV, Generation, IV> function)
{
return GENERATION_IMMUTABLE_SET.injectInto(value, function);
}The methods fold and injectInto are hard enough to explain, without adding the overhead of Stream, Gatherers, and Optional into the mix.
The following blog explains the method injectInto in more detail. I refer to injectInto as the “Continuum Transfunctioner.” Read the following blog to find out why.
Refactoring a Conclusion
After having given a 75 minute talk at dev2next, and then turning the talk into a blog where I repeat the live refactoring that Vlad and I did in front on an audience, there is very little left for me to say. There is a lot to digest in this blog. I dare say this probably the longest blog I have ever written.
I will simply leave you with our takeaways slide from the talk, and an important section of the book “Eclipse Collections Categorically.”
Note: The following is an excerpt from Chapter one of the book, “Eclipse Collections Categorically.” This section of Chapter one is available in the online reading sample for the book on Amazon.
I hope you enjoyed reliving the talk Vlad and I gave at dev2next titled “Refactoring to Eclipse Collections.” I enjoyed writing it, and will see if I can go back and make improvements over time. This blog will hopefully be a good resource for folks seeking to build or reinforce a set of basic skills across several method categories for Eclipse Collections. This blog isn’t as comprehensive as the book I just wrote, but should hopefully be a good starter for what you might have been missing just using Java Collections and Streams for the past 21 years.
Thanks for reading!
I am the creator of and committer for the Eclipse Collections OSS project, which is managed at the Eclipse Foundation. Eclipse Collections is open for contributions. I am the author of the book, Eclipse Collections Categorically: Level up your programming game.
