top of page
Writer's pictureWix Engineering

Breaking Chains with Pipelines in Modern JavaScript

Updated: Apr 12, 2020



JavaScript provides a growing list of iteration methods, which can be used to manipulate array content in a functional manner. However, these built-in iteration methods suffer from several notable limitations and restrictions. Third-party libraries, such as lodash, provide alternatives which can overcome these limitations, but introduce limitations and restrictions of their own.

In this post I show how modern JavaScript capabilities can be used to implement a simple library which doesn’t suffer from these restrictions, and how the upcoming pipeline operator will make it much more pleasant to use such a library. This will result in code that is easier to both write and read.



I prefer using JavaScript’s iteration methods as a means for operating on, and manipulating collections of items. This means that when I need to apply a transformation to values contained in an array, for example, I tend to use Array.prototype.map rather than a for loop:



Instead of:



The main reason I like this approach is because it’s more declarative and explicit: once you get used to the syntax, and become acquainted with the various iteration methods, just seeing which method is being used provides a clear indication of what type of operation is being performed. For example, when I see filter I immediately know that the purpose is to retain only the items that match a specific criteria.


Likewise, when I see map I know that the purpose is to apply a transformation to the values of items. The endresult it that the code becomes both more succinct and readable, especially when used with the new ECMAScript arrow syntax, as shown above.


Another major benefit to this approach is that most JavaScript iteration methods don’t modify the collection to which they are applied. The immutability of the original collection reduces the likelihood of bugs stemming from unintended side-effects. The price we pay for this benefit is the overhead of generating a new collection of items for each application of an iteration method.


Last but not least, these methods can be chained together, so that complex operations on a collection can be constructed from a sequence of simple ones, each one expressed as an operation on a single item:



Yet, JavaScript’s current iteration methods have two significant shortcomings, which limit their usefulness:


  • The built-in iteration methods are only available on arrays. This means that if you want to apply them to any other type of collection, you must first transform it into an array.

  • They are eager rather than lazy. In the example above, the entire array is filtered into a temporary array, and after that the entire resulting array is transformed into yet another temporary array, even though only the first three items are required. This can result in excessive CPU and memory use.



Lodash To The Rescue?


Fortunately, there are third-party libraries out there that can alleviate these limitations. A great example is the highly popular lodash library, which provides iteration methods that can operate on more generalized collections, and not just on arrays. In addition, lodash has a chaining feature, which is lazy rather than eager, meaning it evaluates and allocates space only for items that it actually needs.


In the following example, since slice only takes the first three items provided by map, the map method will only actually apply the transformation to up to three items, regardless of the total size and content of the original array:



Also, no arrays are generated by any of the iteration methods in the chain. Instead, the terminating call to value instructs lodash to actually trigger the lazy computation and provide the resulting array.


But, unfortunately, lodash has its own limitations and downsides. In particular, lodash only supports collections which are either arrays or JSON property bags. If you implement your own custom collection, for example, it won’t work with lodash. Even the built-in Map and Set collections aren’t supported by lodash. Instead, they must be converted into arrays before lodash methods can be applied to them.


Another limitation is that tree-shaking isn’t really compatible with lodash chains. This is because lodash uses the dot operator to construct the chains, and so each link in the chain emits an object which references all the chainable iteration methods. These references prevent bundlers, like WebPack, from being able to identify which methods are actually being used, and excluding the rest. (There are methods to overcome this, but they are convoluted and cumbersome to use, which is why most developers don’t use them.)


Are we stuck then, with a limited solution? Fortunately not!



Better Iteration With Iterators


It turns out that implementing a better mechanism for iteration is actually very straightforward in modern ECMAScript / JavaScript, thanks to iterators and generators. This new capability of the language provides a method for generic iteration over most types of collections, either built-in or user defined, including both Map and Set. (For more information on what iterators and generators are, what they can do, and how they work, you can view my talk on this topic.)


We can use JavaScript iterators and generators to quickly construct our own simple iteration library. Note that this isn’t a complete, production-grade library - rather it’s a sample implementation intended to demonstrate how such a library should look like and operate. Here is an implementation of map as a JavaScript generator:



All map does is take an input argument, which is either an iterator or a collection that implements iterators, and a second argument, which is the operator to apply to each item. Since it’s a generator, when map is invoked it doesn’t actually compute anything, or return an intermediate collection. Instead it creates an output iterator, which directly provides the result of each transformation only when required, as specified by the argument to yield.


Similarly, here is the implementation of filter:



Now we can finally use these functions together to reproduce the lodash example from before:



Here filter is applied directly to the numbers array, and provides an iterator, which is passed to map. Similarly, map provides an iterator which is passed to slice. And slice also provides an iterator, which is passed as an argument to the built-in Array.from function. It turns out that Array.from can use the iterator to construct an array. As a result, no intermediate collections are generated, and only the values needed are actually computed. (You can also use the spread operator instead of Array.from, but I wanted to construct this expression from a sequence of function calls.)


Yet there are still a couple of problems with this expression. First and foremost, we haven’t yet defined the slice function. Can you come up with an implementation for it yourself? Try giving yourself a couple of minutes to do so. Are you done? Hopefully the implementation you came up with is similar to the following:



Please be aware that this is a simplistic implementation, intended to show how such a function can be defined. Note the use of break to exit the loop immediately when index is greater or equal to finish. This ensures that values that aren’t needed aren’t pulled from the input iterator. As a result, the preceding functions in the sequence, filter and map in this example, won’t compute unneeded values. This is how we get lazy evaluation in this implementation.


And because each iteration function is wholly independent of the others, there won’t be any problem tree-shaking this code, unlike lodash chaining.



Laying The Pipeline


Another big problem with this implementation is that it’s difficult to read and follow, unlike the lodash example. This is because when reading from left to right, the functions are provided in reverse to the order in which they are used. Moreover, all the parentheses and arguments placement make this code confusing and verbose.


Fortunately, a much better syntax for this type of expressions is being introduced by ECMAScript, known as the pipeline operator. Although the syntax for this proposed operator hasn’t fully settled yet, and it’s still at an early stage of the acceptance process, you can already try it out using Babel. With the pipeline operator, the result of each function is passed as an argument to the next function in the expression. Here is the same example above written using the pipeline operator:



The # symbol is a placeholder, used to indicate where the result of the previous expression in the pipeline sequence should be used when the next function takes more than one argument. In this case, # is either the original array (passed to filter), or the iterator returned from the previous step.


While this syntax certainly requires some getting used to, it definitely makes the code much easier to read and follow, and the operations are provided in the order in which they are used. And so we have the solution we wanted all along.


Here is another example showing how this library can be used to operate on a JavaScript object in order to pick only the numeric fields:



Note that Object.entries will actually generate an intermediate array. It’s possible to create an entries function which provides an iterator instead - I leave this as an exercise, or you can cheat and look at the CodePen that I have prepared with this, and additional functions.



Needle In A Haystack


One useful function that has been added to JavaScript relatively recently is Array.prototype.find. This function returns the value of the first item in a provided array that satisfies the specified testing function. Since this is such a useful function, lodash has had it from the get-go. Let’s see how to best implement this function in the context of our small and simple library.


One obvious way to implement find functionality using the functions that we already have is using filter, for example:



Normally, using filter to implement find is a bad idea because it tests every item in the array, even if a matching item has already been found. However, because this library uses lazy evaluation to only compute needed values, the use of slice prevents this from happening. In this code segment, slice indicates that only the value of the first matching item is required, and so, once an item is found, no additional items are tested by filter.


But this code is overly verbose, and creates an extra one-item array that contains the value of the matching item. Let’s improve on this by implementing a simple head function:



This function returns the value of the first item in a collection, or undefined if the collection is empty. Now we can achieve the same result as the previous example using the much simpler:



In this case, it’s the head function that prevents filter from processing any item after the first matching item is found. And, since it directly returns the value of the first item, no intermediate array is required.


We can also package this code into a find function, and use it as follows:




This is a great example of how easily extensible such a library is.



Summary


Using very little modern JavaScript code, we’ve created a super-simple library that provides much of the functionality of lodash chaining, is as readable, and has some notable advantages over that excellent library. Indeed, once pipelining becomes a standard part of ECMAScript, and is supported by JavaScript environments, it’s highly likely that such public libraries will appear.


In the interim, you can use this CodePen to try out this functionality already. It contains a more complete implementation of the library I discussed in this post, with additional functions such as reduce and flat, and is still very small, at around 100 lines of unminified code! Feel free to play with this library, and use it as you see fit. And if it inspires you to create a library of your own then more power to you.


Photo by JJ Ying on Unsplash

 


This post was written by Dan Shappir



 

For more engineering updates and insights:


bottom of page