When developers use DataWeave, they often come to rely on the reduce() function to fill in any gaps left by the standard Core library. Although filter() and joinBy() and splitBy() and even groupBy() could be implemented by simply using reduce(), we favor the Core library functions as the best implementation of those patterns.
On the other hand, polishing your reduce() game will get you through tight spots sometimes when the requirement is idiosyncratic, or when the standard function just won't do.
So we'll take a data set that represents a crude inventory, and use reduce() to extract a dashboard snapshot showing projected revenue, and potential growth through untapped inventory.
(BTW, if more about DataWeave is not your cup of tea, never fear. We are talking about the onset of automation in the workplace next. So drop by again.)
So here we go.
Quick review in case you haven't pondered the mysteries of reduce() for a while.
The function expects two input arguments. An Array, and a lambda. The lambda expects two args, an element from the array (of type Any), and an accumulator. Each element is processed in turn, and reduce() returns the final value of the accumulator.
Your choice about how to initialize the accumulator makes a huge difference in the outcome, so it's important to be clear about the matter.
If you do not initialize the accumulator, then it will take the value of the first element of the input array, and begin the first iteration of the lambda using the second element.
While this can be handy, even sometimes elegant, it also carries the implicit condition that the accumulator will be of the same type as the first element in the array.
If you want the accumulator to be of some type other than the elements in the array, you'll need to initialize it yourself.
With that in mind, let's get some code.
Our starting place is an array of records that each represent an airline flight. Here's what one looks like:
{"available-seats": "+40.00","airline-name": "delta","flight-code": "a134ds","departure-date": "apr 11, 2018","destination": {"open-flights-airport-id": "3484","airport-code": "lax","airport-name": "los angeles international airport","city": "los angeles","dst": "a","altitude": "125","icao": "klax","longitude": "33.94250107"},"plane-type": null,"origination": "mua","price": "+750.00"}
There are records for each of three different airlines. So in our experiment, we are interested in the airline, the number of available seats, and the price of each seat.
We'll begin by simply calculating global statistics first to get our framework in place. Our array will be held in a variable called "flights"
flights reduce (f,a={numberOfFlights:0,flightsOpenRevenue:0}) ->{numberOfFlights: a.numberOfFlights + 1,flightsOpenRevenue: (a.flightsOpenRevenue+ f.price as Number * f."available-seats" as Number)}
Here's what we get:
{"numberOfFlights": 9,"flightsOpenRevenue": 153071.38}
The code is a little dense, but if you trace it through, you can see how the accumulator is initialized and updated. But we could do much better, and if we isolate the accumulator initialization into a variable of its own. This pays dividends downstream as you will see.
We also took the liberty of reorganizing the calculation to update the revenue. Here's that improved approach:
var acc = {numberOfFlights:0,flightsOpenRevenue:0}---flights reduce (f,a=acc) ->{numberOfFlights: a.numberOfFlights + 1,flightsOpenRevenue: a.flightsOpenRevenue +f.price as Number *f."available-seats" as Number}
It may seem like a simple stylistic difference. Believe me, when you are in the middle of forming your logic for something like this, layout can make a big difference. You will comment out portions of the expression to observe intermediate state, check the datatype of an expression when you are not sure, and sometimes wrestle with a bright idea you have along the way.
Our outcome from this refactoring move does not change, but now we are poised to expand the complexity of our dashboard significantly. We are out to get something like this:
var dashboard = {United: {numberOfFlights:0,flightsOpenRevenue:0},American: {numberOfFlights:0,flightsOpenRevenue:0},Delta: {numberOfFlights:0,flightsOpenRevenue:0}}
Of course if you are like me, the repeated phrase makes the teeth hurt! Especially since each repeating phrase is already held in our acc variable. So what we really need is this:
var acc = {numberOfFlights:0,flightsOpenRevenue:0}var d2 = {United: acc,American: acc,Delta: acc}
Your best DataWeave code is always smaller.
Now we can write a lambda that updates each of the Airline objects appropriately.
flights reduce (f,a=d2) ->f."airline-name" match {case "american" -> {United: a.United,Delta: a.Delta,American: {numberOfFlights: a.American.numberOfFlights + 1,flightsOpenRevenue: a.American.flightsOpenRevenue +f.price as Number * f."available-seats" as Number}}case "united" -> {American: a.American,Delta: a.Delta,United: {numberOfFlights: a.United.numberOfFlights + 1,flightsOpenRevenue: a.United.flightsOpenRevenue +f.price as Number * f."available-seats" as Number}}case "delta" -> {United: a.United,American: a.American,Delta: {numberOfFlights: a.Delta.numberOfFlights + 1,flightsOpenRevenue: a.Delta.flightsOpenRevenue +f.price as Number * f."available-seats" as Number}}else -> a}
It does the job, but again, too much repetition. So now we introduce our data to the update operator. It allows us to surgically update an object in a fashion similar to the PATCH operation in HTTP. That is to say, we need only supply the altered portion of the object. The remaining fields will remain unchanged.
Here's what our final approach looks like when using update:
flights reduce (f,a=d2) -> f."airline-name" match {case "american" ->a update {case amer at .American -> {numberOfFlights: a.American.numberOfFlights + 1,flightsOpenRevenue: a.American.flightsOpenRevenue +f.price as Number * f."available-seats" as Number
}}case "united" ->a update {case unit at .United -> {numberOfFlights: a.United.numberOfFlights + 1,flightsOpenRevenue: a.United.flightsOpenRevenue +f.price as Number * f."available-seats" as Number}}case "delta" -> a update {case delt at .Delta -> {numberOfFlights: a.Delta.numberOfFlights + 1,flightsOpenRevenue: a.Delta.flightsOpenRevenue +f.price as Number * f."available-seats" as Number}}else -> a}
In either case, the final outcome gives what we hoped for.
{"United": {"numberOfFlights": 3,"flightsOpenRevenue": 22143.48},"American": {"numberOfFlights": 3,"flightsOpenRevenue": 90000},"Delta": {"numberOfFlights": 3,"flightsOpenRevenue": 40927.9}}
We might consider that the repetition between the three update cases looks like a good candidate for a function. So if you feel inspired to refactor just a little more, you may discover something amazing about this tiny little experiment.
And in the meantime, when someone talks to you about "Low Code/No Code" you can smile to yourself. We won't tell them how the sausage is actually made.
No comments:
Post a Comment