GroovyGrid DSL poll: what is right way to implement branching control structures in Groovy

It is well-known that you can’t implement if/else type control structure in Java. Another useful example of such control structure is continuations. It is much less know that you can implement such kind of structure in Groovy. In this short article I will show several options how it can be done. Our use case will be the same: check point definition for GroovyGrid. The main quesion of the post is which option is better from readability point of view. I will appreciate any comments from Groovy user, which way do they prefer.

Usecase

GroovyGrid is Groovy DSK for GridGain framework. I develop GroovyGrid with idea to simplify coding of grid applications using Groovy instead of Java. Our use case for this article will be checkpoint definition.

So what is checkpoint? Checkpoint is possibility for extensive computing jobs can save their intermidiate state, so if job failed and was started by GridGain again it can continue not from scratch but using previously save state. In general logic of such job looks like following pseudo-code

Check if saved state exist
If it doesn't
     Execute first part of the job
     Save result of first part on checkpoint
Execute second part of job using either loaded or calculated result of first part

Now we are redy to start with implementation options

Option I: Closure after closure

We can use following syntax

checkPoint ("checkpoint") {
     // first part calculation
}
{ state ->
    // second part calculation
}

As you can see we put two closures one after another without any syntactical indication of connection between closures. Fortunately for us Groovy parser understand such construction as call of ‘checkpoint’ method with one parameter of type String and two parameters of type closure. So our implementation of ‘checkPoint’ method is pretty trivial (avoiding of course exception handling and such)

def checkPoint(String name, Closure before, Closure after) {
    Serializable state = gridTaskSession.loadCheckpoint(name)
    if (state == null) {
       state = before ()
       gridTaskSession.saveCheckpoint("checkpoint", state)
    }
    after(state)
}

What I like in this option is that it is extremly simple in implementation and very straight forward. Important to note here that if we want to use several checkpoints it is very easy to nest it.

checkPoint("state2") {
    checkPoint("state1") {
        // step1
     }
     { state1 ->
       // step2
     }
}
{ state2 ->
   // step3
}

Option II: Control object

What I don’t like in Option I is the fact that closure are not ‘visually’ connected in the code. So here are improved syntax

checkPoint ("checkpoint") {
    // first part calculation
}.andContinue { state ->
    // second part calculation
}

What we gain is ‘andContinue’, which probably improves readability. What we complicate is implementation – now ‘checkPoint’ method returns not value but control object to be called to complete the calculation

def checkPoint(String name, Closure before) {
    [
      andContinue : { Closure after ->
          Serializable state = gridTaskSession.loadCheckpoint(name)
          if (state == null) {
              state = before ()
              gridTaskSession.saveCheckpoint("checkpoint", state)
          }

          after (state)
      }
    ]
}

There is nice trick we can use to variate the same idea – use ‘rightShift’ of ‘rightShiftUnsigned’ name instead of ‘andContinue’ It allows us to write code like this (‘rightShiftUnsigned’ case)

checkPoint ("checkpoint") {
    // first part calculation
}  >>> { state ->
    // second part calculation
}

Again nesting is pretty simple

checkPoint("state2") {
    checkPoint("state1") {
        // step1
     } >>> { state1 ->
       // step2
     }
} >>> { state2 ->
   // step3
}

Option III: No after-brunch at all

Careful reader might notice for our particular use case special syntax for ‘after brunch’ is not necessary because we will always execute it. So we can simply write

def state = checkPoint ("checkpoint") {
    // first part calculation
}
// second part calculation

There main reason I consider other options here is because the article is more about methodology and not about particular use case. For example, in case if we want to execute first part asynchronously on grid or thread pool and and continue with second part only after first one completed (of course without waiting) the current option will not work at all.

Anyway for completeness here is implementation

def checkPoint(String name, Closure before) {
    Serializable state = gridTaskSession.loadCheckpoint(name)
    if (state == null) {
       state = before ()
       gridTaskSession.saveCheckpoint("checkpoint", state)
    }
    state
}

Main questions

  1. Which option do you prefer?
  2. What are other options?

Please let me know and Enjoy Groovy!

Advertisement

2 Responses to GroovyGrid DSL poll: what is right way to implement branching control structures in Groovy

  1. apohorecki says:

    The second option seems much more readable than the first one. IMHO if you used syntax like:
    checkpoint(‘name’, {}, {})
    it would be more readable too.

    Cheers!
    Adam

  2. ybryukhov says:

    I’d pick the option which would allow to detect user errors (in checkpoint usage) and report them, preferably at compile time

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: