Oliver Plohmann's Lightweight Ponderings on Software Development

Samstag, 7. Februar 2015

Overview

Is Google's Go object-oriented now or not?
Inner Pattern to mimic Method Overwriting in Go
Implicits in Scala: Conversion on Steroids
Go-style Goroutines in Java and Scala using HawtDispatch
JDK8 lambdas and anonymous classes
Why I like Eclipse and sometimes not
Groovy 2.0 Performance compared to Java
STIters: Smalltalk-style collection iterators in Kotlin
Beyond C/C++: High-level system programming in D

Is Google's Go object-oriented now or not?

Note: There is a discussion about this post on reddit, which you can find here.

Introduction

The question whether Go without direct support for inheritance is object-oriented or not pops up frequently on the Go user forum, Stackoverflow, various blogs, and other places on the Internet. Even some books about Go don't explain this well. Having developed about 10 years in Smalltalk, which is widely considered to be a pure object-oriented language, I thought my background about OOP is sufficient to write an article that sheds some light on this matter. It's a long read, I fear. But the matter is therefore being dealt with thoroughly. If you get tired in the meanwhile you can also just skip to the last conclusive chapter ;-).

Message Passing

It is often said that object-oriented programming is about inheritance. At the beginning of OOP inheritance received great attention. It was often sold to management as the tool to reduce development costs: you inherit already existing functionality from super classes and this way have much less development work to do. This idea quickly turned out to be too optimistic. In my old Smalltalk times it was already essential in job interviews to be of the opinion that deep class inheritance trees are hard to change and should therefore be avoided. And that was more than 15 years ago. Maybe because inheritance was new and somewhat "spectacular" it received so much attention although OOP was always about message passing [1, 2]. Let's have a look at a sample that shows message passing in Smalltalk:

| foo |
foo := Foo new.

That's all? Yep, but there is a lot happening. The message with the selector (e.g. message label) new is sent to the class Foo. The resulting instance of class Foo is stored in the locally defined variable foo. In Smalltalk a class is an instance of class MetaClass. Being an instance a class is by definition also an object. So Smalltalk is true to the "everything is an object paradigm". This is how the same thing looks like in Go:

foo := new(Foo)

Here new is not a message that is sent to a receiver object. It is a language built-in keyword that is called with the parameter Foo, where Foo is not an object. So Go does not apply message passing here. But relax, the case is purposefully a little extreme one. Many many OOP languages do instance creation the way as in Go, for example Java, C#, C++, Scala, Groovy, D (or even shorter simply Foo() as in Kotlin). It is a trade-off between purity and efficiency and in this sense has little to do with being object-oriented or not. It is more about the degree of object-orientedness. In fact, the only languages I know that do instance creation by applying message passing and "everything is an object" beside Smalltalk are Objective-C [3] and Ruby [4].

The important point is that Go allows for the creation of instances. Otherwise we would be doing modular programming as in old venerable Modula II. As a side note Go allows for some flexibility with regards to function invocation that most languages don't offer:

type myint int

func (self *myint) catch22() int {
    return int(*self) + 22
}

func main() {
    myi := myint(123)
    println(myi.catch22())
}

But this has nothing to do with message passing and being object-oriented or not. The broader applicability of methods in Go was only something I thought was worth noting.

Inheritance

Now we are going to attack the real beast: Inheritance. Long matter short, inheritance is not built into the Go language as in OO languages. But Go has interfaces and therefore dynamic dispatch. This gives us the tools with which we can implement with little extra effort what inheritance is basically about and that is method overwriting (not to be confused with method overloading). From my experience answering questions on the Go user forum about Go being object-oriented or not I know that we first need to spend some time explaining very well why delegation as such (or specifically embedding in Go) is not the same thing as inheritance. The sample code I'm making use of to demonstrate this is taken from this well written Go Primer. It looks like this:

type Base struct {}
func (Base) Magic() { fmt.Print("base magic") }

type Foo struct {
Base
}
func (Foo) Magic() { fmt.Print("foo magic") }

Now let's run we run the code. First we call Magic() on the Base class:

base := new(Base)
base.Magic() // prints "base magic"

No big surprise it prints "base magic" to the console. Next, let's call Magic() on Foo:

foo := new(Foo)
foo.Magic() // prints "foo magic"

And here we go: Go prints "foo magic" to the console and not "base magic". In the sample code Foo delegates to Base. It does not inherit from Base. Nevertheless, the same thing happens here as with inheritance: foo.Magic() prints "foo magic". So we have the same effect as with inheritance and the whole discussion about Go supporting inheritance or not is void. Right? No, not at all! In fact, the sample code above does not introduce any code that makes the difference between delegation and inheritance visible. This was done on purpose as a little pedagogical trick knowing from experience that a stepwise approach makes people better understand the matter.

Okay, but now we are getting real and are adding another method named MoreMagic() to the show:

func (self Base) MoreMagic() {
self.Magic()
self.Magic()
}

Note, that it is added to Base and very importantly not to Foo. Otherwise we would get the same misleading effect as in the code just shown before. The decisive factor about MoreMagic() is that it is defined in Base (and not in a "subclass") and calls a method Magic(), which is defined in both, Base and Foo. All right, now let's see what happens when we call MoreMagic():

base := new(Base)
base.Magic() // prints "base magic"
base.MoreMagic() // prints "base magic base magic"

foo := new(Foo)
foo.Magic() // prints "foo magic"
foo.MoreMagic() // prints "base magic base magic" !!

And here we have the proof that method overwriting does not take place in Go. Because what is happening here is delegation and not inheritance. What's that? Well, foo.MoreMagic() prints "base magic base magic". In a language that supports inheritance (e.g. method overrwriting) "foo magic foo magic" would be printed to the console. We can work around it by adding MoreMagic() to Foo as well, but this results in code duplication. More importantly, the idea of method overwriting is that this is done once and forever in a superclass and no subclass needs to know about it. In fact, there is a way you can get method overwriting to work with little extra coding effort. It is described in the blog "Inner Pattern to mimic Method Overwriting in Go" I once wrote some time ago. I believe hardcore Gophers would call the solution described in that blog as "against the language". However, it shows that method overwriting can be done in Go without too much effort. And this is what matters for this article.

Inheritance is also about inheriting variables, which I haven't talked about so far. This a little spectacular issue. There is little use in defining a variable in a subclass with the same name in a superclass. In fact, many OO languages don't allow for this as it is the source of sometimes hard to find bugs. So "variable overwriting" is nothing interesting to look into except for the issue with scoping. If an inherited variable has protected scope it can be seen from subclasses, but not from classes outside of the inheritance path. This cannot be modelled in Go as variables or methods are either private or public. The Go creators applied a smart little trick here to get around this by defining private variables and methods as visible for all methods within the same package (and not only visible within the class where defined).

Classes

When talking about OOP we cannot miss out on classes. It is sometimes said that Go does not have them, because Go has a language construct called "structs", but no one called "class". In fact, structs in Go serve as structs and classes at the same time. You can invoke a method on a struct and therefore a struct in Go can also do what classes can do (except for inheritance). The very first code snippet in this article actually did just this:

type Base struct {}
func (Base) Magic() { fmt.Print("base magic") }

base := new(Base)
base.Magic()

Here method Magic() is defined for struct Base. You need to create an instance of Base before you can invoke Magic() on it. In Go methods are defined outside structs (akka classes). This is unusual as for most class-based languages a method that can be invoked on an instance of some class needs to be defined inside that class. The way it is done in Go is just a different way of denoting that a method is associated with some struct. In fact, this is nothing new. It was done in Oberon like that long time before Go. When I looked at Go for the first time it reminded me a lot of Oberon.

Polymorphism

A gross concept in OOP is polymorhpism. Actually, it is probably the least understood idea of OOP. I never understood why, although polymorhpism is very important to enable the developer to do things in a transparent way. Polymorhpism means that different objects respond to the same message with their own proprietary behavior. This is why this feature is named polymorphism, which means something like "different shapes". Needless to say that Go supports this:

type Cat struct {
}

func (self Cat) MakeNoise() {
fmt.Print("meow!")
}

type Dog struct {
}

func (self Dog) MakeNoise() {
fmt.Print("woof!")
}

cat := new(Cat)
cat.MakeNoise() // prints "meow!"

dog := new(Dog)
dog.MakeNoise() // prints "woof!"

This and That

This section deals with some other things that are often provided by OO languages, but are not worth being dedicated each an entire chapter for.

Method Overloading. There is some other feature that is supported by many OO languages which is called method overloading where methods of the same name defined in the same class or inheritance path may have different paramter lists (different length, different types). Go does not have that and good old pure object-oriented Smalltalk also doesn't have it. So let's agree that it is not essential in OOP. It is just a nicety. Some people even argue that method overloading can have bad side effects (I remember having read a good article about it by Gilad Bracha, but couldn't find it again even after searching through the Internet for a long time).

No Class Methods. Go does not have class methods or class variables. A class method is a method that does not act on an instance of a class and produces and instance-specific result, but produces the same result for all instances of a class. In the same way a class variable contains the same value for instances of a class [5]. For example, the lack of class methods can be compensated for in Go this way:

package juice

type Juice struct{
}

func FromApples(apples []Apple) Juice {
}

func FromOranges(oranges []Orange) Juice {
}

juice := juice.FromApples(...)

The code above was shamelessly stolen from a post on the Go user forum. This looks almost as if FormApples were a class method invoked on a class named Juice. Truly, it is a function defined in a package named juice. What is decisive is that you don't have to know by heart that a constructor FromApples exists somewhere in the namespace. You want to create an instance of Juice and hence you look for constructors of Juice in package juice where you will find it, because it's the place to look for it first. Similarly, class methods can be easily mapped by package-level variables.

Go Base Library not always really OO

Go on language level has almost everything it takes to be called object-oriented. To me it does not look exactly the same way on the level of Go's base library. Here is some code to show what I mean:

package main

import (
    "fmt"
    "strconv"
    "strings"
)

func main() {
    i := strconv.Itoa(123)
    fmt.Println(i)

    i2, _ := strconv.Atoi(t)
    fmt.Println(i2)

    str := "foo"
    str = strings.ToUpper(str)
    fmt.Println(str)
}

When converting between strings and integers you have to know to find the functions you need in package strconv. When converting strings to upper or lower case you have to know to look in package strings for the functions that do that. All right, strings and numbers are language built-in types in Go as in many other languages whether being object-oriented or not. But you should, for example, not have to know about strconv. Functions Itoa or Atoi should be in package strings as well are maybe in a sub-package of it like strings.conv.

Here is some other sample where some not existing file named "none" is opened and afterwards a check for errors is done:

package main

import (
    "fmt"
    "io/ioutil"
    "os"
)

func main() {
    _, err := ioutil.ReadDir("none")
    if err != nil {
        fmt.Println(os.IsNotExist(err))
    }
}

The way to figure out what kind of error happened is to do the call os.IsNotExist(err). Here you need to know that this function IsNotExist exists in package os. You wouldn't have to know what method exists to do that and where to find it if you could just ask the error struct what kind of error it is. The error struct would simply have a method like this (the method below does just what os.IsNotExist is doing):

func (self IoError) IsNotExist() bool {
    switch pe := self.(type) {
    case nil:
        return false
    case *PathError:
        err = pe.Err
    case *LinkError:
        err = pe.Err
    }
    return self == syscall.ERROR_FILE_NOT_FOUND ||
        self == syscall.ERROR_PATH_NOT_FOUND || err == ErrNotExist
}

You would just need to have a look what methods struct IoError provides (IoError is a name I just made up). You see that is has a method IsNotExist(), which is what you need and the you can call it to get the job done:

if err != nil {
        fmt.Println(err.IsNotExist())
}

That's all. If you don't want to call this object-oriented programming, then you could call it programing with abstract data types or something. However, what is done in many places in the Go standard library is programming with global functions as in C.

So the impression of the Go base library gives some mixed feelings. In many ways the Go base library has a function-oriented organisation structure as in C and not a class-based organisation structure as in modular languages or OO languages. To me this is really unfortunate, because the lack of inheritance would not be that much of a problem given the fact that you can work around it. But the function-oriented organisation of the base library in many cases inevitably throws you back into some function-oriented programming as in C. Rust, which has some areas of overlap with Go, is in this way really different and does that kind of things the err.IsNotExist()-style way everywhere in the standard library.

Conclusion

Go has everything it takes to be called a hybrid object-oriented language except for inheritance. Method overwriting, the core feature of inheritance, can be implemented with little extra effort (one extra method per overwritten method) as Go has interfaces and this way also dynamic dispatch. How to do this is explained in this blog. So if you really can't live without inheritance, the good news is that it can be done. The function-oriented organisation structure of the Go base library rather than being class-oriented leaves some bad aftertaste behind, though.

A well-balanced Last Comment

Programming languages are like human beings: When you are too strict with them about a particular trait you might overlook many of the good traits. The real good trait to mention about Go is concurrency. IMHO, if some application is heavily concurrent Go might be a suitable choice and the language being in some ways relatively simple becomes a secondary issue. CSP-style inter-process communication as in Go using channels and goroutines makes things so much simpler than working with locks, mutexes, semaphores, etc.

I have spent some years working on a quite concurrent application where I was mostly busy with reproducing deadlocks and race conditions and looking for ways to fix them (neither nor is easy and often takes quite some time). From that point of view concurrency control in Go through the use of channels greatly simplifies things in concurrent programming and saves an enormous amount of time.

Also, Go solves the C10K problem out of the box. This comes for the price for a reduced level of pre-emptiveness. But this can be dealt with in most cases and for some applications this issue does not even matter.

Related Articles

Inner Pattern to mimic Method Overwriting in Go
Is Go an Object Oriented language?
Go Tutorial: Object Orientation and Go's Special Data Types

[1] Inheritance was invented in 1967 for Simula and then was picked up later by Smalltalk, see this article on Wikipedia.
[2] see this article about message passing on Wikepedia
[3] Objective-C: Foo *foo = [[Foo alloc] init];
[4] Ruby: foo = Foo.new()

[5] Class methods and class variables are called static methods and variables in Java borrowing the terminology from C/C++, where it it actually exactly mean the same thing.

Samstag, 7. September 2013

Inner Pattern to mimic Method Overwriting in Go

Note: I wrote a similar blog post "Is Google's Go object-oriented now or not?" that is not specifically about inheritance and Go, but about Go and object-orientedness in general.

Go [1, 2] is a relatively new statically typed programming language developed at Google that compiles to the metal. It feels like a modernized C in the way that is resembles C a lot, but has closures, garbage collection, communicating sequential processes (CSP) [3, 4, 5] and indeed very fast build times (and no exception handling, no templates, but also no macros). Build times in Go are that fast that it is almost instant and feels like working with a dynamically typed scripting language with almost no turnaround times. Performance in Go 1.2 is somewhat in the league with Java [6, 7, 8, 9]. Could be faster, but it is already a lot faster than scripting languages like Ruby, Python, or Lua. Go is also underpowered at the moment as there is room for optimization. From looking at job ads Go seems to appeal most to people in the Python, PHP or Ruby camp. Contrary to Python, Ruby, and Lua, Go does multi-threading well and makes thread synchronization easier through the use of CSP [10].

Motivation

Go relies on delegation, which by Go aficionados is called embedding (see chapter "Inheritance" in [2]). There is no inheritance and hence no method overriding. This means that "f.MoreMagic()" in the last line in the code snippet below (sample shamelessly stolen from [2]) does not print "foo magic" to the console as expected but "base magic":

package main

import "fmt"

type Base struct {}
func (Base) Magic() { fmt.Print("base magic") }
func (self Base) MoreMagic() {
self.Magic()
}

type Foo struct {
Base
}
func (Foo) Magic() { fmt.Print("foo magic") }

f := new(Foo)
f.Magic() //=> foo magic
f.MoreMagic() //=> base magic

So is there a way to mimic method overriding in Go at a reasonable cost? Many Go developers would consider only the idea of mimicking it in Go as not in line with the language and beside the point. But method overriding offers a great deal of flexibility being able to redefine default inherited behavior and make an object or algorithm behave as appropriate for its kind in a transparent way.

Inner Pattern

What first stroke my mind when looking at the Magic/MoreMagic sample was the inner construct [12] in the Beta programming language, which is some kind of opposite super. Applying that idea in this case I came up with a solution, which I named the "inner pattern":

package main

import "fmt"

type Animal interface {
Act()

    // Definition of ActInner tells developer to define a function Act with the
    // body calling ActInner as in Dog.Act() or Cat.Act()
    ActInner(inner Animal)

    makeNoise()
}

type Mamal struct {
    Animal
}

func (self Mamal) makeNoise() {
    fmt.Println("default noise")
}

func (self Mamal) ActMamal() {
    self.makeNoise()
}

func (self Mamal) ActInner(inner Animal) {
    inner.makeNoise()
}

type Dog struct {
    Mamal
}

func (self Dog) makeNoise() {
    fmt.Println("woof! woof!")
}

func (self Dog) Act() {
    self.ActInner(self)
}

type Cat struct {
    Mamal
}

func (self Cat) makeNoise() {
    fmt.Println("meow! meow!")
}

func (self Cat) Act() {
    self.ActInner(self)
}

func main() {
    dog := new(Dog)
    dog.ActMamal() // prints "default noise" but not "woof! woof!"
    dog.Act() // prints "woof! woof!" as expected

    cat := new(Cat)
    cat.ActMamal() // prints "default noise" but not "meow! meow!"
    cat.Act() // prints "meow! meow!" as expected
}

Note that the function makeNoise has to be public (e.g. MakeNoise) in case structs Cat and Dog are placed in separate packages which for simplicity reasons is not the case in the sample code above. Otherwise, the code would still compile but at runtime always Mamal.makeNoise would be called instead of Cat.makeNoise or Dog.makeNoise depending on the type of the receiver object.

So we get "method overriding" this way at the cost of having to stick to some kind of convention: If there is a method in a struct delegated to that has a parameter named inner like ActInner(inner Animal), we need to add a method Act() in our "subclass" calling ActInner:

func (self Dog) Act() {
self.ActInner(self)
}

func (self Cat) Act() {
self.ActInner(self)
}

This solution is not nicely transparent as f.ex. in Java where you would just add a method act() to your subclass that overrides the inherited method act() and that's it. Coming to think of it in C++ you can only override an inherited method if the inherited one is marked as virtual. So in C++ and other languages like Kotlin [13] or Ceylon [14] you also need to "design ahead" and think of whether a method is intended to be overridable. And this solution with actInner(inner Animal) in Go does not even carry the runtime overhead of dynamically dispatched virtual functions.

Also, in case struct Dog or Cat does not implement the function makeNoise along with function ActInner, the function Mamal.makeNoise() will be called at runtime. The Go compiler won't complain about some "subclass" Dog or Cat not implementing the "abstract" method makeNoise as for instance in Java or other OO languages that support abstract classes. There is no way round that as in the end a price has to be paid for not having a performance penalty in Go due to dynamically dispatched method calls compared to OO languages that support method overwriting.

Conclusion

My preliminary conclusion of all this is to use the "inner pattern" in Go as an ex post refactoring measure when code starts to lack "too much" in transparency. To apply it to begin with is too much pain where the gain in the end is uncertain. Otherwise, only apply it ex ante when it is clear from the beginning that the flexibility will be needed anyway as f.ex. with templated algorithms.

By the way, I think Rob Pike is right with what he says about CSP [10]. So I had a look how it can be done in Java. Groovy has it already [15] and for Java there is [16]. When casting CSP into an object-oriented mold you end up with something like actors. The best actor implementation for Java/Scala is probably Akka.

Update: Discussion on Reddit

There has been a discussion going on on Reddit concerning the proposed approach in this blog post. Below my remarks to the various comments made in that discussion.

« The link within the post for "Inner Pattern to mimic Method Overwriting in Go" is a bit odd. The Mamal struct contains an Animal interface but it is never assigned to or used, instead he's passing around an Animal interface explicitly and having to redeclare the Act func for both Dog and Cat. This is a bit cleaner, it uses the previously unused member interface Animal, and only has one Act func at the Mamal level which can dispatch generically to either Dog or Cat. »

The main function in the suggested solution ("This is a bit cleaner") looks like this:

func main() {
        dog := new(Dog)
        dog.Animal = dog
        dog.Mamal.makeNoise() // prints "default noise"
        dog.makeNoise()       // prints "woof! woof!"
        dog.Act()
}

The problem with the solution above is that it is not transparent. The developer needs to do some explicit variable switching as in "dog.Animal = dog" which breaks encapsulation. In the approach proposed in this blog post a "subclass" only needs to implement a method Act() that calls ActInner(...) and that's it. Some variable switching from the outside (which breaks encapsulation) is not required and the developer does not need to know that it has to be done (applying information hiding). The is conceptually the idea of method overwriting in OOP. I'm not proficient in C, but I would guess that such a solution with field assignment in records is what has to be resorted to when using procedural languages.

« You are correct about the interfaces, here's an example based on the code in the article: http://play.golang.org/p/de5-18d6aP »

The main function in the suggested solution ("http://play.golang.org/p/de5-18d6aP") looks like this:

          func main() {
                  noise(Dog{})
                  noise(Cat{})
          }

The solution above applies a procedural approach where compared to object-oriented message passing receiver object and method argument are swapped. The receiver object (the object, the message is being sent to) is passed as the parameter to a procedure (aka function), which for OOP would be the wrong way round. The solution in my proposed approach defines an additional indirection to "restore" message passing as in OOP with Dog{} or Cat{} being the receiver objects (e.g. dog.Act(), cat.Act()).

« Yes, he is just showing that two independent structs can have a method with the same name, which isn't really surprising. »

Yes, I agree with that. My intention was only to show that this is supported compared to languages that are modular and not class based such as Modula II. I might be missing further criteria that need to be fulfilled.

References

[1]		Effective Go
[2]		Google Go Primer

[3]		Communicating Sequential Processes by Hoare
[4]		Goroutines
[5]		Race Detector
[6]		Benchmarks Game
[7]		Performance of Rust and Dart in Sudoku Solving
[8]		Benchmarks Round Two: Parallel Go, Rust, D, Scala
[9]		Benchmarking Level Generation in Go, Rust, Haskell, and D

[10]		Robert Pike in "Origins of Go Concurrency" (Youtube video, from about position 29:00): "The thing that is hard to understand until you've tried it is that the whole business about finding deadlocks and things like this doesn't come up very much. If you use mutexes and locks and shared memory it comes up about all the time. And that's because it is just too low level. If you program like this (using channels) deadlocks don't happen very much. And if they do it is very clear why, because you have got this high-level state of your program 'expressing this guy is trying to send here and he can't, because he is here'. It is very very clear as opposed to being down on the mutex and shared memory level where you don't know who owns what and why. So I'm not saying it is not a problem, but it is not harder than any other class of bugs you would have."

[11]		Go FAQ Overloading

[12]		Super and Inner — Together at Last! (PDF)

[13]		Inheritance in Kotlin
[14]		Inheritance in Ceylon
[15]		CSP in Groovy
[16]		Go-style Goroutines in Java and Scala using HawtDispatch

Freitag, 30. August 2013

Implicits in Scala: Conversion on Steroids

Also published on java.dzone.com, see link.

With the use of implicits in Scala you can define custom conversions that are applied implicitly by the Scala compiler. Other languages also provide support for conversion, e.g. C++ provides the conversion operator (). Implicits in Scala go beyond what the C++ conversion operator makes possible. At least I don't know of any other language where implicit conversion goes that far as in Scala. Let's have a look at some sample Scala code to demonstrate this:

class Foo {
    def foo {
        print("foo")
    }
}

class Bar {
    def bar {
        print("bar")
    }
}

object Test
{
    implicit def fooToBarConverter(foo: Foo) = {
        print("before ")
        foo.foo
        print(" after ")
        new Bar
    }

    def main(args: Array[String]) {
        val foo = new Foo
        foo.bar
    }
}

Running Test.main will print to the console: "before foo after bar". What is happening here? When Test.main is run the method bar is invoked on foo, which is an instance of class Foo. However, there is no such method bar defined in class Foo (and in no superclass). So the compiler looks for any implicit conversion where Foo is converted to some other type. It finds the implicit fooToBarConverter and applies it. Then it tries again to invoke bar, but this time on an instance of class Bar. As class Bar defines some method named bar the problem is resolved and compilation continues. For a more detailed description about implicits compilation rules see this article by Martin Odersky, Lex Spoon, and Bill Venners. Note that no part of the conversion code from Foo to Bar is defined neither in class Foo nor in class Bar. This is what makes Scala implicits so powerful in the given sample (and also a bit dangerous as we shall see in the following).

If we tried to get something similar accomplished in C++ we would end up with something like this (C++ code courtesy to Paavo Helde, some helpful soul on comp.lang.c++):

#include <iostream>

class Foo {
public:
    void foo() {
        std::cout << "foo\n";
    }
};

class Bar {
public:
        Bar(Foo foo) {
                std::cout << "before\n";
                foo.foo();
                std::cout << "after\n";
        }
    void bar() {
        std::cout << "bar\n";
    }
};

void bar(Bar bar) {
    bar.bar();
}

int main() {
      Foo foo;
      bar(foo);
}

There are also "to" conversion operators defined by the syntax like:

class Foo {
operator Bar() const {return Bar(...);}
};

Note that such conversions are often considered "too automatic" for robust C++ code, and thus commonly the "operator Bar()" style conversion operators are just avoided and the single-argument constructors like Bar(Foo foo) are marked with the 'explicit' keyword so the code must explicitly mention Bar in its invocation, e.g. bar(Bar(foo)).

The C++ code including the comment in the paragraph above is courtesy to Paavo Helde. As it can be seen it is not possible in C++ to achieve the same result as with implicits in Scala: There is no way to move the conversion code completely out of both class Foo and Bar and getting things to compile. So conversion in C++ is less powerful than in Scala on one hand. On the other hand it is also less scary than implicits in Scala where it might get difficult to maintain a large Scala code base over time if implicits are not handled with care.

Looking for a matching implicit to resolve some compilation error can keep the compiler busy if it repeatedly has to look through a large code base. This is also why the compiler only tries the first matching implicit conversion it can find and aborts compilation if applying the found implicit won't resolve the issue. Also, if implicits are overused you can run into a situation where you need to step through your code with the debugger to figure out the conversion that results in a different output being created than expected. This is an issue that made the guys developing Kotlin drop implicits from their Scala-like language (see reference). The problem that you can shoot yourself into your foot when overusing implicits is well known in the Scala community, for instance in "Programming in Scala" [1] it says on page 189: "Implicits can be perilously close to "magic". When used excessively, they obfuscate the code's behavior. (...) In general, implicits can cause mysterious behavior that is hard to debug! (...) Use implicits sparingly and cautiously.".

What remains on the positive side is a powerful language feature that has often proven to be very useful if applied with care. For instance Scala implicits do a great job in glueing together disparate APIs transparently or in achieving genericity. This article only dealt with one specific aspect of implicits. Scala implicits have many other applications, see f.ex. this article by Martin Odersky, Lex Spoon, and Bill Venners.

[1] "Programming in Scala", Dean Wampler & Alex Payne, O'Reilly, September 2009, 1st Edition.

Sonntag, 9. Juni 2013

Go-style Goroutines in Java and Scala using HawtDispatch

Also published on java.dzone.com, see link.

In Google Go any method or closure can be run asynchronously when prefixed with the keyword go. According to the documentation "(...) they're called goroutines because the existing terms—threads, coroutines, processes, and so on—convey inaccurate connotations" (see reference). For that reason I also stick to the term goroutine. Goroutines are the recommended method for concurrent programming in Go. They are lightweight and you can easily create thousands of them. To make this efficient goroutines are multiplexed onto multiple OS threads. Networking and concurrency is really what Go is about.

This blog describes how to make use of HawtDispatch to achieve a very similar result in Java. HawtDispatch is a thread pooling and NIO event notification framework, which does the thread multiplexing that in Go is built into the language. There is also a Scala version of HawtDispatch. So the approach described here for Java can be applied in the same way in Scala. The code shown in this blog can be downloaded here from GitHub (includes a maven pom.xml to get HawtDispatch installed). Go provides channels as a means for goroutines to exchange information. We can model channels in Java through JDK5 BlockingQueues.

Let's have a look at some Go code that makes use of goroutines and channels (the sample code is shamelessly stolen from this article, see the chapter named "Channels"):

ch := make(chan int)

go func() {
  result := 0
  for i := 0; i < 100000000; i++ {
    result = result + i
  }
  ch <- result
}()

/* Do something for a while */

sum := <-ch // This will block if the calculation is not done yet
fmt.Println("The sum is: ", sum)

Making use of JDK8 default methods we can define in our Java world something like a keyword go. For that purpose I created one named async (using pre-JDK8 we would have to stick to little less elegant static methods):

public interface AsyncUtils {
    default public void async(Runnable runnable) {
        Dispatch.getGlobalQueue().execute(runnable);
    }
}

The async method will execute Runnables on a random thread of a fixed size thread pool. If you wanted to implement something like actors using HawtDispatch you would use serial dispatch queues. Here is a simplistic actor implemented using HawtDispatch (with queueing being serial through the use of the queue class DispatchQueue):

public class HelloWorldActor {

    private DispatchQueue queue = Dispatch.createQueue()

    public void sayHello() {
        queue.execute(()->{ System.out.println("hello world!"); });
    }

    public static void main(String[] args) {
        HelloWorldActor actor = new HelloWorldActor();
        actor.sayHello(); // asynchronously prints "hello world" 
    }
}

To be precise the HelloWorldActor in the snippet above is more of an active object as functions are scheduled rather than messages as with actors. This little actor sample was shown to demonstrate that you can do much more with HawtDispatch than just running methods asynchronously. Now it is getting time to implement the sample in Go in Java with what we have built up so far. Here we go:

public class GoroutineTest implements AsyncUtils {

    @Test
    public void sumAsync() throws InterruptedException
{
        BlockingQueue<Integer> channel = new LinkedBlockingQueue<>();

        async(()->
{
            int result = 0;
            for(int i = 0; i < 100000000; i++) {
                result = result + i;
            }
            channel.add(result);
        });

        /* Do something for a while */

        int sum = channel.take();
        System.out.println("The sum is: " + sum);
    }

    @After
    public void tearDown() throws InterruptedException {
        DispatcherConfig.getDefaultDispatcher().shutdown();
    }
}

The code presented here would also work with pre-JDK8 since JDK8 is not a requirement for HawtDispatch. I just preferred to make use of JDK8 lambdas and defender methods to get the sample code more compact.

Mittwoch, 2. Januar 2013

JDK8 lambdas and anonymous classes

Preview releases of the upcoming JDK8 including the long-awaited lambdas are available for several months meanwhile. Time to have a look at lambdas to see what they are and what you can expect from them.

So today, I downloaded the latest preview release of the JDK8 from jdk8.java.net/lambda to have a look at the upcoming lambdas in JDK8. To my despair, this code snippet did not compile:

        List<Integer> ints = new ArrayList<>();
        ints.add(1);
        ints.add(2);
        ints.add(3);

        int sum = 0;
        ints.forEach(i -> { sum += i; });

The compiler error was: "value used in lambda expression should be effectively final". The compiler complains here that the variable sum had not been declared final. Also see this blog post, that is part of the JDK8 lambda FAQ, which explains the matter (I perpetually insist on having found the issue independently from this post ;-)). So lambdas in JDK8 carry exactly the same restriction as anonymous classes and you have to resort to the same kind of workaround:

	int sumArray[] = new int[] { 0 }; ints.forEach(i -> {sumArray[0] += i;}); println(sumArray[0]);

This works and prints 6 as expected. Note, that the compiler did not complain here about sumArray not being declared final as it is effectively final: "A variable is effectively final if it is never assigned to after its initialization" (see link). This is a new feature in JDK8 as the code below does not compile with a pre-JDK8 if value is not declared final:

	final long[] value = new long[] { 0 }; Runnable runnable = new Runnable() { public void run() { value[0] = System.currentTimeMillis(); System.out.println(value[0]); } };

However, this means that JDK8 lambdas are not true closures since they cannot refer to free variables, which is a requirement for an expression to be a closure:

"When a function refers to a variable defined outside it, it's called a free variable. A function that refers to a free lexical variable is called a closure.". Paul Graham, ANSI Common Lisp, Prentice Hall, 1996, p.107.

The free variable gives the closure expression access to its environment:

"A closure is a combination of a function and an environment.". Paul Graham, ANSI Common Lisp, Prentice Hall, 1996, p.108.

In the end we can conclude that JDK8 lambdas are less verbose than anonymous classes (and there is no instantiation overhead as with anonymous classes as lambdas compile to method handles), but they carry the same restrictions as they do. The lambda specification (JSR 335) also says so explicitly: "For both lambda bodies and inner classes, local variables in the enclosing context can only be referenced if they are final or effectively final. A variable is effectively final if it is never assigned to after its initialization.". Here is also a link to an article where Neal Gafter himself (who was a member of the BGGA team) tried to explain why inner classes are no closures (read the comments section). However, all this is only a little tear drop as the usefulness of closures is preserved to a large extend. An imense amount of pre-JDK8 boilerplate code can be replaced with much more concise expressions now. And in the end, you can anyway still do this:

int sum = ints.stream().reduce(0, (x, y) -> x + y);

Nevertheless, the difference between JDK8 lambdas and closures is worth a note as it is good to know. There is a nice write-up about many the things you can do with JDK8 lambdas in this blog post. Here is some sample code from it:

	List<String> names = Arrays.asList("Alice", "Bob", "Charlie", "Dave"); names .mapped(e -> { return e.length(); }) .asIterable() .filter(e -> e.getValue() >= 4) .sorted((a, b) -> a.getValue() - b.getValue()) .forEach(e -> { System.out.println(e.getKey() + '\t' + e.getValue()); });

We can also reference a static method:

	executorService.submit(MethodReference::sayHello); private static void sayHello() { System.out.println("hello"); }

The lambda FAQ says about the restriction on local variable capture explained in this article: "The restriction on local variables helps to direct developers using lambdas aways from idioms involving mutation; it does not prevent them. Mutable fields are always a potential source of concurrency problems if sharing is not properly managed; disallowing field capture by lambda expressions would reduce their usefulness without doing anything to solve this general problem.".

The author is making the point here that immutable variables, like those declared final, cannot be changed inadvertently by some other thread. A free variable referenced from within a closure expression (but declared outside the closure) is allocated somewhere on the heap, which means that it is not local to some specific stack (hence it is free). Being allocated on the heap a free variable can be seen by all other threads as well. This way a free variable can effectively become a variable shared between threads for which access needs to be synchronized to prevent stale data from happening. So the finalness restriction for JDK8 lambdas helps in avoiding that kind of trouble. Note, however, that this is only true for simple types or objects that are not nested as can be seen in the sample code in the beginning of this text with the single-element-array sumArray: the final variable sumArray cannot be re-assigned, but the single element it holds can be changed any time.

Sonntag, 21. Oktober 2012

Why I like Eclipse and sometimes not

I learned from the comedy movie Borat that a typical way to turn a statement into a humorous one is to append "not" at the end of it. So I did this as well in the title of this article. Admittedly, the main reason being though, that no one would otherwise read an article titled "Why I like Eclipse" ... ;-).

I often happen to meet people in projects that are really into NetBeans or IntelliJ IDEA and not into Eclipse at all. These people don't understand why someone like me would work with Eclipse (I also use IntelliJ IDEA quite a bit). The problem is here that explaining why I like Eclipse results in a long talk which is first about Eclipse background knowledge that demands a lot of patience and distracts people for too long time from their work. Secondly, a long talk is followed about why I feel very productive when good code browsers as in Eclipse are at my disposal. So I'm trying to explain it in this little article once and for all for the benefit of the world (eventually, you need to append again "not" at this place). Don't worry, it's not going to be one-sidedly as I will also talk about the things in Eclipse that are not that amusing. It's merely about code browsers and their differences than specifically about Eclipse.

Code Browsing

The real reason I like Eclipse is its powerful code navigation and code browsing capability only comparable to the code browsing its big idol which is the excellent Smalltalk environment. I'm willing to sacrifice a lot of other things as long as I have that. Let me quote Bertrand Meyer: “Smalltalk is not only a language but a programming environment, covering many of the aspects traditionally addressed by the hardware and the operating system. This environment is Smalltalk’s most famous contribution”. [1] This statement includes that Smalltalk not only is a language with an IDE on top, but a computing environment as such. This coherence has gone lost with Eclipse which has good and bad consequences. But this is a different topic, too long to talk about it here as well. The people that have worked with Smalltalk understand what this means. But the people that have not, only gaze fixedly at you for a moment and then continue working. So my contribution in this article is aimed at explaining what this is about. Earlier, people often have heard about Smalltalk and are willing to listen for a while. Nowadays, you have to say something like "Smalltalk is the system that had closures from the beginning already in 1980 from which later this clone was made starting with a 'J'". Otherwise people would not even stop coding for a second. Or you have to say something like "Smalltalk is the system Steve Jobs was looking at when visiting Xerox Parc (see also this article about Xerox Parc) when he said that this is the way he wants the user interface to be on the computers he is producing" (user interface with icons, moveable windows that can be collapsed, and a mouse).

What's the catch about excellent code browsing capability then? Problem is that when your code starts to grow, you somewhen reach a point where it is hard to keep oversight. Well, that is was structured programming is for: you can structure your code and then there is abstraction, information hiding, modularity, inheritance, polymorphism and more. But somewhen you can't remember any more in which class what method was placed and it is sometimes still hard to keep oversight even with abstraction and all that. I have seen people that are nevertheless able to understand their code very well using simple development tools only. Therefore, I agree that you don't necessarily need to have an IDE with good code browsers. For some people it's a necessity. For others it's a matter about comfort and maybe also developer productivity.

Eclipse’s heritage from the Smalltalk IDE

Eclipse was developed by the people of a company named OTI in Ottawa, Canada, that used to develop and market the other big Smalltalk development system at that time in the market (besides ParcPlace Smalltalk, now Cincom Smalltalk) which was first Envy/Developer and then OTI Smalltalk (I’m not sure about the name here). When the development of Eclipse started OTI was already acquired by IBM as IBM wanted to sell OTI’s Smalltalk system as IBM Smalltalk as a replacement for their failed CASE tool strategy. The product named IBM VisualAge for Smalltalk was also very successful (especially in the finance sector) at a time where there was only C++ and Smalltalk for serious production quality OO development. Later Java came along and IBM abandoned its Smalltalk system, sold it to Instantiations and jumped onto the Java train developing IBM VisualAge for Java. VisualAge for Java was very much like the Smalltalk IDE only the programming language being Java: It was an interactive development environment where almost any statement could be selected and executed at development time. You could look at your data in inspectors in which you also could send messages to objects dynamically at development time. From what I have heard VisualAge for Java itself was developed in IBM Smalltalk, but I cannot provide evidence for this. This was IMHO a very productive development environment and everything was fine as long as your application only consisted of the code you were writing. But then web development came along and this was no longer true as now, beside source code files, a plethora of all other kinds of file types came into play: html, jsp, xml, css, jar, war, ear, and much more and they all have to be bundled together. The latter was as much a problem for Smalltalk’s/VA Java’s approach to create a runtime package as the former. So VisualAge for Java was abandoned and Eclipse was developed. If you managed to get to this line, the bits of Eclipse history I had to provide are now behind you ;-).

Code Browsing in Eclipse

So far I have not mentioned why code browsing in Eclipse is so fantastic (let’s say it is better than in many other IDEs at least). There are different browsers for different things. If you are working on code files only you can use the "Java Browsing" perspective. You see the packages and their classes of your project at a glance and everything else is removed. You can still have the "Java" perspective where your Java code and all the other types of files are visible at once. You can have all the browsers you work with side by side each in a window of its own. Select Window > Prefrences > General > Perspectives > Open a new perspective and select "In a new window". From now on every perspective you open will open in a new window of its own. Most people working with Eclipse I have seen don't know this feature at all. But this is the usual way the Smalltalk IDE was intended to be used. Then Eclipse has an equivalent for the Smalltalk class hierarchy browser. It is also not activated by default. To do so you have to go to Window > Prefrences > Java > When opening a type hierarchy and select "Open a new Type Hierarchy Perspective". I always found this browser to be very useful when working on an abstract class and some of its concrete subclasses, because you can really concentrate on just that what matters in that regard.

I once had a situation where Eclipse ran out of memory, which was probably caused by the memory demands of the OSGi implementation when building from within Eclipse. But because of me using several browsers at the same time in Eclipse as I used to do it earlier when developing with Smalltalk, some colleague was absolutely sure that having that many browsers open consumed too much memory. When I switch between perspectives that are displayed in the same window, memory remains allocated for all of them just the same way as when they are opened in their own window. No way you could switch between perspectives that quickly, otherwise. But that argument just didn't fly. Some people are that used to just working with a single window IDE that anything different appears simply weird to them.

And why I sometimes don’t like Eclipse

Eclipse provides a solid base to place an IDE on top of it for all kinds of things. Its Java plugin is also very useful. But it is not always as good at specific tasks such as code completion (IntelliJ IDEA is IMHO awesome here), refactoring (needless to say that "Refactoring was conceived in Smalltalk circles” [2]), “intra-file navigation” (jump from some JSF xhtml statement to the underlying Java code, etc.). It does not have an excellent Swing IDE builder such as NetBeans. When you develop a web application all the plugins that come into play are not nicely integrated and concerted as in IntelliJ IDEA. Also MyEclipse does not do much about this in the end. The weakness of Eclipse in short is that it stops after providing a plugin platform and a Java plugin. From then on every one is left to his own devices. A lot of nice people have developed very respectable plugins for all kinds of things, but they miss “calibration” with related plugins rendering them isolated solutions.

Then Eclipse has become sluggish and sometimes irresponsible. I’m not amused how often Eclipse is irresponsible and I have to wait till it’s responsive again. I don’t know exactly what the reason is in every case. Maybe just someones plugin is not well written and is causing this. Whatever, as already said, other IDEs don’t have this problem as all the plugins that come into play are inter-coordinated and tastefully furnished.

Last but not least, at the time of writing (21st October 2012) Eclipse has still no support for JDK8 lambdas and default methods.This is because Eclipse’s Java compiler is built into Eclipse and cannot be easily separated (you can define a custom builder for your project which will call the javac of the JDK you defined. But the JDT will still not be able to deal with JDK8-style lambdas). So the whole thing has to be exchanged. This is probably some heritage from Smalltalk as well where the whole thing was a single system. Earlier at that time, this was unmatched coherence (compared to piping in a myriad of little Unix tools). Nowadays it's considered inflexible and monolithic. I use IntelliJ IDEA 12 EAP for my current little spare time JDK8 lambda project. So far there was never a problem to get any lambda expression compiled and to run. Simply amazing.

Last and least, I really wished NetBeans and IntelliJ IDEA had a class browser in addition like the one in Smalltalk or something like the “Java Browsing” perspective in Eclipse. When you are working on code only and no html, xml, css, or whatever files are part of your application, IMHO, there is nothing like it. But in todays world there is no way to develop an application without any xml (or json nowadays), for example. But I'm convinced there is a way to merge the best of Eclipse/NetBeans/IntelliJ IDEA with the best of the Smalltalk IDE.

1. Bertrand Meyer, Object-oriented Software Construction, Prentice Hall, 1988, p.439.

2. Martin Fowler, Kent Beck (Contributor), John Brant (Contributor), William Opdyke, don Roberts, Refactoring: Improving the Design of Existing Code, Addison-Wesley, 1999, p.6.