Why Erlang?

The chance that you are reading this blog post on a device with a multicore cpu is increasing on a daily basis which is why everybody is talking about concurrency now. Concurrency for our web applications and API backends means that we’d like our htop to look like this:

htop screenshot

I’ve recently been to a really awesome ruby conference and three or four talks out of 21 were about concurrency. The ruby community is quite open and so many possibilities were discussed: Using threads, using different ruby runtimes to circumvent the GIL, using more processes, using the actor model via libraries like Celluloid or even using Akka through JRuby.

While the actor model seems to be a good fit for building concurrent network applications it often suffers from problems if the runtime it is implemented in has no “native” support for it. There are implementations for Ruby, Python and Java but they all have to jump through several hoops to get the job done and are not necessarily yielding the best performance. This is one of many reasons why Erlang would be a much better choice but first, lets talk about this actor model for a bit to understand why it is such a good fit.

The Actor Model

There is this nice quote from wikipedia which offers a first glimpse:

»The Actor model adopts the philosophy that everything is an actor. This is similar to the everything is an object philosophy used by some object-oriented programming languages, but differs in that object-oriented software is typically executed sequentially, while the Actor model is inherently concurrent.«

While there are some resemblances between actors and objects, like modularity, encapsulation and message passing, the main feature of actors is that they are being run at the same time.

Strictly using message passing for sharing state with other actors which run in parallel enables asynchronous communication, meaning that the sender does not have to wait for a response from the receiver.

Another big difference to the OOP world is that in the actor model there is no global state and therefore also no shared memory between actors. In languages like Java, Ruby and Python there is always global state and threads have access to shared memory. This is often a cause for trouble in the form of deadlocks or race conditions and is maybe the biggest pain of using threads.

In the actor model each actor has its own internal state and is only sharing it via messages. Thereby it is acting as a serializer for access to its state and effectively preventing deadlocks and race conditions.

It might be also worth noting that the actor model especially makes sense for functional languages as they embrace the concept of immutable data.

There is a lot more to read about actors but I would say these are the most important bits to know. In general the actor model makes designing and implementing concurrent applications a lot easier. Compared to threads there is no need of managing the access to information with mutexes, locks or semaphores or other complex abstractions.

Ok, so what about Erlang?

First let me tell you that for years I have been a passionate Ruby developer. I really like the language and community a lot. From time to time though I felt I was hitting some invisible walls when it came to network applications like web apps, web servers, proxies etc. Basically everything that had to handle a lot of requests and/or did non trivial tasks.

I had Erlang on my radar for quite some time but coming from my ivory tower with a ruby rooftop it took several attempts to convince me that it was worth a try. Conceptually it already made a lot of sense to me and I’m sure that most people who read about Erlang will agree. I have to admit that I was mostly appalled by the weird syntax so much that it stopped me from trying. This was a big mistake though and a large part of my motivation to write this blog post is about telling you that you should try out Erlang as soon as possible.

Anyway, first lets describe Erlang in one line:

»Erlang is a functional language, implementing the actor model for concurrency.«

Its a language which was developed by Ericsson for their carrier grade telecom switches and the design goals were to create a language that would allow to design fault tolerant, highly available and concurrently running systems.

You can read all about it on wikipedia or this awesome website: http://learnyousomeerlang.com/ – They do a much better job describing the language.

Case study for Erlang at Wooga

This post is about getting you to try it and I will do that by telling a story about Erlang at Wooga.

Wooga makes social games with millions of daily active users. The games constantly talk to the servers to transform and persist the users game state. Some of our game backends are developed in Ruby and that worked really well so far. Ruby, like I said, is a really nice programming language and although it is certainly not the fastest, you can squeeze a lot of performance out of it when you know what you are doing.

Our biggest game in terms of users, revenue and backend complexity runs on about 80 to 200 application servers though. It handles about 5000-7000 requests per second and almost all of them are changing the game state of the user. I’d say the amount of application servers is still reasonable for the amount of load but its certainly not the most impressive number.

Then some day a new backend had to be built for a game with similar complexity and my colleague Paolo suggested to use Erlang this time as he thought it would be a really great fit for us. We hired an experienced Erlang developer (Knut) and together they implemented the backend. By now this game has approximately 50% of the users of the other game and the number of application servers they need is: 1!

They run the backend on two or three servers for redundancy purposes but it could perfectly run on one. Even if it would actually need four it would still be drastically more efficient and performant that the other backend(s).

Now of course they also knew about all the mistakes we have made in previous games and its not Erlang alone that gave them so much better performance but rather they could implement the backend in a unique way which is really easy with the actor model and rather hard everywhere else.

Basically they’ve build a stateful web server which means that each user who is playing the game is represented by an actor inside of the Erlang VM. The user starts playing and an actor with the users game state is spawned. All subsequent requests for the time the user is playing are going directly to this actor. Since the game state is held in the actors own memory all requests, which would otherwise hit the database, can be processed and answered extremely quickly.

If the actor crashes, all the other actors are not being harmed since there is no shared / global state. When the user stops playing, the actor will save the game state to a persistent data store and terminate making it easy for the garbage collection. Since the data is immutable it is always possible to revert to the game state before the transformation started in case something goes wrong.

It is really awesome and there is a lot more to tell about it. Fortunately Knut and Paolo have spoken on a couple of conferences about it and shared their slides so you can get some more insights:

* http://www.slideshare.net/wooga/erlang-factory-sanfran
* http://www.slideshare.net/hungryblank/getting-real-with-erlang

More Erlang at Wooga

After Paolo’s and Knut’s success the Erlang virus spread inside of the company. We have started new game backends in Erlang and built smaller additional services with it. Personally I can confirm that the more you learn about Erlang the more it makes sense and feels right. It made me even feel a little bit sorry for those at the Ruby conference who were struggling with different runtimes and libraries to introduce the level of concurrency and ease of development that Erlang delivers in one package. A package that has been in production use for more than 20 years.

The hard part of learning new languages is to find a reasonably sized project to start with. Learning just by reading books is always slow as you forget most of what you read when you don’t play around with it. Apart from the weird syntax which I don’t find that weird anymore, not having an actual project to try Erlang was the biggest show stopper for me. So I encourage you to pick a small little project and play around with Erlang. I think you will not regret it.

I hope I will find the time for a follow up blog post about how I learned Erlang and about getting started in it soon. In the meantime go to learnyousomeerlang.com and get started on your own. Trust me – this site is better than any book about Erlang which you can buy right now.

PS: Thanks to Elise Huard for proof reading! If you have feedback, drawings of an ivory tower with a ruby rooftop to make this blog post more colorful or any other contributions send it right away!

45 Responses to Why Erlang?

  1. As eye opening and wonderful it is going to Erlang, I can tell you from very recent (on-going) experience how painful it is to go the other way.

    I have been lucky enough to spend the last few years in Erlang nearly exclusively, but a new opportunity has arisen for me that is firmly bolted (years of work) to Python.

    So, I began researching how to do concurrency in python (as you mentioned… the GIL) and my findings have been horrifying. The problem space requires long-lived connections per user … I wandered into a nightmare of async, callbacks, deferreds, monkey-patching, and all that just to take advantage of a single core… so on top of that you strap a pre-forker and load balancer… and now you are starting to get to something useful. Then you strap on top of that packaging and deployment tools, maybe strap on some custom code for in-place upgrades… blarg.

    “Any sufficiently complicated concurrent program in another language contains an ad hoc informally-specified bug-ridden slow implementation of half of Erlang.” — Robert Virding :: First Rule of Programming

  2. I was at the same conf, and i’m in process of writing similar post on my blog.

    In general there was a lot about concurrency but only this guy said something more then “wikipedia” level :)

  3. Erlang is a fine language … it does some nice concurrency things too and so does Go. Yet it suffers from a number of shortcomings.

    The first problem for Erlang is the lack of a credible workforce. There are simply not enough qualified programmers to do the work. Many of them happened to get there through a CS degree or accident. (I’m the latter)

    The second problem is that the toolset is still very limited and many people I’ve interacted with demonstrated that most production tools are still proprietary. Every time something goes wrong with Erlang someone says, “it was not intended for that”.

    I hold an opposing position; here are many more ways to scale than just going to Erlang and there are many more problems that Erlang introduces. It’s just something different that has not been fully explored. The argument is very similar to the NoSQL vs SQL or CAP vs ACID… where searching a BTree takes the same number of compares. The difference between CAP and ACID is the ACID.

    • You compare Erlang ~20 years old language with GO few years old max including lab tests and you say “lack of credible workforce”.

      I think if you will think about functional languages there is not many programers out there at all.

      I see that in Europe there is problem in finding any programmer not only specialised ones :)

    • Lack of credible workforce. I disagree with this as someone who has hired Erlang developers — the community is exploding, and has an amazing signal to noise ratio.

      Lack of toolset. This one made me grasp, OTP & a bunch of the default Erlang tooling is amazing, astounding, better than anything I have found in any other language (I have used professionally dozens, and have poked at over 70). Release building, process monitoring, clustering, messaging, true isolation, true preemptive multitasking, live debugging, live upgrades… these all come out of the box.

      Maybe by “toolset” you meant 3rd party libraries. Another area Erlang is exploding in, due mainly to rebar making it drop-dead simple to have dozens of dependencies in your application, chaining dependencies, and all that happening automatically and pointing at specific versions. So, wow, I just can’t agree with you on your second point.

      You speak as someone who seems mighty ignorant about what Erlang actually is…

  4. Sorry, but why no Scala? I had no practice with it, but it seems that with akka you can achieve the same or even better performance.

    • Akka is interesting, it is inspired by / uses the philosophy of Erlang (http://doc.akka.io/docs/akka/2.0.1/intro/what-is-akka.html).

      “we adopt the “Let it crash” model which have been used with great success in the telecom industry to build applications that self-heals, systems that never stop” — AKA: “we are trying to copy the awesome that is Erlang, without directly referencing Erlang”.

      The reason I don’t like Akka is that it uses Java GC, which I am not a fan of… at all. I much prefer Erlang per-process GC for a much smoother / consistent response under heavy load.

      I prefer the OTP infrastructure to the Java ecosystem. I like being able to do in place updates, and in place debugging, and in place introspection.

      I prefer the hands off approach to scheduling Erlang provides, versus the dispatcher model of Akka.

      • >> The reason I don’t like Akka is that it uses Java GC, which I am not a fan of… at all. I much prefer Erlang per-process GC for a much smoother / consistent response under heavy load.
        >>
        >> I prefer the OTP infrastructure to the Java ecosystem. I like being able to do in place updates, and in place debugging, and in place introspection.
        >>
        >> I prefer the hands off approach to scheduling Erlang provides, versus the dispatcher model of Akka.

        But those are preferences, not an actual “this is better” argument. Say I don’t have a strong need for concurrency, then what benefits remain for using Erlang? There’s definite obstacles to using it (many are mentioned in the article), so aside from concurrency, are there enough “additional benefits” that outweigh these concerns?

  5. Thanks for the post. I want to re-write my web app (http://chizzl.com/) in chicago boss just so I can use Erlang and get over the hump to begin using Erlang exclusively for once. This leap is so frigging hard (as you mentioned) and it’s a shame. I know how great the returns could be, but life is so short and you end up using the tools that are no-brainers. But thanks for posting about the awesomeness of Erlang!

  6. Great article.

    I’m also a Rubyist strugling to *master* Erlang. When you come from Ruby the Erlangs’ syntax is pretty weird at first. But how you handle processes, concurrency and communication blows me away. Too bad, I haven’t discovered Erlang earlier :(

  7. Nice post!!! I used Scala and Akka to get erlang for a project where the customer was apprehensive about erlang. Erlang is awesome

  8. Can you tell me the difference between Erlang and Clojure? Is Erlang easier to do the concurrency programming than on Clojure?

  9. Well, Erlang truly deserves better assessment than it gets. In today’s world Erlang is the best platform for the application server or services. You can possibly achieve what you achieve with Erlang/OTP in terms of performance, scalability/throughput, concurrency, distribution, fault-tolerance, live updates, etc. in Java, C#, C++, with SQL, etc. but only through a considerable pain, effort and cost. This is the practical essence of Robert Virding’s law! Yes Erlang did not invent Actors or message passing. I learnt about Gul Agha/Hewitt’s Actors in the late 80′s in grad school. So also Smalltalk-80′s MVC and message passing. But neither did Java O-O or MVC, etc.

    Erlang/OTP does so much for you and productivity is incredibly high. Compare the equivalent Lines of Code in say Yaws Web Server or Ejabberd versus, Apache, etc. To think that Mnesia, a fully distributed database which supports dual main-memory/disk storage engines, with sharding, etc. is only 20 – 30,000 lines of code is mind-bugling.

    For productivity alone it is best for small and one-man development shops. Productivity and time-to-market are, for me, Erlang/OTP’s overriding advantage. Nothing beats Erlang, server-side.

  10. Erlang is a terrific language. I’d second the notion that everyone should at least be familiar with it.

    However, for those of you who are Ruby programmers, I’d also strongly suggest that you take a hard look at Celluloid and DCell. I’ve been working with them for awhile now, and both are really quite amazing. To give you an idea of how expressive they are, I am working on a paxos implementation for dcell that includes support for dynamic reconfiguration and leader election. The whole thing is probably under 1000 lines of code.

    Here is a link to Tony’s blog post announcing DCell.

    * For those interested in the actor model in general, I’d also check out Termite, which runs on top of Gambit scheme. Gambit serializes continuations, which allows you to do some pretty neat things, such as in-place upgrade of individual actors. I don’t think it’s really under active development any more, it’ll expand your ideas of what can be done with this model.

  11. I think this article is a bit off. The actors model is a pattern that can be implemented in pretty much every language with basic threading or forking support. Indeed there are multiple implementations of actors in many languages.
    It is pointless to learn a new language because of a pattern like this. Grab an actor implementation for your favourite language and be happy, or if you prefer, implement your own.

    However, erlang actors are built on top of erlang processes which are very lightweight. In practice, this means that erlang actors are the cheapest memory wise. Other implementations of actors need to rely on threading which is typically more expensive.

    • What you won’t get with an actor library is Erlang’s runtime.
      Processes are garbage collected individually, and the VM is multicore-ready

  12. Did you consider Node.js at any point? Or how would it compare to the Erlang version? Although, as far as I understood, the decision was made some time ago, so Node might not have been in the state it is now.

      • Well, thanks for the insults. I guess I know now how the world looks like on your high horse…

        Isn’t that exactly what you are doing? Waiting for network IO, caching that to memory and sending them to “database” of sorts when done. When you consider that Node has been doing multi-core/process things for some time now, it’s exactly what you were describing of. Node applications even scale the same way.

        Personally, I’d say that healthy developer ecosystem (potential workforce & ready made modules) is far more important than marginally faster solution.

        • It’s not about a faster solution, it’s about ease of development. For example, in Node how would you handle partitioning the different users to different node processes? A single node process would not handle thousands of users if each user requires any sort of CPU. And Erlang has many ready made modules.

          • Well, you could use child-processes or even cluster. I haven’t tried those for specifically this kind of a problem (spawning n number of them as you go), but that’s why I’m asking. I know it can be done, I just don’t know how much resources you waste (if any) on the main application, having thousands of child-processes running at the same time.

            But hey, at least someone is willing to answer without insults. So thank you for that! :)

        • Erlang like Node does aynchronous IO, but unlike Node it has a preemptive scheduler, so CPU-bound computations are not an issue.

          Also Erlang use multiple threads of execution to good effect to monitor and recover from failures
          (see Error Handling in http://ferd.ca/an-open-letter-to-the-erlang-beginner-or-onlooker.html)

          Finally, Erlang allows transparent distribution, sending messages to remote or local processes is very similar

          So Node and Erlang are quite different beasts.

  13. Pingback: MM086 Deutschland wurde heruntergeladen | mobileMacs

  14. Pingback: Seeking Scalablity Part 1: Resources - webJABr

  15. Pingback: MM088 Pregnant Hill | mobileMacs

  16. You should be aware that the “serbo-croatian translation” is a thinly-veiled SEO scam to get you to put a link to their site (WebHostingGeeks), which will improve their Google PageRank.

    Don’t you think it’s a little random that they would choose that specific page to translate, and no other pages?

    • Well I had email contact with the person translating it. Her motivation was very believable and the request for translating it was personal and polite. Did you read/understand the translation? Do you really think its scam? I happily improve their page rank for making the article available in another language.

      • Having an email conversation doesn’t mean it’s not a fraud. The translation itself is only marginally better than Google Translate, and would be of no use to any Serbo-Croatian person. Also, anyone from that region who is remotely interested in computers would speak English and would never need a translation.

        Here are a few articles detailing the scam: 1, 2, 3.

        You’re certainly welcome to keep the translated link if you like, but it won’t help anyone except the scammers.

  17. Hi,
    Great article! I am currently learning Erlang and so far have been a devout Java programmer. I still find it hard to convince myself that Erlang is better that Java when you are building concurrent web-base applications like the one you developed. I had a question about the example you gave – Why do you think that you could not have achieved a similar performance improvement had you use Java with the same stateful web-server design and the Actor being represented by a Java thread. Sorry for the dumb question but I still can’t understand why Java can’t me just modeled to work with the Actor paradigm.
    thanks in advance,
    -v-

    • You’re question is completely valid and it is possible to write something similar in Java. The benefit of using Erlang is that the language was build right from the start for these kind of things. It feels very natural, easy and robust to do it in Erlang. One aspect of it is the functional programming and immutable variables. The other one is the whole process model, monitoring, supervision trees etc. I’m quite confident you’d agree once you tried building the same thing in Java and Erlang

  18. I just wanted to thank you for sending some signals :) For our part, the best way to get a nice handle on Erlang programming with proper ‘real world’ application was zotonic(.com), a cms in erlang … it’s a great example to get your head wrapped around a damn fine language and runtime system. Also, as a cms, it’s multi-site, out of the box and supports all the usual separation of concerns logic you’d expect. And of course, it’s erlang multi-node scalable. But it’s so fast, in comparison, say, with a php alternative, it’ll make your head spin :)

    Of course, we also use ejabberd :) which is also not a bad start, but the cms seemed a better fit for us since we deal with that problem space all the time and just ‘use’ jabber :)

    Ok, my two bits to maybe inspire someone to get programming :)

  19. “Another big difference to the OOP world is that in the actor model there is no global state and therefore also no shared memory between actors. In languages like Java, Ruby and Python there is always global state and threads have access to shared memory.”

    Global state and shared memory are not features of the OOP world. They are features of specific OO languages like Java, Ruby and Python, but that doesn’t mean they are a necessary part of OOP.

    “Strictly using message passing for sharing state with other actors which run in parallel enables asynchronous communication, meaning that the sender does not have to wait for a response from the receiver.”

    The original definition of OOP strictly used message passing for sharing state with other objects — Smalltalk. So not surprisingly, there have been experiments to implement massively parallel Smalltalk:

    https://github.com/smarr/RoarVM

    Functional programming may be a good fit for parallel programming, but OOP is not inherently a bad fit.

  20. Pingback: Why Polyglot? » Mediafly

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>