Sunday, November 27, 2011

Nonacceptance of functional programming makes the software industry suck

Abstract


Functional programming (FP), a very useful paradigm of programming devised in the 60s, has been underestimated for a long a time and that turned out to be the biggest mistake in the software industry as software has become more and more complex. FP simply allows one to program what to do, not how to do, leading to software that is easier to read, write, extend, modify and debug.

How? You try to avoid as much as possible keeping state information or having side-effects. The responsibility for having demoted the importance of FP lies with the object-oriented paradigm (OOP) which instead has been promptly and largely adopted by industry. OOP deceptively appears a simpler way to structure software, but being too simple to be generic enough, often misused and retaining the old ideas of keeping and manipulating state information explicitly, has largely failed its objectives in the real word.

Only FP makes possible to really program the way we think, not the way computers work and multiplies the advantage of using a compiler or interpreter instead of pure machine code.

Content


In the object oriented paradigm (OOP) you just wrap code and data into classes thus sweeping under the carpet all problems generated by keeping too much state. But you haven't done much this way: nasty bugs continue to pop up inside classes and haunt you. And, what's worse, by using OOP you are mostly forcing the decomposition of your problem into a hierarchical structure which is not the best one for all situations. OOP is a useful paradigm for certain peculiar kind of problems, like simulation - indeed it is here that the full idea of objects originated in a language called Simula - but pretty unsuitable as a general programming technique.

By the way most companies use OOP just as a framework for mediocre programmers with little or no design skills to fill the gaps. By looking at most OO software in the real world you see that the hierarchies they created often don't decompose problems in a natural way. E.g. top level operations are not primitives. OOP is very easy to misuse and misunderstand and the way the industry took it in has been more a source of bad design than of better solutions compared to the old procedural programming that took code and data completely separate.

Every smart programmer has been disappointed by OOP and discovered that using it as the only paradigm or way of structuring code is just a mere forcing with no added benefits. On the contrary OOP introduces a lot of additional complexity in handling instantiation, communication and, where applicable, synchronization of all these objects your software is artificially fragmented in.

Not everything is an object in the real word: if you force your view of the world as made up of objects only, you will soon find out that it doesn't help to manage complexity at all, since these objects often communicate between themselves in elaborate ways and have to change their own state as a result of this interaction. These are all complications that arose suspicion someone is giving you the illness for selling you the cure afterwards. I can't believe many universities teach OOP to their students ignoring FP completely, while the opposite should be done. Fortunately the best ones (e.g. MIT) don't do such a silly mistake.

Therefore we get back to the crux of the matter: the evil lies exactly in having to maintain too much state! That is a bad programming technique and OOP does not help to remove or at least relieve this problem. On the contrary OOP has made it worse, encouraging people to basically continue to code the same way as computers work at the lowest level, adding only some artificial and unnecessary structure to programs. There is no good reason nor advantages to keep state information spread over all your program. It is ok to introduce a bit of state and side-effects when they are absolutely necessary for a program to actually work (e.g. input/output itself is a side effect that cannot be avoided).

OOP, being just another imperative programming variant, introduces a lot of state to keep the intermediate results of a computation and this is a very bad thing programmers have not be accustomed to think of! From a long time we have garbage-collector technologies and do not need to manage memory in such a low-level fashion like this. That's the truth no bombastic OO programming book will ever tell you and you'll have to learn the hard way. As I did, when I found myself reinventing a bit of FP in non-functional programming languages and trying to avoid using classes and objects for everything.

FP tries to use pure mathematical functions as much as possible. Your program is naturally decomposed into reusable modules that contain almost all code and very little mutable data, especially in the form of state variables. A bit of state information (that is mutable data) shows up, but only in the outer high-level layers of the software, where it is easier to control, or it is hidden in the insides of some functional library and only if absolutely needed. In the latter case state information is usually made private and gets changed in predictable ways only.

E.g. think of a pseudo random number generator that keeps track of the last value generated so to give you an always different value the next time you invoke its primitive. If you want more than one generator (each being an object, you may say in OOP lingo) you can easily use a high-order function that returns many independent random number generators, each with its own state hidden under the bonnet. In FP functions can return other functions (this is the equivalent of a constructor) and the bonnet where some state can be kept is called a closure, a stack frame containing binded variables automatically managed by the system. Functions can thus be objects, destruction is automatic (there are no destructors to invoke explicitely) and you can implement an object system with inheritance and polymorphism and even more if you want - read if it is appropriate, because it isn't always so, since the whole point is to avoid keeping too much state and OOP doesn't do that.

Else most of your code is made up of functions whose result depends only on their arguments, just like in mathematics. These are independent pieces of code that can be easily and separately developed, reused, understood, modified, tested, debugged and even parallelized for faster execution, without changing a line of code or having to write any involved additional test code! Big programs structured this way finally make sense to the initiate: except for very few exceptions, that is objects implemented using FP or global variables - both to be used sparingly, you don't have to take account of current state of many objects when reading a program, which is exactly what makes a program difficult to understand.

A program not written by you (or written by you a few months ago) can be grokked by looking at the "main" code and abstracting over all the other details. You read the program source code and it tells you exactly what the program does instead of puzzling you with stateful objects whose interactions will not be clear until you have a grasp of the whole code AND you have tried some of the involved execution paths that leads to the creation of a particular state of interest among many possible ones. This is practically impossible to do for any large-enough software project.

It should not be needed to read all the code or jump around classes like hell and look at state variables during long and complex execution paths in order to understand how a software works! If you have to do that it means that the code is not well-structured and you aren't working at the right abstraction level. State information and side-effects are the source of all evils and should be avoided, controlled and confined as much as possible, while imperative programming (including OOP) makes unnecessary use of them!

FP is a such a simple miracle, one must be blind not to see it! If you use pure or mostly pure functional programming you can make sense of a piece of code extracted from a large software without knowing anything about the rest! You can even go on and fix a bug in the that code from your first working day. Managers who want new programmers to be productive right away should adopt FP for this reason even if they don't understand what the many other benefits are. If they want to be able to replace people easily, they should use FP, not OOP only.

FP is not a theoretic device, rather OOP is! FP works in practice because of a combination of many features that good functional languages offer, some of them may seem strange at first because we are not used to be able to do these things in imperative programming. Imperative language implementors didn't want to abstract too much from physical machines. It's not that they chose to make their life simpler when implementing interpreters or compilers at the expense of the programmer, because actually a LISP interpreter or compiler is much easier to implement than say a C++ one. They limited the expressiveness and/or extensibility of languages just out of ignorance of more advanced programming techniques.

Well, to be honest not only for that. In the early days of computing, having languages that express algorithms in the same imperative way as machine languages was justifiable by the necessity of saving some CPU cycles and/or a bit of memory. Now both of these are very cheap compared to human programmer's time, but they weren't when computers had the size of a big refrigerator or occupied a whole room. It's a long time that these limitations are not needed anymore and they now seem so ridiculous just as wanting to program at the assembly level.

Nonetheless FP has been literally ignored by the dumb business world for about 50 years, until a recent very mild rediscovery, mostly in the area of concurrent programming. How have programming languages progressed in all this time? Well, we are now very slowly adding some basic functional programming features to less powerful bloated imperative and object-oriented languages, while we should be better off using an old full-featured multi-paradigm functional language like Lisp.

That is why I am giving up to my programmer job after many years of practice and experience. I cannot easily find high-quality code bases to work on and even in new developments I am dictated what programming language to use according to the most recent passing fad. And they are all bad languages, a giant step backward from the Lisp of 1958. Moreover I have to do overtime and work under pressure with a very low wage. Programming is no fun any more this way: we programmers are just slaves forced to work without good tools. But these good tools exists. It's just they don't want us to use them, for no logic reason. So it's not my failure, it's the industry that really sucks and it sucks badly. I made up my mind to abandon programming for the businesses, although I like it out of difficulty to find out a good job.
As of March 2012, I am permanently retired from my profession. Anyway, I am still interested in any kind of casual job NOT INVOLVING computers. But I am no more interested in any kind of IT job as an employee, not even for IBM or Google (not to mention Microsoft). Maybe in 50 years, if the software industry gets better, provided I am still alive. I am still interested in hardware, though, especially sledgehammers. To all IT managers on this planet: to knock a nail one should use a hammer, not a chainsaw. To cut trees one should use a chainsaw, neither a nail nor a hammer.

No comments: