Tuesday, 20 September 2011
The VBScript from hell: typed vs. typeless languages
I grew up on structured programming. After introducing myself into the wonderful world of programming by ways of old school, line-numbered, GOTO and GOSUB BASIC, peppered with assembly for obvious performance reasons, I started to learn Pascal using Turbo Pascal (thanks, Borland!) and from there jumped to other structured languages (C and its cousins) and to a few non structured ones: hello to Prolog, Scheme, SQL and a few esoteric theorem proving engines whose name I've forgotten.
Notice a pattern here? Yes, except at the very beginning, I have had a healthy dose of strongly typed languages. Even when working with SQL, I tend to use the available language features or extensions to be as strongly typed and structured as I can. This habit came from working on "large" Pascal and C projects, where I noticed that beyond 100K LOC, it was becoming more and more difficult to introduce code changes without breaking something else. I learned to love the C compiler warnings. I learned to obey the rules, because each time I tried to jump over them, with very few exceptions, I ended up burning myself.
I have to agree that for absolute beginners, strong typing and rigid program structures impose a lot of overhead that obscures the essentials of what is going on when you execute your program. In other words, teaching programming should not begin with a strongly typed language. Teaching programming should not require you to declare main() correctly, or linking with the right runtime, or import or include anything. All those things are accidental complexities in the process of writing a computer program and, while necessary for some environments, don't contribute anything for you to learn about variables, control flow or states.
As with any kind of learning, learning to program has its stages. Transition from one stage to another is sometimes difficult, but is part of the learning process. Discovering and thinking in structured terms is a revolution for the line numbered BASIC programmer, of the same calibre than the one that happens inside the mind of the programmer when he or she is capable of thinking in terms of objects or starts thinking in declarative terms. Like the practitioners of other arts or crafts, the good programmer wants and needs to know all the rules and techniques, why they are there and when and why and when is actually good to disobey them. There's no single answer for all questions, and each problem is best suited to a tool. To stop treating everything as a nail, one needs to know how to use something else than a hammer.
Typeless languages are not intrinsically bad. Quite the opposite, they are a killer solution in many situations. And there are even middle of the road solutions such as Python duck typing (does not enforce you to use types but forces you to be disciplined) with which I feel comfortable enough with. Scripting languages are great for short, quick programs.
But as the learning experts say, there comes a point in your journey where you think you know enough, but really you don't know what you don't know. Unfortunately, there are people out there that never pass this stage and keep their whole life growing the monster, believing that they are actually doing the right thing. Some of them eventually realize that there is a better way, some other will never ever progress beyond that.
True, there is people that say that as long as you have your set of test cases covering all the code, it is not that relevant whether your language is type safe or not, and in fact they argument that strongly typed languages are more verbose and limiting. To which I say, yes, anyone with enough experience and discipline to have everything covered by test cases is well past the learning point of typed vs. typeless. That's someone that knows when and why rules can be broken. The rest will hear only the last part of the sentence and continue feeding the monster, hoping to have some time in the future to build these all-covering test cases.
All this reflection is because I am here, facing a 3000 line monster VBScript program. It has 20 lines at the beginning declaring global variables, of course more than one variable in each declaration. Not a single data type. It is not even indented, and comments are scarce. Yet is a key piece in a high value process, and no one is quite sure of what it does. Sigh.
Oh, of course, you can also write your 3000 lines enclosed in a begin/end block in Pascal too. Nothing stops you doing that. But your fellow Pascal practitioners will point their fingers and shame you. Because they already know that's not good practice. Because they know it will bite back. Because they know that you will regret it in the future. Fellow VBScripters? Most of them will say "it works, so what". The ones that don't say that are writing in VBScript because they know it is the best tool for that particular job, not because they don't know anything else.