Database Performance Tuning

Tuesday 28 November 2023

Trunk based development sounds great. Here's why you should probably not do it

Mostly via Dave Farley's YouTube channel I'm lately involved in a lot of discussion regarding the "Trunk Based" development practice, touted together with continuous integration (CI) and continuous delivery (CD) as the next step in software development productivity.

As a short summary, this practice means that your branches are never older than a day, because you commit changes to main very, very frequently, sometimes more than once a day. This has a number of advantages, but the main one is that it mostly avoid merge conflicts. If your changes are small there is a correlated small likelihood of conflicting with others, and if conflicts arose, they should be easy to resolve. Plus, other team members are always up to date on your development so their own short lived branches are not going to conflict when they are merged.

Sounds good, in principle, but herein lies a danger that I call the "21 method parameter" danger.

Let's see. Dev 1 creates a method that takes, say, six parameters. All is well and good. Dev 2 needs a slight variation of that method and adds another parameter... ten changes later, you have a method with 15 parameters. Of course, everybody thinks that is excessive, but to refactor it means going back to changes made by Dev 2... Dev X which means a lot of refactoring, changing tests, etc.

All this runs against the "Trunk based" approach because it needs a long lived branch that is going to have merge conflicts. So nobody changes anything and they keep happily coding ahead.

So there it goes, until you have, and I've seen it in the real world, 21 parameters.

The context has not changed, and living in a "trunk based" world means this will likely won't change. It will be even worse.

A few people during the debate pointed out that there are tools to prevent that kind of thing to happen. Yes, you may use linters or other static analysis tools to detect such quality issues cropping up. But then you run into Bob adding the 7th parameter to the constructor asking you why you can't make an exception because your commit will became much older because there's no time to refactor this... so you allow it and go ahead towards the 8th parameter.

Then there were people saying that this kind of nightmare situation cannot be prevented with any methodology, to which I'd say that at least you should not use a methodology that almost encourages it. Due to the above, "trunk based" development is one of them, outside a very few scenarios. Let's see.

Early stages of development, where in fact there is nothing settled down but you know for sure that your code is going to have some shape. Yes, both Alice and Bob know the user role is going to contain a "Accounts Payable" member and an "Accounts Receivable" member so it is ok for each one to do a commit adding the role.
Elite teams made up of senior people capable of having the judgement of stopping before adding the 22th parameter to the method. They know something is wrong and set out to fix it steering of the "trunk based" approach.
Code that is so badly structured that even the simplest of user story implementations need lots of changes in lots of modules. In this case, most developers are statistically very likely to induce merge conflicts because you can't touch the accounts receivable module without touching the user module and the database repository interface and who knows what else.

Reality is, rarely one has the chance of starting something from scratch. Even if we are so lucky, that stage won't last more than a few months.

Reality is, we regular people are not part of elite teams. We're regular developers that have to meet deadlines and are willing to trade off a bit more technical debt (there are already 16 parameters, what's wrong with having 17 instead?) in order to fulfil our commitments.

So if you're in one of the three cases, perhaps trunk based development is right for you. For the rest of the cases, trunk based development merely rewards the behaviours that end up in the third stage: a sprawling code base of layers upon layers of patches made on behalf of "making progress" that won't be easy to fix. These code bases are usually the ones that are deemed to be a "total rewrite" over time.

And no, it's not good to be proud of doing "trunk based" development because you're in case (3) I feel sorry for you because the brittleness of the system makes it very, very difficult to make any progress with it.

As a final note, the solution of long lived branches creating merge conflicts sounds easy to me: it is not the age of the branch that matters, it is how often you merge the latest changes from main onto it. Do it daily and your branch will never, ever, be more than a day older in comparison with a branch created yesterday. So simple solution that it is hard to understand why the "trunk based" proponents keep ignoring it.

Wednesday 21 June 2023

On uniqueness and MD5 hashes

Came across this today: someone needs an unique key, but thinks that Python's uuid4 is not unique enough. The proposed solution? Just get the MD5 hash of the uuid.

No, passing your non unique identifier thru the MD5 formula won't make it more unique. In fact, it will create something equally non unique as your initial uuid. Only larger.

Sunday 5 March 2023

The "You do not need foreing keys" crowd

Can't help but notice a current trend advocating that foreing keys on database schemas are just a burden that you can do without. The last argument I read about it was along the lines of:

FKs are enabled on dev, testing and staging. Your automated test suites and your manual tests of your application in those environments and FKs help you catch data integrity problems.
Once you fix all problems, you just can deploy in production dropping all FKs, after all, you've taken care of all problems in the other environments, why you need FKs there? They just add overhead, don't they?

Ok, let's start admitting a basic truth: is strictly true that you can run your database without foreign keys. Yes, they add some performance overhead. Yes, they force you to do database operations in a certain order.

But beyond that, all these arguments are just showing lack of experience and ignorance. Let's see why.

For a start, anyone who thinks that all possible problems can be anticipated by means of testing in dev/staging environments is simply not experienced enough to have seen code bases with 100% test coverage and complete manual end user testing failing due to... unexpected reasons. Anyone that does not know that he/she "does not know what does not know" is simply lacking maturity and experience. Thinking that you can anticipate all possible error states, all possible system states is just hubris.

But that is just the beginning. Anyone who has watched a code base evolve under different hands and eyes knows that anything that is left as optional will, at some point in the development cycle, be ignored. Data integrity checks included.

And, anyone who has worked in anything more complex than a web shopping cart knows that parallelism is hard, concurrency is even harder, and locking is even harder than that, and these topics intersect between them. You just cannot easily roll your own transactional engine or your own locking mechanism. It takes a lot of work from a lot of talented people to create truly bullet proof database and transaction engines. Not saying that you cannot do it, but thinking that you will do better with your limited resources and experience in your project is really putting yourself very, very, high on the developer competence scale.

It is useful to compare these arguments with a topic that, while apparently unrelated, is an argument that has the exact same kind of flaws: strongly typed vs. loosely typed vs. untyped languages.

Yes, you can in theory create big and complex systems without using any data types. But the end result will be much more difficult to understand, harder to evolve and test, way more error prone and expensive in the long run than using a strongly typed language to do it.

Why is that? Because when using a strongly typed language, the compiler is acting a as first test bed for the validity of your assumptions about the shape of things that get passed around your code. Without that validation layer, you're simply more exposed to problems, introducing more chances of getting things wrong and forcing the reader of your code (including your future self) to deep dive into each and everything function call just to see what the callee is expecting to receive. Time consuming and error prone, to say the least.

So, data types are like foreign keys: a device that is used to make your code more robust, consistent and changeable. You can do without them, but be prepared to pay a cost much higher than having to declare types and relationships has.

In summary, "you don't need foreign keys in production" is the terraplanism of software development. It only shows how much you don't know and how little real world experience you have. Don't embarass yourself.

Tuesday 25 May 2021

Fighting your framework: don't do it

New role, new opportunities. This time I'm off Python, or at least not being the main development language, and into Java. But starting to see the same patterns emerge.

My new job involves using Java, specifically, Spring Boot. You may have a lot of arguments against Spring in general, and I'd agree with most of them. However, it is hard to find another framework with such extensive coverage on a number of very essential topics. From relational to unstructured data access, MVC, security, testing... all in one of the fastests (only second to compiled ones) and more popular languages out there that has also a dazzling number of libraries available covering almost anything else not covered by Spring itself. What? You say bloated? Yes, gimme bloat if I can spin up a basic app in a couple of hours. You say complex? Yes, it is, so much that a side project (Spring Boot) was born to simplify the management of dependencies and generally avoid your head exploding.

In a sense, this is not different from Django: you may have a lot of valid points against using it for production applications, but it does have a lot very powerful counterpoints that completely justify its adoption: insanely fast development cycle, a test runner that lets you use actual databases to do actual testing without taking 20 seconds to set up the testing harness (I wish Spring had something like that), a load of libraries and projects you plug an start using in seconds...

So yes, if you're starting an app from scratch these day, you better have very, very good reasons to not use Spring (if you want Java) or Django (if you prefer Python)

And one of the most interesting parts when joining an existing development team is that you get the chance to understand architectural and layout choices made at the time the app was created and evaluate how these decisions have contributed to the evolution of the app. Or instead have became obstacles for developing new functionality.

I'm starting to notice a pattern that invariably comes true: the decision to "fight" the framework and solve a problem by trying to replace some of its built in functionality is one of the biggest blockers to evolve an app. Invariably, the custom home made replacement lacks the flexibility, performance or reliability of the one provided by the framework, not surprising as you're not as experienced and skilled in the details of the standards and protocols as your framework's creator. Invariably, it becomes an obstacle when doing major framework version upgrades. Invariably, it creates additional layers of complexity instead of letting you concentrate on application functionality.

You've chosen a framework. Use it. Extend it. But don't patch it or overcome it. If you find yourself doing that, throw away your framework and pick another one.

Above all, please, please, please, do not fight it. Use it as intended. If you don't like the way it is set up, just do not override 30 methods in 9 classes across 528 lines of code just because you think authentication should follow your three-redirections-across-two-domains flow to decode a session token. Do not create your own versions of a cache just because you can or you don't understand how the built in one works.

Monday 22 June 2020

Microsoft is changing. This time is for real

After installing the latest W10 update (this time without the usual associated multi-reboot drama, luckily) I see that finally, Windows has a decent, capable browser that is fast, secure and compatible with modern standards. The only small wrinkle in this glossy picture is that said browser no longer comes completely from Microsoft, but is really a kind of Google's Chrome fork.

Leaving aside the discussion on how much content on the latest Edge update is coming from Microsoft vs. Google, anyone that lived the "browser wars" at the end of the previous century can't help but shake his/her head thinking about what just happened. Microsoft, one of the largest talent pools in software development in the world, has gave up after five years of trying to create a browser that was competitive with Chrome and FF, unable to keep up on the development of what was in the past one of the key componentes of a desktop computer: one that is the gateway to most, if not all, content consumed on the computer. The very same web browser that in the past was a key piece of technology and an avenue to introduce new technologies and control the user experience. The subject of a federal investigation on monopoly abuse.

Yes, that desktop browser. Today, it has become irrelevant. And this change signals that Microsoft, this time for real, is changing. No longer trying to be the king of the hill for all hills under the sun, they finally seem to be focused on a few things that look like a reasonable long-term bet: milking your cash cows (Office, mainly), sell services instead of products (same thing, different name, subscription model) and offer a seamless integration of your legacy server estate with reasonable cloud offerings seems to be a much safer an stable revenue source. No more attempts to be more Apple-y than Apple, more Google-y than Google, no more desktop changes trying to apple-ize or google-ize the Windows desktop, no more XBox (yes, that will happen), no more Windows Phone, no more Bings, no more Cortanas...

If any, this should send a signal to the other big IT companies: they are all now old enough and mature that it is time to end dispersion and focus on a few things that can work on the long term. The pace of innovation is settling down. In a sense, Google is already doing that with their Alphabet split, but they are not on the same level: I do not see Google's adopting a piece of MS software for a key system component happening.

Who knows, let's wait another decade and see what happens.

Wednesday 12 February 2020

Four years is quite a long time

But yes, four years. That's how long I've been absent from this blog. To be blunt, this blog started as a means to achieve a double goal: share some experiences while at the same time vent some frustrations coming from the frictions that happen in the day to day operations of an IT shop. Mostly centered around the role of databases in the business IT world, but ended up covering all topics that I touched, even tangentially.

I was -naively- expecting to achieve some degree recognition and influence by posting here, but this clearly did not happened. At least as a result of what was written here. And personal life took a few turns and twists that left little time to write anything. But that does not mean the action stopped.

Ah, but this is also a tremendous opportunity to reflect on what four years mean in "technology-time" In summary, from what I see on my sorroudings, there are a few key changes that have operated in these four years.

First, the web and mobile took over almost everything. Outside a few specific domains, no one in his/her right mind even thinks of not doing its application as a web app, preferably one that scales across a range of devices from small smartphones to slightly bigger tablets to full fledged desktop PCs. It is not easy and there are a few trade offs unless you're willing to invest in custom development for a platform. Something that only the people with the bigger pockets do. Everyone else uses the browser as a front end and something else on the back end. Except of course if you're SAP, Oracle, Adobe, AutoDesk, Microsoft or write games.

Second, and related to the first point, speed of development and deployment have become key differentiators in the SW development arena. No business will tolerate the age-old bi-annual development cycle. There is also that thing called DevOps which basically means "Sysadmins can and should write code unless they like their roles to be abstracted away by some big cloud provider in a few config pages"

Everything is converging to a global tendency that tries to automate everything, remove manual -and usually error prone- intervention, scale in all directions and optimize away all the inneficiencies in the development process itself. New methodologies like XP/Scrum/Agile bend the classic waterfall cycle rules in an attempt to extract the most out of each cent invested in people and technology.

Together all those things paint a picture where the classic roles (TI, QA, Ops, DBA, Dev) blend a merge into some sort of team that is required to perform at leves unheard of ten years ago. It is not uncommon to find web-based business doing app deployments four times a day and three week development iterations delivering significant value at a rate previously unthinkable.

Paradoxically, all this means that the classic role of a DBA is simple disappearing. Teams are supposed to have the skills and knowledge to design, mantain and monitor a database without a specific specialized role dedicated to the task. And if your DB does not perform, it is cheaper than ever to scale it up. And most of the time, it works. Problem is, when it fails, it does so spectacularly. And here is the paradox, a developer with solid DB knowledge can make to a team the difference between an unreliable and underperforming app and something that makes customers smile. Hence the paradox, the more you know about DBs, the better developer you become.

So these four years for me have been a transition from a guy who was very good programming but really added value when fiddling with databases to someone whose role and job title is simply a senior developer that happens to have a lot of very useful knowledge about how DBs and SQL works across a few different engines and architectures.

And still enjoys all this quite a bit

Monday 22 June 2015

Django and Python best practices

After a few weeks reading an extensive, and quite complex, Python/Django code base, I’ve realized that there are a few simple practices that can make a significant difference in how effectively and quickly one can pick up an application. Not being up to now an intensive Python user, I was expecting to catch up on the code with more or less the same level of effort it takes me to grasp a piece of C, SQL, or Java.

But it hasn’t happened as quickly as I expected.

I’ve found myself tracing with a debugger the application, not looking for bugs, but trying to understand what it does. In my mind, this is an admission of defeat: I can’t understand the code by reading it, I’ve to watch it in motion to be sure my mental image of what the code does and what the code actually does match.

Debugging is the task of verifying why your mental image of the code should do is not matching with what it actually does. Not the opposite. When you don’t know what some code does, you should be able to know it by reading it.

And I’ve realized why it was taking so much time. Python is so powerful and expressive that has its own shoot-yourself-in-the-foot factor that can be, with a big enough code base, equally dangerous than C, SQL or Java shoot-yourself-in-the-foot pitfalls.

So I’ve put together this short guide with the list of things I’d want to see in code that I’ve not written myself. Which are really the list of things I want to keep an eye when I write Python code in the future.

So this being a case of either my code reading abilities being weak or the code not being well written, I of course prefer to blame the code. Not the people to wrote it, of course: I’ve real world examples of each of these entries, but the point is not to shoot blame around, but rather to make the code more readable, shareable, and in the future, easier for newcomers to understand without resorting to tracing it with a debugger.

Don’t fight or otherwise reinvent Django or the standard library

Django provides built-in functionality to validate your data. To enable referential integrity. To deliver web pages. To do a lot of things. Django has been around for years, and improved over and over by a lot of people. So before implementing anything, think twice. Look around in the Django docs and check if there’s something already built in to do that.

In particular, use Django forms to validate data. Use Django validators in models. Use Django ForeingKey, use clean_data to actually … clean data. Don’t reinvent the wheel if there’s a perfectly good, already debugged and reusable, wheel available.

Use the Python standard comment syntax

Document every single parameter your function accepts. If your function has side effects, document them. If your function throws exceptions, state so in the documents.

The only acceptable exception for this rule is for methods that override or implement an existing Django convention. That would be an unnecessary restatement of what is already said. Except of course if your override a standard framework method and add some special contract.

It is much, much worse to have misleading documentation than no documentation at all because it creates cognitive dissonance. If you’re changing a method and not updating the documentation, you’re just confusing future readers that will discover sooner or later that the documentation does not match what the code does, and they’ll throw away the documentation anyway and throw a few expletives at you or your family. So it is best to throw away the documentation than to keep it obsolete before someone else loses time unnecessarily discovering that it is outdated.

Use Python parameter passing to… actually pass parameters

Python is famous for its readability, and its duck typing prevents a lot of mistakes. Named parameters and default values are a convenient way to plainly state what a method does. Method signatures can also be read by your IDE and used in code completion.

There are native ways to pass arguments to method calls. Don’t use JSON or HttpRequest to pass parameter values to a function that is not an URL handler. Period.

See the point on kwargs for more details.

Be explicit. Be defensive

When you consider augmenting the signature of a function by adding more parameters, just add them and provide sensible defaults.

You may think that you’re just making your code future-proof by using variable parameter lists. You are wrong. You’re just making it more confusing and difficult to follow.

Consider this code

 class base(object):  
   def blah(self, param1, **kwargs)  
   ...

 class derived(base):  
   def blah(elf, param1, **kwargs)

Don’t do it. If when you create it, blah does not need more than 1 argument, declare it like so. Leaving **kwargs forces the reader of the code to go thru the whole function body to verify if you’re actually using it. Don’t worry about future proofing your code, any half decent IDE will tell you which methods could have issues by your change in much less time than you can think about it. So just declare this:

 class base(object):  
   def blah(self, param1, param2=None)  
   ....

And future readers of your code will be able to tell what your function accepts.

MVC: the whole application is NOT a web page

According to Django’s own site:

In our interpretation of MVC, the “view” describes the data that gets presented to the user. It’s not necessarily how the data looks, but which data is presented. The view describes which data you see, not how you see it. It’s a subtle distinction.

So, in our case, a “view” is the Python callback function for a particular URL, because that callback function describes which data is presented.
Furthermore, it’s sensible to separate content from presentation – which is where templates come in. In Django, a “view” describes which data is presented, but a view normally delegates to a template, which describes how the data is presented.

Where does the “controller” fit in, then? In Django’s case, it’s probably the framework itself: the machinery that sends a request to the appropriate view, according to the Django URL configuration.

That does not mean that you have to use the same parameter passing conventions as an HTTP request. If you do that, you’re giving up on all the parameter validation and readability that Python provides.

There are native ways to pass arguments to method calls. Don’t use JSON format or HttpRequest to pass parameter values to a function that is not an URL handler. Yes, I'm repeating a sentence from the previous point here because it is very important to keep this in mind.

Avoid the temptation to create über-powerful handler() or update() methods/objects that can do everything inside a single entry point in a "util" or "lib" module (with a possibly associated evil kwargs parameter list) The fact is, this single entry point will branch to a myriad places and it will be a nightmare to follow and change in the future, becoming the dreaded single point of failure that no one wants to touch even with the end of a long sitck.

Instead, move the functionality related to each data item as close to the data as possible. Which means, to the module where it is declared. These should be much smaller and easier to test and manage than the über monsther methods.

Then, use the controller to glue together all these small pieces to build a response to your clients.

kwargs is EVIL. Deeply EVIL. Root-canal-extraction-level evil.

kwargs is a Python facility designed to provide enormous flexibility in some situations. Particularly, decorators, generators and other kind of functions benefit greatly from being able to accept an arbitrary number of parameters. But It is not a general purpose facility to call methods.

The ONLY acceptable use of kwargs in normal application development, that is, outside framework code, is when the function actually can accept an arbitrary number of arguments and its actions and results are not affected by the values received in the kwargs parameter list.

In particular, the following code is NOT ACCEPTABLE:

 def blah(**kwargs):  
   if ‘destroy_world’ in kwargs:  
     do_something()  
   if ‘save_world’ in kwargs:  
     do_something_else()

See how many things are wrong with this function? Let’s see: first, the caller does have to either read the documentation you provided in the function to know what are acceptable values to send in the kwargs dictionary. Second, a small syntax error when composing the arguments for the function call can make a significant difference in what the function does. Third, anyone reading your code will have to go thru all the function body to understand what valid kwargs arguments are.

And finally, why stop there? Why don’t you define all of your methods accepting **kwargs and be done with parameter lists? Can you imagine how completely unreadable your code will become?

Seriously, each time you use kwargs in application code, a baby unicorn dies somewhere.

DRY - Don’t repeat yourself

If you’re doing it more than twice, it is worth thinking about it. Consider this code:

 def blah(self, some_dict):  
   if ‘name’ in some_dict:  
     data[‘name’] = some_dict[‘name’]  
   if ‘address’ in some_dict:  
     data[‘address’] = some_dict[‘address’]  
   ....  
   ....  
   if ‘postal_code’ in ‘some_dict’:  
     data[‘postal_code’] = some_dict[‘postal_code’]

Why not use this instead?

 def blah(self, some_dict):  
   allowed_entries = [‘name’, ‘address’, ... ‘postal_code’]   
   for entry in allowed_entries.keys()   
     if entry in some_dict:   
       data[entry] = some_dict[entry]

Or even better and surely more pythonic and satisfying:

 def blah(self, some_dict):  
   allowed_entries = [‘name’, ‘address’, ... ‘postal_code’]   
   data = { key : some_dict[key] for entry in allowed_entries if key in some_dict }

There are a lot of advantages of doing this. You can arbitrarily extend the list of things that you transfer. You can easily test this code. The code coverage report will keep giving you 100% no matter how many values you include in some_dict. The code is explicit and simple to understand.

And even better, someone reading your code will not have to go thru a page or two of if statements just to see what you’re doing.

Avoid micro optimizations

You may code this thinking you’re just writing efficient code by saving a function call:

 if a in some_dict:  
   result = some_dict[a]  
 else  
   result = some_default_value

instead of

 result = some_dict.get(a, some_default_value)

Now, go back to your console and time these two examples executing a few thousand times. Measure the difference and think how many .0001’s of a seconds you’re saving, if any. Now, go back to your app and remember the point about using the Python standard library and the Django provided functionality.

Monday 4 May 2015

Javascript: has everyone forgot how it became what it is?

From time to time there's the occasional question from people that think that I'm really smart and ask me about advice on which programming language they should learn to ensure they have a good career ahead. Of course, I try always to answer these questions instead of focusing on the real question, which is why they think I'm so smart when in fact I am not.

And things always end up being a debate about how important is to know Javascript and how much of a future it has. Then you stumble upon debates about how good Node.js because you are using the same language on client and server, and what a great language Javascript is for writing the server side of an application.

And while I think that it is important that everyone knows Javascript, I don't think it is going to be the only programming language they are going to need. Or that they are going to work in Javascript a lot. Because Javascript is good for what it was created for, not for writing performance sensitive server code or huge code bases.

And when raising that point, it seems most people seem to forget how Javascript became what it is today.

See, in the beginning of the web, there was only one browser. It was called Mosaic, and did not had any scripting capabilities. You could not do any client side programming in web pages. Period. If you wanted your web pages to change, you had to write some server code. Usually using something called CGI and a language that was able to read/write to standard input/output. But let's not disgress.

Then came Netscape. A company where many of the authors of the original Mosaic code ended up working in. These guys forgot about their previous Mosaic code, started from scratch and created a web browser that was the seed that started the web revolution. Besides being faster and more stable than Mosaic, the Netscape browser known as Navigator had a lot of new features, some of them became crucial for the development of the world wide web as we know it today. Yes, Javascript was one of those.

So they needed a programming language. They created something with a syntax similar to Java, and even received permission from Sun Systems (owner of Java at the time) to call it Javascript. Legend says Javascript was created in 10 days, which is in itself no small feat and speaks volumes about the technical abilities of the Netscape team, most notably in this case of Brendan Eich

At that point, Javascript was a nice and welcome addition to the browser, and to your programming toolbox, because it enabled things that previously were simply not possible with a strict client-server model.

Then it all went boomy and bubbly, and later crashy. The web was the disruptive platform that ... changed everything. The server side (running Perl, Java, ASP or whatever) plus the client side executing Javascript was soon used to create sophisticated applications that replaced their desktop counterparts, but also being universally available from anywhere, instantly accessible, without requiring any client capable of running anything but a browser and a TCP/IP network stack.

Javascript provided the missing piece in the puzzle necessary for replacing applications running in desktops and laptops of the time with just a URL typed in the address bar of a web browser. Instantly available and updated, accessible from any device, anywhere. Remember, there were no mobile smartphones back then.

That of course ringed a lot of bells at Microsoft. They saw the internet and the browser as a threat to their Windows desktop monopoly and Microsoft, being the smart people they are, set out to counter that threat. The result was Internet Explorer.

Internet Explorer was Microsoft's vision of a web browser integrated in Windows. It was faster than Netscape's Navigator. It was more stable. It crashed less. It came already installed with Windows, so you did not have to download anything to start browsing the web. Regardless of the anti-monopoly lawsuits arising from how Microsoft pushed Internet Explorer in the market, the truth was the Internet Explorer was a better browser than Netscape's Navigator in almost any dimension And I say almost because I'm sure someone can remember something where Navigator was better, but I sincerely can't.

And it contained a number of technologies designed by Microsoft to regain control of their Windows desktop monopoly. Among them, the ill-fated ActiveX technology (later to become one of the greatest sources of security vulnerabilities of all times) and the VB scripting engine. That was part of Microsoft "embrace, extend, extinguish" tactic. Now, you could write your web page scripts in a Visual Basic dialect instead of Javascript.

Internet Explorer practically crushed Navigator out of the browser market, leaving it with 20% or so of their previous 99% market share. It was normal at the time for web developers to place "works best with Internet Explorer" stickers on their pages, or even directly refuse to to load a page with any other browser than Explorer and pop up a message asking you to use Explorer to view their pages. Microsoft was close to realizing their dreams of controlling the web and keeping their Windows desktop monopoly untouched.

And then came Mozilla. And then came the iPhone. Which are other stories, and very interesting by themselves, but not the point of this post...

What is interesting from an history perspective at that point is that developers were using many proprietary IE features and quirks, yet their web page scripts were still mostly written in Javascript. Not in VBScript. And VBScript faded away like ActiveX, FrontPage and other Microsoft ideas about how web pages should be created. Web developers were happily using Microsoft proprietary extensions but kept using Javascript.

Why that happened? Why embrace lots of proprietary extensions to the point of making your pages unreadable outside of a specific browser but keep your code in Javascript instead of the Microsoft's nurtured VBScript? Basically two reasons: first, there were still a significant minority of non-Internet Explorer users browsing the web, so Javascript programs worked on both browsers with little changes from one to another. Second: VBScript sucked. You may think that developers immersed in a proprietary web were choosing Javascript over VBScript because the language was superior. And it was. But this was not the case of choosing among the very best available. It was just a matter of keeping that remaining 20% happy and at the same time picking up the one of the two languages that sucked less.

Mind you, Javascript had no notion of a module or package system. No strong typing. Almost no typing at all. No notion of what a thread was. No standard way of calling libraries written in other languages.

But if after reading all these missing items you think Javascript sucks, you have to see VBScript to appreciate the difference. A language sharing all the deficiencies of Javascript, and then having more of its own. VBScript sucked more than Javascript. Javascript was the lesser of two evils.

And as of today, Javascript still has all those deficiencies. Don't think of Javascript as the language of choice for writing web page front ends. Think of it as your only choice. You don't have any other alternatives when working with web pages. Period. Javascript is not used because it is the best language, it is used because it is the only one available.

It was much later when Google created the V8 Javascript interpreter, making Javascript fast enough to be considered acceptable for anything else beyond animations and data validations. It was even later when Ryan Dahl, the creator of Node.js, had the crazy idea of running V8 on a server and have it handle incoming http requests. Node.js works very well on a very limited subset of problems, and fails completely outside those.

The corollary is: Javascript will be around for ages. You need to know it if you want to do anything at all on the client side. And know it well, together with the framework of the week if you want to do anything at all on the client side. But it will not be the language where in the future web servers are programmed in.

Phew. And all this still does not completely answer the question of which programming languages you need to know. Javascript is one of them, for sure, but not the most important or the most relevant. It is a necessary evil.

Three guys in a garage, the NIH syndrome and big projects

As the old adage says, experience is the root (or perhaps was it the mother?) of science. Nowhere near software development, it seems. For what is worth, not a week passes without another report of a disastrous software project being horrendously late, over budget, under performing or all of these at the same time.

Usually, these bad news are often about the public sector. Which are usually great news to those ideologically inclined to think that government should be doing as little as possible, even nothing, in our current society. This argument usually does not take into account that these government run projects are almost always awarded to private contractors. Apparently, these same contractors are able to deliver as expected when they the money does not come from public funds, hence the blame should sit squarely on the way government entities manage those projects not on these contractors, right?

I have bad news for these kind of arguments: it is just that these publicly funded failures are just more visible than the private ones. With various degrees of transparency, and by their own nature, these kind of projects are much more likely to be audited and reviewed than any privately funded project.

That is, for each Avon Canada or LSE failure that is made public, we have many, many more news of public sector failures such as healthcare.gov or Queensland failures. So next time you consider that argument, think if it is simply that you don't hear about private sector projects going wrong so often. Is it really because all goes well? Or is it because simply the private sector is more opaque and thus can hide its failures better?

Anyway, I'm digressing here. The point is that big projects, no matter where the funding comes from, are much more likely to fail. Brooks explained it in the TMM (mandatory reading) years ago. Complex projects means complex requirements which means bigger systems with more moving parts and interactions, which technically are challenging enough but are nothing compared to the challenges of human communication and interaction that rise exponentially as the number of people involved increases.

What is more surprising how often when one reads the post mortem of these projects there is some kind of odd technical decision that contributes to the failure. This is usually discussed in detail, with long discussion threads pondering how one solution can be better than another and inevitably pointing to the NIH syndrome. This can take the shape of using XML as a database, a home grown transaction processor or home grown database, using a document database as structured storage (or viceversa), using an unstructured language to develop an object oriented system and so son.

There is an explanation for focusing on technical vs. organizational issues when discussing these failures: technical bits are much more visible and easier to understand. Organizational, process or methodology issues, except for those involved directly in the project, are much more opaque. While technical decisions usually contribute to a project's failure, fact is that very, very complex and long projects have been successfully executed with way more rudimentary technologies in the past, so it is only logical to conclude that the the technology choices should not be that determinant in the fate of a project.

And we usually quickly forget something that we tend to apply in our own projects: the old "better the devil you know" adage. More often than not, technologies are chosen conservatively, as it is far easier to deal with their knows weaknesses than to battle with new unknowns. We cannot, of course, disregard other reasons in these choices. Sometimes there are commercial interests from vendors interested in shoehorning their technology, and these are difficult to detect. But we have to admit that sometimes the project team believed the chosen option as the best for the problem at hand. Which leads to the second point in the post: the NIH syndrome.

What someone unfamiliar with a strange choice of technology can be dismissed just as another instance of "Not invented here" syndrome. But what is NIH for someone claiming to "know best" for a specific technology area, it is perhaps the most logical decision for someone else. What looks attractive as a standalone technology for a specific use case may not look so good when integrated into a bigger solution. This is why people still choose to add text indexing plugins to a relational database instead of using a standalone unstructured text search engine, for example.

Another often cited reason for failure in these projects -and in all software projects in general- is that there are huge volumes of changes introduced once the project started. What is missing here is not some magic ingredient, but a consistent means of communication that states clearly changing anything already done is going to cost time and money. Projects building physical things -as opposed to software- seem to be able to get along with this quite well, if only because one does not have any issues explaining the effects of change on physical things. But the software world has not yet managed to create the same culture.

So now that we've reasonably concluded that technology choices are not usually the reasons for a project failing, that change is a constant that needs to be factored in and there is no way to avoid it, and that there is a strong correlation between project size and complexity, is there a way of keeping these projects to fail?

In my opinion, there is only one way: start small. Grow as it succeeds, and if it does not, just discard it. But I hear you crying, there is no way for these projects to "start small", as their up front requirements usually cover 100s of pages full of technical and legal constraints that must be met from day one. These projects don't have "small" states. They just go in a big bang and have to work as designed from day one. And that's exactly why they fail so often.

Otherwise, three guys in a garage will always outperform and deliver something superior. They are not constrained by 100s of requirements, corporate policies and processes, or financial budget projections.

Sunday 2 November 2014

Accidental complexity, Excel and shadow IT

You probably have felt like this. You've devoted some intense time to solve a very complex and difficult problem. In the process, you've researched the field, made attempts to solve it in a few different ways. You've came across and tested some frameworks as a means of getting close to the solution without having to do it all by yourself.

You've discarded some of these frameworks and kept others. Cursed and blessed the framework's documentation, Google and StackOverflow all at the same time. You have made a few interesting discoveries along the way, as always, the more interesting ones coming from your failures.

You've learned a lot a now are ready to transfer that knowledge to your customer, so you start preparing documentation and code so that it is in a deliverable state. Your tests become more robust and reach close to 100% coverage. Your documents start growing and get data from your notebook, spreadsheets and test results. Everything comes together nicely and ready to be delivered as a coherent and integrated package, something that your customer will value, appreciate and use for some time in the future (and pay for, of course)

And along the way, you've committed to your version control of choice all your false starts. All the successes. You've built a rather interesting history.

Of course, at this point your customer will not see any of the failures, except when you need to refer to them as supporting evidence for taking the approach you propose. But that's ok, because the customer is paying for your results and your time, and he's not really interested in knowing how to get these. If your customer knew how to do it in the first place, you would not be there doing anything, after all.

But it is exactly on these final stages where the topic of accidental versus essential complexity raises its head. You've spent some time solving a problem, solved it and yet you still need to wrap up the deliverables so that they are easily consumed by your customer(s). This can take many different shapes, from a code library, a patch set or a set of documents stating basic guidelines and best practices. Or all of them.

The moment you solved the problem you mastered the the essential complexity, yet you have not even started mastering the accidental complexity. And that still takes some time. And depending on the project, the problem, and the people and organization(s) involved this can take much more time and cost than solving the problem itself.

Which nicely ties to the "Excel" part of the post, which at this point you're likely asking yourself what exactly Excel has to do with accidental complexity.

The answer is: Excel has nearly zero accidental complexity. Start Excel, write some formulas, some headers, add some format, perhaps write a few data export/import VBA scripts, do a few test runs and you can proudly claim you're done. Another problem solved, with close to 100% of your time devoted to the essential complexity of the problem. You did not write any tests. You did not used any kind of source control. You did not analyzed code coverage. You did not document why cell D6 has a formula like =COUNTIF(B:B;"="&A4&TEXT(A6)) You did not document how many DLLs, add-ons or JDBC connections your spreadsheet needs. You did not cared about someone using it on a different language or culture where dates are expressed differently. Yet all of it works. Now. Today. With your data.

That is zero accidental complexity. Yes, it has its drawbacks, but it works. These kind of solutions are usually what is described as "shadow IT", and hardly a day passes without you coming across one of these solutions.

What I've found empirically is that the amount of shadow IT on an organization remains roughly proportional to the size of the organization. You may assume that larger organizations should be more mature, and by virtue of that maturity would have eliminated or reduced shadow IT. Not true. And that is because the bigger the organization, the bigger the accidental complexity is.

If you look at the many layers of accidental complexity, you'll have some of them common to organizations of all size, mostly at the technical level: your compiler is accidental complexity. Test Driven Development is accidental complexity. Ant, make, cmake and maven are all accidental complexity. Your IDE is accidental complexity. Version control is accidental complexity.

But then there's organizational accidental complexity. Whereas in a small business you'll likely have to talk to very few individuals in order to roll out your system or change, the larger the organization the thickest the layers of control are going to be. So you'll have to have your thing reviewed by some architect. Some coding standards may apply. You may have to use some standard programming language, IDE and/or framework, perhaps particularly unsuited to the problem you are solving. Then you'll have to go thru change control, and then... hell may freeze before you overcome the accidental complexity, and that means more time and more cost.

So at some point, the cost of the accidental complexity is way higher than the cost of the essential complexity. That is when you fire up Excel/Access and start doing shadow IT.

Monday 18 August 2014

The code garage - What to do with old code?

From time to time I have to cleanup my hard disk. No matter how big my partitions are, or how bigger the hard disk is, there always comes a point where I start to be dangerously close to run out of disk space.

It is In these moments when you find that you forgot to delete the WAV tracks of that CD you ripped. That you don't need to have duplicate copies of everything you may want to use from both Windows and Linux because you can keep these in an NTFS partition and Linux will be happy to use them without prejudice.

And it is in these moments when I realise how much code I've abandoned over the years. Mainly in exploratory endeavours, I've written sometimes what in retrospective seem to be substantial amounts of code.

Just looking at abandoned Eclipse and Netbeans folders I find unfinished projects from many years ago. Sometimes I recognise them instantly, and always wonder at how subjective the perception of time is: in my mind that code is fairly fresh, but then looking at the timestamps I realize that I wrote that code seven years ago. Sometimes I wonder why I even thought that the idea was worth even trying at the time.

Yet here they are: a JPEG image decoder written in pure Java whose performance is about only 20% slower than a native C implementation. A colour space based image search algorithm complete with a web front end and back end for analysis. A Python arbitration engine that can scrape websites and alert of price differences applying Levenshtein comparisons across item descriptions. Enhancements to a remote control Android app that is able to drive a Lego Mindstorm vehicle over Bluetooth. That amalgamation of scripts that read EDI messages and extracts key data from them. Like seven different scripts to deal with different media formats, one for each camera that I've owned. And many more assorted pieces of code.

The question is, what I should do with this code? I'm afraid of open sourcing it, not because of patents or lawyers but because its quality is diverse. From slightly above alpha stage to close to rock solid. Some has test cases, some does not. In short, I don't feel it is production quality.

And I can't evade the thought that everything one writes starts in that state: we tend to judge the final product and tend to think that it was conceived in that pristine shape and form from the beginning. I know that's simply false: just look at the version control history of any open source project. But I want to have that smooth finish, clean formatting, impeccable documentation and fully automated build, test and deploy scripts from day one.

Yet some of this could be potentially useful to someone, even to me at some time in the future. So it is a shame to throw it away. So it always ends up surviving the disk cleanup. And I'll see it again in a few years and make myself the same question... why not have the equivalent of the code garage? Some place where you could throw all the stuff you no longer use or you don't think are going to use again and leave it there so anyone passing by can take a look and get the pieces if he/she is interested in them?

Monday 14 April 2014

Heartbleed: the root cause

I can't resist on commenting this, because Heartbleed is the subject of countless debates in forums. In case you've been enjoying your privately owned tropical island for the past week or so, Heartbleed is the name given to a bug discovered in the OpenSSL package. OpenSSL is an Open Source package that implements the SSL protocol, and is used across many, many products and sites to encrypt communications between two endpoints across insecure channels (that is, anything connected by the internet is by definition insecure)

The so-called Heartbleed bug accidentally discloses part of the server memory contents, and thus can leak information that is not intended to be known by anyone else but the OpenSSL server. Private keys, passwords, anything stored in a memory region close to the one involved in the bug can potentially be transmitted back to an attacker.

This is serious. Dead serious. Hundreds of millions of affected machines serious. Thousands of million of password resets serious. Hundreds of thousands of SSL certificates renewed serious. Many, many man years of work serious. Patching and fixing this is going to cost real money, not to mention the undisclosed and potential damage arising from the use of the leaked information.

Yet the the bug can be reproduced in nine lines of code. That's all it takes to compromise a system.
Yet with all its dire consequences, the worst part around Heartbleed for me is what we're NOT learning from it. Here are a few of the wrong learnings that interested parties extract:

Security "experts" : this is why you need security "experts", because you can't never be safe and you need their "expertise" to mitigate this and prevent such simple mistakes to surface and audit everything right and left and write security and risk assesment statements.
Programmers: this Heartbleed bug happened because the programmer was not using memory allocator X, or framework Y, or programming language Z. Yes, all these could have prevented this mistake, yet none of them were used, or could be retrofitted easily into the existing codebase.
Open Source opponents: this is what you get when you trust the Open Source mantra "given enough eyeballs, all bugs are shallow" Because in this case a severe bug was introduced without no one realizing that, hence you can't trust Open Source code.

All these arguments are superficially coherent, yet they are at best wrong but well intentioned and at worst simply lies.

In the well intentioned area we have the "Programmers" perspective. Yes, there are more secure frameworks and languages, yet no single programmer in his right mind would want want to rewrite something of this complexity caliber without at least a sizeable test case baseline to verify it. Where's that test case baseline? Who has to write it? Some programmer around there, I guess, yet no one seems to have bothered with it. In the decade or so that OpenSSL has been around. So these suggestions are similar to saying that you will not be involved in a car crash if you rebuild all roads so that they are safer. Not realistic.

Then we have the interested liars. Security "experts" were not seen anywhere during the two years that the bug has existed. None of them analyzed the code, assuming of course that they were qualified to even start understanding it. None of them had a clue that OpenSSL had a bug. Yet they descend like vultures on a dead carcass on this and other security incidents the demonstrate how necessary they are. Which in a way is true, they were necessary much earlier ago, when the bug was introduced. OpenSSL being open source means anyone at any time could have "audited" the code and highlighted all the flaws -of which there could be more of this kind- and raised all the alerts. None did that. Really, 99% of these "experts" are not qualified to do such a thing. All bugs are trivial when exposed, yet to expose them one needs code reading skills, test development skills and theoretical knowledge. Which is something not everyone has.

And we finally have in the deep end of the lies area we have the Open Source opponents perspective. Look at how this Open Source thing is all about a bunch of amateurs pretending that they can create professional level components that can be used by the industry in general. Because you know, commercial software is rigurously tested and has the backing support of commercial entities whose best interest is to deliver a product that works as expected.

And that is the most dangerous lie of all. Well intentioned programmers can propose unrealistic solutions, the "security" experts can parasite the IT industry a bit more but that creates at best inconvenience and at worst a false sense of security. But assuming that these kinds of problems will disappear using commercial software puts everyone in danger.

First, because all kind of sotfware has security flaws. Ever heard of patch Tuesday? Second, because when there is no source code, there is no way of auditing anything and you rely on trusting the vendor. And third, because the biggest OpenSSL users are precisely commercial entities.

However, as easy it is to say if after the fact, it remains true that there are ways of preventing future Heartbleed-class disasters: more testing, more tooling and more auditing could have prevented this. And do you know what is the prerequisite to do all these things? Resources. Currently the core OpenSSL team consists of ... two individuals. None of which are paid directly for development of OpenSSL. So the real root cause of Heartbleed is lack of money, because there could be a lot more people that could be auditing and crash proofing OpenSSL, if only they were paid to do it.

But ironically, it seems that there is plenty of money on some OpenSSL users, whose business relies heavily on a tool that allows to securely communicate over the Internet. Looking from this perspective, Heartbleed could have prevented if any of the commercial entities using OpenSSL had invested some resources on auditing or improving OpenSSL instead of profitting from it.

So the real root cause of Hearbleed lies in these entities taking away without giving back. And when you look at the list, boy, how they could have given back to OpenSSL. A lot. Akamai, Google, Yahoo, Dropbox, Cisco or Juniper, to name a few, have been using OpenSSL for years, benefitting from the package yet not giving back to the community some of what they got. So think twice before basing part of your commercial success on unpaid volunteer effort, because you may not have to pay for it at the beginning, but later on could bite you. A few hundred of millions of bites. And don't think that holding the source code secret you're doing it better, becase in fact you're doing it much worse.

Monday 26 August 2013

What is wrong with security: "don't use bcrypt"

You know, security is lately one of my biggest sources of irritation. More so when I read articles like this one. On the surface, the article is well written, even informative. But it also shows off most of what is currently wrong with computer security.

Security, like most other areas of the IT world, is an area of specialization. If you look around, you'll see that we have database, operating system, embedded system, storage and network experts. While it is true that the job role that has the best future prospect is the generalist that can can also understand and even learn deeply any subject, it is also true that after a few years of working focused on a specific subject, there is a general tendency to develop more deep knowledge in some subjects than others.

Security is no different in that regard, but has one important difference with all the others: what it ultimately delivers is the absence of something that is not even known. While the rest of the functions have more or less clearly defined goals in any project or organization, security can only provide as proof of effectiveness the lack -or a reduction- of security incidents over time. The problem is, while incidents in other areas of computing are always defined by "something that is not behaving as it should", in security an incident is "something that we did not even know could happen is actually happening"

Instead of focusing on what they don't know, the bad security focus on what they know. They know what has been used so far to exploit an application or OS, so here they go with their vulnerability and antivirus scanners and willingly tell you if your system is vulnerable or not. Something that you can easily do yourself, using the exact same tools. But is not often you hear from them an analysis of why a piece of code is vulnerable, or what are the risky practices you should avoid. Or how the vulnerability was discovered.

And that is part of the problem. Another part of the problem is their seemingly lack of any consideration of the environment. In a similar way to the "architecture astronauts" the security people seem to live in a different world. One where there is no actual cost-benefit analysis of anything and you only have a list of know vulnerabilities to deal with, and at best a list of "best practices" to follow. Such as "don't use bcrypt"

And finally, security guys are often unable to communicate in a meaningful way according to their target audience. Outside a few specialist, most people in the IT field (me included) lack the familiarity with the math skills required to understand the subtle points of encryption, much less the results of the years of analysis and heavy number theory required to even attempt to efficiently crack encryption.

Ironically, the article gets some of these points right. At the beginning of the article, there is an estimation of cracking cost vs. derivation method that should help the reader make an informed decision. There is advice about the bcrypt alternatives and how they stack one against each other.

But as I read further the article, it seems to fall into all these security traps: for example, the title says "don't use bcrypt", only to say on its first paragraph "If you're already using bcrypt, relax, you're fine, probably" Hold on, what was the point of the article then? And if you try to read the article comments, my guess is that unless you're very strong on crypto, you'll not fully understand half of them and will come up confused and even more disoriented.

But what better summarizes what is wrong with security is the second paragraph: "I write this post because I've noticed a sort of "JUST USE BCRYPT" cargo cult (thanks Coda Hale!) This is absolutely the wrong attitude to have about cryptography"

How is detailing the reason for using bcrypt a wrong attitude about attitude? The original article is a good description of the tradeoffs of bcrypt against other methods. That is not cargo cult. Not at least in the same way as "just use a SQL database", "just use a NoSQL database", "just use Windows" or "just use Linux" are cargo cult statements. Those statements are cargo cult only when taken out of context. Like the DBA that indexes each and every field in a table in the hope that sacrificing his disk space, memory and CPU to the cargo cult church will speed up things.

But the original article was not cargo cult. Not more than the "don't use bcrypt" article is cargo cult.

I guess that what I'm trying to say is that there are "bad" and "good" security. The "bad" security will tell you all about what is wrong with something and that you should fix all this immediately. The good security should tell you not only what is vulnerable, but also how to avoid creating vulnerabilities in the future. And provide you ready made and usable tools for the job. And articles like "don't use bcrypt" are frustrating in that they give almost what you need, but in a confusing and contradictory way.

When I choose a database, or operating system, or programming language, or whatever tool to do some job, I do it having only a superficial knowledge the trade offs of each option. But I don't have to be an specialist in any of these to decide. I don't know the nuts and bolts of the round robin vs. priority based and how O(1) task schedulers work. Or the details of a B-Tree vs. hash table index implementations. Or the COW strategy for virtual memory. I know the basics and what works best in each situation, mostly out of experience and education. True, with time I will learn the details of some of these as needed. But a lot of the time software developers are making really educated best guesses. And the more complex the subject -and crypto is one of the most- the more difficult these decisions are.

If I want to encrypt something, I want to have an encrypt function, with the encryption method as a parameter and a brief explanation of the trade offs of each method. And make it fool proof, without any way of misusing it. Yes, someone will find a way of misusing it and probably will be a disaster. Find ways of finding these misuses.

So please security guys, give us tools and techniques to prevent security issues. With a balanced view of their costs and benefits. And let the rest of the world sleep safely in their ignorance of 250 years of number theory. That is your real job. Creating huge repositories of vulnerabilities and malware signatures is not good enough. That in fact does little to protect us from future threats. Give us instead the tools to prevent these in the first place. And in a way that everyone can understand them.Thank you.

Friday 17 May 2013

IT Security: the ones following the rules are those without enough power to override them

With all the talk about IT governance, risk management, security compliance and all that terminology, it seems that most IT people ignore the realities of the environment they are working on.

As an example, let's have a corporate security department, defining security standards and imposing them on the IT organization for almost all possible situations. All in the name of keeping the company away from security incidents, yes. They dismiss all objections about usability, convenience, and even how the security standards are relevant or not to the company business.

That latter point is a pet peeve of mine. It is very easy to define security standards if you ignore everything else and just apply the highest levels of security to everyone. By doing that, nobody is ever going to come back to you and say that the security is not good enough, because you are simply applying the strongest one. However, unless your company or organization is actually a secret security agency, you're seriously restricting usability and the ability of the systems to actually help people doing their jobs. But hey, that's not on my mission statement, right?

What they forget is that applying these standards implies adding overhead for the company. All these security policies not only add time and implementation cost to the company, but also create day to day friction in how people use their tools to accomplish their work.

Not unsurprisingly, the end result is that all these policies end up being overriden by exception. Let's see a few examples coming from real life. Or real life plus a bit of exaggeration to add some humor (note, in the following paragraphs you can replace CEO with whatever role has enough power to override a policy)

Everyone has to enter a 16 digit password that has at least two digits, special characters and use words that do not appear in the Bible. That is, until the CEO gets to type that.
Everyone has to use two factor authentication, until the CEO loses his/her RSA token or forgets to take it to the beach resort.
Nobody can relay or forward mail automatically to external accounts. Until the CEO's mailbox becomes full and Outlook does not allow him/her to respond to a critical message.
Nobody can connect their own devices to the office network. Until the CEO brings to the office his/her latest iPad.
Nobody can share passwords, until the CEO's assistant needs to update the CEO location information in the corporate executive location database. Security forbids delegation for some tasks and this is one of them.
Nobody can use the built in browser password store, until the CEO forgets his/her password for the GMail account that is receiving all the mail forwarded from his coporate account.
All internet access is logged, monitored and subject to blacklist filters. Until the CEO tries to download his/her son latest Minecraft update.
No end user can have admin rights on his/her laptop, until the CEO tries to install the latest Outlook add-on that manages his/her important network of contacts.
USB drives are locked, that is, until the CEO wants to see the interesting marketing materials given away in a USB thumb drive in the last marketing agency presentation, or wants to upload some pictures of the latest executive gathering from a digital camera.

I'm sure you can relate these examples to your real world experience. Now, except for a few perfectly understandable cases of industries or sectors where security is actually essential for the operations of the company, what do you think will happen? Experience tells me that the CEO will get an exception for all these cases.

The corollary is: security policies are only applicable for people without enough power to override them. Which often means that the most likely place for a security incident to happen is in... the higher levels on the company hierarchy. Either that or you make sure the security policy does not allow exceptions. None at all, for anyone. I'm sure that would make the higher company executive levels much more interested in the actual security policies and what they mean for the company they are managing.

Monday 15 April 2013

Record retention and proprietary data formats

My recent experience with an application upgrade left me considering the true implications of using proprietary data formats. And I have realized that they are an often overlooked topic, but with profound and significant implications that are often not addressed.

Say you live in a country where the law requires you to keep electronic records for 14 years. Do you think it is an exaggeration? Sarbanes-Oxley says auditors must keep all audit or review work papers from 5 to 7 years.You are carefully archiving and backing up all that data. You are even copying the data to fresh tapes from time to time, to avoid changes in tape technology leaving you unable to read that perfectly preserved tape -or making it very hard, or having to depend on an external service to restore it.

But I've not seen a lot of people make themselves the question, once you restore the data, which program you'll use to read it? Which operating system will that program run on? Which machine will run that operating system?

First, what is a proprietary data format? Simple, anything that is not properly documented in a way that would allow anyone with general programming skills to write a program to extract data from a file.

Note that I'm leaving patents out of the discussion here. Patents create additional difficulties when you want to deal with a data format, but do not completely lock you out of it. It merely makes things more expensive, but you'll definitely be able to read your data, even if you have to deal with patent issues, which are another different discussion altogether.

Patented or not, an undocumented data format is a form of customer lock in. The most powerful there is, in fact. It means that you depend on the supplier of the programs that read and write that data forever. But the lock in does not stop here. It also means that you are linking your choices of platform, hardware, software, operating system, middleware, or anything else your supplier has decided that is a dependency to read your data.

In the last few years, virtualization has helped somewhat with the hardware part. But still does not remove it completely, in that there could be custom hardware or dongles attached to the machine. Yes, it can get even worse. Copy protection schemes are an additional complication, in that they make it even more difficult for you to get at your data on the long term.

So in the end, the "data retention" and "data archiving" activities are really trying to hit a moving target, one that is very, very difficult to actually hit. Most of the plans that I've seen only focus on some specific problems, but all of them fail to deliver an end to end solution that really address the ability to read the legacy data on the long term.

I suppose that at this point, most of the people reading this is going back to check their data retention and archiving plans and looking for gaping holes in the plans. You found them? Ok, keep reading then.

A true data archiving solution has to address all the problems of the hardware and software necessary to retrieve the data over the retention period. If any of the steps is missing, the whole plan is not worth spending in. Unless of course you want your plan to be used as mean for auditors to thick the corresponding box in their checklist. It is ok for the plan to say "this only covers xxx years of retention, we need to review it in the next yyy years to make sure daat is still retrievable", it is at least much better and more realistic than saying "this plan will ensure that the data can be retrieved in the following zzz years" without even considering that way before zzz years have passed the hardware and software used will become unsupported, or the software supplier could disappear without no one able to read the proprietary data format.

There is an easy way of visualizing this. Instead of talking about the business side of record retention, think about your personal data. All your photos and videos of your relatives and loved ones, taken over the years. All the memories that they contain, they are irreplaceable and also they are something you're likely to want to access in the long term future.

Sure, photos are ok. They are in paper, or perhaps in JPG files, which are by the way very well documented. But what about video? Go and check your video camera. It is probably using some standard format, but some of them use weird combination of audio and video codecs, with the camera manufacturer providing a disk with the codecs. What will happen when the camera manufacturer goes out of business or stops supporting that specific camera model? How you will be able to read the video files and convert to something else? That should make you think about data retention from the right point of view. And dismiss anything that is in an undocumented file format.

Monday 11 February 2013

I just wanted to compile a 200 line C program on Windows

Well, 201 lines to be exact. How fool I was.

Short story: we have a strange TIFF file. There has to be an image somehow stored there, but double clicking on it gives nothing. By the way, this file, together with a million more of them, contains the entire document archive of a company. Some seven years ago they purchased a package to archive digitized versions of all their paper documents, and have been dutifully scanning and archiving all their documents there since then. After doing the effort of scanning all those documents, they archived the paper originals off site, but only organized them by year. Why pay any more attention to the paper archive after all? In the event of someone wanting a copy of an original document, the place to get it is the document archiving system. Only in extreme cases the paper originals are required, and in those cases yes, one may need a couple of hours to locate the paper original, as you have to visually scan a whole year of documents. But is not that of a big deal, especially thinking about the time saved by not having to classify paper.

All was good during these seven years, because they used the document viewer built into the application that works perfectly. However, now they want to upgrade the application, and for the first time in seven years have tried to open one of these files (that have the .tif extension) with a standard file viewer. The result is that they cannot open the documents with a standard file viewer, yet the old application displays them fine. Trying many standard file viewers at best displays garbage, at worst crashes the viewer. The file size is 700K in size, the app displays them perfectly, so what exactly is there?

Some hours of puzzling, a few hexdumps and a few wild guesses later, the truth emerges: the application is storing files with the .tif extension, but was using its own "version" of the .tif standard format. Their "version" uses perhaps the first ten pages of the .tif standard and then goes on its own way. The reasons for doing this could be many, however I always try to keep in my mind that wise statement: "never attribute to malice what can be adequately explained by incompetence"

The misdeed was, however, easy to fix. A quite simple 200 line C program (including comments) was able to extract the image and convert it to a standard file format. At least on my Linux workstation.

I was very happy with the prospect of telling the good news to the business stakeholders: your data is there, you've not lost seven years of electronic document archives, it is actually quite easy and quick to convert these to a standard format and you can forget about proprietary formats after doing that. However, I then realized that they used Windows, so I had to compile the 200 line C program in Windows just to make sure everything was right.

Checking the source, I could not spot any Linux specific things in the program, all appeared to be fairly vanilla POSIX. However what if they are not able to compile it, or the program does something differently? This is one of the moments when you actually want to try it, if only to be absolutely sure that your customer is not going to experience another frustration after their bitter experience with their "document imaging" system and to also learn how portable your C-fu is across OSs. Too many years of Java and PL/SQL and you get used to think that every line of code you write has to run unchanged anywhere else.

So I set myself to compile the C source in Windows before delivering it. That's where, as most always, the frustration began. The most popular computing platform became what is now, among other things, by being developer friendly. Now it seems that it is on its way to become almost developer hostile.

First, start with your vanilla Windows OS installation that likely came with your hardware. Then remove all the nagware, crappleware, adware and the rest of things included by your friendly hardware vendor in order to increase their unit margins. Then deal with Windows registration, licensing or both. Then patch it. Then patch it again, just in case some new patches have been released between the time you started the patching and now that the patching round has finished. About four hours and a few reboots later, you likely have an up to date and stable Windows instance, ready to install your C compiler.

Still with me? In fairness, if you already have a Windows machine all of the above is already done, so let's not make much ado about that. Now we're on the interesting part, downloading and installing your C compiler. Of course, for a 200 line program you don't need a full fledged IDE. You don't need a profiler, or debugger. You need something simple, so simple that you think one of the "Express" editions of the much renowned Microsoft development tools will do. So off we go to the MS site in order to download one of these "Express" products.

So you get here and look at your options. Now, be careful, because there are two versions of VS Express 2012. There's VS Express 2012 for Windows 8 and there's VS Express 2012 for Windows Desktop, depending if you're targeting the Windows store or want to create... what, an executable?. But, I thought Windows was Windows. In fact, I can run a ten year old binary on Windows and will still work. Oh, yes, that's true, but now MSFT seems to think that creating Windows 8 applications is so different than creating Windows Desktop applications that they have created a different Express product for each. Except for paying VS customers, who have the ability to create both kinds of applications with the same product. Express is Express and is different. And you don't complain too much, after all this is free stuff, right?

As I wanted to create a command line application, without little interest in Windows Store, and without being sure of whether an inner circle of hell awaited if I choose one or the other, I simply choose VS Express 2010. That will surely protect me from the danger of accidentally creating a Windows Store application, or discovering that command line apps for example were no longer considered "Windows Desktop Applications" You may think that I was being too cautious or risk averse at this point, but really, after investing so much time in compiling a 200 line C command line utility in Windows I was not willing to lose much more time with this.

Ah, hold on, the joy did not end there. I finally downloaded VS 2010 Express and started the installation, which dutifully started and informed me that it was about to install Net 4.0. How good that the .Net 4.0 install required a reboot, as I was starting to really miss a reboot once in a while since all the other reboots I had to do due to the patching. At least the install program was nice enough to resume installation by itself after the reboot. Anyway, 150 MB of downloads later, I had my "Express" product ready to use.

What is a real shame is that the "Express" product seems to be, once installed, actually quite good. I say "seems" because I did not play with it much. My code was 100% portable in fact, and it was a short job to discover how to create a project and compile it. Admittedly I'm going to ship the executable to the customer the build with debug symbols, as I was not able to find where to turn off debug information. Since the program is 30K in size, that's hardly going to be a problem, and if it is, it's 100% my fault. To be honest, I lost interest in VS Express 2010 quickly once I was able to test the executable and verify that it did exactly the same as the Linux version.

But the point is, in comparison, I can build a quite complete Linux development environment in less than two hours, operating system installation included, incurring in zero licensing cost and using hardware much cheaper than the one needed to run Windows. Why is that to create a Windows program I need to spend so much time?

What happened to the "developers, developers, developers" mantra? Where is it today? Anyone old enough can remember the times when Microsoft gave away free stacks of floppy disks to anyone remotely interested in their Win32 SDK. And those were the days without internet and when CD-ROMs were a luxury commodity. And the days when IBM was charging $700 for their OS/2 developer kit. Guess who won the OS wars?

Things have changed, for worse. Seriously, Microsoft needs to rethink this model if at least they want to slow their decline. At least, I guess I've discovered one pattern that probably can be applied to any future OS or platform. Today, to write iOS/MacOS programs you need to buy a Mac and pay Apple $100. The day it becomes more difficult, complex, or expensive (as if Apple hardware were cheap), that day will be the beginning of the end for Apple.

Tuesday 5 February 2013

The results of my 2012 predictions - 3 wrong, 8 right

A bit late, but time to review what has happened with my 2012 predictions. Since the score is clearly favorable to me, please allow me the time to indulge in some self congratulation, and offer also my services as a technology trend predictor at least better than big name market analysis firms. No, not really. But nonetheless having scored so high deserves some self appraisal, at least.

The bad

Windows becoming legacy. I was wrong on this one, but only on the timing. Microsoft's latest attempt to revive the franchise is flopping on the market, to the tune of people paying for getting Windows 8 removed from computers and replaced by Windows 7. Perhaps Redmond can reverse the trend over time, perhaps Windows 9 will be the one correcting the trend. But they have already wasted a lot of credibility, and as time passes it is becoming clear that many pillars of the Windows revenue model are not sustainable in the future.

Selling new hardware with the OS already installed worked well for the last twenty years, but the fusion of the mobile and desktops, together with Apple and Chromebooks are already eroding that to a point where hardware manufacturers are starting to have the dominant position in the negotiation.
The link between the home and business market is broken. Ten years ago people were buying computers essentially with the same architecture and components for both places, except perhaps with richer multimedia capabilities at home. Nowadays people are buying tablets for home use, and use smartphones as complete replacements of things done in the past with desktops and laptops.
On the server side, the open source alternatives gain credibility and volume. Amazon EC is a key example where Windows Server, however good it is, it is being sidetracked on the battle for the bottom of the margin pool.

JVM based languages. I was plain wrong on this one. I thought that the start of Java's decline would give way to JVM based alternatives, but those alternatives, while not dead, have not flourished. Rails keeps growing, PHP keeps growing and all kind of JavaScript client and server based technologies are starting to gain followers.

As for compuer security... well, the shakeup in the industry has not happened. Yet. I still think that the most of the enterprise level approach to security is plain wrong, focused more on "checklist" security than on actual reflection of the dangers and implications of their actions. But seems that no one has started to notice except me. Time will tell. In the end, I think this one was more of a personal desire than a prediction in itself.

The good

Mayan prophecy. Hey, this one was easy. Besides, if it were true, I won't have to acknowledge the mistake on a predictions result post.

Javascript. Flash is now irrelevant. Internet connected devices with either no Flash support at all or weak Flash support have massively outnumbered the Flash enabled devices. jQuery and similar technologies now provide almost the same user experience. Yes, there are still some pockets of Flash around, notably games and the VMWare console client, but Flash no longer is the solution that can be used for everything.

NoSQL. I don't have hard data to prove it, but some evidence -admittedly a bit anecdotal- from its most visible representative, MongoDB, strongly suggest that the strengths and weaknesses of each NoSQL and SQL are now better understood. NoSQL is no longer the solution for all the problems, but a tool that, as any other, has to be applied when it is most convenient.

Java. I have to confess that I did not expected Java to decline so quickly, but as I said a year ago, Oracle had to change a lot to avoid that, and it has not. The latest batches of security vulnerabilities (plus Oracle's late, incomplete and plain wrong reaction) have finally nailed the coffin for Java in the browser, no chances of going back. A pity, now that we have Minecraft. On the server side, the innovation rate in Java is stagnated and the previously lightweight and powerful framework alternatives are now seen as bloated and complex as their standards derived by committee brethren.

Apple. Both on the tablet and mobile fronts. Android based alternatives already outnumber Apple's products in volume, if not in revenue. And Apple still continues to be one of the best functioning marketing and money making machines on the planet.

MySQL. This one really is tied down again to Oracle's attitude. But it has happened, both for the benefit of Postgres and the many MySQL forks (MariaDB, Percona, etc) that keep in their core what made MySQL so successful.

Postgres. In retrospect, that was easy to guess, given the consistent series of high quality updates received in the last few years and the void left by Oracle's bad handling of MySQL and the increasingly greedy SQL Server licensing terms.

Windows Phone. Again, an easy one. A pity, because more competition is always good. As with Winodws 8, it remains to be seen if Microsoft can -or want to- rescue this product from oblivion.

Will there be any 2013 predictions now that we're in February?

On reflection, some of these predictions were quite easy to formulate, if somehow against what the general consensus was at the time. That's why there is likely not going to be 2013 predictions. I still firmly think that Windows will go niche. That is happening today, but we have not yet reached the "Flash is no longer relevant" tipping point. You'll know that we've arrived there when all the big name technologists start saying that they were seeing it coming for years. But they have not started saying that. At least yet.

Anyway, this prediction exercise left my psychic powers exhausted. Which is to say, I don't have that many ideas of how the technology landscape will change during 2013. So as of today, the only prediction I can reliably make is that there won't be 2013 predictions.