Thursday 3 July 2008

Maintenance programmer and performance

It's almost universal. And unavoidable. Yet I'm not sure if it's right. But it's true. At almost any organization whose goal is to develop and maintain software, there is a knowledge based hierarchy. The best and brightest individual tend to take care of the more complex issues with programming and architecture. The juniors and newcomers are assigned lowest value added tasks. Usually, organizations tend to offer career options and other incentives to their best people, with the hope of retaining them and extract more value from the time they are in the office. The less experienced, productive or talented people are invariably relegated to jobs that are perceived as adding less value.

Agile/Extreme Programming methodologies try to get away with this segregation of categories by using "Pair programming" and making everyone accountable for all code in the system, but this is in practice difficult to achieve. It needs to have a more or less homogeneous team in terms of experience and capability, as without an equal footing "pair programming" ends up morphing into some sort of mentoring scheme. In that scheme, pairs are allocated in such a way that there's always a "master" on the subject working together with someone less experienced. In an ideal world, they will become true pairs given enough time, but time is money for a business that wants results here, now and cheap. Nonetheless, and although I don't have field experience on that, I think that pair programming properly done can be one of the greatest assets of a programming team.

It's after all the best business interest to maximize the return of every penny they put at risk (read invest in a software package), but this hierarchical scheme it does usually have an impact on the way the software maintenance is done. Because the best people is usually assigned to the "best problems", and maintenance is often consigned as a necessary evil.

I've never understood that point of view. All the literature about business software development is sending the same message about emphasizing readable and clear code based on field data that proves that maintenance is often half of the total cost of any application, if not more. It should follow logically that assuming that maintenance costs are high, maintenance taken as its own profit centre should be given the importance it deserves. But in fact it's not often done.

Part of the reason, I suppose, stems from the wide scope of the word "maintenance" From simple changes in user interface or report layout to whole data model changes, all of them often fall into the same bucket of "maintenance". While obviously the profile and capabilities needed for one task versus another are deeply different, the fact is that the same resources are assigned usually for both kind of tasks.

And that invariably impacts performance. And I'm sure that even strict adherence to the most rigid change management methodology will still leave business systems to unexpected performance problems, if only because assessment of the performance impact of changes is still an unexplored area.

Add a bit here, add a bit there.


Let's tell a story that illustrates this very well. Bob is an entry level developer assigned to minor enhancements in the company ERP system. One day, Bob receives a call from a change manager that has already completed the immense paperwork required to add a simple data point to a data entry screen. When entering a customer order, the system has to display the outstanding value of all orders pending to be delivered to the customer.

While Bob is relatively new to programming, he is able to complete the task and creates a simple subroutine (or method, or user exit) that computes the outstanding value of all orders for an individual customer. The change is tested, moved to live, and for Bob it's all well and good.

Months later, some internal control manager which is being part of the internal audit team needs to take action around a point remarked in the last audit around the excessive number of customer orders entered that exceed the per customer agreed order volume. He calls for a change so the data entry clerk can spot not only the amount but also the outstanding volumes.

Bob receives the request and changes the subroutine (or method, or user exit) to display that information on screen. It makes sense to do it at the same time that the order value is calculated, since they are closely related. The change is tested, moved to live, and for Bob its all well and good. So good in fact that he's asked to create a report listing all that information, since the customer service manager will start its daily operations meeting with that information on the desk for the team to arrange and prepare shipments.

Being smart, Bob creates a report that calls his function once for each customer in the customer master, checking for the outstanding order volume and adding it to the report if it has some. The report runs a bit slow, but that's not a problem because the report is run overnight. Change is tested, moved and everything is good.

Bob moves into other projects, and eventually leaves the company. He's replaced by Sean, an equally talented and motivated person as Bob, who later on receives the request to change the report. As the customer service manager focuses his daily meeting on the activities done in the next 24 hours, he needs to split the outstanding order volume by delivery date, so as not to waste time reviewing something that it's not an immediate concern.

Sean examines the report and without hesitation, changes Bob function so that it breaks totals by delivery date and adds them up at the end. Of course, the function is even slower and the report more so, but for Sean is the most logical way to complete the work since it involves the least amount of change and effort. Also, the report is giving 24 more times information, so it's reasonable that it takes more time. Oh, besides that the report is run overnight, so performance should not be too big of a concern...

Sean's change is tested, moved to live and all is good... well, not. Now all the data entry team is screaming because they cannot enter orders. The system has become inexplicably slow and it takes ages to do now what in the past was acceptable, if not exactly snappy. Of course, the explanation is that the same program logic is being used for two very different contexts (night batch report and live data entry) with very different requirements. Each time the program is changed it makes perfect sense in that concrete context, yet the cumulative changes lead to a problem.

I've noticed this pattern a lot of times happening in the real world (in fact, this story is true except for the person's names) The problem would have been avoided if close testing of the whole system were done, or if close inspection of code revealed its usage in multiple places, and possibly the conflicting uses of this.

Of course, had Sean or Bob been mentored, they would have included some comments or documentation somewhere. Had complete performance testing of the whole system been done the problem would have been detected. But who does that for a seemingly innocent change in some report output or for an information only label on a data entry screen?

The irony is that, when the problem is detected, poor Sean is blamed for its poor coding practices, while he had few choices in his way to proceed with the change. Had he created another procedure he would be seen as "copy & paste" programmer, not taking advantage of functionality already there and investing more time than necessary in the change. Had he said "look, it's very inefficient to loop over customers just trying to find one with outstanding orders, let's try to make this the other way as it will be more efficient" he would have been said that performance was not that important. Had he spotted the actual problem of reusing the code because it was used also in an interactive data screen, he would be said that he was not assigned to that part of the code.

The curse of the maintenance programmer


All of the above conspires against the maintenance developer. Regardless of their experience and skill level, they will be always caught in the middle of discussions. The IS side will argue about costs and resource limitations and the business side will argue about costs and deadlines.
Plus, an existing application forms part of a closely interweaved link of process, knowledge and business practices. It's not easy to justify any refactorings once an application is live, if only because the benefits are never apparent in the short term.

I've seen very talented developers despair when being assigned maintenance roles. Not because of the quality of the challenge or problem they have to face, as some systems can be more complex to maintain than to develop given the number of additions that have after its go live. No, the reason for their despair is that they have to deal with a codebase that is inherited from not only the original developers but also anybody else that has been it their chairs before. When they are asked to do anything with the code, and after overcoming the "complete rewrite syndrome", they may feel comfortable with it. However, when faced with change they are always compelled by management to take the shortest (read cheapest) route. As we know from the travelling salesman problem, this is not always the optimal solution.

No comments:

Post a Comment