Monday, November 29, 2010

10 mistakes every programmer makes Admit it, you've made mistakes like these


When you start programming, you get disillusioned quickly. No longer is the computer the allinfallible perfect machine – "do as I mean, not as I say" becomes a frequent cry.
At night, when the blasted hobgoblins finally go to bed, you lie there and ruminate on the errors you made that day, and they're worse than any horror movie. So when the editor of PC Plus asked me to write this article, I reacted with both fear and knowing obedience.
I was confident that I could dash this off in a couple of hours and nip down to the pub without the usual resultant night terrors. The problem with such a request is, well, which language are we talking about?
I can't just trot out the top 10 mistakes you could make in C#, Delphi, JavaScript or whatever – somehow my top ten list has to encompass every language. Suddenly, the task seemed more difficult. The hobgoblins started cackling in my head. Nevertheless, here goes…

1. Writing for the compiler, not for people
When they use a compiler to create their applications, people tend to forget that the verbose grammar and syntax required to make programming easier is tossed aside in the process of converting prose to machine code.
A compiler doesn't care if you use a single-letter identifier or a more human-readable one. The compiler doesn't care if you write optimised expressions or whether you envelop sub-expressions with parentheses. It takes your human-readable code, parses it into abstract syntax trees and converts those trees into machine code, or some kind of intermediate language. Your names are by then history.
So why not use more readable or semantically significant identifiers than just i, j or x? These days, the extra time you would spend waiting for the compiler to complete translating longer identifiers is minuscule. However, the much-reduced time it takes you or another programmer to read your source code when the code is expressly written to be self-explanatory, to be more easily understandable, is quite remarkable.
Another similar point: you may have memorised the operator precedence to such a level that you can omit needless parentheses in your expressions, but consider the next programmer to look at your code. Does he? Will he know the precedence of operators in some other language better than this one and thereby misread your code and make invalid assumptions about how it works?
Personally, I assume that everyone knows that multiplication (or division) is done before addition and subtraction, but that's about it. Anything else in an expression and I throw in parentheses to make sure that I'm writing what I intend to write, and that other people will read what I intended to say.
The compiler just doesn't care. Studies have shown that the proportion of some code's lifecycle spent being maintained is easily five times more than was spent initially writing it. It makes sense to write your code for someone else to read and understand.

2. Writing big routines
Back when I was starting out, there was a rule of thumb where I worked that routines should never be longer than one printed page of fan-fold paper – and that included the comment box at the top that was fashionable back then. Since then, and especially in the past few years, methods tend to be much smaller – merely a few lines of code.
In essence, just enough code that you can grasp its significance and understand it in a short time. Long methods are frowned upon and tend to be broken up.
The reason is extremely simple: long methods are hard to understand and therefore hard to maintain. They're also hard to test properly. If you consider that testing is a function of the number of possible paths through a method, the longer the method, the more tests you'll have to write and the more involved those tests will have to be.
There's actually a pretty good measurement you can make of your code that indicates how complex it is, and therefore how probable it is to have bugs – the cyclomatic complexity.
Developed by Thomas J. McCabe Sr in 1976, cyclomatic complexity has a big equation linked to it if you're going to run through it properly, but there's an easy, basic method you can use on the fly. Just count the number of 'if' statements and loops in your code. Add 1 and this is the CC value of the method.
It's a rough count of the number of execution paths through the code. If your method has a value greater than 10, I'd recommend you rewrite it.

3. Premature optimisation
This one's simple. When we write code, sometimes we have a niggling devil on our shoulder pointing out that this clever code would be a bit faster than the code you just wrote. Ignore the fact that the clever code is harder to read or harder to comprehend; you're shaving off milliseconds from this loop. This is known as premature optimisation.
The famous computer scientist Donald Knuth said, "We should forget about small efficiencies, say about 97 per cent of the time: premature optimisation is the root of all evil."
In other words: write your code clearly and cleanly, then profile to find out where the real bottlenecks are and optimise them. Don't try to guess beforehand.

4. Using global variables
Back when I started, lots of languages had no concept of local variables at all and so I was forced to use global variables. Subroutines were available and encouraged but you couldn't declare a variable just for use within that routine – you had to use one that was visible from all your code. Still, they're so enticing, you almost feel as if you're being green and environmentally conscious by using them. You only declare them once, and use them all over the place, so it seems you're saving all that precious memory.
But it's that "using all over the place" that trips you up. The great thing about global variables is that they're visible everywhere. This is also the worst thing about global variables: you have no way of controlling who changes it or when the variable is accessed. Assume a global has a particular value before a call to a routine and it may be different after you get control back and you don't notice.
Of course, once people had worked out that globals were bad, something came along with a different name that was really a global variable in a different guise. This was the singleton, an object that's supposed to represent something of which there can only be one in a given program.
A classic example, perhaps, is an object that contains information about your program's window, its position on the screen, its size, its caption and the like. The main problem with the singleton object is testability. Because they are global objects, they're created when first used, and destroyed only when the program itself terminates. This persistence makes them extremely difficult to test.
Later tests will be written implicitly assuming that previous tests have been run, which set up the internal state of the singleton. Another problem is that a singleton is a complex global object, a reference to which is passed around your program's code. Your code is now dependent on some other class.
Worse than that, it's coupled to that singleton. In testing, you would have to use that singleton. Your tests would then become dependent on its state, much as the problem you had in testing the singleton in the first place. So, don't use globals and avoid singletons.

5. Not making estimates
You're just about to write an application. You're so excited about it that you just go ahead and start designing and writing it. You release and suddenly you're beset with performance issues, or out-of-memory problems.
Further investigations show that, although your design works well with small number of users, or records, or items, it does not scale – think of the early days of Twitter for a good example. Or it works great on your super-duper developer 3GHz PC with 8GB of RAM and an SSD, but on a run-of-the-mill PC, it's slower than a Greenland glacier in January.
Part of your design process should have been some estimates, some back-back-of- the-envelope calculations. How many simultaneous users are you going to cater for? How many records? What response time are you targeting?
Try to provide estimates to these types of questions and you'll be able to make further decisions about techniques you can build into your application, such as different algorithms or caching. Don't run pell-mell into development – take some time to estimate your goals.

6. Off by one
This mistake is made by everyone, regularly, all the time. It's writing a loop with an index in such a way that the index incremented once too often or once too little. Consequently, the loop is traversed an incorrect number of times.
If the code in the loop is visiting elements of an array one by one, a non-existent element of the array may be accessed – or, worse, written to – or an element may be missed altogether. One reason why you might get an off-by one error is forgetting whether indexes for array elements are zero-based or one-based.
Some languages even have cases where some object is zero-based and others where the assumption is one-based. There are so many variants of this kind of error that modern languages or their runtimes have features such as 'foreach loops' to avoid the need to count through elements of an array or list.
Others use functional programming techniques called map, reduce and filter to avoid the need to iterate over collections. Use modern 'functional' loops rather than iterative loops.

7. Suppressing exceptions
Modern languages use an exception system as an error-reporting technique, rather than the old traditional passing and checking of error numbers. The language incorporates new keywords to dispatch and trap exceptions, using names such as throw, try, finally and catch.
The remarkable thing about exceptions is their ability to unwind the stack, automatically returning from nested routines until the exception is trapped and dealt with. No longer do you have to check for error conditions, making your code into a morass of error tests.
All in all, exceptions make for more robust software, providing that they're used properly. Catch is the interesting one: it allows you to trap an exception that was thrown and perform some kind of action based upon the type of the exception.
The biggest mistakes programmers make with exceptions are twofold. The first is that the programmer is not specific enough in the type of exception they catch. Catching too general an exception type means that they may be inadvertently dealing with particular exceptions that would be best left to other code, higher up the call chain. Those exceptions would, in effect, be suppressed and possibly lost.
The second mistake is more pernicious: the programmer doesn't want any exceptions leaving their code and so catches them all and ignores them. This is known as the empty catch block. They may think, for example, that only certain types of exceptions might be thrown in his code; ones that they could justifiably ignore.
In reality, other deadly runtime exceptions could happen – things such as out-of-memory exceptions, invalid code exceptions and the like, for which the program shouldn't continue running at all. Tune your exception catch blocks to be as specific as possible.

8. Storing secrets in plain text
A long time ago, I worked in a bank. We purchased a new computer system for the back office to manage some kind of workflow dealing with bond settlements. Part of my job was to check this system to see whether it worked as described and whether it was foolproof. After all, it dealt with millions of pounds daily and then, as now, a company is more likely to be defrauded by an employee than an outsider.
After 15 minutes with a rudimentary hex editor, I'd found the administrator's password stored in plain text. Data security is one of those topics that deserves more coverage than I can justifiably provide here, but you should never, ever store passwords in plain text.
The standard for passwords is to store the salted hash of the original password, and then do the same salting and hashing of an entered password to see if they match.
Here's a handy hint: if a website promises to email you your original password should you forget it, walk away from the site. This is a huge security issue. One day that site will be hacked. You'll read about how many logins were compromised, and you'll swallow hard and feel the panic rising. Don't be one of the people whose information has been compromised and, equally, don't store passwords or other 'secrets' in plain text in your apps.

9. Not validating user input
In the good old days, our programs were run by individuals, one at a time. We grew complacent about user input: after all, if the program crashed, only one person would be inconvenienced – the one user of the program at that time. Our input validation was limited to number validation, or date checking, or other kinds of verification of input.
Text input tended not to be validated particularly. Then came the web. Suddenly your program is being used all over the world and you've lost that connection with the user. Malicious users could be entering data into your program with the express intent of trying to take over your application or your servers.
A whole crop of devious new attacks were devised that took advantage of the lack of checking of user input. The most famous one is SQL injection, although unsanitised user input could precipitate an XSS attack (crosssite scripting) through markup injection.
Both types rely on the user providing, as part of normal form input, some text that contains either SQL or HTML fragments. If the application does not validate the user input, it may just use it as is and either cause some hacked SQl to execute, or some hacked HTML/JavaScript to be produced.
This in turn could crash the app or allow it to be taken over by the hacker. So, always assume the user is a hacker trying to crash or take over your application and validate or sanitise user input.

10. Not being up to date
All of the previous mistakes have been covered in depth online and in various books. I haven't discovered anything new – they and others have been known for years. These days you have to work pretty hard to avoid coming into contact with various modern design and programming techniques.
I'd say that not spending enough time becoming knowledgeable about programming – and maintaining that expertise – is in fact the biggest mistake that programmers make. They should be learning about techniques such as TDD or BDD, about what SLAP or SOLID means, about various agile techniques.
These skills are of equal or greater importance than understanding how a loop is written in your language of choice. So don't be like them: read McConnell and Beck and Martin and Jeffries and the Gang of Four and Fowler and Hunt & Thomas and so on. Make sure you stay up to date with the art and practice of programming.
And that concludes my top 10 list of mistakes programmers make, no matter what their language stripe. There are others, to be sure, perhaps more disastrous than mine, but I would say that their degree of dread is proportional to the consequences of making them.
All of the above were pretty dire for me the last time I made them. If you have further suggestions or calamities of your own, don't hesitate to contact me and let me know.

Friday, November 26, 2010

The Next Really, Really, Really, Big Thing



Everybody should be excited about the next big thing. And why not? It’s very, extremely big. Even bigger than anything that came before. No, really, it’s that freakin’ HUGE.

If you don’t want to get left behind, you’ve got to hop on this right away. Of course, you will need to be fast and smart and work late nights, but it will be worth it. You can’t go halfway on a thing like this. It’s all or nothing, baby!

I’m here to tell you what this big thing is. But first, let’s take a quick look at past big things so that we can see why this one is so much bigger.

A Short History of Big Things

We live in interesting times. Conventional wisdom says that it takes about 20 years for new technology to take its full effect. These days, innovation cycles are much shorter, so we’re getting new stuff before we really know what to do with the old.

Many economists believe that these time lags account for the productivity paradox (i.e. it’s notoriously difficult to measure what we really get out of all this new stuff). So it is always hard to see the next big thing until it’s already really big and you’ve missed out.

Nevertheless, there are always pundits and gurus to point the way. Unfortunately, they are usually only partly right, which makes the history of big things somewhat muddled:

Digital Media: Sometime back in the 90’s, an extremely confident young man appeared on the TV show, 60 Minutes, and announced that he was going to put their company (CBS) out of business.

I don’t remember what actually happened to the guy, but last year CBS earned about a billion dollars in operating profit (Yahoo made about a tenth as much). 60 minutes, of course, is still on the air and still gets huge ratings.

E-Commerce: During the dot-com boom, many pointed out that a lot of the web revenues were driven by advertising (which, for some reason, is supposed to be a bad business). However, selling things over the web was infinitely more promising.

Of course, many of those e-commerce start-ups failed, some did okay and some did extremely well. Today, Amazon.com is enormously successful, but really not in Wal-Mart’s league. I was at the mall the other day and it seemed pretty crowded.

Search: After the crash in 2000, Search emerged as the new, new thing. Google has made a bundle on this one (and some regional players, like Yandex in Russia and Baidu in China, have also done well). Yahoo and Microsoft… not so much.

Social Media: This is the most recent big thing (and, of course, has a big movie to prove it). Facebook has 500 million members, but profits remain elusive. Others, such as MySpace, Friendster and Digg… well, we’ll see.
Big Things That Last

Of course, the biggest things get so big that they last for a very long time. Jim Collins profiled a bunch of them in his book Built to Last. He studied firms like Hewlett Packard, Sony and General Electric and found that much of what we hear about really big things is untrue.

For instance, they often don’t start with very good ideas. In fact, sometimes they begin with lousy ones (apparently Sony’s first product was a rice cooker). Nor do they tend to have charismatic, visionary leaders. What they do have is a lot of talented people who work as a team.

It seems to me, this is where a lot of technology driven companies go wrong. We glamorize the vision and forget that it is people who actually make it happen. Moreover, because our globalized, digitized world is so complex, these people have very diverse skills and perspectives and need to operate in an uncertain environment.

Getting really smart, driven people to work together well is the truly BIG thing.
Winning the Talent War

A while back, I wrote a post about how to win the war for talent, and I made the point that talent isn’t something you acquire, it’s something you build. I think it’s worth summarizing the main points here:

In-House Training: While third party training can sometimes be helpful, having an in-house training program is much more valuable. Companies like GE and McDonald’s have put enormous resources into training campuses, but even small companies can build good programs with a little effort and focus.

An often overlooked benefit of in-house training is the trainers themselves, who are usually mid and senior level employees. They get to refresh basic concepts in their own minds while they teach more junior people. This also helps the old guard get invested in the next generation.

Perhaps most importantly, training helps to bring people together who would ordinarily not meet and improves connectivity throughout the company.

Focus On Intrinsic Motivation: Most people want to do a good job. Of course, money is important, but the best people want to achieve things and to be recognized for doing so. Often, time and effort wasted on designing elaborate compensation schemes could be better spent on getting people recognized for true accomplishments.

A senior executive taking a minute or two to stop and recognize a job well done can often mean more than a monetary reward. That doesn’t mean that people don’t need to be paid what they’re worth, but anybody can sign a check. Paying big salaries is not, and will never be, a long term competitive advantage.

Best Practice Programs: One way for people to shine is to have regular meetings where they can present successful initiatives to their peers. This also helps increase connectivity and gets good ideas spread throughout the company.

Another approach is to build an in house social network where people can share ideas and rate each others work (there are plenty of applications similar to slideshare that can be adapted easily and cheaply to a company intranet).

Coaches and Mentors: Getting regular feedback is essential for development. We’re generally pretty bad judges of our own efforts. Some companies have formal mentoring programs that are quite successful. However, what is most important is a realization throughout the company that senior people are responsible for helping to develop junior ones.

Firing Nasty People: A long time ago, I decided that I didn’t want to work with nasty people, so I started firing them regardless of competency. I’ve been amazed at what a positive effect it had and have never looked back. Nasty people invariably destroy more than they create.

A Community of Purpose: Most of all, people need to believe in what they do; that their work has a purpose and makes a positive impact. Nothing motivates better than a common cause that people value above themselves.

So the next big thing is really not much different than the previous ones. There will be an interesting idea that has real value and most of the companies who jump on it will screw it up and lose a lot of money.

The difference, of course, will be made by the people who are working to solve everyday problems, how they are developed and how they treat each other.