Thursday, September 14, 2006

Into bit twiddling

*Note * Please avoid if you are not interested in technical drivel.

Let us continue with the earlier topic which I digressed from.
Now , bit twiddling at the low level is just that - fool around with bits and bytes to do things more 'efficiently'.

Like the classic bitcount :

Naive :
int bitcount(unsigned int mask){
int retval = 0;
int indx = 1;
int count = 32;
while (count --){ retval += (indx & mask) ? 1 : 0; indx <<= 1; }
return retval;
Bad indeed !

Better :
int bitcount(unsigned int mask){
int retval = 0;
while (mask){ mask &= (mask - 1); retval ++; }
return retval;

Just to show the relative differences between both algo's , you do have better options :-)

Similarly , you can have one for finding out the lsb (least significant bit).
a) iterate from bitpos 0 to 31 ,
b) Extract only lsb (mask &= (-mask);) use '&' to identify which all block's have a bit set : a few '&' , compare and '+''s (all relatively low cost when done in ctx of registers).
c) Have a lsb_pos[16k] (or so) array which hold's the lsb positions and use a few conditions and shifts to reduce problem to a array lookup + offset.
d) drop to assembly and use that (single instruction in a lot of cases) ;-)

If you are going to have a random input space , then iterating from 0 to 31 makes no sense for finding lsb. On the contrary , if you are sure that there is high probability of success within the first few lsb's , then a brute force approach from 0 to 31 might be good !
Similarly , if you are going to have a large number of array's and the lsb_pos array might get kicked out of L2 , then relying on that approach might not be too good - (b) or even (a) might be better ! A cache miss is quite expensive in modern processors.

What I am getting at is , runtime aspects of a piece of code is not just dependent on the complexity - but also on various other aspects. Ofcourse complexity plays a large part (I will show an example below where it is inverse) and for most cases , it should be the first tool to inspect any piece of code/idea - but dont rely on it blindly. It is asymptotic behaviour in the face of large input space : which rarely happens in actual practise.
Even when the complexity of the different approachs might look the same , the runtime aspects could be markedly different : more so if start considering the actual implementation aspects.

Now to move on to a slightly counter intutive example.
Consider that at every so often , you will generate a list of probable options and 'tentative weights' associated with each. Each weight is a heurestic guess as to how good the option is - and you will essentially use this to select the option and do a search to find out the actual value (those familiar with game theory can consider this as weighting the options in a minimax or alphabeta tree).
If you find a option which returns a value above some minimum - you are done and can 'return'.
Also assume that your 'guesses' are reasonably good (but not fool proof ofcourse ;-) )
Now , if you are given this problem , how will you pick the option to be tried ?

When I ask this question to interviewee's , without batting an eyelid they answer with all sorts of sorting algorithms : ranging from bubblesort to heapsort and some would even go as far as to write down the code in their enthusiasm to get an easy answer.
But , they miss a fundamental problem - they did not study the probabilistic aspects of the problem space properly.

In this problem , if the first option is the right choice it makes no sense to search the others at all !
And if you 'guess' is a reasonably good one , then the probability of you needing to go all the way down to the last option is pretty low !!
So , in this particular case , a brute force "scan for highest and return" approach might actually make a lot of sense as compared to first sorting all options and then picking from the sorted set.
Hence , your order 'n ^ 2' approach could work out better than 'n log n'.
Ofcourse , implict in the 'n log n' approach is the 'cost' of doing the actual sort : both in terms of processing and memory.
Actually , this is the approach used by most chess programs in selecting the best move from the generated set - so , no I am not yanking your chain about this problem :-).

These might look like corner cases , and maybe they are corner cases : but understanding your problem and the runtime aspects go a long way in designing a solution which will have low 'cost' (performance tuning , rewriting , etc).

Recently , I was faced with another interesting problem - not really a problem , but more like reviewing a design. There was this proof of concept being discussed to add something to the server. First of all , it was something which would be one of the many options and secondly it was not going to be something which would be 'touched' too much.
The interface was going to be a simple 'send to all servers' and another being 'notify me when you recieve anything from other servers'.
Simple 2 methods - add another to initialise and the whole subsystem would have like 3 exposed interface methods.
This new subsystem would essentially initialise an external 'provider' , configure it with some configuration values (typically as part of init() ) and then it would be ready to send and recieve.

But voila , the design presented had like 10+ classes , a bunch of interfaces , etc : all for what a simple util class with 3 method's would have sufficed : that too for something which was initially defined as potentially throwaway work.
It was overdesigning at its best : without even understanding what the problem was to be solved , what the infrastructure provides , what were its contracts with the api users , etc.
Worse was to convince them that it sucked - the design had all the common 'design patterns' in it , so how could it be wrong ! :-)

Somewhere down the line , a lot of developers forget that it is not our job to design marvellous class hierarchies and abuse design patterns. The intent is to solve problems - and everything else is an aid in this quest : if you find that something is making your life miserable , dont do it !
Introducing 10 classes to do something which could be done by 1 , and offering nothing of utility to anyone - but just to make the code 'look good' is overkill.
I find that a lot of java programmers in particular are prone to this sort of overdesigning malady.
Sad indeed is the state of a lot of projects just 'cos of attitude like this.
You should be pragmatic in your design and yet make sure that you do not design yourself into a corner !

So moral of the story ? No moral - code a lot and code well !
Learn from the code written by more expierenced developers , and you will learn a lot.
Programming still remains a craft - which is why some codebases are very intutive and a joy to work with while still remaining highly performent , while others suck ass.


Post a Comment

Links to this post:

Create a Link

<< Home