bit twiddling ...
I find myself at home relaxing most of the time with nothing to do ... and it is irritating !
Luckily , I still have a cygwin compiler here at my parents comp and some old chess code that I had written ... so it is back to chess !
Typically when you write code , the general attitude is : get it right first - and later worry about performance. This does help in a lot of cases and as a general thumb rule , it might be fine.
But from my experience , I see something this happening to a lot of novice chess programmers :
A) You finish your code , profile/optimise a few times - and then hit a brick wall.
You end up with this inefficiency spread across the whole codebase such that nothing registers as a hotspot , but the whole thing sucks.
It really does make sense to take performance also into account when writing code - premature optimisation is really an evil : but inefficient code is not the alternative !
If you are going to use a large number of lookup tables such that they will get kicked off L2 - the memory latency and worse tlb thrashing will kill your program.
If you use branches without bounds , the BPT will not be able to accomadate your loops and you will end up messing up your pipelines.
I have seen people muck up even simple things like loop invarients - which might not be simple enough for the compiler to optimize and yet stupid enough for the human to change.
Compilers are pretty dumb when it comes to a lot of optimisations - and if you dont want to sprinkle your code with assembly (I hate that) , it might make sense to 'guide' the compiler to generate optimal code .... for example , like make your code ammendable to be converted to cmov (again a rule of thumb :-P) , reduce branches if you can help it ... sometimes the overhead of a few extra cycles spent in executing some code is worth it if you can preserve the pipelines.
I remember once removing a branch/assignment with a shift/and ... helped quite a bit - though on face of it , it looked more expensive !
Read the processor optimisation manuals - even if you code in a high level language !
The insights on how the processor works in enlightening ... dont rely too much on it while coding , but knowing how your code might get executed helps in writing better code.
Look at the assembly listing of your program once in a while - especially the parts where most of the time is spent - you might be surprised and shocked at what you see :-)
B) Which brings us to next major goof-up : Micro optimisations !
On the flip side , I see programmers who are obsessed with optimisation - so much so that they write 'wonderful' tight code for their small functions , and try to squeeze the most out of it.
They hack around with assembly for a function that is rarely called , forget the more fundamental design aspects and microoptimise their inefficient design.
When they sew everything up , the final program sucks ass.
Their movegen would be blazingly fast , their makemove would be efficient - their perft or alphabeta search goes like the snail.
In general , writing inline assembly is not very good from an optimisation point of view - you are messing up what the compiler could have done and end up ruining it for the compiler : dabble in it only if you are darn sure there is no other alternative.
Example , might make sense to use it for bsr/bsf (find lsb/msb) .... alternative would be a bunch of cmov's.
Ofcourse , taking all these in isolation does not really make sense ... the rest of the pipeline where it is executed does play a vital role in modern processors.
If you can parallelise your code such that multiple pipelines can potentially execute the code , you could end up with a nice speedup !
The way your code gets executed , the context in which it is getting executed , how to optimise the data required for it , the data layout , memory allignment , etc play a very crucial role ...
Though I spoke primarily from the point of view of C , most of the above are relevent to other high level languages - even once which run atop a virtual machine.
Though you might be tempted to treat all these as a black box while writing code , but if you know how the whole system might eventually behave , it helps in writing better code ...
What I have found is , most of the rules of thumb are just that - rules of thumb.
You should know when to stick with them , and when to break them ... and there in lies expierence : blind faith in these 'rules' will only lead to problems : and worse , it is tough to sometimes convince some of these proponents that they really are shooting themselves on their foot !
And most of the above is just that : rules of thumb ...
I wanted to talk about bit twiddling in particular but assumed it make sense to briefly ramble to clarify that what I talk about next is subjective and extremely context sensitive.
Whatworks for one need not work for other !
Luckily , I still have a cygwin compiler here at my parents comp and some old chess code that I had written ... so it is back to chess !
Typically when you write code , the general attitude is : get it right first - and later worry about performance. This does help in a lot of cases and as a general thumb rule , it might be fine.
But from my experience , I see something this happening to a lot of novice chess programmers :
A) You finish your code , profile/optimise a few times - and then hit a brick wall.
You end up with this inefficiency spread across the whole codebase such that nothing registers as a hotspot , but the whole thing sucks.
It really does make sense to take performance also into account when writing code - premature optimisation is really an evil : but inefficient code is not the alternative !
If you are going to use a large number of lookup tables such that they will get kicked off L2 - the memory latency and worse tlb thrashing will kill your program.
If you use branches without bounds , the BPT will not be able to accomadate your loops and you will end up messing up your pipelines.
I have seen people muck up even simple things like loop invarients - which might not be simple enough for the compiler to optimize and yet stupid enough for the human to change.
Compilers are pretty dumb when it comes to a lot of optimisations - and if you dont want to sprinkle your code with assembly (I hate that) , it might make sense to 'guide' the compiler to generate optimal code .... for example , like make your code ammendable to be converted to cmov (again a rule of thumb :-P) , reduce branches if you can help it ... sometimes the overhead of a few extra cycles spent in executing some code is worth it if you can preserve the pipelines.
I remember once removing a branch/assignment with a shift/and ... helped quite a bit - though on face of it , it looked more expensive !
Read the processor optimisation manuals - even if you code in a high level language !
The insights on how the processor works in enlightening ... dont rely too much on it while coding , but knowing how your code might get executed helps in writing better code.
Look at the assembly listing of your program once in a while - especially the parts where most of the time is spent - you might be surprised and shocked at what you see :-)
B) Which brings us to next major goof-up : Micro optimisations !
On the flip side , I see programmers who are obsessed with optimisation - so much so that they write 'wonderful' tight code for their small functions , and try to squeeze the most out of it.
They hack around with assembly for a function that is rarely called , forget the more fundamental design aspects and microoptimise their inefficient design.
When they sew everything up , the final program sucks ass.
Their movegen would be blazingly fast , their makemove would be efficient - their perft or alphabeta search goes like the snail.
In general , writing inline assembly is not very good from an optimisation point of view - you are messing up what the compiler could have done and end up ruining it for the compiler : dabble in it only if you are darn sure there is no other alternative.
Example , might make sense to use it for bsr/bsf (find lsb/msb) .... alternative would be a bunch of cmov's.
Ofcourse , taking all these in isolation does not really make sense ... the rest of the pipeline where it is executed does play a vital role in modern processors.
If you can parallelise your code such that multiple pipelines can potentially execute the code , you could end up with a nice speedup !
The way your code gets executed , the context in which it is getting executed , how to optimise the data required for it , the data layout , memory allignment , etc play a very crucial role ...
Though I spoke primarily from the point of view of C , most of the above are relevent to other high level languages - even once which run atop a virtual machine.
Though you might be tempted to treat all these as a black box while writing code , but if you know how the whole system might eventually behave , it helps in writing better code ...
What I have found is , most of the rules of thumb are just that - rules of thumb.
You should know when to stick with them , and when to break them ... and there in lies expierence : blind faith in these 'rules' will only lead to problems : and worse , it is tough to sometimes convince some of these proponents that they really are shooting themselves on their foot !
And most of the above is just that : rules of thumb ...
I wanted to talk about bit twiddling in particular but assumed it make sense to briefly ramble to clarify that what I talk about next is subjective and extremely context sensitive.
Whatworks for one need not work for other !
0 Comments:
Post a Comment
<< Home