To improve algorithm performance, one direct way is to shorten calculation time. The part that consumes the most computation power would be the trig functions.

The built-in trig functions are generally very good in terms of accuracy. But we don't need that level of precision, so we can sacrifice accuracy to achieve faster speed.

We only use SIN, COS, ACOS and ATAN in our algorithm, so this post will describe only these functions.

__SIN and COS__One popular, easy and really fast way is to use look-up table.

Firstly I generated a list of 256 input from 0 to 360, with equal intervals, and get a list of results for each input.

and then use the list of results as our table, and input (in degree) as our index. so let's say we need to get SIN(30), we just need to look up the 30th value in the table.

The reason that I use 1 degree instead of smaller interval, is that not only it saves spaces, but also I won't need that kind of precision. we can obviously do smaller interval like 0.5, or 0.2. just do some index conversion before looking up.

Actually, we don't need the full cycle of SIN, we just need the first quarter cycle, because the whole cycle can be replicated by the first quarter, so we save a lot of memory! All we have to do is to add some if/else statements, and do some tricks on the index system.

Since SIN and COS are very close brothers (or sisters), that COS is just a shifted version of SIN, and we can lookup its value from the same table. The difference is just on the indexing.

Because we are only using integer degree input, and the result for it was computed from the built-in SIN function, so it's guaranteed to be correct!

[Upload Source Code]

__ACOS__ACOS is different, because input will be between -1 and 1, if we are going to use look-up table, we will need to do index conversion that involves multiplication and division which will kill the performance. Also the table need to be huge because we can't reduce table size like what we did in the SIN function. So might be better to use infinite series stated here.

It should be noted that the more accurate, the higher power term needs to be computed. Power is a terrible performance killer, so I will only keep the first 4 terms.

To verify the accuracy, we can compare the built-in method with the 'polynomial' method in Excel.

As we can see, the error rises exponentially after about 0.3. It is obvious what we have a larger error as we go near 1. Because we are only keeping the first few polynomial terms, as we go nearer to 1, the neglected terms will become significant, thus making the difference obvious.

what we can do here, is to replace the polynomial method with look-up table, let say when input is larger than 0.8, when the error could have reached about 1%.

New method is about 15 times faster! (measured in VC++)

[Upload Source Code]

__ATAN__I have been search for a way to do it. But none seems to be faster than C++ ATAN. So I will stick with it for now.

## No comments:

## Post a Comment

Note: only a member of this blog may post a comment.