Unix usage tips blog: stupid vs smart, case vs ftable, algorithms vs processors, testing price vs gain

Actually, the task below can be easily solved by the paper and pen.

It was written in home at evening, just to measure performance as additional argument for rewriting stupid code for improving maintainability and performance of old legacy code.

Hypothetical example of the optimization of the following code was considered, and the return of the optimization effort (as resulting 'optimized' code) was timed.

     for(unsigned int i=0;i<65536;i++)
        for(unsigned int j=0;j<65536;j++)
         switch(j%15+1)
                {
                case 1:
                f1();
                break;
                case 2:
                f2();
                ...
                }

The following are examples of performance, that can be achieved for integer massive operations due to different types of optimization.

Tried on AMD dual core DeskTop centOS, with gcc 4.1.3,
with optimization, and without.

4 types of code were considered:

Using switch(code) and case to define function to call;

Using case(code) and inlining function in call;

Using call to function by code ( fun(code) );

Just doing job without obfuscation by ftabs and cases.

Compilation done in following way:


[rtg@rtgCent cpp_exa]$ g++ -O3 case.cpp -o caseo
[rtg@rtgCent cpp_exa]$ g++ -O3 ftab.cpp -o ftabo
[rtg@rtgCent cpp_exa]$ g++ ftab.cpp -o ftab
[rtg@rtgCent cpp_exa]$ g++ case.cpp -o case
[rtg@rtgCent cpp_exa]$ su -

Some examples of output:

Red and green just presents timing for worst and best timing.


[root@rtgCent cpp_exa]# time nice -n -10 ./case
-458752

real    2m23.422s
user    2m23.205s
sys     0m0.112s

[root@rtgCent cpp_exa]# nice -n -10 ./caseo
-458752
[root@rtgCent cpp_exa]# time nice -n -10 ./caseo
-458752

real    1m3.688s
user    1m3.614s
sys     0m0.054s

[root@rtgCent cpp_exa]# time nice -n -10 ./ftab
-458752

real    1m41.664s
user    1m41.521s
sys     0m0.101s
[root@rtgCent cpp_exa]# time nice -n -10 ./ftabo 
-458752

real    0m56.068s
user    0m56.004s
sys     0m0.047s
[root@rtgCent cpp_exa]# time nice -n -10 ./plaino 
-458752

real    1m3.801s
user    1m3.732s
sys     0m0.047s

[root@rtgCent cpp_exa]# time nice -n -10 ./plain
-458752

real    1m53.174s
user    1m53.060s
sys     0m0.079s

[root@rtgCent cpp_exa]# time nice -n -10 ./opto 
-458752

real    0m14.527s
user    0m14.509s
sys     0m0.014s

[root@rtgCent cpp_exa]# time nice -n -10 ./opt
-458752

real    0m31.956s
user    0m31.923s
sys     0m0.022s

It were just examples of the simpliest 'non algorithmic' optimization.

Simplified 'Algorithmic' optimization of the simplest case gives more than 100 time's faster output:


[root@rtgCent cpp_exa]# time nice -n -10 ./alg
-458752

real    0m0.011s
user    0m0.004s
sys     0m0.005s
[root@rtgCent cpp_exa]# time nice -n -10 ./algo
-458752

real    0m0.011s
user    0m0.003s
sys     0m0.005s

If you have some different opinion/results, please report in comments :)

By the way, who needs optimization today of critical systems, with today's pricing on retesting of system, and today modern cheap fast processors?

Code for examples:
alg.cpp
ftab.cpp
case.cpp
plain.cpp
opt.cpp

cpuinfo:

cpuinfo

Unix usage tips blog

Thursday, July 10, 2008

stupid vs smart, case vs ftable, algorithms vs processors, testing price vs gain

1 comment:

Label Cloud