Results 1 to 4 of 4

Thread: GCC Optimisations CFLAGS / CXXFLAGS

  1. #1
    Very good friend of the forum drgr33n's Avatar
    Join Date
    Jan 2010
    Location
    Dark side of the moon ...
    Posts
    699

    Default GCC Optimisations CFLAGS / CXXFLAGS

    Thought I'd post a little about CFLAGS as I've been messing with them and learnt a few things.

    What are CFLAGS ?

    CFLAGS are arguments you can pass to your C compiler to optimize the building and running of your programs. You can add these flags to your Makefiles and also specify them to tools like configure to add to the generated Makefile. Your C compiler then picks up these flags and compiles your software in the way you have specified. CFLAGS are for the C compiler and CXXFLAGS are used for your C++ compiler. There are also LDFLAGS and CPPFLAGS etc but we won't be going over them here.

    So why use CFLAGS ?

    CFLAGS are used for two things, optimization and debugging software. We are going to focus on the optimization of your software using CFLAGS. You can speed up the way your computer reads the program and specify more options like features in the processor ETC. This speeds up the runtime of the program and takes advantage of all your box has to offer

    Safe CFLAGS.

    Safe CFLAGS are flags like "-march=" which is used to specify your processor arch and is the main safe CFLAG to use globally. We will talk about global flags later in this tutorial. There are lots of CFLAGS but what can work for one program can destroy another so use with caution. Also be aware if you plan on sharing your compiled binaries with others if you specify something that is not supported by the other persons box they will not be able to run the program. However you can still specify a base and optimal setting by using the -march and -mtune flags for eg. If I wanted to optimize my program for my intel centrino DUO but wanted to shere the compiled program with a few m8s with various newish computers I would use these flags

    Code:
    -march=prescott -mtune=i686
    -march is telling the C compiler that I have a Santa Rosa processor but mtune is telling the C compiler to also optimize for the standard i686 arch. If you are just compiling for yourself just use march if not specify mtune aswell.

    -pipe

    The pipe option is used to save on compile time by telling your C compiler to not create temp files while compiling instead pipe the output into the next function. This option will eat ram so if you only have a little to spare don't use this flag.

    The -O Flag

    -O set the optimization level to the C compiler. You can use this to your advantage to compile programs to have to most amount of optimizations "-O3" or the least "-O". Remember the more optimizations the bigger your compiled program will be. So don't go setting the "-O3" flag to something like GCC if not you will very quickly run out of HDD space . Here are the four main optimization settings.

    -O

    Is the least amount of optimizations you can set. It will have the least impact on the size of the compiled program but will increase the runtime.

    -O2

    turns on all -O optimizations and all other optimizations that don't greatly increase the programs size or interfere with debugging. This option is the most common used flag in the linux world and probably is best to use as a global CFLAG.

    -O3

    turns on all -O2 optimizations and even more. This is the highest optimization setting you can use but its not necessarily the best. It makes your programs bulky and may not be the fasted flag to use. Also debugging is affected and makes the job near on imposable.

    -Os

    Utilizes all the -O2 optimizations but uses further tricks to make your programs smaller. If you are compiling for older or PDA's ETC this may be a good option to use as long as it doesn't cause any compile errors, if so use -O2.

    Making these flags globally.

    If you would like to use these flags all the time when compiling your programs you can by adding them to your /etc/profile bash script. Open up in kwrite or something and add something like this at the bottom.

    Code:
    CFLAGS="-O2 -fomit-frame-pointer -pipe -ffast-math -malign-double -march=prescott -msseregparm -msse3 -minline-all-stringops -fgcse-lm -fgcse-sm -fforce-addr"
    CXXFLAGS="${CFLAGS}"
    export CFLAGS CXXFLAGS
    and save.. Remember to add your own flags and not use mine .

    Here's a little list to help you with choosing the right arch for your cpu.

    i386 = i386
    i486 = i486
    487 = i486
    Pentium = pentium
    Pentium-MMX = pentium-mmx
    Pentium Pro = pentiumpro
    Pentium II = pentium2
    Celeron = pentium2
    Pentium III = pentium3
    Pentium 4 = pentium4
    Via C3 = c3
    Winchip 2 = winchip2
    Winchip C6-2 = winchip-c6
    AMD K5 = i586
    AMD K6 = k6
    AMD K6 II = k6-2
    AMD K6 III = k6-3
    AMD Athlon = athlon
    AMD Athlon 4 = athlon
    AMD Athlon XP/MP= athlon
    AMD Duron = athlon
    AMD Tbird = athlon-tbird
    Centrino Duo = prescott

    And benchmarks on my laptop...

    Benchmark 1 without CFLAGS...

    Dhrystone Bench

    BYTE UNIX Benchmarks (Version 3.11)
    System -- Linux drgr33n 2.6.29.1 #1 SMP Sat Apr 4 13:53:11 GMT 2009 i686 Intel(R) Core(TM)2 CPU T5200 @ 1.60GHz GenuineIntel GNU/Linux
    Start Benchmark Run: Mon Apr 6 19:29:56 GMT 2009
    5 interactive users.
    Dhrystone 2 without register variables 4131510.6 lps (10 secs, 6 samples)
    Dhrystone 2 using register variables 4045697.1 lps (10 secs, 6 samples)

    TEST BASELINE RESULT INDEX

    Dhrystone 2 without register variables 22366.3 4131510.6 184.7
    =========
    SUM of 1 items 184.7
    AVERAGE 184.7

    Benchmark with "-pipe -O3 -fomit-frame-pointer -ffast-math -malign-double -march=prescott -mtune=prescott -msseregparm -msse3 -minline-all-stringops -fgcse-lm -fgcse-sm -fforce-addr" flags set.

    BYTE UNIX Benchmarks (Version 3.11)
    System -- Linux drgr33n 2.6.29.1 #1 SMP Sat Apr 4 13:53:11 GMT 2009 i686 Intel(R) Core(TM)2 CPU T5200 @ 1.60GHz GenuineIntel GNU/Linux
    Start Benchmark Run: Mon Apr 6 19:51:50 GMT 2009
    5 interactive users.
    Dhrystone 2 without register variables 5755118.4 lps (10 secs, 6 samples)
    Dhrystone 2 using register variables 5327306.5 lps (10 secs, 6 samples)

    TEST BASELINE RESULT INDEX

    Dhrystone 2 without register variables 22366.3 5755118.4 257.3
    =========
    SUM of 1 items 257.3
    AVERAGE 257.3

    As you can see by using custom CFLAGS I'm getting about a 40% increase in the average speed. You can use BYTE UNIX Benchmark to test your own CFLAGS. Happy tinkering

    ADDED:

    With all the above cflags and native cpu flaged.

    Dhrystone 2 without register variables 22366.3 5101827.8 228.1
    =========
    SUM of 1 items 228.1
    AVERAGE

    With all the above cflags and prescott flaged using -O2 optimizations.

    Dhrystone 2 without register variables 22366.3 5376992.4 240.4
    =========
    SUM of 1 items 240.4
    AVERAGE 240.4

    These speeds can be afected by what my system is doing at the time of the benchmark, I am running them twice and posting the largest result.

    References:

    Gentoo WIKI: http://en.gentoo-wiki.com/wiki/Safe_Cflags

  2. #2
    Developer
    Join Date
    Mar 2007
    Posts
    6,126

    Default

    Just to make a addition if you are using gcc 4.3 and newer you can make use of --march=core2 cflag which has taked the place of the --march=nocona flag which used to be the optimal for multicore systems.

    gcc has also taken things one step further with a experimental cflag in 4.3 which is --march=native. This is supposedly supposed to let the compiler determine the best settings for each program. I have tried it on one system and seems to work pretty well however YMMV.

    Just thought I'd throw in my 2 cents

  3. #3
    Very good friend of the forum drgr33n's Avatar
    Join Date
    Jan 2010
    Location
    Dark side of the moon ...
    Posts
    699

    Default

    Cheers purehate I did know about the core2 flag but was advised against it as my centrino duo still matches prescott.

    I will try the native flag and post the benchmark results now

    EDIT: The Native flag actually slowed the benchmark a bit ??

  4. #4
    Super Moderator Archangel-Amael's Avatar
    Join Date
    Jan 2010
    Location
    Somewhere
    Posts
    8,012

    Default

    Nice addition Dr_GrEeN, Thanks for sharing.
    Simple and easy to follow.
    Pureh@te helped me out a lot when it came to CFLAGS when I set up my gentoo box.
    To be successful here you should read all of the following.
    ForumRules
    ForumFAQ
    If you are new to Back|Track
    Back|Track Wiki
    Failure to do so will probably get your threads deleted or worse.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •