<?xml version="1.0" encoding="utf-8" ?>
<rss version="2.0"
   xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
   xmlns:admin="http://webns.net/mvcb/"
   xmlns:dc="http://purl.org/dc/elements/1.1/"
   xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
   xmlns:wfw="http://wellformedweb.org/CommentAPI/"
   xmlns:content="http://purl.org/rss/1.0/modules/content/"
   >
<channel>
    <title>Eduzine© - C/C++</title>
    <link>http://eduzine.edujini-labs.com/</link>
    <description>Articles from Edujini Team</description>
    <dc:language>en</dc:language>
    <admin:errorReportsTo rdf:resource="mailto:" />
    <generator>Serendipity 0.8.2 - http://www.s9y.org/</generator>
    <pubDate>Sun, 15 Oct 2006 22:03:48 GMT</pubDate>

    <image>
        <url>http://www.edujini-labs.com/images/newmasthead.gif</url>
        <title>RSS: Eduzine© - C/C++ - Articles from Edujini Team</title>
        <link>http://eduzine.edujini-labs.com/</link>
        <width></width>
        <height></height>
    </image>
<item>
    <title>Inline Assembly Code (GCC)</title>
    <link>http://eduzine.edujini-labs.com/archives/20-Inline-Assembly-Code-GCC.html</link>
<category>GCC</category>    <comments>http://eduzine.edujini-labs.com/archives/20-Inline-Assembly-Code-GCC.html#comments</comments>
    <wfw:comment>http://eduzine.edujini-labs.com/wfwcomment.php?cid=20</wfw:comment>
    <slash:comments>2</slash:comments>
    <wfw:commentRss>http://eduzine.edujini-labs.com/rss.php?version=2.0&amp;type=comments&amp;cid=20</wfw:commentRss>
    <author>eduzine@edujinionline.com (Eduzine)</author>
    <content:encoded>

This article describes using inline assembly code in your C/C++ program... was it ever difficult for you, will never be any more!

&lt;h2&gt;Introduction&lt;/h2&gt;

&lt;p&gt;
First of all, what does term &amp;quot;inline&amp;quot; mean?
&lt;/p&gt;

&lt;p&gt;
Generally the inline term is used to instruct the compiler to insert the code of a function
into the code of its caller at the point where the actual call is made. Such functions are
called &amp;quot;inline functions&amp;quot;. The benefit of inlining is that it reduces function-call overhead.
&lt;/p&gt;

 
&lt;p&gt;
Now, it's easier to guess about inline assembly. It is just a set of assembly instructons
writen as inline functions. Inline assembly is used for speed, and you ought to believe me that
it is frequently used in system programming.
&lt;/p&gt;

&lt;p&gt;
We can mix the assembly statements within C/C++ programs using keyword &amp;quot;&lt;code&gt;asm&lt;/code&gt;&amp;quot;.
Inline assembly is important because of its ability to operate and make its output visible
on C/C++ variables.
&lt;/p&gt;

&lt;h2&gt;GCC Inline Assembly Syntax&lt;/h2&gt;

&lt;p&gt;
Assembly language appears in two flavors : Intel Style &amp;amp; AT&amp;amp;T style.
GNU C compiler i.e. GCC uses AT&amp;amp;T syntax and this is what we would use.
Let us look at some of the major differences of this style as against the Intel Style.
&lt;/p&gt;

&lt;p&gt;
If you are wondering how you can use GCC on Windows, you can just download Cygwin from
&lt;a href=&quot;http://www.cygwin.com&quot;&gt;www.cygwin.com&lt;/a&gt;.
&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;b&gt;Register Naming &lt;/b&gt; :
      Register names are prefixed with %, so that registers are &lt;code&gt;%eax, %cl&lt;/code&gt; etc,
      instead of just &lt;code&gt;eax, cl.&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt; &lt;b&gt;Ordering of operands&lt;/b&gt; :
      Unlike to Intel convention (first operand is destination),
      the order of operands is source(s) first, and destination last.
      For example, Intel syntax &amp;quot;&lt;code&gt;mov eax, edx&lt;/code&gt;&amp;quot; will look like
      &amp;quot;&lt;code&gt;mov %edx, %eax&lt;/code&gt;&amp;quot; in AT&amp;amp;T assembly.
&lt;/li&gt;
&lt;li&gt; &lt;b&gt;Operand Size&lt;/b&gt; :
      In AT&amp;amp;T syntax the size of memory operands is determined from the
      last character of the op-code name. The suffix is &lt;code&gt;b&lt;/code&gt; for (8-bit) byte,
      &lt;code&gt;w&lt;/code&gt; for (16-bit) word, and &lt;code&gt;l&lt;/code&gt; for (32-bit) long.
      For example, the correct syntax for the above instruction would have been
      &amp;quot;&lt;code&gt;movl %edx, %eax&lt;/code&gt;&amp;quot;.
&lt;/li&gt;
&lt;li&gt; &lt;b&gt;Immediate Operand&lt;/b&gt; :
      Immediate operands are marked with a &lt;code&gt;$&lt;/code&gt; prefix, as in
      &amp;quot;&lt;code&gt;addl $5, %eax&lt;/code&gt;&amp;quot;, which means add immediate long value 5 to register &lt;code&gt;%eax&lt;/code&gt;).
&lt;/li&gt;
&lt;li&gt; &lt;b&gt;Memory Operands&lt;/b&gt; :
      Missing operand prefix indicates it is a memory-address; hence
      &amp;quot;&lt;code&gt;movl $bar, %ebx&lt;/code&gt;&amp;quot; puts the address of variable bar into register
      &lt;code&gt;%ebx&lt;/code&gt;, but &amp;quot;&lt;code&gt;movl bar, %ebx&lt;/code&gt;&amp;quot; puts the contents of variable
      bar into register &lt;code&gt;%ebx&lt;/code&gt;.
&lt;/li&gt;
&lt;li&gt; &lt;b&gt;Indexing&lt;/b&gt; :
      Indexing or indirection is done by enclosing the index register or indirection memory cell
      address in parentheses. For example, &amp;quot;&lt;code&gt;movl 8(%ebp), %eax&lt;/code&gt;&amp;quot;
      (moves the contents at offset 8 from the cell pointed to by &lt;code&gt;%ebp&lt;/code&gt; into register &lt;code&gt;%eax&lt;/code&gt;).
&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;Basic Inline Code&lt;/h2&gt;

&lt;p&gt;
We can use either of the following formats for basic inline assembly.
&lt;/p&gt;
&lt;pre&gt;asm(&amp;quot;assembly code&amp;quot;);
&lt;/pre&gt;

or

&lt;pre&gt;&lt;u&gt;_asm_&lt;/u&gt; (&amp;quot;assembly code&amp;quot;);
&lt;/pre&gt;

&lt;p&gt;
Example:
&lt;/p&gt;

&lt;pre&gt;asm(&amp;quot;movl %ebx, %eax&amp;quot;); /&lt;strong&gt; moves the contents of ebx register to eax &lt;/strong&gt;/
&lt;u&gt;_asm_&lt;/u&gt;(&amp;quot;movb %ch, (%ebx)&amp;quot;); /&lt;strong&gt; moves the byte from ch to the memory pointed by ebx &lt;/strong&gt;/
&lt;/pre&gt;

&lt;p&gt;
Just in case if we have more than one assembly instructions,
use semicolon at the end of each instruction.
&lt;/p&gt;

&lt;p&gt;
Please refer to the example available in basic_arithmetic.c in downloads.
&lt;/p&gt;

&lt;p&gt;
Compile the program using &amp;quot;&lt;code&gt;-g&lt;/code&gt;&amp;quot; option of GNU C compiler &amp;quot;&lt;code&gt;gcc&lt;/code&gt;&amp;quot; to keep debugging
information with the executable and then using GNU Debugger &amp;quot;gdb&amp;quot; to inspect the contents of CPU registers.
&lt;/p&gt;

&lt;h2&gt;Extended Assembly&lt;/h2&gt;

&lt;p&gt;
In extended assembly, we can also specify the operands.
It allows us to specify the input registers, output registers and a list of clobbered registers.
&lt;/p&gt;

&lt;pre&gt;asm ( &amp;quot;assembly code&amp;quot;
           : output operands                  /&lt;strong&gt; optional &lt;/strong&gt;/
           : input operands                   /&lt;strong&gt; optional &lt;/strong&gt;/
           : list of clobbered registers      /&lt;strong&gt; optional &lt;/strong&gt;/
);
&lt;/pre&gt;

&lt;p&gt;
If there are no output operands but there are input operands,
we must place two consecutive colons surrounding the place where
the output operands would go.
&lt;/p&gt;

&lt;p&gt;
It is not mandatory to specify the list of clobbered registers to use,
we can leave that to GCC and GCCs optimization scheme do the needful.
&lt;/p&gt;


&lt;h3&gt;Example (1)&lt;/h3&gt;

&lt;pre&gt;asm (&amp;quot;movl %%eax, %0;&amp;quot; : &amp;quot;=r&amp;quot; ( val ));
&lt;/pre&gt;

&lt;p&gt;
In this example, the variable &amp;quot;&lt;code&gt;val&lt;/code&gt;&amp;quot; is kept in a register,
the value in register &lt;code&gt;eax&lt;/code&gt; is copied onto that register,
and the value of &amp;quot;&lt;code&gt;val&lt;/code&gt;&amp;quot; is updated into the memory from this register.
&lt;/p&gt;

&lt;p&gt;
When the &amp;quot;&lt;code&gt;r&lt;/code&gt;&amp;quot; constraint is specified, gcc may keep the variable
in any of the available General Purpose Registers.
We can also specify the register names directly by using specific register constraints.
&lt;/p&gt;

&lt;p&gt;
The register constraints are as follows :
&lt;/p&gt;

&lt;pre&gt;+---+--------------------+
| r |    Register(s)     |
+---+--------------------+
| a |   %eax, %ax, %al   |
| b |   %ebx, %bx, %bl   |
| c |   %ecx, %cx, %cl   |
| d |   %edx, %dx, %dl   |
| S |   %esi, %si        |
| D |   %edi, %di        |
+---+--------------------+
&lt;/pre&gt;

&lt;h3&gt;Example (2)&lt;/h3&gt;

&lt;pre&gt;	int no = 100, val ;
        asm (&amp;quot;movl %1, %%ebx;&amp;quot;
              &amp;quot;movl %%ebx, %0;&amp;quot;
             : &amp;quot;=r&amp;quot; ( val )        /&lt;strong&gt; output &lt;/strong&gt;/
             : &amp;quot;r&amp;quot; ( no )         /&lt;strong&gt; input &lt;/strong&gt;/
             : &amp;quot;%ebx&amp;quot;         /&lt;strong&gt; clobbered register &lt;/strong&gt;/
         );
&lt;/pre&gt;

&lt;p&gt;
In the above example, &amp;quot;val&amp;quot; is the output operand, referred to by %0 and &amp;quot;no&amp;quot;
is the input operand, referred to by %1.
&amp;quot;r&amp;quot; is a constraint on the operands, which says to GCC to use any register for storing the operands.
&lt;/p&gt;

&lt;p&gt;
Output operand constraint should have a constraint modifier &amp;quot;=&amp;quot; to specify the
output operand in write-only mode. There are two %s prefixed to the register name,
which helps GCC to distinguish between the operands and registers. operands have a single % as prefix.
&lt;/p&gt;

&lt;p&gt;
The clobbered register %ebx after the third colon informs the GCC that the value
of %ebx is to be modified inside &amp;quot;asm&amp;quot;, so GCC wont use this register to store any other value.

&lt;/p&gt;&lt;h3&gt;Example (3)&lt;/h3&gt;

&lt;pre&gt;int arg1, arg2, add ;
&lt;u&gt;_asm_&lt;/u&gt; ( &amp;quot;addl %%ebx, %%eax;&amp;quot;
		: &amp;quot;=a&amp;quot; (add)
		: &amp;quot;a&amp;quot; (arg1), &amp;quot;b&amp;quot; (arg2) );
&lt;/pre&gt;

&lt;p&gt;
Here &amp;quot;add&amp;quot; is the output operand referred to by register eax.
And arg1 and arg2 are input operands referred to by registers eax and ebx respectively.
&lt;/p&gt;

&lt;p&gt;
See the file &lt;code&gt;arithmetic.c&lt;/code&gt; for extended inline assembly statements.
It performs simple arithmetic operations on integer operands and displays the result.
&lt;/p&gt;

&lt;h2&gt;Volatile&lt;/h2&gt;

&lt;p&gt;
If our assembly statement must execute where we put it,
(i.e. must not be moved out of a loop as an optimization),
put the keyword &amp;quot;volatile&amp;quot; or &amp;quot;&lt;u&gt;_volatile_&lt;/u&gt;&amp;quot; after &amp;quot;asm&amp;quot; or &amp;quot;&lt;u&gt;_asm_&lt;/u&gt;&amp;quot; and before the ()s.

&lt;/p&gt;&lt;pre&gt;asm volatile ( &amp;quot;...;&amp;quot;
		&amp;quot;...;&amp;quot; : ... );
&lt;/pre&gt;

or

&lt;pre&gt;&lt;u&gt;_asm_&lt;/u&gt; &lt;u&gt;_volatile_&lt;/u&gt; ( &amp;quot;...;&amp;quot;
			&amp;quot;...;&amp;quot; : ... );
&lt;/pre&gt;

&lt;p&gt;
Refer to the example in &lt;code&gt;gcd.c&lt;/code&gt;, which computes the Greatest Common Divisor
using well known Euclid's Algorithm ( honoured as first algorithm ).
&lt;/p&gt;

&lt;p&gt;
Here are some more examples which uses FPU (Floating Point Unit) Instruction Set.
&lt;/p&gt;&lt;ol&gt;
&lt;li&gt;Example program to perform simple floating point arithmetic is available in &lt;code&gt;float.c&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Example program to compute trigonometrical functions like sin and cos is available in &lt;code&gt;maths.c&lt;/code&gt;.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;

&lt;/p&gt;&lt;h2&gt;Summary&lt;/h2&gt;

&lt;p&gt;
GCC uses AT&amp;amp;T style assembly statements and we can use asm keyword to specify basic as well as extended assembly instructions.
Using inline assembly can reduce the number of instructions required to be executed by the processor.
In our example of GCD, if we implement using inline assembly, the number of intructions required for calculation
would be much less&lt;br /&gt;as compared to normal C code using Euclid's Algorithm.
&lt;/p&gt;

&lt;h2&gt;Downloads&lt;/h2&gt;

&lt;p&gt;
The code can be downloaded from the downloads section &lt;a href=&quot;http://downloads.edujinionline.com/index.php?act=category&amp;id=9&quot; target=&quot;_blank&quot;&gt;here&lt;/a&gt;.
&lt;/p&gt;

    </content:encoded>
    <pubDate>Sun, 15 Oct 2006 14:11:46 -0700</pubDate>
    <guid isPermaLink="false">http://eduzine.edujini-labs.com/archives/20-guid.html</guid>
    </item>
</channel>
</rss>
