Social Icons

Thursday, December 19, 2013

video tutorials on youtube

Linux programming and Kernel development video tutorials will be uploaded into youtube on 10/feb/2014.......
Channel name " kernelarmy ".

          " KNOWLEDGE INCREASES WHEN SHARED "

UNDERSTANDING LINKER...............ARTICLE 16

UNDERSTANDING LINKER

Linker is a tool which will link the libraries to the executable file. This is the definition I have learned as a child. I used to believe that linking of the libraries is the sole purpose of using the linker.
               Later is when I realized linker is the most important tool in the entire tool chain. It has a number of functionalities. We will discuss most of them in this article.
(note:  I wont be discussing anything about the linker script as I am not very familiar with it.)


WHAT IS THE FUNCTIONALITY OF THE LINKER...?

The Linker is the most important tool in the gcc tool chain. We already know that we use many library functions in C and C++. Some of the examples are printf(), scanf(), pow() etc.. These functions are defined in the standard C library and we use the linker to link these libraries into the source code. Well, let me give you more clarity about this. At the time of GNU project one of their most remarkable works is GNU compiler and the standard C library which is called as GLIBC. We can download this from the Linux.org website. 
                                                    We know the functionality of the functions. We write functions for the purpose of reusability. In the same way libraries are also written for the reusability. The library functions like printf(), scanf() etc .. we do not write the code for its functionality. Its already present in the standard C library called glibc. What we do is we just embed that code into our executable file and run it. Now who will embed the library functions code into our soucecode?.. Its the Linker. The Linker will detect what are the libraries required and will automatically link them into our source code. The Linker can also be used to create user defined libraries. 

WHICH TOOL THE LINKER USES...?

Well I have shown you the step by step compilation of a program add.c. We have performed the first three stages of the compilation i.e the pre-processing phase, compiler and the assembler. (refer previous posts on understanding pre-processor, compiler, assembler). The assembler's output is an object file. The object file will be taken as input by the Linker and a binary executable object file will be the output. After the assembly phase we got a file called add.o which is an object file. This will be taken by the Linker for further processing. Please refer previous articles in case you did not understand.
 In the above screen shot you can see that there are  four files in my directory. add.c is my source code file. The order is in the following manner.           

                                                                            add.c(source code)
                                                                               ||
                                                                            CC1(pre-processor)
                                                                               ||
                                                                            add.i  
                                                                               ||
                                                                            CC1(compiler)
                                                                               ||
                                                                            add.s
                                                                               || 
                                                                              AS(assembler)
                                                                               ||
                                                                           add.o  
                                                                               ||
                                                                           LD(linker)
                                                                               || 
                                                                           add.exe
  
The add.exe is the executable file generated by the linker. Its not important to have the ' .exe ' extension. Lets now perform the fourth phase of the compilation. We have used different flags to tell the compiler to invoke a specific tool inorder to perform the step by step compilation. In case of the linker we need not give any flags. Look at the below screen shot.
I am passing the assembler's output to the gcc compiler and generating an executable file. The executable files are displayed in green colour.
Here when I executed the command ' ls ' it displays all the contents in the directory and we can see that there is add.exe which is the executable file. Lets now execute this file and see what's the output generated.
We can execute the file by doing ./ add.exe. The output is saying that " THE VALUE OF C IS 30 ". If you remember we wrote a C program which adds the values of A and B and stores the output in C. 
  

THE VERBOSE OUTPUT OF THE LINKER..

command:-- gcc -v add.o   -o  add.exe

 The Linker will first read a file called specs and it will load all the required libraries, check the flags.. well very similar to that of those previous tools. It uses a tool called ld. linux.so.2 is the linker which the gcc is using. The most important thing to notice here is how it is linked. We will discuss about all that in the next article.

Tuesday, December 17, 2013

UNDERSTADING ASSEMBLER.............ARTICLE 15

ASSEMBLER

 WHAT IS AN ASSEMBLER??????

Basically assembler is a tool which will take part in the compilation process. It will take the output generated by the compiler and convert it into machine instructions which the computer will understand. When I am referring the terms compiler and compilation, please be aware that both are different. 
Compiler is a tool and compilation is a process of converting the high level language written source code into machine understandable instructions.
                              So, we have already seen how the stages of compilation. The assembler will be invoke in the third stage of the compilation process. The output generated by the compiler which will have an extension ' .s ' i.e filename.s will be taken as input by the assembler and will be converted into machine readable instructions. The file generated is called  as the object file.

 HOW THE ASSEMBLER WORKS

Now we will see how the assembler will work. I am assuming that anyone who is reading this article has read the previous articles. If any confusion arises please follow my old articles. I am covering each and every topic very systematically.
                                          So coming to the assembler we have already seen what the compiler will generate as an output. The output generated by the compiler is used as the input by the assembler tool.

We are going to do a step by step compilation of a C program. In my last two posts on understanding the preprocessor and the compiler I used a C program "add.c". If you have not read those two previous posts of mine I am afraid some of you might feel it difficult to understand this article. Well I am assuming you have already read those two previous posts. So after the compilation we get the output file add.s which will be used as the input file for the assembler.

  Here I did an ls command to display the files I have currently in my directory. We got the output generated by the compiler i.e the tool called cc1 which generated the add.s which will be input to the assembler.


HOW TO INVOKE THE ASSEMBLER???


We have previously invoked the pre-processor and the compiler independently. To invoke the assembler the command is  gcc  -c  add.s  -o  add.o
 The output generated will be an object file called add.o . This file will contain the machine instructions which will be used by the linker.


The path followed by the assembler is similar to that of the pre-processor and the compiler. I am going to execute the above command with a " -v " flag which will give the verbose output. The path followed will be similar to that of the pre-processor and the compiler except for the tool i.e assembler uses a tool called " as " where as the compiler and pre-processor will use a tool called the cc1.

gcc  -c -v  add.s  -o add.o


The above screen shot will help you locate the tool " as ". Rest of the process followed by the assembler is similar to that of the pre-processor and the compiler. Please refer the previous articles for more clarity.

Monday, December 2, 2013

UNDERSTAND COMPILER.............ARTICLE 14

COMPILER

In my last post I have discussed about the pre-processor and how it works. Lets have a look on how the compiler is playing the major part in the process of compilation. Here when I am referring to the terms compilation and compiler, they are not the same. Compilation is the process of converting the source code of the program and converting it into a binary executable file. 
                                                       
                                                             Compiler is the tool we use to convert the pre-processed code into assembly instructions. If you have not referred my last post on preprocessor I recommend you to read that article first. This article is a continuation of the previous article.

WHAT IS THE COMPILER???

A compiler is a tool. In the compilation process we will come across 4 major phases. The compiler is the tool that is used in the second phase of compilation. It will convert the pre-processed output generated by the pre-processor into assembly instruction set.

HOW THE COMPILER WORKS???

By now I am assuming whoever is reading this is already familiar with the first phase of compilation i.e the pre-processor and how the pre-processor works. The compiler comes into action in the second phase of compilation. The output generated by the pre-processor will have a " .i " extension to it. A file with a  ' .i ' extension states that its a pre-processed code. 

For example:  anyfilename.i  means that its a pre-processed output.

                                           The compiler will take the pre-processed output of the pre-processor and convert it into assembly instructions which is an intermediate code. (not purely machine level code) 
 In the last article I have written a program to add two numbers and  pre-processed the sourcecode which generated an output add1.i. As I mentioned earlier its a pre-processed output which will be the input to the compiler.

INVOKING THE COMPILER:

Here I will first write the command to perform the second phase of compilation such that it will stop after the  second step.

                                            gcc  -S  app1.i   -o   app2.s

Here I am instructing the gcc that invoke the compiler tool. The " -S " flag  instructs the gcc to invoke the compiler tool and take the add1.i as the input for the compiler. The compiler will now process the add1.i  pre-processed file and convert it into an assembly instructions . The assembly code will be pushed into a file called " add2.s ". This file is the container of the assembly code and has an extension " .s " which indicates that the file contains the assembly code inside the file.

check out the below screenshots:
  This means that in my temp directory I have the source code file add.c and the pre-processed output add1.c. Now lets invoke the compiler and convert the add1.i into an assembly instruction set add2.s

 Now I want you to clearly observe the above screen shot. I have executed the command to perform the second phase of the compilation and then executed a command to display the list of files in my directory. Now I have three file, add2.s is now added to the directory temp.

The add2.s is the assembly output given by the gcc. Lets use the " -v " flag to get the verbose output and know exactly the approach of the compiler tool.

command:    gcc  -S  -v  app1.i  -o  app2.s

When I execute the above command it will display the approach taken by the compiler tool to generate the assembly instructions.


 After executing the above command we will get an output like this.


What the Gcc does is initially it will read a file called " specs " which we can refer as the specifications. This file is automatically installed when the Ubuntu is installed. It tells the compiler where to look and what to look for. The next step is the target. Gcc will check  for which platform it is generating the code. Here in my machine its showing  " Target: i686-linux-gnu "

After this step there will be some verifications and validation performed by the gcc and the gcc will look for the options or flags or switches we have provided to it. 

COLLECT_GCC_OPTIONS='-S' '-v' '-o' 'add2.s' '-mtune=generic' '-march=i686'

The additional flags which are required will be provided by the gcc itself.

Observe carefully the below screen shot.
 
 
gcc will now call the tool called " cc1". cc1 is the tool which will compile the pre-processed code into an assembly code. The tool cc1 will perform both the pre-processing as well as the compiler operations. The operation it needs to perform will depend on the flag we provide to it. If we give the  
" -E " flag, cc1 will perform the pre-processing and if we give the " -S " flag, cc1 will act as a compiler. One more important thing we need to observe is that its showing that the file is already pre-processed. I have clearly highlighted it. 
                                                     
                                                            The next step it does is it will load all the dependent components and will convert the per-processed code into an assembly code.

LETS OPEN THE ASSEMBLY FILE add2.s

file is a command we can use to determine the type of file. The syntax is very simple file <<filename>>

Its showing that the add2.s is an ASCII PROGRAM TEXT, which means that it can be opened using normal text editors. We will use the Vim editor to open the file.. (SYNTAX   VIM  <<FILENAME>>)

 The assembly instruction the gcc generates is specific to the architecture. My machine is using the x86 architecture hence gcc will generate the assembly output for my specific architecture. If we use any other architecture then the gcc will generate the assembly for that specific architecture. 

                                                The reason why the assembly output is architecture specific is because all the architectures may or may not have the same set of registers and same number of registers. The processor will only understand its specific instruction set. The registers in arm architecture and the x86 will be different. Hence the compiler will generate the assembly code specific to the architecture. 

Observe the assembly output carefully. You can find the main function as 
" main: ".
At this point it will be difficult for you to follow and understand the assembly code if you are new to this. I will write an article in future on how to understand the assembly code.