Saturday 19 December 2009

C pointers - Part - I

C pointers – An introduction
=====================
This is fairly basic stuff, University level if you want to call that. If you are a professional developer using C, then you wouldn’t find anything new here and you might be wasting your time. If you are new to C and is struggling to get in terms with pointers, as I have during my Uni days, you might find this helpful.
Okay so let’s get into the topic straightaway.
What happens when you declare a variable, say of type char? This is how you would do it:

char var = 0;


So you have declared a variable of type char, which is a single byte, named ‘var’ and you have initialized that to 0. When you build this code (obviously not just this declaration, with a main function and the necessary header files and all that), the compiler would allocate a memory location say at 0x1234, where the variable ‘var’ would reside.

            0x1234 
     0000  0000

                  
So if you are not going to modify the value of ‘var’ inside the code, this memory location would have the value 0. &var denotes the address where ‘var’ is residing and in this case the value of &var will be 0x1234. If later on in the code we change the value of ‘var’ say,

var    =   10;

then, the memory location 0x1234 will now have the value 10 instead of 0. Whenever you use the variable ‘var’ in your code, the compiler knows that it needs to pick the value sitting in the memory location 0x1234 (Well, only if the variable is still in scope, but lets imagine that we only have a main function in the code and no other sub functions).
            0x1234 
  0000 1010


Now what would happen if ‘var’ was an int? i.e we have made the declaration of ‘var’ as follows:

int var    =   256;

Size of int can vary from system to system. But let’s assume we are using a 16-bit (2 bytes) processor and the size of int is 2 bytes. Compiler will allocate a memory location for ‘var’, say at 0x2345. So what happens when you read the variable ‘var’ somewhere in your code? Would we just need to read the location 0x2345? By declaring as int, we have told the compiler that ‘var’ is now 2 bytes. So the compiler knows that it would need to read both 0x2345 and 0x2346 (it is assumed that the system is big endian in all the examples explained here).
    
     MSB  0x2345 
  0000 0001

        
   LSB 0x2346 
  0000 0000


Now let’s start with pointers.
This is how a pointer is declared:

char *ptr   =   null;

Here, we are declaring an identifier ptr as a pointer of type char.  By doing so, we are telling the compiler that the identifier ptr contains not just a variable – it has a specialty – it is an address! Address of a variable of type char. On building the code, the compiler allocates a memory location; say 0x3456, for ptr to live which currently has a value ‘null’.
Suppose you have the following bit of code in your program:

char * ptr     =   null;
char var       =   10;
ptr            =   &var;

On building the code, let’s assume the compiler allocated the address 0x2345 for var and 0x3456 for ptr. Hence initially, 0x2345 contains the value 10 and 0x3456 contains null. On executing the third line, ptr will change from null to address of var which is 0x2345. Hence, address of ptr (&ptr) is 0x3456 and ptr contains the value 0x2345 which is an address. And what more is, *ptr will give you the value residing in the address 0x2345.
To summarize,
Declaring char * ptr tells the compiler that ptr contains an address.
When you use ptr, you are using the address 0x3456.
When you use *ptr, you are using the value inside the address 0x3456.
The symbol ‘*’ is used to ‘dereference’ the pointer. i.e for finding out what value the pointer holds.

In this case we had a char pointer, which means the pointer points to a variable of type char. Hence when you dereference it, the compiler will know that it has to read a single byte in the address pointed by the pointer. What would happen if the pointer was of type int? Consider the following piece of code:

int * ptr   =   null;

Pointer arithmetic:
==============
This is quite an important concept and quite a useful one when you start writing some serious code in C. Consider the char * ptr case again:

char * ptr     =   null;
char var       =   10;
int ptrVal     =   0;


ptr            =   &var;
ptrVal         =   ptr + 1;

The value of ptr was 0x2345 since we assigned the address of var to it. So what would be the value of ptrVal after the execution of the last statement? The answer is pretty straightforward in this case; ptrVal will be 0x2346 – address of the next memory location.
Now, consider a slightly different scenario:

int * ptr  =   null;
char var   =   10;
int ptrVal =   0;


ptr        =   var;
ptrVal     =   ptr + 1;

Here the pointer is of type int. Again, what would be the value of ptrVal after the execution of last statement? ptr is 0x2345, so ptrVal should be 2345 + 1 which is 0x2346 just as our first case, you might think. Well, it is not. This is pointer arithmetic, so when we are adding 1 (or any other digit) to a pointer, the compiler looks for the address of the next int in the memory location. Since int is two bytes long in our examples, ptrVal will be 0x2347.
Similarly ptrVal + 2 will be 0x2349, ptrVal + 2 will be 0x234A and so on. This was the same in the previous example as well, but since the pointer was of type char, the size of the variable it points to is one byte. Hence ptr + 1 would only be the next consecutive address after ptr, which made it look similar to normal arithmetic.
Will post the next section 'Using pointers' soon!