The code of a data type is implemented by a method, which is executed by the ExecutionEngine. The CLR offers a large number of services to support the execution of code.Any code that uses these services is called managed code. Managed code allows the CLR toprovide a set of features such as handling exceptions. It also makes sure that the code isverifiable. Only managed code has access to managed data.
There is no rule in the IL book that prevents a method from being global. It can certainly bewritten outside a class.
In fact we can write the smallest IL program without using the class directive. It ismandatory to have a function with the entrypoint directive. Thus, had the designers of C#so desired, they could have provided the facility of global functions, but they chose not to.They decided, in their infinite wisdom, that all functions should be placed within a class.There is no such restriction imposed by IL.The CLR recognizes three types of methods: static, instance and virtual. There are somespecial functions that are automatically called by the runtime such as static constructorsor type initializers such as .cctor and instance constructors such as .ctor.A method in IL is uniquely identified by its signature. A signature consists of five parts:• The name of the method• The type or class that the method resides in• The calling convention used• The return type• The parameter types.
For people like us, who are familiar with the world of C, C++ and Java, the concept of amessage signature depending upon the return type of a function is alien.Here, we have two functions, both named a2, which differ in the type of return value. Thisis perfectly valid in IL. The reason being that when calling a method in IL, we only have tostate the return type. But what is allowed in IL, may be taboo in C#.Method overloading is a concept where the same function name appears in a class, morethan once. In fact, you may not have clearly observed, in the above programs, the thispointer is not passed to the global functions. Even then, things worked well.The reason for this is that generally, global functions are static by default. In fact, staticfunctions are found in classes, value types and interfaces. Static functions always have abody associated with them.The second type of method very commonly used is an instance. These are functionsassociated with an instance of a class. In this version of the CLR, we cannot declare themin interfaces. Unlike static methods which are stand-alone methods and behave like globalfunctions, an instance functions is always passed a pointer or reference to the dataassociated with the object. Thus, it can use the this pointer to access a different set of dataeach time.
A runtime exception is thrown cause the call expects the method to be static, whereas, ourmethod is an instance. To avoid this runtime error, replace the modifier instance withstatic.
The this pointer is of the same type as the class in which the method resides. We therefore,have to create an instance of a class before we can execute any instance method from theclass.As a rule, all instance functions must have the this pointer as the first parameter.Therefore, it is automatically added as a first hidden parameter. The this pointer can be anull reference too.
Whenever we refer to a field in a type, through a function, the this pointer should first beavailable on the stack. This facilitates access to the instance fields. This explains the aboveerror.Here, we have placed a ldnull as the this pointer, and thus, are unable to access theinstance members. On commenting the ldnull, no error is generated.The instruction newobj places a this pointer on the stack. Therefore, prior to using it,ldarg.0 is checked for NULL. However, for a value type, the this pointer is a managedpointer to the value type. Unlike static or virtual, an instance is not an attribute of amethod. It is part of the calling convention of a method.
There are three ways to call a method in IL. These are: call, callvirt and calli. Two of these,call and callvirt, have already been dealt with, in the past.There are three other instructions that can be used to call a method in a special way.These are jmp, jmpi and newobj. Every method that we call has its own evaluation stack.The parameters to the function are placed on this stack, and instructions also obtain theirarguments from the same stack.On the execution of an instruction, the result is also placed on the same stack. Theruntime creates and maintains this stack. When the method quits out, the stack isreleased.There is another stack that we do not concern ourselves with. This stack keeps track of themethod being called, and hence, is known as the call stack.The last and final instruction in any function is the ret instruction. This instruction isresponsible for the method returning control back to the calling method. If a functionreturns a value, it must be placed on the stack before ret is called. When quitting off amethod, the stack must not contain any value, other than the value to be returned.We use the call instruction to call static or virtual functions. Before the call instruction, allthe parameters to the method must be placed on the stack. The first argument to thefunction is placed first. The only difference between calling a static and an instancemethod is that, the modifier instance is used for an instance method whereas, no modifieris required for a static method
Virtual functions have to be handled with care as they are runtime entities. With virtualfunctions, the instruction callvirt is used in place of call. callvirt unlike call executes theoverriding version of the method.
We have pulled out this program from an earlier chapter, where we explained new, overrideand virtual functions. The callvirt function calls the function abc from xxx, as it overridesthe one from the class yyy.The reason being, in the class xxx, there is no modifier newslot for the function abc, henceit is a different abc from the one in the base class. With call however, the instructionsimply calls abc from the class specified, as it does not understand modifiers like virtual,newslot etc. instance is used with callvirt as the this pointer, under no circumstances, canbe NULL.
In the above example, the super class function abc from the class yyy is called, from thefunction abc from class xxx. This facilitates reusing code defined in the super class.A virtual function may want to call all code in the base class. In IL parlance, it is termed asa super call. In the above code, we foresee a problem with callvirt as it will either call itselfover and over again, or give us the following exception:
The reason for the above error is that, the this pointer refers to class xxx and not of theclass yyy. Thus, the instruction call is used and not callvirt.
We have created an object like zzz using newobj. It places a reference to a zzz on the stack.The this pointer then calls the instance function abc.Here we have displayed "hi" and then an instance method pqr is called using the jmpinstruction.After the method pqr finishes execution, control does not regress to method abc. Instead,control returns back to vijay, which is the method that called abc. Thus the string "bye"present in the method pqr, does not get displayed.The jmp instruction does not revert the control back to the method from where theprogram initially branched out.
The above program is similar to its predecessor, but it uses the instruction jmpi instead ofjmp. This instruction is similar to jmp, but differs in the following aspects:• In the case of the jmp instruction, we placed the method signature on the stack as aparameter to the instruction.• In the case of the jmpi instruction, we first use the instruction ldftn to load theaddress of the function pqr on the stack, and then call jmpi.The jmp family of instructions executes a jump or a branch across a method. We can onlyjump to the beginning of a method, and not to anywhere inside it. The signature of themethod that we intend to jump to, must be the same.
If the signature of the method being jumped to is not the same, the above exception isthrown. The jmp instruction is not verifiable.
The method abc take two ints as parameters. We have placed the constant 3 on the stack,and then used the instruction starg to change the parameter j. Then, ldarg is used to placethe new value on the stack. Thereafter, we have called the WriteLine function to confirm ifthe new value is 3. The jmp instruction is the next to be called.Here we have not placed any parameters on the stack. The jmp instruction first places thenumbers 1 and 2 on the stack, and then, calls the function pqr, that simply displays theparameters that have been passed.Even though we have changed the parameter j, the change is not reflected in the calledfunction pqr. This is contrary to what the documentation states. The call does not passparameters to the next method. The instruction jmp does so.If function pqr returns a value, it will be passed to the function vijay and not to abc. We
cannot place any values on the stack before executing the jump. Jumps can be executedonly between methods that have the same signatures.
We can call a method indirectly by first, placing its address on the stack, and then, usingthe calli instruction. At first, the instruction ldftn places the address of a non-virtualfunction on the stack. Like in the case of instance functions, the this pointer has to beplaced first on the stack, followed by the parameters to the functions. When we tried usingcalli with the address of a virtual function, Windows generated an error.We use the newobj instruction to create a new instance, and also, call the constructor of aclass, which is nothing more than a special instance method.The only difference between a constructor and an instance call is that, the this pointer isnot passed to the constructor. newobj first creates the object, and then, automaticallyplaces the this pointer on the stack
The newobj instruction places the this pointer on the stack before calling the constructor.If we desire to call the constructor ourselves, we too need to place the this pointer on thestack.In the above program, we have changed the value of the field i to 1, then again changed itto 2 using stfld and then displayed this value. Thereafter, we have called the constructor,which changes the value back to 1 again. This proves that a constructor is no differentfrom any other function.A method definition is called a method head in IL. The head also functions as an interfaceto other methods. The format of the head is as follows:
• It starts with a number of predefined method attributes.• These are followed by an optional indication, specifying whether the method is aninstance method or not.• Thereafter, the calling convention is specified.• This is followed by the return type and a few more optional parameters.• Finally, we state the name and the parameters to the method and the implementationattributes.Methods are instance by default. To change the default behavior, we use use the modifiersstatic or virtual. As of today, the return type cannot have any attributes, but who knows,what changes may take place tomorrow.The code for the method is written in the method body. It can incorporate a large numberof directives.
The code that we write, gets converted into numbers. Every IL instruction is represented bya number. The ldc.i4.3 instruction is known by the number 19 hex. This information isavailable in the Instruction Set Reference. The directive emitbyte emits an unsigned 8 bitnumber directly into the code section of the method.Thus, we can use the opcodes of an IL instruction directly in il programs.The return value of the entrypoint function can either be void, int32 or unsigned int32.This value is handed over to the Operating System. A value of ZERO normally indicatessuccess and any other value indicates an error. The entrypoint method is unique, meaning,it can have private accessibility, and yet be accessed by the runtime.The .locals directive is used to create a local variable that can only be accessed from withinthat method. Thus, it is used to store data that exists only for the duration of a method
call. After a method quits, all the memory allocated for a local is reclaimed by IL.It is faster for the system to allocate memory on the stack, where locals get stored, than toallocate memory on the heap for the fields. We cannot specify attributes for local variables,like we do for parameters.
The .locals directive can be placed at the end of the code and does not have to be placed atthe beginning. Thus, in a sense, a forward reference is allowed here.
Remove the comments and a value of zero will be displayed.
There is some overlap in IL. If we use the modifier init in the locals directive, then all thevariables will be assigned their default values, depending upon their type. We have touchedupon this point earlier.The same effect is seen when we use the directive .zeroinit. This applies to all the locals inthe method.• If we place the comments, the variable i will be assigned whatever value is present onthe stack.• If we remove the comments, the runtime initialises all the value types to ZERO andall the reference types to NULL.
Some of the directives can only be used within certain entities. The directive .zeroinit canonly be used within a method and not outside. The assembler checks whether the directivehas been used at the right place or not. If not, it generates an error message that is hardlyinformative.
You may accuse us of being repetitive, but there is no harm in refreshing our memory.Class yyy is a base class and xxx the derived class. We have created a local of type yyy,which is the base class, but initialized it to the class xxx, which is the derived class. Abetter way to say it is, we are creating an object that looks like xxx, but storing it in a yyylocal.callvirt calls the function abc from the class xxx despite of it being called from the yyyclass, . This is because, the instruction callvirt executes at runtime. In that environment,the this pointer on the stack is of class xxx, and thus abc from the class xxx is called. Thevirtual function has its own unique way of deciding on the pointer to be placed on thestack.If we remove the modifier virtual from the function abc in class xxx, then the function abc
will be called from the yyy class. Changing the newobj to yyy does not make a difference, asboth the run time and compile time data types should be the same. The run time data typetakes precedence over the compile time data type.We add the modifier newslot in function abc class xxx as follows:
Here, from the point of view of the run time, the function abc is treated as a new function.As there is no connection with the abc of class yyy, they are now treated as two distinctfunctions. The abc of class yyy is called. Placing the modifier newslot in class yyy functionfor abc makes it a new function abc, if one is present in the object. Thus, it makes nodifference here.
The above program is pretty large. The only difference between this program and itspredecessor is that, we have added one more class www derived from xxx. We have createdtwo locals, one each of the types xxx and yyy, but the run time data type of both the localsis a www object.The functions abc are virtual throughout. When we call the functions abc though callvirt,even though we are using the class prefix xxx and yyy, the function gets called from www.This is so because the run time data type, i.e. www, of the this pointer has been passed.Then, we make our first small change: We add a newslot to the function abc in class www.The output now reads as follows:
This output has resulted as shown above because, newslot dissociates the function abc ofthe class www, from the earlier abc functions. Thus, since the abc of class xxx is thenewest, it gets called.Next, we add the modifier newslot to the function abc from class xxx and remove it fromthe class www. The output now reads as.
Isn't the output fascinating? Now you probably can understand, as to why we are revisitingvirtual functions.By adding the modifier newslot to the function abc in class xxx, we are creating twofamilies of abc:• One that comprises only of a single abc in class yyy• Another made up of abc functions from classes xxx and www.Thus, in every instance, the last member of the family gets called and, since the first familyhas only one member, this single member i.e. class yyy, gets called.In the second case, the abc of class www gets called. Now let us add the newslot modifier tofunction abc class www, without removing the one from class xxx.The output now reads as follows:
Now, we have three families of abc functions. Each of them has only one function abc thathas nothing to do with the abc functions of the other families.If we add the modifier newslot to the function abc in class yyy, we will not see any changein the output. This is because, we are cutting off abc from its root, from class yyy onwards.There is no function abc in any of the classes that yyy derives from. Hence, there is nochange in the output.If we remove virtual from the function abc in class www, it has the same effect as addingthe modifier newslot. A virtual modifier function signifies that the address of the functionto be called should be read from the vtable. If we remove the virtual modifier from functionabc class xxx, the output will be as follows:
This output has resulted because of the following:The object created is a www type.• In the first case, the vtable has the address of a www abc. The vtable stores a singleaddress of every virtual function. The runtime checks for the compile time data type ofthe pointer and on examining, it looks like yyy. Within yyy, it discovers that functionabc is virtual. Thus it looks into the vtable for the address which turns out to be that ofwww.• In the second case, at the compile time the type revealed is xxx. But within the classxxx, the function is not virtual and thus, the vtable does not come into play.Now we remove virtual from the function abc of class yyy only. Remember, we are makingonly one change a time. The output now will be as follows:
The same explanation as given earlier applies here too. We hope you will remember us andour brilliant explanation of the concept of virtual. At least, this is how we interpret it, anddo not mind being the only ones to do so in this manner
In IL, the scoping levels do not exhibit similar behavior to those found in traditionallanguages like C. Here i is created as a new variable each time with the { brace eventhough, all the variables are moulded together into one large local directive.Thus we refer to the individual variables i in their respective blocks. The ldloc.0 stands forthe first i whereas, ldloc.2 stands for the inner i that is visible in the outer brace
The above program displays different values for the local variable i. The output proves thatthey are created consecutively in memory.Whenever you are in doubt, display the value of the variables and clear up the cobwebs inyour mind. Thus, scope blocks are also known as syntactic sugar and are only used toincrease the readability and to debug code written by others.
Internally, for a variable name, IL begins at the scope we are presently in, and recursivelytries to resolve the name of the variable. Thus, even though a declaration hides the name ofa variable, we can access it using the index. The scope does not change the lifetime of avariable. All the variables in a method are created when we first enter the method, and diewhen we exit from it. The variable is always accessible by the zero based index, that isallocated on a "first come first served" basis.
The above program demonstrates how a function accepts multiple number of parameters.Vararg is a calling convention that allows passing of multiple parameters to a function. Wehave created a variable called it, that looks like System.ArgIterator. We have then loadedits address on the stack using ldloca and then called arglist. This instruction returns an
opaque handle i.e. an unmanaged pointer which represents all the arguments passed tothe method. This handle can be passed to other methods but is valid only during thelifetime of the current method. This opaque handle is of the type RuntimeArgumentHandle.The arglist instruction is valid on methods that take a variable number of arguments. Theconstructor of the value class ArgIterator is called with this handle as a parameter.Once the value class is instantiated, we place the address of a local variable x on the stack.This is more to store the parameter passed to our function. Subsequenly, the address ofvariable it is put on the stack too. A function GetNextArg from class ArgIterator is calledthat places a typedref on the stack, which is then passed to the function ToObject.Then, the class to an int32 is casted and unboxed as we need a value type. This value iscopied to the variable x. The vararg is a calling convention, and thus, part of the signatureof the method. We are specifying it as part of the call instruction. The ellipsis denote theend of fixed parameters and beginning of the variable number of parameters. This isbecause, a function may want to have a certain fixed number of parameters also.The other functions of the class ArgIterator can also give us useful information, such asthe number of items on the stack.We use method parameters to enable a method to accept data from the caller. Methodparameters are checked for type safety. They make it mandatory for a method to be calledwith the correct parameters. The Execution Engine enforces the contract between the callerand the called methods.
We are not compelled to assign any name to the parameters. In the above program, wehave a local as well as a parameter of type int32 which has no name or id. IL does notseem to care at all. However, the unnamed variables can be referenced only as an index.Parameters can also have attributes, as we shall now see, but these attributes havenothing to do with the signature.
The first attribute to a parameter is opt, which makes it optional. This means that, it is notcompulsory(义务的) to pass a parameter to our function.
Always read the fine print. The opt attribute may indicate that the parameter is optional,but it is used for documentation purposes only. The compiler may place the opt attributeon a parameter, so that other tools make sense of it. As far as the runtime is concerned,however, all the parameters are mandatory, and it simply ignores the opt attribute. Thus,opt has no significance for the runtime.Implementation attributes provide a lot of information about the nature of the method tothe runtime. These attributes decide whether the method requires special handling atruntime or not.
You should run the above program with and without the synchronized attribute toappreciate its significance.The attribute il managed tells the runtime that the method contains IL code that will run inthe managed world. We have created two threads, V_1 and V_2. These execute the samefunction abc from class yyy.In the function abc, we display numbers from 0 to 3, using a loop. After displaying anumber, the Sleep function stalls all operations for 1000 milliseconds. Thus the firstthread executes function abc, prints the value 0 and then sleeps. Now the second threadtakes advantage of the fact that the first thread is sleeping, and it also displays 0 and fallsasleep. This continues till we reach the value 3 and exit from the loop.The synchronized attribute does not execute the second function until the first threadterminates. Thus, the second thread has no choice but to wait until the first threadfinishes execution. Try implementing the above in C#.What we are trying to say is that if C# does not inculcate a feature of IL, there is no wayyou can use it in any .cs program.If a code implementation attribute is not given, the default value is il managed. The otherthree options are native, optil and runtime. These are mutually exclusive. The runtimeattribute specifies that the implementation of the code will be supplied by the runtime, andnot by the programmer. We cannot place any code in this type of a method. It is used forconstructors and delegates.
On running the ‘a.exe’ executable, three message boxes pop up with the following message
The program reported the above errors on the introduction of the new attribute optil. Itclearly says that it could not find a particular dll. The attribute optil means that the code isan optimized IL code that runs faster.We normally end all our attributes for a method with the qualifier managed or unmanaged.The default value is managed. This signifies as to who will manage the execution of themethod.• Managed signifies that the CLR will manage it.• Unmanaged signifies that someone else will manage it.
If we use the unmanaged attribute with pure IL code we get the above exception.
There are over a trillion lines of code already written in the programming language C,under the Windows Operating System. This code resides in files called dll's or DynamicLink Libraries. To ensure that this code is also be available to programs written in IL, C#provides an attribute called DllImport.To be technically accurate, code written in a dll has nothing to do with a programminglanguage. Once we obtain a dll, there is no way one can detect as to which programminglanguage it was originally written in. The C# compiler converts our attribute DllImport to amethod. This implies that C# understands attributes and depending upon the attribute itgenerates relevant IL code. The method is called MessageBoxA and has the sameparameters that we specified in C#. The added attribute is pinvokeimpl, that is first passedthe name of the dll that contains the function.Then we have a calling convention that has three parameters. The parameters are pushedon the stack before the function gets called. The order of placing parameters on the stackthat IL follows is "first written first placed" i.e. from left to right. The winapi callingconvention follows the reverse order i.e. right to left.Then, the name of the function gets added with a number specifying the size of theparameters on the stack. Finally who restores the stack, the caller or the callee?The function MessageBoxA can be called in the same manner that any other static functionof IL gets called.There are two primary ways of calling unmanaged methods :• One is using pinvokeimpl,
• The other is using IJW (It Just Works).In IJW, the runtime stays out of our way, and we have to write code for handlingeverything. We stick to pinvokeimpl, the one we can work with. The runtime willautomatically drift us from managed to unmanaged code, convert data types and handle allthe issues of transition management. The attributes to be used are native and unmanagedas, that is what the documentation recommends. The C# compiler however, is not familiarwith the documentation.
The above example uses recursion to find out the factorial of a number. It uses the prefixtail. wich is a tail call instruction. Functional programming languages like Lisp or Prolog
use tail calls extensively. In a non-tail call, the current stack frame is kept intact, and anew frame is allocated. This means that the stack position changes. In a tail call, the stackframe is replaced with a frame for the function to be called.When a call terminates with a ret, the control returns to the caller function. In the case oftail calls, control continues to remain with the called method. Since non-tail calls need tostore information as to who the caller is, it uses up memory on the stack, and may limitthe amount of recursion that is possible. Thus, tail calls handle recursion more effectivelythan non-tail calls.The above program works even without the tail prefix.