World Library  
Flag as Inappropriate
Email this Article

Common Intermediate Language

Article Id: WHEBN0000046004
Reproduction Date:

Title: Common Intermediate Language  
Author: World Heritage Encyclopedia
Language: English
Subject: .NET Framework, Common Language Infrastructure, List of CIL instructions, Metadata (CLI), Common Language Runtime
Collection: Assembly Languages, Common Language Infrastructure
Publisher: World Heritage Encyclopedia
Publication
Date:
 

Common Intermediate Language

Common Intermediate Language (CIL, pronounced either "sil" or "kil") (formerly called Microsoft Intermediate Language or MSIL) is the lowest-level human-readable programming language defined by the Common Language Infrastructure (CLI) specification and is used by the .NET Framework and Mono. Languages which target a CLI-compatible runtime environment compile to CIL, which is assembled into an object code that has a bytecode-style format. CIL is an object-oriented assembly language, and is entirely stack-based. Its bytecode is translated into native code or — most commonly — executed by a virtual machine.

CIL was originally known as Microsoft Intermediate Language (MSIL) during the beta releases of the .NET languages. Due to standardization of C# and the Common Language Infrastructure, the bytecode is now officially known as CIL.[1]

Contents

  • General information 1
  • Instructions 2
  • Computational model 3
    • Object-oriented concepts 3.1
    • Metadata 3.2
  • Example 4
  • Generation 5
  • Execution 6
    • Just-in-time compilation 6.1
    • Ahead-of-time compilation 6.2
  • Pointer instructions - C++/CLI 7
  • See also 8
  • External links 9
  • References 10

General information

During compilation of CLI programming languages, the source code is translated into CIL code rather than into platform- or processor-specific object code. CIL is a CPU- and platform-independent instruction set that can be executed in any environment supporting the Common Language Infrastructure,[2] such as the .NET runtime on Windows, or the cross-platform Mono runtime. In theory, this eliminates the need to distribute different executable files for different platforms and CPU types. CIL code is verified for safety during runtime, providing better security and reliability than natively compiled executable files.

The execution process looks like this:

  1. Source code is converted to CIL i.e. Common Intermediate Language, which is the CLI's equivalent to assembly language for a CPU.
  2. CIL is then assembled into a form of so-called bytecode and a CLI assembly is created.
  3. Upon execution of a CLI assembly, its code is passed through the runtime's JIT compiler to generate native code. Ahead-of-time compilation may also be used, which eliminates this step, but at the cost of executable-file portability.
  4. The computer's processor executes the native code.

Instructions

CIL bytecode has instructions for the following groups of tasks:

Computational model

The Common Intermediate Language is object-oriented and stack-based. That means that data are pushed on a stack instead of pulled from registers as in most CPU architectures.

In x86 it might look like this:

add eax, edx

The corresponding code in IL can be rendered as this:

ldloc.0
ldloc.1
add
stloc.0    // a = a + b or a += b;

Here are two locals that are pushed on the stack. When the add-instruction is called the operands get popped and the result is pushed. The remaining value is then popped and stored in the first local.

Object-oriented concepts

This extends to object-oriented concepts as well. You may create objects, call methods and use other types of members such as fields.

CIL is designed to be object-oriented and every method needs (with some exceptions) to reside in a class. So does this static method:

.class public Foo
{
    .method public static int32 Add(int32, int32) cil managed
    {
        .maxstack 2
        ldarg.0 // load the first argument;
        ldarg.1 // load the second argument;
        add     // add them;
        ret     // return the result;
    }
}

This method does not require any instance of Foo to be declared because it is static. That means it belongs to the class and it may then be used like this in C#:

int r = Foo.Add(2, 3);    // 5

In CIL:

ldc.i4.2
ldc.i4.3
call int32 Foo::Add(int32, int32)
stloc.0

Instance classes

An instance class contains at least one constructor and some instance members. This class has a set of methods representing actions of a Car-object.

.class public Car
{
    .method public specialname rtspecialname instance void .ctor(int32, int32) cil managed
    {
        /* Constructor */
    }

    .method public void Move(int32) cil managed
    {
        /* Omitting implementation */
    }

    .method public void TurnRight() cil managed
    {
        /* Omitting implementation */
    }

    .method public void TurnLeft() cil managed
    {
        /* Omitting implementation */
    }

    .method public void Brake() cil managed
    {
        /* Omitting implementation */
    }
}

Creating objects

In C# class instances are created like this:

Car myCar = new Car(1, 4); 
Car yourCar = new Car(1, 3); 

And these statements are roughly the same as these instructions:

ldc.i4.1
ldc.i4.4
newobj instance void Car::.ctor(int, int)
stloc.0    // myCar = new Car(1, 4);
ldc.i4.1
ldc.i4.3
newobj instance void Car::.ctor(int, int)
stloc.1    // yourCar = new Car(1, 3);

Invoking instance methods

Instance methods are invoked like the one that follows:

myCar.Move(3);

In CIL:

ldloc.0    // Load the object "myCar" on the stack
ldc.i4.3
call instance void Car::Move(int32)

Metadata

CLI records information about compiled classes as Metadata. Like the type library in the Component Object Model, this enables applications to support and discover the interfaces, classes, types, methods, and fields in the assembly. The process of reading such metadata is called reflection.

Metadata can be data in the form of attributes. Attributes can be custom made by extending from the Attribute class. This is a very powerful feature. It allows the creator of the class the ability to adorn it with extra information that consumers of the class can use in various meaningful ways depending on the application domain.

Example

Below is a basic Hello, World program written in CIL. It will display the string "Hello, world!".

.assembly Hello {}
.assembly extern mscorlib {}
.method static void Main()
{
    .entrypoint
    .maxstack 1
    ldstr "Hello, world!"
    call void [mscorlib]System.Console::WriteLine(string)
    ret
}

The following code is more complex in number of opcodes.

This code can also be compared with the corresponding code in the article about Java bytecode.

static void Main(string[] args)
{
    for (int i = 2; i < 1000; i++)
    {
        for (int j = 2; j < i; j++)
        {
             if (i % j == 0)
                 goto outer;
        }
        Console.WriteLine(i);
outer:
    }
}

In CIL syntax it looks like this:

.method private hidebysig static void Main(string[] args) cil managed
{
    .entrypoint
    .maxstack  2
    .locals init (int32 V_0,
                  int32 V_1)

              ldc.i4.2
              stloc.0
              br.s       IL_001f
    IL_0004:  ldc.i4.2
              stloc.1
              br.s       IL_0011
    IL_0008:  ldloc.0
              ldloc.1
              rem
              brfalse.s  IL_001b
              ldloc.1
              ldc.i4.1
              add
              stloc.1
    IL_0011:  ldloc.1
              ldloc.0
              blt.s      IL_0008
              ldloc.0
              call       void [mscorlib]System.Console::WriteLine(int32)
    IL_001b:  ldloc.0
              ldc.i4.1
              add
              stloc.0
    IL_001f:  ldloc.0
              ldc.i4     0x3e8
              blt.s      IL_0004
              ret
}

This is just a representation of how CIL looks like near VM-level. When compiled the methods are stored in tables and the instructions are stored as bytes inside the assembly, which is a Portable Executable (PE).

Generation

A CIL assembly and instructions are generated by either a compiler or a utility called the IL Assembler (ILAsm) that is shipped with the execution environment.

Assembled IL can also be disassembled into code again using the IL Disassembler (ILDASM). There are other tools such as .NET Reflector that can decompile IL into a high-level language (e. g. C# or Visual Basic). This makes IL a very easy target for reverse engineering. This trait is shared with Java bytecode. However, there are tools that can obfuscate the code, and do it so that the code cannot be easily readable but still be runnable.

Execution

Just-in-time compilation

Just-in-time compilation (JIT) involves turning the byte-code into code immediately executable by the CPU. The conversion is performed gradually during the program's execution. JIT compilation provides environment-specific optimization, runtime type safety, and assembly verification. To accomplish this, the JIT compiler examines the assembly metadata for any illegal accesses and handles violations appropriately.

Ahead-of-time compilation

CLI-compatible execution environments also come with the option to do an Ahead-of-time compilation (AOT) of an assembly to make it execute faster by removing the JIT process at runtime.

In the .NET Framework there is a special tool called the Native Image Generator (NGEN) that performs the AOT. In Mono there is also an option to do an AOT.

Pointer instructions - C++/CLI

A huge difference from Java's bytecode is that CIL comes with ldind, stind, ldloca, and many call instructions which are enough for data/function pointers manipulation needed to compile C/C++ code into CIL.

class A {
   public: virtual void __stdcall meth() {}
};
void test_pointer_operations(int param) {
        int k = 0;
        int * ptr = &k;
        *ptr = 1;
        ptr = ¶m;
        *ptr = 2;
        A a; A * ptra = &a; ptra->meth();
}
.method assembly static void modopt([mscorlib]System.Runtime.CompilerServices.CallConvCdecl) 
        test_pointer_operations(int32 param) cil managed
{
  .vtentry 1 : 1
  // Code size       44 (0x2c)
  .maxstack  2
  .locals ([0] int32* ptr,
           [1] valuetype A* V_1,
           [2] valuetype A* a,
           [3] int32 k)
// k = 0;
  IL_0000:  ldc.i4.0 
  IL_0001:  stloc.3
// ptr = &k;
  IL_0002:  ldloca.s   k // load local's address instruction
  IL_0004:  stloc.0
// *ptr = 1;
  IL_0005:  ldloc.0
  IL_0006:  ldc.i4.1
  IL_0007:  stind.i4 // indirection instruction
// ptr = ¶m
  IL_0008:  ldarga.s   param // load parameter's address instruction
  IL_000a:  stloc.0
// *ptr = 2
  IL_000b:  ldloc.0
  IL_000c:  ldc.i4.2
  IL_000d:  stind.i4
// a = new A;
   IL_000e:  ldloca.s   a
  IL_0010:  call       valuetype A* modopt([mscorlib]System.Runtime.CompilerServices.CallConvThiscall) 'A.{ctor}'(valuetype A* modopt([mscorlib]System.Runtime.CompilerServices.IsConst) modopt([mscorlib]System.Runtime.CompilerServices.IsConst))
  IL_0015:  pop
// ptra = &a;
  IL_0016:  ldloca.s   a
  IL_0018:  stloc.1
// ptra->meth();
  IL_0019:  ldloc.1
  IL_001a:  dup
  IL_001b:  ldind.i4 // reading the VMT for virtual call
  IL_001c:  ldind.i4
  IL_001d:  calli      unmanaged stdcall void modopt([mscorlib]System.Runtime.CompilerServices.CallConvStdcall)(native int)
  IL_0022:  ret
} // end of method 'Global Functions'::test_pointer_operations

See also

External links

  • Common Language Infrastructure (Standard ECMA-335)
  • “ECMA C# and Common Language Infrastructure Standards” on MSDN
  • Hello world program in CIL
  • Kenny Kerr's intro to CIL (called MSIL in the tutorial)
  • Speed: NGen Revs Up Your Performance With Powerful New Features -- MSDN Magazine, April 2005

References

  1. ^ "What is Intermediate Language(IL)/MSIL/CIL in .NET". Retrieved 2011-02-17. CIL: ... When we compile [a] .NET project, it [is] not directly converted to binary code but to the intermediate language. When a project is run, every language of .NET programming is converted into binary code into CIL. Only some part of CIL that is required at run time is converted into binary code. DLL and EXE of .NET are also in CIL form. 
  2. ^ Benefits of CIL. Retrieved 2011-02-17. Furthermore, given that CIL is platform-agnostic, .NET itself is platform-agnostic... 
This article was sourced from Creative Commons Attribution-ShareAlike License; additional terms may apply. World Heritage Encyclopedia content is assembled from numerous content providers, Open Access Publishing, and in compliance with The Fair Access to Science and Technology Research Act (FASTR), Wikimedia Foundation, Inc., Public Library of Science, The Encyclopedia of Life, Open Book Publishers (OBP), PubMed, U.S. National Library of Medicine, National Center for Biotechnology Information, U.S. National Library of Medicine, National Institutes of Health (NIH), U.S. Department of Health & Human Services, and USA.gov, which sources content from all federal, state, local, tribal, and territorial government publication portals (.gov, .mil, .edu). Funding for USA.gov and content contributors is made possible from the U.S. Congress, E-Government Act of 2002.
 
Crowd sourced content that is contributed to World Heritage Encyclopedia is peer reviewed and edited by our editorial staff to ensure quality scholarly research articles.
 
By using this site, you agree to the Terms of Use and Privacy Policy. World Heritage Encyclopedia™ is a registered trademark of the World Public Library Association, a non-profit organization.
 


Copyright © World Library Foundation. All rights reserved. eBooks from Project Gutenberg are sponsored by the World Library Foundation,
a 501c(4) Member's Support Non-Profit Organization, and is NOT affiliated with any governmental agency or department.