In-depth Understanding of the Entire Process of Java Class Loading

The whole process of Java class loading

A Java file will experience the entire life cycle from being loaded to being unloaded, which involves4stages:

> Loading-> Linking (verification+> Preparation+> Parsing)-> Initialization (preparation before usage)-> Usage-> Unloading

Among which loading (excluding custom loading),+The linking process is entirely responsible by the JVM, when and how to initialize a class (loading+The linking process has already been completed before this), JVM has strict regulations (four situations):

1.When encountering new, getstatic, putstatic, and invokestatic, these4When a bytecode instruction is executed and the class has not been initialized, it will be initialized immediately. It is actually3The following situations: when a class is instantiated using new, when reading or setting a class's static fields (excluding static fields that are marked final, as they have already been placed in the constant pool), and when executing static methods.

2.Using java.lang.reflect.*.When a method is reflected to a class, if the class has not been initialized, initialize it immediately.

3.When initializing a class, if its parent has not been initialized, initialize its parent first.

4.When the JVM starts, the user needs to specify a main class to be executed (the class that contains static void main(String[] args)), then the JVM will first initialize this class.

Above4There are two types of references to a class: an active reference, which is called preprocessing, and other cases, called passive references, which will not trigger the initialization of the class. The following also gives some examples of passive references:

/** 
 * Passive reference scenario1 
 * Referencing a superclass's static field through a subclass will not cause the subclass to initialize. 
 * @author volador 
 * 
 */ 
class SuperClass{ 
  static{ 
    System.out.println("super class init."); 
  } 
  public static int value=123; 
} 
class SubClass extends SuperClass{ 
  static{ 
    System.out.println("sub class init."); 
  } 
} 
public class test{ 
  public static void main(String[]args){ 
    System.out.println(SubClass.value); 
  } 
}

Output result: super class init.

/** 
 * Passive reference scenario2 
 * Referencing a class through an array reference will not trigger the initialization of this class. 
 * @author volador 
 * 
 */ 
public class test{ 
  public static void main(String[] args){ 
    SuperClass s_list=new SuperClass[10]; 
  } 
}

Output result: No output

/** 
 * Passive reference scenario3 
 * Constants are stored in the constant pool of the calling class at compile time, and essentially do not reference the class that defines the constant. Therefore, it will not trigger the initialization of the class that defines the constant. 
 * @author root 
 * 
 */ 
class ConstClass{ 
  static{ 
    System.out.println("ConstClass init."); 
  } 
  public final static String value="hello"; 
} 
public class test{ 
  public static void main(String[] args){ 
    System.out.println(ConstClass.value); 
  } 
}

Output result: hello (tip: When compiling, ConstClass.value has been transformed into the constant 'hello' and placed in the constant pool of the test class).

The above is for the initialization of classes, interfaces also need to be initialized, and the initialization of interfaces is somewhat different from that of classes:

The above code uses static{} to output initialization information, interfaces cannot do this, but the compiler still generates a <clinit>() class constructor for interfaces when initializing interfaces, which is used to initialize the member variables of the interface, this is also done in the initialization of classes. The real difference lies in the third point, before the initialization of a class is executed, it requires that all parent classes be initialized, but the initialization of interfaces seems not to be interested in the initialization of parent interfaces, that is to say, when the child interface is initialized, it does not require that the parent interface be initialized, but only when the parent interface is actually used (such as when referencing constants on the interface) it will be initialized.

Let's break down the whole process of loading a class: loading->Verification->Preparation->Resolution->Initialization

Firstly, loading:

　　　 This is what the virtual machine needs to complete3This matter:

　　　　　　　 1.Obtain the binary byte stream defined by a class through a full qualified name of a class.

　　　　　　　 2.Convert the static storage structure represented by this byte stream into the runtime data structure of the method area.

　　　　　　　 3.Generate a java.lang.Class object representing this class in the Java heap, as the access entry for these data in the method area.

Regarding the first point, it is very flexible, many technologies are involved here, because it does not limit the origin of the binary stream:

From a class file->General file loading

From a zip package->Loading classes from a jar

From the network->Applet

..........

Compared with other stages of the loading process, the loading stage has the strongest controllability because the class loader can be the system's or the one written by yourself, and programmers can write their own loaders to control the acquisition of byte streams.

After obtaining the binary stream, it will be stored in the method area in the way required by the JVM, and at the same time, a java.lang.Class object will be instantiated in the Java heap to associate with the data in the heap.

After the loading is completed, the verification of these byte streams should begin (in fact, many steps are interlaced with the above, such as file format verification):

The purpose of verification: to ensure that the byte stream information of the class file conforms to the JVM's taste and does not make the JVM feel uncomfortable. If the class file is compiled from pure Java code, there will naturally not be any unhealthy problems such as array out of bounds or jumping to non-existent code blocks, because once such phenomena occur, the compiler will refuse to compile. However, as mentioned before, the class file stream may not necessarily come from Java source code, but may also come from the network or other places, or even you can use16Binary representation, if the JVM does not verify these data, some harmful byte streams may cause the JVM to completely crash.

Inspection mainly goes through several steps: file format verification->Metadata verification->Bytecode verification->Symbolic reference verification

File format verification: verifies whether the byte stream conforms to the specification of the Class file format and whether its version can be processed by the current JVM version. If it is okay, the byte stream can be saved in the method area in memory. The following3All these verifications are performed in the method area.

Metadata verification: performs semantic analysis on the information described by bytecode to ensure that the content described conforms to the syntax specifications of the Java language.

Bytecode verification: the most complex, it verifies the content of the method body to ensure that it will not do anything out of the ordinary at runtime.

Symbolic reference verification: to verify the authenticity and feasibility of some references, such as when other classes are referenced in the code, we need to check whether those classes actually exist; or when some properties of other classes are accessed in the code, we need to check the accessibility of those properties. (This step will lay the foundation for the subsequent parsing work)

The verification phase is very important, but it is not necessary. If some code is used repeatedly and its reliability has been verified, the implementation phase can try to use-The Xverify:none parameter is used to turn off most class verification measures to shorten the class loading time.

After the above steps are completed, we will enter the preparation phase:

This stage will allocate memory for class variables (referring to static variables) and set the initial value of the class, and this memory is allocated in the method area. It should be noted that this step will only set an initial value for static variables, while instance variables are allocated when objects are instantiated. The initial value setting for class variables is somewhat different from the assignment of class variables, for example:

public static int value=123;

At this stage, the value of value will be 0, not123because at this time, no Java code has started to execute yet,123is still invisible, while what we see is123The putstatic instruction that assigns a value to value exists in <clinit>() after the program is compiled, so, the value assigned to value is123It will only be executed during initialization.

There is an exception here:

public static final int value=123;

Here, the value of value will be initialized to123This means that during the compilation phase, javac will generate a ConstantValue attribute for this special value and assign a value to it based on the ConstantValue during the preparation phase.

After completing the previous step, parsing needs to be performed. Parsing seems to be the conversion of class fields, methods, and other things, specifically involving the format and content of the Class file, but it has not been deeply understood.

The initialization process is the final step of the class loading process:

In the previous class loading process, in addition to the user being able to participate through a custom class loader in the loading phase, all other actions are completely dominated by the JVM. It is not until the initialization phase that the actual Java code is truly executed.

This step will perform some pre-operations, note that in the preparation phase, the system has assigned values to class variables once.

In fact, this step is the process of executing the <clinit>() method of the program. Let's study the <clinit>() method next:

The <clinit>() method is called the class constructor method, which is automatically collected by the compiler, merging all assignments of class variables and statements in static code blocks, and placed in the same order as they are arranged in the source file.

The <clinit>() method is different from the class constructor; it does not need to explicitly call the <clinit>() method of the superclass. The JVM ensures that the <clinit>() method of the subclass is executed before the parent class's method is executed, which means that the first <clinit>() method executed in the JVM is definitely java.lang.Object's.

Let's take an example to illustrate this:

static class Parent{ 
  public static int A=1; 
  static{ 
    A=2; 
  } 
} 
static class Sub extends Parent{ 
  public static int B=A; 
} 
public static void main(String[] args){ 
  System.out.println(Sub.B); 
}

Firstly, Sub.B refers to the static data, and the Sub class needs to be initialized. At the same time, its superclass Parent needs to perform initialization actions first. After the initialization of Parent, A=2, so B=2The above process is equivalent to:

static class Parent{ 
  <clinit>(){ 
    public static int A=1; 
    static{ 
      A=2; 
    } 
  } 
} 
static class Sub extends Parent{ 
  <clinit>(){ //The JVM will execute the method of the superclass first before executing here 
  public static int B=A; 
  } 
} 
public static void main(String[] args){ 
  System.out.println(Sub.B); 
}

The <clinit>() method is not necessary for classes or interfaces; if there are no assignments to class variables and no static code blocks in the class or interface, the <clinit>() method will not be generated by the compiler.

Since static{} cannot exist within an interface, but there may still be variable assignment operations during variable initialization, an <clinit>() constructor is also generated inside the interface. However, unlike classes, it is not necessary to execute the <clinit>() method of the parent interface before executing the <clinit>() method of the child interface; the parent interface is initialized only when the variables defined in the parent interface are used.

In addition, the implementation class of an interface will also not execute the interface's <clinit>() method during initialization.

In addition, the JVM guarantees that a class's <clinit>(); method can be correctly locked and synchronized in a multi-threaded environment. <Because initialization will only be executed once>.

Let's illustrate this with an example:

public class DeadLoopClass { 
  static{ 
    if(true){ 
    System.out.println("To be ["+Thread.currentThread()+"] Initialized, let's come up with an infinite loop"); 
    while(treu){}   
    } 
  } 
  /** 
   * @param args 
   */ 
  public static void main(String[] args) { 
    // TODO Auto-generated method stub 
    System.out.println("toplaile"); 
    Runnable run=new Runnable(){ 
      @Override 
      public void run() { 
        // TODO Auto-generated method stub 
        System.out.println("["+Thread.currentThread()+"] We are going to instantiate that class"); 
        DeadLoopClass d=new DeadLoopClass(); 
        System.out.println("["+Thread.currentThread()+"] The initialization work of that class is completed"); 
      }}; 
      new Thread(run).start(); 
      new Thread(run).start(); 
  } 
}

In this case, you will see a blocking phenomenon when running.

Thank you for reading, I hope it can help everyone, thank you for your support to this site!

Basic Tutorial