Getting started with Java bytecode

original
2016/05/18 18:30
Reading amount 1.9K

Introduction to bytecode

Hello world

 public class Helloworld { public static void main(String[] args) { System.out.println("hello,world"); } }

If you use javap to view such structures

javap -c Helloworld.class

Output is

 public class com.beetl.myos.ch1.Helloworld { public com.beetl.myos.ch1.Helloworld(); Code: 0: aload_0 1: invokespecial #8                  // Method java/lang/Object. "<init>":()V 4: return public static void main(java.lang.String[]); Code: 0: getstatic     #16                 // Field java/lang/System.out:Ljava/io/PrintStream; 3: ldc           #22                 // String hello,world 5: invokevirtual #24                 // Method java/io/PrintStream.println:(Ljava/lang/String;)V 8: return }

be careful

Javap is a built-in decompilation tool of java, which is part of jdk. Make sure that JAVA_HOME is set correctly and that JAVA_HOME bin is set to Path.

Two parts are output from javap. The first is the constructor, which uses three direct code instructions. If you are familiar with Java programming, you will know that although no constructor is provided for Helloworld, Java will provide a default constructor. We can see that there is a constructor called<init>by decompiling the class, which is created when the program is compiled into a class. The three commands are aload_0 invokespecial return

  1. Aload_0 This instruction tells the virtual machine to put the local variable this into the operation stack. For each method (the constructor is also a method from the perspective of bytecode, no difference), the parameters of the method and the variables declared in the method are determined during compilation. They are stored in the local variable table of the method stack in the order of occurrence. The position 0 is always reserved for this by default, and the subsequent positions are reserved for the declared parameter list of the method, Then there are local variables reserved for internal use of methods. We will introduce the data structure foundation of instructions in detail in the next section. Here, we only need to know that aload_0 is the first object in the variable list. This is put on the operation stack

  2. Invokespecial # 8 This instruction tells the virtual machine to call the method in the constant pool, that is, the method java/lang/Object output by javap "<init>": () V. The invokespecial instruction requires an object reference in the Opera nd Stack, that is, this just pushed in through aload_0. The parameters after invokespecial point to the init method of the constant pool. As the name of invokespecial implies, this instruction is only used for some special method calls, such as instance initialization method, private method, parent method

Operation stack

Indexes

content

zero

this

  1. Return returns without a value. If you need to return an object, use arerun to return an integer. Use ireturn. These instructions require that the operation stack contain a response value. four

For the second part, javap outputs four instructions

  1. Getstatic # 16 pushes the static field into the operation stack, and # 16 points to the System.out object in the constant pool

  2. ldc, Because we know that System.out.println also needs a parameter, the ldc # 22 instruction will push the reference of the string represented by # 22 into the operation stack.

  3. The invokevirtual instruction is commonly used to call methods, followed by # 24, which is the java/io/PrintStream. println: (Ljava/lang/String;) V method in the constant pool. The invokevirtual instruction will call the first object on the operation stack. The operation stack should look like this

Operation stack

Indexes

content

zero

Reference to System.out

one

Reference to hello world string

  1. Return Return

Operation stack

If you have learned computer principles or understood the working mode of registers, you should not be able to see that the operation stack is actually very similar to the CPU operation. After you put the value into the register, The CPU instruction will take out the register value for operation, and the virtual machine bytecode also has a similar principle. For example, aload_0, which we saw earlier, takes the first value from the variable table and puts it into the operation stack. Usually this is this (except for the static side). Let's look at a more typical piece of java code

 public class Ch1Simple { public int add(int a,int b){ int c = a+b; return c; } }

Run on the command line

javap -c Ch1Simple

The output is as follows. To save space, the constructor bytecode is omitted

 public int add(int, int); Code: 0: iload_1 1: iload_2 2: iadd 3: istore_3 4: iload_3 5: ireturn
  • 0: iload_1 instruction puts the second element of the variable table into the operation stack. The second element is actually int a. Again, the first element is this, the third element is int b, and the fourth element is int c. This is determined when compiling

The variable table of the method stack of the add method should look like this at this time

Variable table

Indexes

content

Push operation stack

zero

this

iload_0

one

a

iload_1

two

b

iload_2

three

c

iload_3

  • Iload_2 This instruction puts the third element of the variable into the operation stack

  • Iadd wants to add two variables of the operation stack, and i indicates that the two variables of the operation stack are of type int

  • The istore_3 operation stack structure is saved back to the variable table, and the location index is 3, that is, the variable c

  • Iload_3 Because the method requires a return value, the return instruction still needs to call the operation stack, so variable 3 is pushed into the operation stack

  • When the ireturn method finishes executing, the value in the operation stack pops up.

Stack frame

When Java is running, a thread stack will be allocated to each thread. When the thread executes each method, it will create a stack frame for it, and destroy the stack frame after execution. The stack frame contains the variable table and operation stack, and its length can be determined during compilation, as follows

 public class Ch1Simple { public int add(int a,int b){ int c = a+b; return c; } }

There are four variables, this, a, b, and c. This is an object pointer, occupying two bytes. The int type stored in variables a, b, and c also occupies two bytes, so the variable table occupies eight bytes.

In the add method, the instruction iadd needs 4 bytes to operate on two numbers, while ireturn needs 2 bytes for the operation stack, so the operation stack only needs 4 bytes to meet the demand (this conclusion is a bit abrupt and needs to be refined)

Therefore, in general, the stack frame of the add method should look like this

Variable table

Indexes

content

Push operation stack

zero

this

iload_0

one

a

iload_1

two

b

iload_2

three

c

iload_3

Stack frame

Indexes

content

zero

 

one

 

In addition to variable table and operation stack, stack frame also includes dynamic link

Expand to read the full text
Loading
Click to join the discussion 🔥 (1) Post and join the discussion 🔥
Reward
one comment
twelve Collection
seven fabulous
 Back to top
Top