注册 登录  
 加关注
   显示下一条  |  关闭
温馨提示!由于新浪微博认证机制调整,您的新浪微博帐号绑定已过期,请重新绑定!立即重新绑定新浪微博》  |  关闭

风雨夜归人

专业收集资料,个人爱好!

 
 
 

日志

 
 

【原创】汇编语言基础知识--译自CE4.4HELP  

2009-05-19 16:19:04|  分类: CCB的教程 |  标签: |举报 |字号 订阅

  下载LOFTER 我的照片书  |

昨晚睡不着,半夜在看CE的HELP,发现最后面的一部分是这篇由CE作者写的关于汇编的基础知识的教程。本来我就一直想给大家写点这方面的知识,不过一直懒得动手,而且也不知道该写到什么程度才合适。现在看到这东西,马上顾不上看,从床上爬起来一边看,一边给大家做翻译。虽然我花一分钟就能看懂它,可是要用标准规范的汉语把它写出来,却不那么容易,为了不致由于我的理解错误误导大家,所以我不得不再去查证很多资料以保证翻译出来的准确性。现在总算基本写出来了,虽然内容还是有点简单,但我想大家看看应该对以后会很有帮助。如果里面有什么看不懂的,欢迎在这里把问题提出来,我再详细补充和解答。对于英语水平比较好的朋友,我强烈建议你自己看看CE的HELP,或者对照着来看,以免被我蹩足的英语所误导:)最后,如果任何人想转载这篇译稿,请征求本人同意,并注明作者为CCB,引自广海俱乐部,谢谢。

CCB,2005-01-24

=================================================

汇编语言基础

原文由Dark Byte(CE作者)发表,Smidge204补充

大多数人认为汇编很难学,但事实上它很简单,在这个教程我将试图解释一些基本的汇编语言如何工作。

处理器以内存和寄存器来工作,寄存器类似内存但比内存快得多,寄存器有EAX,EBX,ECX,EDX,ESP,EBP,ESI,EDI,还有段寄存器,(还有一个叫EIP,这个是指令指针,它指示下一条将要执行的指令)

一些例子:

sub ebx,eax (ebx=00000005,eax=00000002)

让我们把它分成更基本的成分:

操作码,参数1,参数2

操作码是一个指令告诉处理器做什么,在这个例子里是让储存在EBX里面的数值,减少储存在EAX中的这个数。

在这个例子中EBX=5而EAX=2,所以这个指令执行后EBX的值应该是3(5-2)

还有请注意:当你看到操作码和两个参数的时候,第一个参数是指令的目标,而第二个参数则是来源。

sub [esi+13],ebx(ebx=00000003,esi=008AB100)

在这个例子里,你可看到第一个参数有一个方括号,这意思是说用一个内存的位置来代替寄存器,内存的位置由方括号中的内容指定,在这个例子里是esi+13(注意13是十六进制数)

因为ESI=008AB100,所以所指的地址是008AB113。

这条指令让保存在008AB113这个地址上的数值,减少保存在EBX上的数量,即3。

如果在008AB113位置上的数值是100,那么执行这个指令后,008AB113位置上的数值将会是97。

sub [esi+13],63 (esi=008AB100)

这个几乎和上一个完全一样,只不过是用直接数值取代寄存器。

记住了63实际上是99,因为指令中写的永远都是十六进制。

假设008AB113这个位置上的数值是100(用十六进制表示是64),执行这个指令后008AB113位置上的数值将会是1(100-99)。

sub ebx,[esi+13] (ebx=00000064 esi=008ab100)

这个指令让储存在EBX上的数值,减少在008AB113里面储存的数值(ESI+13=008AB100+13=008AB113,你没忘记吧)

上面直到这里都只使用SUB这个指令,但处理器能理解的指令其实很多很多。

让我们来看看MOV这个最常用的指令吧,虽然它的名字是MOVE(移动)数据,但它其实只是把数据从一个位置复制到另一个位置罢了。

MOV工作起来也和SUB完全一样,第一个参数是目标,第二个参数是来源。

举例:

MOV eax,ebx(eax=5,ebx=12)

把储存在EBX的数值复制到EAX里面

所以,如果这条指令被执行,那么EAX里面的数值会是12(并且EBX里面仍然是12)

MOV [edi+16],eax (eax=00000064, edi=008cd200)

这个指令把保存在EAX里面的数值(十六进制数64,也即十进制的100)放到EDI+16(008CD200+16=008CD216)这个位置。

所以执行这个指令之后,储存在008CD216这个位置上的数值将会是100(十六进制数64)

就象你看到的,它工作起来也和SUB指令一样。

然后,还有一些指令只有一个参数,例如INC和DEC。

举例:

inc eax :EAX中的数值加1

dec ecx :ECX中的数值减1

dec [ebp]: 将EBP所指的内存地址处的数值减1

现在我只讲32位寄存器(EAX,EBX,ECX......),但其实还有16位寄存器和8位寄存器可以使用的,16位寄存器是:AX,BX,CX,DX,SP,BP,SI,DI;8位寄存器是:AH,AL,BH,BL,CH,CL,DH,DL。

请注意当你改变了AH或AL寄存器时你也同时改变了AX寄存器,并且如果你改变了AX寄存器你也同时改变了EAX,其他的BL+BH+BX+EBX,CH+CL+CX+ECX,DH+DL+DX+EDX也一样。

(CCB注:以AX为例,AX是一个十六位寄存器,而AH是八位寄存器,它是指AX寄存器的高八位,而AL则是指AX的低八位。而32位的CPU增加了32位的寄存器,即EAX是在AX的基础上再加十六位,举例说明:

如果EAX的数值是(二进制):

EAX 00000000000000001101000100100111

那么

AX                              1101000100100111

而AH,AL则分别是:

AH                              11010001

AL                                              00100111

即AX包含AH和AL,而EAX包含AX,当然也包含AH和AL,不过WINDOWS上的程序一般比较少使用8位和16位寄存器)

你可以几乎完全一样地使用这些不同的寄存器,但它们只改变1(8位寄存器)或2(16位寄存器)字节,而不是改变4(32位寄存器)字节。

举例

dec al:8位寄存器AL减1

sub [esi+12],al:将储存在[ESI+12]所指位置上的一个1字节数值,减少AL寄存器中的数值

mov al,[esi+13]:将[ESI+13]所指的位置上的1字节数值,放到AL寄存器中

请注意,将16位和8位寄存器用来指示内存地址这是完全不可能的,例如:mov [al+12],0 是错误的。

其实还有64位和128位寄存器,但我不想讨论它们,因为它们比较难于使用,并且不能用于那些可以用于32位寄存器的指令。

那么,还有JUMP(跳转),LOOP(循环)和CALL(调用)

JMP:

JMP指令简单地修改指令指针(EIP)到JMP所指的位置并且继续执行下去。

跳转里面还有条件跳转,它只在特定的条件成立时才改变指令指针。(例如根据比较指令(CMP)的结果来设置跳转)

JA=大于则跳转

JNA=不大于则跳转

JB=小于则跳转

JE=如果相等刚跳转

JC=如果进位(进位标志置位)刚跳转

还有好多其他的条件跳转

LOOP:

循环指令和跳转指令差不多都是跳转到内存的其他位置去执行,不同的是它只有在ECX寄存器非0时才跳转。

(CCB注:也就是说,ECX是个循环的计数器,比如当循环开始时,ECX里面的数值是3,那么执行一次循环后,ECX会自动减1,并且跳到前面重复循环,第二次执行后ECX又再减1,当ECX为0的,不再跳回去执行)

当然,循环也有条件循环:

LOOPE:当ECX非0,并且“零标志”没有置位时循环

LOOPZ:和LOOPE相同

LOOPNE:当ECX非0,并且“零标志”被置位时循环

LOOPNZ:的LOOPNZ相同

(CCB注:CPU中有另一个特有的寄存器,零标志是这个特殊的寄存器中的一个“位(BIT)”,很多转向指令例如跳转,循环等都会根据这个特殊的寄存器中的某些位来做为条件,例如这里的零标志位和上面的进位标志,一般一个标志位上是1时即被置位,而该位为0时为没有置位)

我想我还得解释一下什么是标志,它是处理器中的一些位,可以用来检查前一指令的一些条件,好象“cmp al,12”如果AL=12那么零标志位被设置为TURE(真),否则零标志位被设置为FALSE(假)。

CALL:

调用其实和跳转一样,除了它使用堆栈来返回(即返回原处继续执行)。

解释一下堆栈:

堆栈是由ESP寄存器为指针所指的内存位置,你可以使用PUSH命令把数值压进堆栈,并且使用POP指令将数值弹出。当你使用PUSH时ESP寄存器会减少,并且把数值放置到ESP所指的位置。当你使用POP时会把数值弹出到POP指令的参数所指的位置,并且ESP寄存器数值增加。简言之,就是最后压堆栈的数据最先出来,倒数第二个压进去的,第二个出来。

(CCB:堆栈的特点就是“后进先出”,想象一下,一个单车道的停车场,第一辆车停到最里面,第二辆车又停进去,然后第三辆车再开进去停在最外面,要出来的时候,是不是第三辆车要开出来之后,第二辆才能出来?要第二辆车开出来之后,第一辆开进去的才能出来?)

RET:

当CALL调用时会把(返回后要执行的)下一条指令的地址压进堆栈,RET(返回)就跳转到这个位置执行(即把指令指针设置到这个位置)。

(CALL调用时)执行到一定地方会遇到RET指令,就会跳转到储存在堆栈中的位置中去执行。(CALL把下一条指令的位置压进堆栈,而RET就把这个位置弹出来并跳到那里执行)

这就是最基本的汇编教程,如果你有什么有关汇编的问题,请提问,我会尽量回答。

如果你想得到更详细的信息,这里有个很好的文件:

http://podgoretsky.com/ftp/Docs/Hardware/Processors/Intel/24547111.pdf

注:理解括号中的数值的用法这一点非常有用,因为以后在使用CE时要用到指针(它可以解决大多数游戏的DMA(动态内存定位)的问题,如果你能看得懂什么汇编指令在修改你找到的数值的话)。

----------------------------------------------------------

“标志位”是保存在一个特殊寄存器中的一些BIT的集合,如果某个BIT是1,即是说这个标志被“置位”,如果是0即是说它被“清除”,正确地说,标志位告诉你处理器中所有的内部状态并给你更多关于前一指令执行的信息。

标志位有三种:状态标志告诉你最后一条指令执行的结果,控制标志告诉你处理器将会怎样,而系统标志告诉你,你的程序执行时的内部环境。

标志寄存器有32个位:(S=状态标志,C=控制标志,X=系统标志)

代码:

0  S 进位标志

1    (保留)

2  S 奇偶标志

3    (保留)

4  S 辅助进位标志

5    (保留)

6  S 零标志

7  S 符号标志

8  X 陷阱标志

9  X 允许中断标

10  C 方向标志

11  S 溢出标志

12  X I/O特权标志(12及13位)

13  X

14  X 嵌套任务标志

15    (保留)

16  X 复原标志

17  X 虚拟8086标志

18  X 对齐检验标志

19  X 虚拟中断标志

20  X 虚拟中断未决标志

21-31 (保留)

让我们看看状态标志,因为这些比较经常用到。

溢出(进位):

当一个操作(加、减、乘等)产生的结果太大,不能存进寄存器或内存位置时,进位标志置位(否则的话,则自动清除该标志位)。例如你使用一个16位寄存器,而你的指令产生的结果数值大于16位,则进位标志被置位。

符号:

当结果为负数时被置位,如果是正数则清除。这个是一个数值的符号位的镜像。(CCB注:就是与结果数值的最高位相同)

零标志:

如果操作结果为0则此位被置位。

辅助进位:

与进位标志相同,但它对待寄存器或内存是以3-BITS(位)而不是普通的8,16或32位,这个用于BCD(二进制编码的十进制数)方面的东西,其他地方根本没什么用。

进位标志:

当第一个BIT超过寄存器/内存的限制时,进位标志被置位。举例来说,mov al, 0xFF,然后add al,1 将会导致进位,因为第九个BIT被设置,而且要注意,溢出标志和零标志也会被设置,而符号标志会被清除。

我把原文附在这里,给大家参照

Basic assembler

Originally posted by Dark Byte + addition by Smidge204

Most people think assembler is very difficult, but in fact it's very easy.

In this tutorial i'll try to explain how some basic assembler works

The processor works with memory and registers. The registers are like memory but a lot faster than memory. Registers are EAX,EBX,ECX,EDX,ESP,EBP,ESI,EDI, and the segment registers. (There's also EIP, wich is the Instruction Pointer. It points to the instruction that is about to be executed)

Some examples:

sub ebx,eax (ebx=00000005,eax=00000002)

Lets take it apart in it's most basic elements:

opcode param1,param2

The opcode is the instruction telling the processor what to do, in this case decrease the value stored in register ebx with the value stored in register eax.

In this case ebx=5 and eax=2 so ebx would be after this instruction 3. (5-3)

Also note that whever you see a opcode with 2 parameters: The first parameter is the target of the instruction. The 2nd is the source

sub [esi+13],ebx (ebx=00000003,esi=008AB100)

In this case you see the first parameter is between brackets. This indicates that instead of registers a memorylocation is being used.

The memorylocation is pointed at by whats in between the brackets, in this case esi+13 (Note that the 13 is in hexadecimal)

ESI=008AB100 so the address pointed at is 008AB113.

This instruction would decrease the value stored at location 008AB113 with the value stored in ebx(wich is 3).

If the value at location 008AB113 was 100 then the value stored at 008AB113 after this instruction would be 97.

sub [esi+13],63 (esi=008AB100)

This is almost the same as above but instead of using a register it uses a direct value.

Note that 63 is actually 99 because the instruction is always written using hexadecimal.

Lets say the value at 008ab113 is 100 (wich is 64 in hexadecimal) then the value at 008ab113 after execution would be 1 (100-99)

sub ebx,[esi+13] (ebx=00000064 esi=008ab100)

This instruction decreases the value stored in ebx with the value stored at location 008ab113. (esi+13=008ab100+13=008ab113, in case you forgot)

Up until now i've only used SUB as instruction, but there are lots and lots of other instructions the procesor knows.

Lets take a look at MOV, one of the most often used instructions

although it's name sugests that it moves data, it just COPYs data from one spot to another.

MOV works exactly the same as sub. first parameter is the destination, and second parameter is the source.

examples:

MOV eax,ebx eax=5,ebx=12

Copies the value stored in ebx into eax

So, if this instruction would be executed eax would be 12. (and ebx would stay 12)

MOV [edi+16],eax eax=00000064, edi=008cd200)

This instruction will place the value of eax(64hex=100 decimal) at the location of edi+16 (008cd200+16=008cd216).

So after instruction the value stored at 008cd216 will be 100 (64 hex)

As you see, it works just like the SUB instruction.

Then there are also those instructions that only have 1 parameter like inc and dec.

example:

inc eax :increase the value at eax with 1

dec ecx: decrease the value of ecx with 1

dec [ebp]: Decrease the value stored at the address pointed to by ebp with 1.

Right now i've only shown the 32-bit registers (eax, ebx ecx....) but there are also 16-bit register and 8-bit registers that can be used.

the 16 bit registers are: AX,BX,CX,DX,SP,BP,SI,DI

the 8 bit register are: AH,AL,BH,BL,CH,CL,DH,DL

Note that when changing ah or al you'll also change AX, and if you change AX you'll also change EAX, same goes for bl+bh+bx+ebx,ch+cl+cx+ecx,dh+dl+dx+edx

You can use them almost the same with the instructions for 32 bit but they will only change 1 (8 bit) or 2(16-bit) bytes, instead of 4 (32-bit) bytes.

example:

dec al :decreases the 8 bit register al

sub [esi+12],al :decreases the 1-byte value stored at the location esi+12 points at with the value of al

mov al,[esi+13]:places the 1-byte value stored at the location esi+13 points in the al register.

Note that it is IMPOSSIBLE to use a 16 or 8 bit register for instructions that point to an address. eg: mov [al+12],0 will NOT work.

There are also 64 and 128 bit registers, but I wont discuss them since they are hardly ever used, and cant be used with the other instructions that also work with 32 bit)

Then there are the JUMPS, LOOPS, and CALLS:

JMP:

The JMP instruction is the easiest it changes the Instruction Pointer (EIP) to the location the JMP instruction points at and continues from there.

There are also conditional jumps that will only change the instruction pointer if a special condition has met. (for example set using the compare instruncion (CMP))

JA=Jump if Above

JNA=Ju,p if not above

JB=Jump if below

JE=Jump if equal

JC=Jump if carry

and LOTS of other conditional jump

LOOP:

The loop instruction also points just like the JMP to a memory location, but only jumps to that location if the ECX register is not 0.

and of course, there are also special contitional loops:

LOOPE:Loop while ecx is not 0 AND the zero flag is not set

LOOPZ:same as LOOPE.

LOOPNE:Loop while ECX is not 0 AND the zero flag is set.

LOOPNZ:Same as LOOPNE

I gues I should also explain what flags are, they are bits in the processor that can be used to check the condition of a previous instruction like 'cmp al,12' if al=12 then the zero flag (ZF) will be set to true, else the Zero flag(ZF) will be set to false.

CALL:

Call is the same as JMP except it uses the stack to go back.

Explenation of the stack:

The stack is a location on memory pointed at by the ESP register.

You can put values in it using the PUSH command, and take out it using the POP command. If you use PUSH it will decrease the ESP register and place the value at the location of ESP. If you use POP it will place the value pointed at by pop into the location pointed at by the parameter of POP and increase the value of ESP. In short: The last thing you push in the stack will be the first thing you pop from the stack, the 2nd last item in will be the 2nd item out.

RET:

After CALL has pushed the location of the next instruction onto the stack it jumps to that location. (sets the instruction pointer to that location)

After a while it will encounter a RET instruction, and will then jump to the location that is stored in the stack. (Call pushed the location in the stack, ret pops it out again and jumps to that location)

And thats the tutorial on the basics of assembler, if you have questions about assembler and stuff just ask and I'll try to answer.

Nice file to check out if you want more info:

http://podgoretsky.com/ftp/Docs/Hardware/Processors/Intel/24547111.pdf

note: It's really usefull to understand how those values between brackets work, because then you can make the most use of the pointer stuff in CE 4.1 (It will remove for most games the Dynamic Memory Allocation problem for most games, if you know how how to look at the assembler code that accesses the values you found)

------------------------------------------------------------------

The "flags" are a set of bits stored in a special register. If the bit is "1" the flag is said to be set, and if it's "0" then the flag said to be "clear". Collectively, the flags tell you all about the processor's internal status and gives more information about the results of previous instructions.

There are three types of flags: Status flags that tell you about the results of the last instruction, Control flags that tell you how the processor will behave, and System flags that tell you about the environment your program is executing it.

The flag register is 32 bits: (S=Status flag, C=Control flag, X=System flag)

Code:

0   S   Carry

1     (Reserved)

2   S   Parity

3     (Reserved)

4   S   Auxiliary Carry

5     (Reserved)

6   S   Zero

7   S   Sign

8   X   Trap

9   X   Interrupt Enable

10   C   Direction

11   S   Overflow

12   X   I/O Privilage (bits 12&13)

13   X  

14   X   Nested Task

15     (Reserved)

16   X   Resume

17   X   Virtual 8086

18   X   Alignment Check

19   X   Virtual Interrupt

20   X   Virtual Interrupt Pending

21   X   Identification

22  

23   |

24   |

25   |

26   |_ (Reserved)

27   |

28   |

29   |

30   |

31   /

Let's go over the status flags, since those are used most often.

Overflow:

When an operation (Addition, subtraction, multiplication, etc) produces a result that is too big to fit in the register (or memory location) used, the Carry flag is set. (If not, it's cleared automatically) For example, if you're using a 16 bit register and your operation produces a value that won't fit in 16 bits, the carry flag is set.

Sign:

Set if the result is negative, cleared if positive. This is typically a mirror of MSB (most significant bit) of a value.

Zero:

Set if result is 0.

Auxiliary Carry:

Similar to Carry, but it will treat the register/memory location as 3-bits instead of 8, 16 or 32. This is used for BCD (Binary coded decimal) stuff and it generally pretty useless otherwise.

Carry:

The carry flag is set if the bit one past the lmit of the register/memory location would have been set. For example, mov al, 0xFF then add al, 1 will cause a carry because the 9th bit would have been set. Also note that the overflow and zero flags would be set and sign flag cleared, too

  评论这张
 
阅读(305)| 评论(0)
推荐 转载

历史上的今天

评论

<#--最新日志,群博日志--> <#--推荐日志--> <#--引用记录--> <#--博主推荐--> <#--随机阅读--> <#--首页推荐--> <#--历史上的今天--> <#--被推荐日志--> <#--上一篇,下一篇--> <#-- 热度 --> <#-- 网易新闻广告 --> <#--右边模块结构--> <#--评论模块结构--> <#--引用模块结构--> <#--博主发起的投票-->
 
 
 
 
 
 
 
 
 
 
 
 
 
 

页脚

网易公司版权所有 ©1997-2017