Are all programs eventually converted to assembly instructions?

2018-06-25 21:08:13

It seems like every class I take is an introduction to a new topic that fails to provide comprehensive knowledge that would allow me to produce an actual program that can be executed outside of an IDE. It's both frustrating and intimidating to realize what I don't know, and it baffles me that an accredited computer science program could offer a curriculum that doesn't shed some light on this process from the very beginning.

Actual Question: Forgive the introduction but it provides a good indication of my (in)experience. I'm currently learning MIPS in a computer architecture class and have been given a rapid intro to assembly. The fine details of how a program executes are often described to me as magic and brushed under the rug for another teacher to explain, if possible.

It is my understanding that processor circuitry varies greatly from chip to chip and therefore may require different low level instructions to execute the same high level code. Are all programs eventually converted to assembly language before becoming raw machine code or is this step no longer necessary?

If so, at what point does the processor begin to execute its own unique set of instructions? This is the lowest level of code, so is it at this point that the program instructions are executed by the processor, bit by bit?

Finally, do all architectures have/need an assembly language?

Assembly language is, so to say, a human-readable form of expressing the instructions a processor executes (which are binary data and very hard to manage by a human). So, if the machine instructions are not generated by a human, using assembly step is not necessary, though it sometimes does happen for convenience. If a program is compiled from a language such as C++, the compiler may generate machine code directly, without going through the intermediate stage of assembly code. Still, many compilers provide an option of generating assembly code in order to make it easier for a human to inspect what gets generated.

Many modern languages, for example Java and C# are compiled into so-called bytecode. This is code that the CPU does not execute directly, but rather an intermediate form, which may get compiled to machine code just-in-time (JIT-ted) when the program is executed. In such a case, CPU-dependent machine code gets generated but usually without going through human-readable assembly code.

Assembly language is simply a human-readable, textual representation of the raw machine code. It exists for the benefit of the (human) programmers. It's not at all necessary as an intermediate step to generate machine code. Some compilers do generate assembly and then call an assembler to convert that to machine code. But since omitting that step results in faster compilation (and is not that hard to do), compilers will (broadly speaking) tend to evolve towards generating machine code directly. It is useful to have the option of compiling to assembly though, to inspect the results.

For your last question, assembly language is a human convenience, so no architecture truly needs it. You could create an architecture without one if you really wanted to. But in practice, all architectures have an assembly language. First, it's very easy to create a new assembly language: give a textual name for all your machine opcodes and registers, add some syntax to represent the different addressing modes, and you're already mostly done. And even if all code was directly converted from a higher-level language directly to machine language, you still want an assembly language if only as a way of disassembling and visualizing machine code when hunting for compiler bugs, etc.

Every general purpose CPU has its own instruction set. That is, certain sequences of bytes, when executed, have a well known, documented effect on registers and memory. Assembly language is a convenient way of writing down those instructions, so that humans can read and write them and understand what they do without having to look up commands all the time. It's fairly safe to say that for every modern CPU, an assembly language exists.

Now, about whether programs are converted to assembly. Let's start by saying that CPU does not execute assembly code. It executes machine code, but there's a one-to-one correspondence between machine code commands and assembly lines. As long as you keep that distinction in mind, you can say things like "and now CPU executes a MOV, then an ADD" and so on. CPU executes machine code that corresponds to a MOV command, of course.

That said, if your language compiles to native code, your program is, indeed, converted to machine code before execution. Some compilers (not all) do that by emitting assembly sources and letting the assembler do the final step. This step, when present, is typically well hidden. The assembly representation only exists for a brief time during the compilation process, unless you tell the compiler to keep it intact.

Other compilers don't use an assembly step, but emit assembly if asked to. Microsoft C++, for example, takes an option /FA - emit assembly listing along with an object file.

If it's an interpreted language, then there's no explicit conversion to machine. The source lines are executed by the language interpreter. The bytecode oriented languages (Java, Visual Basic) live somewhere in between; they're compiled to code that is not the same as machine code, but is much easier to interpret than the high level source. For those, it's also fair to say they're not converted to machine code.

链接地址: http://www.djcxy.com/p/72412.html

上一篇: 编译器在这个汇编代码中做了什么？

下一篇: 所有程序是否最终转换为汇编指令？