Interlopers.net - Half-Life 2 News & Tutorials

by **zombie@computer** on Fri May 11, 2012 11:20 am

From equation to hl2
Part two of the introduction to ‘life, the universe and everything’ will be about… hardware. Yay!
Although most people know what a CPU is, rarely do people realize how it works. A CPU is pretty complicated, but, works in a very simple manner.

Imagine a simple calculator. You punch in some number (4), some operator (+) and another number (2) and you get the answer on the screen (6). Such a calculator is so simple everybody understands how it works. It's also easy to understand how a more difficult equation, say (5*4)+2, requires two calculations on this simple calculator: first for 5*4, resulting in 20, and then 20+2, resulting in 22.
Many people don’t like these overly simple calculators because it takes a lot of work to calculate something like (5*4)-4/12+4-12. Especially since there are much smarter calculators that do calculations like this in one go.

So, back to our CPU. Surely your hexacore Pentium can calculate (5*4)-4/12+4-12 in one go? The answer may surprise you: No. a CPU cannot calculate such an equation in one go. The CPU works similar to the simple calculator.

Calculating a simple equation
First, the equation has to be 'parsed'. That is, the list of characters has to be translated into some code that can be executed. Somehow…

A lexer (reader) splits the equation into chunks. Although pretty obvious to you, the computer needs to explicitly split the equation into brackets, numbers and operators.
Now that we have a list of chunks, it is time to do something with them; This is what the parser does. It looks at each chunk, and decides what it is. 5 is a number, that is obvious, but when it encounters a '-', is that an operator (as in 5-3) or is it a sign (-5)? The parser also rearranges the actions required to perform the equation (for instance, start with 5*4, then 4/12, then subtract both answers, etc).

If you are interested in how this may work, you can look up the 'shunting yard algorithm' on Wikipedia. This algorithm rearranges a list of chunks into a more-or-less executable list of commands.

Now that we have broken down the equation into simple parts, the CPU can tackle the equation entirely by executing each part separately in the sequence the parser made it. The result is automatically the result of the last calculation performed. Of course there are quite a few steps to take before a CPU can do this…

Memory
Of course, just like a calculator, our computer has memory. In the equations above, our CPU can make use of the memory store and memory recall commands just like we can in our calculator. But unlike our calculator, the CPU has quite a bit of places where it can store data. It wouldn't be very smart if the CPU can store data but didn't know where it put it, right?
In your computer every bit of memory has a certain address. Computers don't care much for readability so the address is just a simple number. The first location in your memory has the address 0, the second has the address 1, etc. Note that the address zero (0) is a reserved address, specifying 'nowhere'. Reading from or writing to nowhere results in an error!

Using numbers for addressing allow for faster access. For instance, let's say we have three numbers in our memory of 4 bits each, the first one is stored at address 4. If we need to read the first bit of the third number, we can simply calculate its address:

Address = 4 + (2*4)

This would not have been possible if that address' name was 'main street 12b' !

Although it might sound strange, all memory in your computer 'lives in the same street'. That is, an address is unique on each computer; If the address 5 is on your RAM, it will not be on your video card RAM. In fact, a lot of devices on your motherboard that communicate with the CPU by transferring data each have a small range of this 'address space'.

Since each bit has its own address, the addressing numbers can get pretty large. In fact, if addresses are expressed in 32 bits numbers, we can only have a max of 4 294 967 296 different bits, or 4 gigabytes. (yes, that's where that limitation comes from).
For debugging purposes these almost unreadable numbers are presented in hexadecimal values (base 16, where 11=A, 12=B, 13=C, 14=D, 15=E, 16=F). Hexadecimal numbers are always preceded with 0x, so you know it's a hexadecimal.

The Instruction at 0x####### referenced memory at 0x#######. The memory could not be 'read'

Now you almost know what that error means! If only you knew what this 'instruction' was!

The CPU, more in depth
So, exactly how does one make a CPU calculate something? You control a CPU using 'instructions', simple mathematical assignments the CPU chip knows how to execute. Instructions are stored in memory, and you simply tell the CPU to execute the instruction at address #. At this address the CPU finds a number, and this number corresponds to a unique assignment the CPU can carry out. When finished, the CPU moves on to the next assignment in memory.

Example: Calculate 1+3 = ?
First of all, you need the CPU to load data (you can't do math on nothing…). A CPU can load data into places called 'registers': tiny bits of memory (usually 16, 32 or 64 bits, technically named WORD, DWORD and QWORD)in the core of the CPU itself. Then you need the CPU to execute a command. Let's say we have a simple CPU with three registers: two input registers and one output register.
The '+' command tells the cpu to add the number in register 1 to the number in register 2 and store the result in register 3:

Code: Select all: Load 1 into first register Load 3 into second register Execute + Store value in third register into memory at address 1

Seems simple, right? In actuality these instructions have to be read from memory somewhere, so it becomes even more complicated:

Imagine the command '‘load value into first register' is command 1;
Imagine the command 'load value into second register' is command 2;
Imagine the command 'execute +' is command 4;
Imagine the command 'save value in third register to memory is command 3;
Imagine 'x' is the next assignment the CPU should carry out.
Imagine that each command consists of two numbers: the first command is 'which command to execute' and the second number is some parameter; e.g. The memory address to read from or write to.

Now imagine our memory looks like this:

Code: Select all: 0;1;12;2;13;4;0;3;13;x;1;3;0

And we tell the CPU to 'start the instruction at address 1'; What would happen?

Code: Select all: Execute instruction at address 1: Loads value at address 10 (=1) into first register Execute instruction at address 3: Loads value at address 11(=3) into second register Execute instruction at address 5: Execute + Execute instruction at address 7: Save value in third register to memory(address=12) Execute instruction at address 9: do the rest

In case you were wondering, this REALLY is how simple your CPU works… Can you imagine how many of these instructions a game like hl2 will execute every second?

CPU's don't have fancy decision trees and whatnot. They simply execute a list of commands indefinitely. They even don't know how to do nothing. They simply execute a 'WAIT' command which makes the CPU wait a bit. If CPU is constantly working, how will it know you pressed a button? Simple. Your keyboard issues an 'interrupt request' (IRQ). Every device in your computer can issue such a request. Basically, there's a bit of memory reserved in which a device can leave a short message telling the CPU to 'stop what you are doing and listen to me!'(again, since this is a computer, this message is a simple number). In this case, an address of a command that is supposed to be executed. Every now and then the CPU checks its IRQ registers to see if there has been an IRQ request. If there is, the CPU reads the address from the IRQ register and executes the command listed in the address specified.

Registers are built for extreme speed, so the data they contain must be used optimally. You don’t want to use registers for big numbers to calculate 1+1 (waste of CPU power), but the reverse may crash your computer (numbers too big to fit in the register). Because there's lots of different sizes and shapes of data, our CPU's have lots of different registers. There are integer registers (for whole 32 or 64 bits numbers), floating point registers (for any non-whole number), etc. Modern CPU's even have entire subsections specialized in certain types of data or calculations.

I'm sure you can imagine that various complicated calculations are often used (many 3D calculations are based on dotproducts and vector multiplications, for instance), and you do not want to use a page-long list of instructions to calculate these. For instance imagine you want to multiply three floating point numbers (eg. 4.3, 6.8, 0.4) by another number (eg 3.2), that would require many lines of instructions using the above method. But, with a special instruction, a lot less:

Code: Select all: Load 4.3 into first floatingpoint register Load 6.8 into second floatingpoint register Load 0.4 into third floatingpoint register Load 3.2 into fourth floatingpoint register Execute VectorMultiply Store value in fifth register into memory at address 1 Store value in sixth register into memory at address 5 Store value in seventh register into memory at address 9 Store value in eighth register into memory at address 13

See? The actual calculation cost just one instruction!

Usually, each new generation of CPU's comes with a whole set of these more advanced instruction sets. You probably heard of these, though you may never realized what they meant. Names like AMDnow!, SSE, X64, X86, are just predefined lists of instructions a CPU should be able to handle.

Cache
If you have complicated calculations and you want to do them fast, it may help to not store intermittent results in your 'slow' RAM, but instead store it temporarily in the CPU's cache. A CPU has different levels of cache, each level is larger but slower than the other. You don't want the CPU to run out of cache, but you also don't want the CPU to wait for data being retrieved from slower cache levels.

And what about the rest of the pc?
Your CPU is a simple chip, it does not control the computer in any way. When you turn the computer on, the first thing that kicks in is the bios chip. It has a small set of instructions that wake up various parts of the computer, and tells the CPU to 'execute the command at memory 1' or wherever the first command is.
The northbridge and southbridge are two chips on your motherboard that route traffic. The northbridge often requires the most cooling and routes traffic between the CPU, RAM, southbridge and the pci-express controller. The southbridge connects those devices that don’t need that much speed. Soundchips, USB, Sata (harddisks and optical drives) etc. A lot of onboard devices are connected to the southbridge as well.
If the CPU writes a '2' to an address that belongs to your LAN-controller, these chips make sure the data gets there.

Northbridge (IOH) and Southbridge (ICH) on a motherboard diagram

So, for instance, how does your CPU turn off your computer? Simply by sending a command (again, just a predefined number) to the device that turns off your computer. How does the CPU send a packet of data over the network? Again, simply by sending a bunch of numbers to the memory addresses that belong to the network controller.

As you can see, the CPU is in trouble if it doesn't know what commands to send where. Luckily that's what drivers are for: lists of code windows or linux can use to determine which codes are required to control the devices in your computer. These lists are stored in memory and when they are required can be read from memory by the CPU. The enormously complex list of code that magically controls your computer the way program and OS'es need to do is called a 'kernel'. It is the core of any OS.

From Code to Instructions.
You may know how to program or not, but how does one transform code into a list of instructions for the CPU? Nowadays CPU instructions are rarely written by 'hand'. The closest thing we have at this moment is assembly, which is a coding language built up of CPU instructions. It is very rare that people code in assembly. Even the simplest tasks take pages of code, and if future CPU's have more optimized ways of doing what you coded in assembly, you can rewrite those pages anew!
Who still code in assembly? Mostly people who need extreme speeds (e.g. those who write drivers or core parts of operating systems (e.g. kernels), or small essential parts of a game) who know exactly what they want. Oh, and nerds who want to show the world how nerdy they are. (I do not b.t.w., i'm too lazy)

Years back everybody coded in assembly, but it didn’t take long for people to realize they coded the exact same things over and over again. Feel free to visit this link to see just how many instructions (each line in assembly is one instruction) it takes just to glue two pieces of text together!

People started grouping code they used often and release them in packages. This is how the first operating systems started: as collections of code. Over the years the packages expanded, and people again felt like they needed to repeat a lot of their work and wanted something else.

If things are too much work, you let a computer do it. It's the same with coding. The first 'compilers' started to appear: program that transformed simple bits of text, e.g. equations or bits of code, into lists of assembly. Uniformity started to grow and languages started to appear. Nowadays programming languages form somewhat of a hierarchy (listed from low-level to high-level):

Assembly (more or less executable directly)
C (gets compiled into assembly)
C++ (gets translated into C)
Java (gets compiled into java byte code, which is interpreted and executed by a C++ based executable)
dotNet framework langues (ie C#, Visual Basic#, F#): a group of languages that get compiled into a semi-executable binary file, that, right before being executed, gets compiled into assembly.

I could go on but I don't want to. Hope you found this interesting and realize just how complicated your computer works, considering how simple it is. Also consider that everything you do, anything you see, everything you hear on your pc eventually comes down to a simple list of instructions. A freakishly long string that is.
I tried to be as accurate as possible.

OK, my boredom is over, back to work!

by **SotaPoika** on Fri May 11, 2012 11:34 am

Thanks for this Z@C. Will give proper read for it later today.

by **Jordash** on Fri May 11, 2012 12:41 pm

Great read, thank you.

by **Epifire** on Fri May 11, 2012 3:36 pm

I knew a lot of the terms but couldn't a account for what they did. Thanks zombie, I really did get a lot out of that.

by **skoften** on Fri May 11, 2012 5:47 pm

overflow the stack, hack the planet

by **stoopdapoop** on Tue May 22, 2012 7:55 pm

nice writeup.

Maybe you should get a job as a parking lot attendant or something else equally boring so we'll get more stuff like this.

also, just wanted to point out that C++ compilers don't compile into C anymore

by **joe_rogers_11155** on Wed May 23, 2012 1:34 pm

i am writing here so this appears in my list of threads to read later. i love this kind of entry-level stuff because it brings me back to earth. well-writen, easy to understand.

by **MaK** on Thu May 24, 2012 5:30 am

I feel smarter now.

by **boing** on Thu May 24, 2012 8:36 am

i feel stupid now

by **Stormy** on Thu May 31, 2012 7:39 am

Great info, although it didn't assist my workflow in any way it is certainly good to know. Where is part 1?

by **stoopdapoop** on Thu May 31, 2012 10:52 am

viewtopic.php?f=2&t=36032

I'd love to know why he put it in the Hammer editor help forum though

Interlopers.net - Half-Life 2 News & Tutorials

Site Navigation

Bored again, here's some info to chew on

Bored again, here's some info to chew on

Re: Bored again, here's some info to chew on

Re: Bored again, here's some info to chew on

Re: Bored again, here's some info to chew on

Re: Bored again, here's some info to chew on

Re: Bored again, here's some info to chew on

Re: Bored again, here's some info to chew on

Re: Bored again, here's some info to chew on

Re: Bored again, here's some info to chew on

Re: Bored again, here's some info to chew on

Re: Bored again, here's some info to chew on

Who is online