Understanding Hardware/Computers as a System

Power On Self Test (POST)

In order to ensure the components present on the motherboard of a computer are functioning properly, and assuming a (at least) somewhat funcional processor, the firmware stored in the BIOS flash chip is run through the attached CPU which will direct the computer to send messages to available devices and wait for a response. The approach used to test varies depending on the device being tested (testing RAM would require a different set of instructions to testing the timer or interrupt controller), and the firmware on the BIOS ROM chip will likely vary with each board (and may even be changed if needed).

However, given that we are assuming the CPU can process machine instructions, these programs have leniency in their complexity (up to what can be stored in the BIOS ROM). For example, the firmware on the early IBM personal computers had a very small amount of rather unreliable RAM (especially compared to modern computers) and so there were instructions to test every byte of RAM on a cold boot (start from complete power off) [citation].
On modern machines this is not particularly feasible nor necessary, and there exist separate tools to hardware test RAM if needed, so it is more common to just do a very simple write to and read from some section of RAM to ensure it is able to be reached and controlled properly. As an example of how these instructions may be written now, here is some assembly by Michael Billington for the hobby computer he built using a WDC 65C816 processor and a custom motherboard:

POST Low RAM Test (Written in 6502 Assembly)
post_check_loram:
ldx #%01010101         ; Power-on self test (POST) - do we have low RAM?
ldy #%10101010
stx a:$00              ; store known values at two addresses
sty a:$01
ldx a:$00              ; read back the values - unlikely to be correct if RAM not present
ldy a:$01
cpx #%01010101
bne post_fail_loram
cpy #%10101010
bne post_fail_loram
jmp post_check_hiram

Ultimately, some test along these lines should be defined in the BIOS firmware for major hardware elements to ensure that each is "functional" up to a certain standard. This may include sending interrupts to initialize system devices such as a video card or optical disc drive. For another example, Michael Billington also attached a Versatile Interface Adapter in order to have peripheral I/O with his built computer, for which his POST assembly can be seen a bit further down on the same page as linked above.

It is important to note that this is still a vast generalization of what occurs in a POST as each motherboard will likely be different in its approaches. This can be clearly seen through how encountered errors are reported, with some motherboards utilizing beeps from an attached speaker and others using flashing LEDs (even within these approaches the number of flashes/beeps may mean something different per motherboard). As such, when working with unfamiliar (and even familiar) hardware it is important to check documentation when any errors occur.

If you would like a real world example of all the assembly run during a POST (on an older machine), you can view all of the x86-16 assembly used in the IBM 5160's BIOS in section 5 of their March 1986 Technical Reference which has been archived here. If you'd like to skip to the POST assembly it begins on page 232 of that linked PDF.

How to Handle a Failed POST

Depending on the motherboard and the tests included in its firmware, a POST may fail for any number of reasons. These failures may be reported in a variety of ways (beeping, a flashing LED, a code appearing on a screen/seven-segment display, etc.), and the codes indicated may vary in meaning for each motherboard. For example, lets imagine a scenario where we are working with a motherboard which uses an AMIBIOS BIOS ROM chip. We connect the CPU to the motherboard as well as a single stick of RAM and power on the machine, allowing the POST process to run. Suddenly, the motherboard's attached speaker lets out 3 short beeps, notifying an error in the POST process (a single long beep would indicate success). In order to get a hint about which part has failed we consult the manual for our motherboard (or BIOS manufacturer), which may look similar to the one compiled here.

When looking at the table of beep codes provided, we can see that 3 short beeps indicates that a "memory failure has occurred in the first 64K of RAM". This lets us narrow down our search a little bit (ignore that there weren't many possible problem components in this example to begin with) as the issue is likely a problem with the RAM or RAM slot. In order to become more confident about which part is the culprit, we can take multiple approaches depending on what tools are available:

  • If you have another stick of RAM which is known to be working you can put it in the same slot to see if the issue persists.
  • If you have a separate, known to be working machine you can slot the suspect RAM into that and test as well.
  • If you have a dedicated RAM tester (like a RAMCheck LX) you can plug the RAM stick into that and verify its quality.

In the event that the POST succeeds when the suspect RAM is in a different slot, the culprit is now known to be some issue with the RAM slot/a problem leading into the slot. If the RAM continues to fail in a different slot, and especially if it fails in a test machine or RAM checker we know that the RAM stick itself has some issue. Were the RAM to work in a test machine/RAM checker or a the POST fails with a known-working RAM stick it becomes more likely that the motherboard itself or processor has some fault that is causing a misidentified POST error. In any case, we likely further narrow down our focus for further searching.

While I have been focusing on the example of RAM for this, this methodology of testing using known working components or dedicated hardware validators greatly improves the ability to determine the root cause of an issue for a variety of issues. Different components will have different sets of tests to run (e.g. testing a RAM card requires a different suite of tests than a video card adapter), but the overall pattern is approximately the same for any failing component.
Note, however, that this pattern is not guaranteed to work cleanly every time as hardware is rather delicate and transferring to and from test machines may damage the hardware being tested or the test machine itself. This method is also not also easily accessible to hobbyists as having known working components easily accessible is a luxury itself, however there are likely stores, such as Best Buy, near you which will allow for testing of parts on validated machines if needed. Regardless, the risks associated with this shouldn't be a prohibitive worry and instead just a case to be wary of as you attempt to determine the cause of a failed POST.

As a final point, understand that the tests conducted in POST are not capable of ensuring working parts, and only serve to improve the confidence that each part is functional up to some standard. Some issues within a component which make it cause the POST to fail may be somewhat arcane and not occur every time, and many more complex issues will not be detectable by a simple POST at all. In the modern day this is alleviated with the components themselves often having some validation mechanisms built in, but nothing is perfect and so unseen problems may still exist.