HOME


6502 Machine Language

Machine language is the lowest possible level of programming. It's called machine language because it involves working directly with the computer hardware. Different computers use different hardware and so those differences are reflected in the machine language of each computer.

Typically when people say "Machine Language" they are actually referring to Assembly Language. Machine language is just the electrical signals bouncing around on the circuit board of your computer. Because these electrical signals have two possible states, On or Off, we use binary numbers as an abstract representation of these electrical signals. Assembly language is a second level of abstraction from the same electrical signals.

What's a Computer?

A computer is made up of a number of components which work together to make a functional machine.

Input
Input is any method used to feed data to the computer for processing or storage. Input devices include keyboard, mouse, graphics tablet, light-pen and others
Output
Output is the method used by the computer to provide data to the user. Output devices include monitor, printer, speakers
Processing
Processing is the execution of a series of tasks (by the computer) on data provided by the user (through input devices). Image and text manipulation are the most common processing tasks. Data conversion (eg text to morse code), page layout, scientific analysis and database management are other processing tasks
Short-term Storage
This is where the computer stores the programs and data it is currently working with. All personal computers use some variety of electronic RAM memory for short-term storage. Modern personal computers also count a portion of the hard disk as part of short-term storage and they expend much effort moving data back and forth between the hard disk and RAM. Short-term storage devices are erased when the computer is shut down or power is otherwise lost
Long-term Storage
Long-term storage is any storage device which retains data even when the computer is not on. Long term storage includes hard disks, floppy disks, Zip drives, CD-ROMs and Compact Flash units. All personal computers contain some amount of electronic ROM memory which is also long-term storage

What's a microprocessor?

A microprocessor is the brain of any computer. It is the component responsible for executing programs and generally directing all functions of the computer. In a personal computer the microprocessor is accompanied by other components which perform specialized functions like video, sound and general Input/Output. Taken as a whole these components make up the personality of a personal computer. It is the microprocessor which (loosely) defines aspects such as memory limits, speed and program structure. Understanding the microprocessor is fundamental to understanding computer programming.

Microprocessors don't understand English, nor do they really understand any programming language. You probably have heard that computers only understand numbers but that's not exactly true either. As with any digital circuit, the microprocessor is simply a device which accepts electrical signals and provides a (usually) predictable output in the form of other electrical signals. These signals sometimes take the form of 5 volts or 0 volts. We use binary numbers as an abstract representation of these voltages, thus we communicate with the computer using the abstract language of binary arithmetic. The resulting "language" is represented by long strings of ones and zeros. Long strings of ones and zeros are difficult to read so most programs are composed in languages using English-like words.

Electrical Signals
The way your computer really works is the passing of electrical signals in various combinations between digital (two-state) circuits. The voltages used are typically between 0 Volts and 5 Volts
Bus
A group of related electrical signals inside the computer. Really it's just a collection of wires which carry said signals from one component of the computer to another. When related electrical signals are grouped together, we call it a Bus.
Binary Numbers
Because we are not machines we represent the state of the computer's circuits numerically. The two possible states of any circuit are represented by 1 or 0. When electrical signals are grouped together (as on a bus) the series of ones and zeroes become a binary number
Machine Language
A low-level programming language which allows you to work directly with the computer hardware. ML is really just an abstract representation of the electrical signals in the computer in the form of long strings of ones and zeros

What is Memory?

As defined above, memory is where programs and data are stored. When we say "memory" we are usually referring to electronic memory. Electronic memory is packaged in an integrated circuit built from thousands of tiny transistors. A transistor is an electronic semiconductor component which can be made to act as a switch. Such a switch may be turned "On" or "Off". It is this property of transistors which allows us to represent the state of the computer hardware using binary numbers.

Binary numbers are well suited for this purpose because in binary there are only two digits, "1" and "0". Humans ususally express values in decimal which represents quantities in powers of 10. Computers use binary which represents quantities in powers of 2.

Bit (b)
A single Binary digIT. A bit represents a single transistor which may be switched on or off. The state of the transistor switch is represented by the value of the Binary digIT which may be 1 or 0. A bit containing a binary 1 is said to be Set or On ; a bit containing a binary 0 is said to be Clear or Off
Byte (B)
A byte is a complete binary word. It is the smallest quantity of data which can travel across the computer's data bus. Storage devices are measured in terms of bytes. Traditionally a byte is defined as 8 bits which can represent 256 possible values (2^8) ranging from 0 to 255
Most Significant bit (MSb)
In a binary word the MSb is the bit with the highest numerical value. In an 8-bit byte the MSb is bit 7, which has a numerical value of 128
Least Significant bit (LSb)
In a binary word the LSb is the bit with the lowest numerical value. That would be bit 0, which has a numerical value of 1
Kilo-Byte (KB)
Kilo is the metric prefix for "1,000 times". Because computers represent quantity in powers of 2, we round up to the nearest power of 2. One Kilo-Byte = 1,024 bytes (2^10)
Mega-Byte (MB)
1,024 KB or 1,048,576 Bytes (2^20)

What are Assemblers, Compilers and Interpretors?

As stated above, most people don't like working directly with electrical voltage levels or long strings of ones and zeroes. Long ago computer scientist and engineers came up with some solutions. Essentially all three boil down to the same concept; Computers should make work easier and that means computers should make programming easier. We want to program in a language we understand (ie english) and we want a computer program to make the necessary conversion to binary numbers the computer can work with. The programs which convert english-like words to binary numbers are Assemblers, Compilers and Interpretors.

Assembler
Assemblers are used to write programs in Assembly Language which is the least english-like programming language. Assemblers work by translating instructions from a text file (prepared by the programmer) into Machine Language (long strings of ones and zeroes), in a process called assembly. When the assembly process is complete the result is a purely binary file called an Executable, which the microprocessor can use directly. AL programs use cryptic abbreviations and contain lots of binary and hexadecimal numbers, which makes the program difficult to decipher by anyone not familiar with AL. The greatest advantage to AL is speed and compactness in the executable. Each microprocessor family has it's own assembly language because the machine language executables they create work directly with the computer hardware. In this document we focus on the assembly language of the 6502 microprocessor
Compiler
A compiler works similarly to an assembler in that it's job is to translate a text file into an executable. The greatest distinction between the two is the language used to compose the program. Compilers typically use a language containing english words and decimal numbers and such programs are easier for novice programmers to decipher. Compiled programs depend upon pre-packaged libraries of routines to perform common tasks, which results in larger and slower programs; that's the trade-off for easy learning. Compiled languages have the characteristic of portability because they don't depend on the hardware of any particular computer. C, Pascal, COBOL and Modula are all compiled languages
Interpretor
Interpreted languages are very different because they don't create executables. In an interpreted language a program called the Interpretor resides in the computer's memory at the same time as the program you are writing. The interpretor contains a text editor component called a Parser, which interactively translates the commands you enter into codes understood only by the interpretor itself called Tokens. When you execute your program the interpretor reads back the tokens in sequence and uses them to call the machine language routines which actually perform the tasks you intended. Interpreted languages are the easiest to learn because the environment you write your program in is the same environment you execute the program in. Because the extra step of compiling (or assembling) is eliminated you can exterminate bugs and add to your program with great ease. Interpreted languages use english words and english-like "sentence structures". Programs written in interpreted languages run slowly because the program you write is never converted to machine language. The BASIC languages in Apple II, Commodore and Atari computers are all interpretors

About the 6502

The 6502 is the most popular microprocessor from personal computers of the 70's and 80's. It has appeared in computers manufactured by Apple, Atari, Commodore and others. Over the years many variants of the 6502 have appeared with various improvements and additions. The 6510 used in the C-64 adds an 8-bit bi-directional I/O port. The 65c02 used in the Apple IIc uses CMOS construction to run cooler (sometimes faster) and adds some useful instructions. Other incarnations added timers and internal memory to the 6502.

All variations of the 6502 share the same base instruction set and that is what makes them a "family". This base instruction set is structured in such a way that each instruction is one byte. Each instruction may be immediately followed by one or two operand bytes (or no operand at all.) The number of operand bytes is dictated by the instruction itself.

Talking to the Outside World

The 6502 is an 8-bit microprocessor, so named because the data bus and internal registers are 8 bits wide (exception: the Program Counter is 16 bits.) When data larger than 8 bits is to be processed the 6502 must perform multiple fetch operations and will do so automatically for instructions with 16-bit operands. There are three primary busses used to interface 6502 to the outside world:

Data Bus
8 bits wide, carries data bytes to/from memory and I/O devices
Address Bus
16 bits wide, carries address information from 6502 to memory
Control Bus
Used to control the operating mode of 6502 or disable it completely

Most times programmers only worry about the information on the Data and Address Busses, although the Control Bus really does affect programs and how (or whether) they function. The Control signals of the 6502 don't truly represent a "bus" (and that term is an abstraction anyway.) Some of the control signals can be queried through the Processor Status Register (P).

Addressing

The 6502 sees memory as a long series of sequential addresses. Each address represents a "box" where data is stored. Each "box" contains 8 bits which represet a value from 0 to 255. When the 6502 wants to store or retrieve a value from a particular memory location, it places the address of that memory location on the address bus. The control bus signals whether the access is to be a read (from memory into 6502) or write (from 6502 into memory).

The Address Bus of the 6502 is 16 bits wide. That means 6502 can access only 65,536 unique memory addresses (2^16 bytes.) All of the computer's RAM, ROM and memory-mapped I/O devices must fit within the 65,536 possible addresses. Most 6502-based computer systems manage to do this by Bank Switching. Bank Switching allows more than one device to occupy the same space in the memory map by using a Soft-Switch to control which device is currently present at that address. A Soft-Switch is a kind of memory-mapped I/O which can act as a toggle switch between two possible hardware states.

Memory Location
A single byte of memory, identified by a unique numerical address within the 6502 address space
Address
The unique 16-bit (two bytes) value used by the 6502 to identify a specific memory location. When 6502 wants to access a memory location, it places the address of that memory location on the Address Bus. In a 6502 ML program, addresses are stored low-byte first and high-byte second
Address Space (AKA Memory Map)
The totality of all possible addresses which the 6502 can uniquely identify. Because the Address Bus of the 6502 is 16 bits wide, it can uniquely identify 65,536 (2^16) addresses. The address space of 6502 is counted from 0 ($0000) to 65,535 ($FFFF)
Memory-Mapped I/O
An Input/Output device which can be manipulated directly by the 6502. Memory-mapped I/O devices provide a number of Registers into which 6502 may store specific values to control the state of that I/O device. These registers are incorporated into the 6502 Memory Map at specific reserved addresses. Memory-mapped I/O devices on 6502 computers include chips like PIA, VIA and CIA as well as the many video and sound chips which have been used those computers over the years
Register
A memory location reserved for the control of a specific hardware feature. Memory-mapped I/O devices have registers external to the 6502 and are accessed through the computer's Address Bus and Data Bus. 6502 also has it's own internal registers which control or reflect the operating mode of 6502
6502 Memory Map (Abstract)
$0000$C000/D000$FFFF
RAM I/O ROM

The Internal Registers

6502 contains a number of internal registers which serve as temporary data repositories or which modify/reflect the processor operating mode. Most registers can be read/written by instructions but some are read-only and cannot be directly changed.

Accumulator (A)
8 bits, primarily used as a temporary holding area for data but also holds the result of any math instructions
X Index (X)
8 bits, primarily used to modify fixed addresses with a variable value but can also be used for temporary storage
Y Index (Y)
8 bits, primarily used to modify fixed addresses with a variable value but can also be used for temporary storage
Processor Status Register (P)
8 bits, constists of a series of flags which either reflect or affect the processor operating mode
Program Counter (PC)
16 bits, used internally by 6502 to mark the current address of program execution. Cannot be read directly but can be written by certain instructions
Stack Pointer (S)
9 bits (the ninth bit is always set to 1), marks the next available location in the 6502 external stack, can be read or written by certain instructions
6502 Programming Model (C-64 PRG p. 415)
1
5......8
7......0Description
AAccumulator A
YIndex Register Y
XIndex Register X
PCHPCLProgram Counter PC
-------1SStack Pointer S
NV-BDIZCProcessor Status Flags P 

The Processor Status Flags allow your program to control/monitor the various modes of processor operation. Here's a detailed description of their functions.

N
The Negative flag reflects the sign (+ or -) of a value in the Accumulator, X Index or Y Index. N will be affected whenever the MSb (bit 7) * of A, X or Y changes. N will be set (1) ** if the MSb is set and clear (0) if the MSb is clear. N is affected by all these instructions: ADC AND ASL BIT CMP CPX CPY DEC DEX DEY EOR INC INX INY LDA LDX LDY LSR ORA PLA PLP ROL ROR RTI SBC TAX TAY TSX TXA TXS TYA
V
The oVerflow flag indicates a carry from bit 6 to bit 7 of A. It is mostly used in signed arithmetic where bit 7 is the sign flag of a value. V is affected by: ADC BIT CLV PLP RTI SBC
-
- is not really flag but is an unused bit of the Processor Status Register. - is bit 5 of P and is always set (1)
B
The Break flag is used to indicate that a program-generated interrupt has occured. If the source of the interrupt is a BRK instruction then B is set. Any interrupt caused by the 6502 IRQ or NMI inputs automatically clears B. PHP also causes B to be set although I'm not sure why (I welcome remarks from anyone who understands this better!)
D
The Decimal flag is used to enable 6502 Decimal mode arithmetic. Decimal mode is enabled when D is set; binary mode is enabled when D is clear. D is set by SED and cleared by CLD but is also affected by PLP and RTI.
I
The Interrupt Disable flag is used to mask interrupts from the 6502 IRQ input and is ideal for use in interrupt handler routines. It has no effect on program interrupts generated by BRK. Use CLI to allow a program to be interrupted by IRQ and SEI to disallow interrupts from IRQ. I is set by BRK and is also affected by PLP and RTI.
Z
The Zero Flag is set whenever an operation causes a result of 0 (zero) in A, X or Y. Z is affected by all these instructions: ADC AND ASL BIT CMP CPX CPY DEC DEX DEY EOR INC INX INY LDA LDX LDY LSR ORA PLA PLP ROL ROR RTI SBC TAX TAY TSX TXA TYA
C
The Carry flag indicates that the result of a math operation exceeds 8 bits and in this sense it can be thought of as the "ninth" Accumulator bit. C is affected by all these instructions: ADC ASL CLC CMP CPX CPY LSR PLP ROL ROR RTI SBC SEC

Assembly Language Revealed

As stated above, assembly language is an abstract representation two levels above the electrical signals in your computer. No human could make heads or tails of the millions of signals jumping across the various circuits at the speed of light every machine cycle. You'd have to be a super-genious to write a program of any size in pure binary numbers. (Although some gifted folks can do this, I don't recommend trying it or you risk your own sanity ;)

Operation Codes
Machine Language programs are made up of instructions given to the microprocessor. These instructions are represented by binary values which are 8 bits wide (one Byte) for the 6502. We call these instructions Operation Codes, or OpCodes for short
Operand
An operand provides the data which the opcode will work on. The operand will either be the literal data itself or a Pointer to the data. Operands take the form of one or two 8 bit bytes which immediately follow the opcode in the ML program sequence. As will be seen later, not all opcodes require an operand
Pointer
A pointer is a one or two byte value which identifies the numerical address of a memory location and is sometimes called a Vector. Most pointers require two bytes because the address bus is 16 bits wide. Pointers to Zero Page addresses only require one byte if the high byte of the address is assumed to be $00. On the 6502, pointers are stored in memory low byte first
Mnemonics
In Assembly Language we don't use binary OpCodes. Instead each opcode is represented by a three-letter abbreviation. The abbreviations are called Mnemonics because the words spelled out by each abbreviation sounds like what the opcode does

Addressing Modes

Assembly language programs are made up of mnemonics. The function of most mnemonics can be modified by an Addressing Mode. An addressing mode tells the computer where to find the target of any particular mnemonic. In 6502 assembly language, we indicate the addressing mode of a mnemonic in the operand. Considering all addressing modes of all mnemonics, each possible combination of mnemonic/address mode corresponds directly to a unique opcode.

Addressing Modes of the 6502
Addressing ModeSymbolOperand SizeDescription
Immediate#value
eg
LDA #7
1 byteThe target of the mnemonic is a single byte immediately following the instruction
Impliedeg
INX
0 bytesThe target of the mnemonic is defined implicitly by the instruction itself
Accumulatoreg
ASL
0 bytesThe instruction operates explicitly on the Accumulator; really a form of Implied Addressing
Absoluteaddress
eg
LDA $7777
2 bytesThe target is the memory location whose address is the two bytes immediately following the opcode
Zero Pageaddress
eg
LDA $77
1 byteThe target is the Zero Page memory location whose address is the byte immediately following the opcode
Relativeaddress
eg
BEQ $7777
1 byteThe target is the memory location whose address is calculated by summing the Program Counter with the signed integer which immediately follows the opcode
Absolute Indexed by Xaddress,X
eg
LDA $7777,X
2 bytesThe target is the memory location whose address is calculated by summing the 16-bit integer immediately following the opcode with the contents of the X register
Zero Page Indexed by Xaddress,X
eg
LDA $77,X
1 byteThe target is the memory location whose address is calculated by summing the 8-bit integer immediately following the opcode with the contents of the X register. The sum is restricted to 8 bits, thus the target is always in Zero Page
Absolute Indexed by Yaddress,Y
eg
LDA $7777,Y
2 bytesThe target is the memory location whose address is calculated by summing the 16-bit integer immediately following the opcode with the contents of the Y register
Zero Page Indexed by Yaddress,Y
eg
LDA $77,Y
1 byteThe target is the memory location whose address is calculated by summing the 8-bit integer immediately following the opcode with the contents of the Y register. The sum is restricted to 8 bits, thus the target is always in Zero Page
Indirect(address)
eg
JMP ($7777)
2 bytesThe target is the memory location whose address is contained in the two memory locations pointed to by the two bytes immediately following the opcode. This addressing mode has a known bug which causes the target to be miscalculated when those two memory locations cross a page boundary; the bug was corrected as of the WDC 65c02
Indirect Indexed (AKA "Indirect by Y")(address),Y
eg
LDA ($77),Y
1 byteThe target is the memory location whose address is calculated by summing the 16-bit integer contained in the two Zero Page memory locations pointed to by the byte immediately following the opcode with the contents of the Y register
Indexed Indirect (AKA "Indirect by X")(address,x)
eg
LDA ($77,X)
1 byteThe target is the memory location whose address is contained in the two memory locations whose Zero Page address is calculated by summing the byte immediately following the opcode with the contents of the X register

As you can see, there is an addressing mode for every occasion. The 6502 programmer should take great care in selecting the addressing mode which best suits his needs. Most times addressing modes are selected based on what is easiest to code and that is usually fine. Sometimes when speed and memory constraints are critical considerations it pays to spend some time considering what mode to use.

Page After Page

As you now know, a 6502 address is 16 bits wide. That means two 8-bit bytes are used to represent an address. The byte with the greatest numerical value is called the high-byte; the byte with the lowest numerical value is called the low-byte. You also should know that addresses in ML programs are stored low-byte first. In the address $C0F0 the high-byte is $C0, the low-byte is $F0 and the address is stored in memory as "... $F0,$C0 ..."

16-bit addresses are stored low-byte first
Binary1100000011110000becomes11110000,11000000
Hexadecimal$C0F0becomes$F0,$C0
Decimal49392becomes240,192
High-Byte (aka Most Significant Byte)
In an address or other multi-byte value, the byte with the greatest numerical weight. The upper 8 bits of a 16-bit value
Low-Byte (aka Least Significant Byte)
In an address or other multi-byte value, the byte with the lowest numerical weight. The lower 8 bits of a 16-bit value

There is a more advantageous way to view memory which is closer to the way the 6502 actually works. Imagine memory as a book with 256 pages and each page containing 256 words. Now the high-byte becomes the Page Address and the low-byte becomes the Byte Offset. This analogy works and the terminology makes more sense than to view memory as one big line.

A page consists of 256 bytes
$00$01$02$03$04$05$06$07$08$09$0A$0B$0C$0D$0E$0F
$10$11$12$13$14$15$16$17$18$19$1A$1B$1C$1D$1E$1F
$20$21$22$23$24$25$26$27$28$29$2A$2B$2C$2D$2E$2F
$30$31$32$33$34$35$36$37$38$39$3A$3B$3C$3D$3E$3F
$40$41$42$43$44$45$46$47$48$49$4A$4B$4C$4D$4E$4F
$50$51$52$53$54$55$56$57$58$59$5A$5B$5C$5D$5E$5F
$60$61$62$63$64$65$66$67$68$69$6A$6B$6C$6D$6E$6F
$70$71$72$73$74$75$76$77$78$79$7A$7B$7C$7D$7E$7F
$80$81$82$83$84$85$86$87$88$89$8A$8B$8C$8D$8E$8F
$90$91$92$93$94$95$96$97$98$99$9A$9B$9C$9D$9E$9F
$A0$A1$A2$A3$A4$A5$A6$A7$A8$A9$AA$AB$AC$AD$AE$AF
$B0$B1$B2$B3$B4$B5$B6$B7$B8$B9$BA$BB$BC$BD$BE$BF
$C0$C1$C2$C3$C4$C5$C6$C7$C8$C9$CA$CB$CC$CD$CE$CF
$D0$D1$D2$D3$D4$D5$D6$D7$D8$D9$DA$DB$DC$DD$DE$DF
$E0$E1$E2$E3$E4$E5$E6$E7$E8$E9$EA$EB$EC$ED$EE$EF
$F0$F1$F2$F3$F4$F5$F6$F7$F8$F9$FA$FB$FC$FD$FE$FF

6502 Memory Map (Abstract)
Page $00Page $01...Page $CF Page $D0...Page $DF Page $E0...Page $FF
RAM I/O ROM

Back to the go6502 Home

This page was last updated March 18, 2008
© Dredd Productions, Ltd

Hosting by WebRing.