PROCESSOR DESIGN AND I/O PRGANIZATION: General Register Organization, Stack Organization, Addressing Mode, Instruction Format, Data Transfer And Manipilation, Program Control, Reduced Instruction Set Computer, I/O interface, Modes Of Transfer, Direct Access Memory, Interrupts And Interrupt Handling, ,Input Output Processor, Serial Communication



The number of registers in a processor unit may vary from just one processor register to as many as 64 registers or more.
  • One of the CPU registers is called as an accumulator AC or 'A' register. It is the main operand register of the ALU.
  • The data register (DR) acts as a buffer between the CPU and main memory. It is used as an input operand register with the accumulator.
  • The instruction register (IR) holds the opcode of the current instruction.
  • The address register (AR) holds the address of the memory in which the operand resides.
The program counter (PC) holds the address of the next instruction to be fetched for execution. Additional addressable registers can be provided for storing operands and address. This can be viewed as replacing the single accumulator by a set of registers. If the registers are used for many purpose, the resulting computer is said to have general register organization. In the case of processor registers, a registers is selected by the multiplexers that form the buses. When a large number of registers are included in the CPU, it is most efficient to connect them through a common bus system. The registers communicate with each other not only for direct data transfers, but also while performing various micro-operations. Hence it is necessary to provide a common unit that can perform all the arithmetic, logic and shift micro-operation in the processor.

A Bus organization for seven CPU registers

The output of each register is connected to true multiplexer (mux) to form the two buses A & B. The selection lines in each multiplexer select one register or the input data for the particular bus. The A and B buses forms the input to a common ALU. The operation selected in the ALU determines the arithmetic or logic micro-operation that is to be performed. The result of the micro-operation is available for output and also goes into the inputs of the registers. The register that receives the information from the output bus is selected by a decoder. The decoder activates one of the register load inputs, thus providing a transfer both between the data in the output bus and the inputs of the selected destination register. The control unit that operates the CPU bus system directs the information flow through the registers and ALU by selecting the various components in the systems. R1 ® R2 + R3
  • MUX A selection (SEC A): to place the content of R2 into bus A
  • MUX B selection (sec B): to place the content of R3 into bus B
  • ALU operation selection (OPR): to provide the arithmetic addition (A + B)
  • Decoder destination selection (SEC D): to transfer the content of the output bus into R1
These form the control selection variables are generated in the control unit and must be available at the beginning of a clock cycle. The data from the two source registers propagate through the gates in the multiplexer and the ALU, to the output bus, and into the into of the destination registers, all during the clock cycle intervals.



  • Very useful feature for nested subroutines, nested loops control
  • Also efficient for arithmetic expression evaluation
  • Storage which can be accessed in LIFO
  • Pointer: SP
  • Only PUSH and POP operations are applicable

Register Stack

Memory Stack Organisation

Memory with Program, Data, and Stack Segments

- A portion of memory is used as a stack with a processor register as a stack pointer

- Most computers do not provide hardware to check
stack overflow (full stack) or underflow(empty stack)


  • Specifies a rule for interpreting or modifying the address field of the instruction (before the operand is actually referenced)

  • Variety of addressing modes

    • to give programming flexibility to the user

    • to use the bits in the address field of the instruction efficiently


    • Immediate Mode

    • Instead of specifying the address of the operand, operand itself is specified - No need to specify address in the instruction - However, operand itself needs to be specified - Sometimes, require more bits than the address - Fast to acquire an operand
    • Register Addressing Mode

    • Address specified in the instruction is the register address - Designated operand need to be in a register - Shorter address than the memory address - Saving address field in the instruction - Faster to acquire an operand than the memory addressing - EA = IR(R) (IR(R): Register field of IR)
    • Register Indirect Mode

    • Instruction specifies a register which contains the memory address of the operand - Saving instruction bits since register address is shorter than the memory address - Slower to acquire an operand than both the register addressing or memory addressing - EA = [IR(R)] ([x]: Content of x)
    • Direct Address Mode

    • Instruction specifies the memory address which can be used directly to the physical memory
      - Faster than the other memory addressing modes
      - Too many bits are needed to specify the address
      for a large physical memory space - EA = IR(address), (IR(address): address field of IR)
    • Indexed Addressing Mode

    • -The address of the operand is obtained by adding to the contents of the general register (called index register) a constant value.
      -The number of the index register and the constant value are included in the instruction code.
      -Index Mode is used to access an array whose elements are in successive memory locations.

    Addressing Modes-Example


    When the assembler processes an instruction, it converts the instruction from its mnemonic form to a standard machine-language (binary) format called an “instruction format”. In the process of conversion, the assembler must determine the type of instruction, convert symbolic labels and explicit notation to a base/displacement format, determine lengths of certain operands, and parse any literals and constants. Consider the following “Move Characters” instruction,
    The assembler must determine the operation code (x’D2’) for MVC, determine the length of COSTOUT, and compute base/displacement addresses for both operands. After assembly, the result which is called “object code”, might look something like the following in hexadecimal,
    The assembler generated 6 bytes (12 hex digits) of object code in a storage to storage (type one) format. In order to understand the object code which an assembler will generate, we need some familiarity with 5 basic instruction formats (there are other instruction types covering privileged and semiprivileged instructions which are beyond the scope of this discussion). First we consider the Storage to Storage type one (SS1) format listed below. This is the instruction format for the MVC instruction above.

    • Byte 1 - machine operation code
    • Byte 2 - length -1 in bytes associated with operand 1
    • Byte 3 and 4 - the base/displacement address associated with operand 1
    • Byte 5 and 6 - the base/displacement address associated with operand 2
    Each box represents one byte or 8 bits and each letter represents a single hexadecimal digit or 4 bits. The subscripts indicate the number of the operand used in determining the contents of the byte. For example, the instruction format indicates that operand 1 is used to compute L1L1 , the length associated with the instruction. If we reconsider the assembled form of the MVC instruction above we see that the op-code is x‘D2’, and the length, derived from COSTOUT, is listed as x’07’. Since the assembler always decrements the length by 1 when converting to machine code, we determine that COSTOUT is 8 bytes long - 8 bytes will be moved by this instruction. Additionally we see that the base register for COSTOUT is x’C’ (register 12) and the displacement is x’008’. The base/displacement address for COSTIN is x’C020’. Why was register 12 chosen as the base register? How were the displacements computed? These parts of the object code could not be determined by the information given in the example above. In order to determine base/displacement addresses we must examine the “USING” and “DROP” directives that are coded in the program. These directives are discussed in the topic called BASE DISPLACEMENT ADDRESSING. Being able to read object code is a necessary skill for an assembler programmer. Knowledge of an instruction’s format gives several important clues about the instruction. For example, knowing that MVC is a storage to storage type one instruction, informs us that both operands are fields in memory and that the first operand will determine the number of bytes that will be moved. Since the length (L1L1) occupies one byte or 8 bits, the maximum length we can create is 28 - 1 = 255 . Recall that the assembler decrements the length when assembling, so the instruction is capable of moving a maximum of 256 bytes. The 256 byte limitation is shared by all storage to storage type one instructions.
    Storage to Storage type two (SS2) is a variation on SS1 .

    • Byte 1 - machine operation code
    • Byte 2 - L1 - the length associated with operand 1 (4 bits) L2 - the length associated with operand 2 (4 bits)
    • Byte 3 and 4 - the base/displacement address associated with operand 1
    • Byte 5 and 6 - the base/displacement address associated with operand 2
    The only difference between SS1 and SS2 is the length byte. Notice that both operands contribute a length in the second byte. Since each length is 4 bits, the maximum value that could be represented is 24 - 1 = 15. Again, since the assembler decrements the length by 1, the instruction can process operands that are large as 16 bytes. There are many arithmetic instructions that require the machine to use the length of both operands. Consider the example below,
    ** Object Code Source Code AFIELD DS PL4 BFIELD DS PL2 ... FA31C300C304 AP AFIELD,BFIELD AP (Add Packed) is an instruction whose format is SS2. Looking at the object code that was generated, we see that x’FA’ is the op-code and that the length of the first operand is x’3’ which was computed by subtracting 1 from the length of AFIELD. Similarly, the length of BFIELD was used to generate the second length of x’1’. In executing this instruction, the machine makes use of the size of both fields. In this case, a 2 byte field is added to a 4 byte field. A second type of instruction format is Register to Register (RR).

    • Byte 1 - machine operation code
    • Byte 2 - R1 - the register which is operand 1 R2 - the register which is operand 2
    Instructions of this type have two operands, both of which are registers. An example of an instruction of this type is LR (Load Register). The effect of the instruction is to copy the contents of the register specified by operand 2 into the register specified by operand 1. The following LR (Load Register) instruction,
    LR R3,R12
    would cause register 12 to be copied into register 3. The assembler would produce the object code listed below as a result of the LR instruction.
    Examining the object code we see that the op-code is x’18’ , operand 1 is register 3, and operand 2 is register 12. You should note that 4 bits are enough to represent any of the registers which are numbered 0 through 15. A third type of instruction format is Register to Indexed Storage (RX).

    • Byte 1 - machine operation code
    • Byte 2 - R1 - the register which is operand 1 X2 - the index register associated with operand 2
    • Byte 3 and 4 - the base/displacement address associated with operand 2
    For instructions of this type, the first operand is a register and the second operand is a storage location. The storage location is designated by a base/displacement address as well as an index register. The subject of index registers is discussed in the topic BASE DISPLACEMENT ADDRESSING. L (Load) is an example of an instruction of type RX. Consider the example below.
    L R5,TABLE(R7)
    The Load instruction copies a fullword from memory into a register. The above instruction might assemble as follows,
    The op-code is x’58’, operand 1 is specified as x’5’, the index register is denoted x’7’ and Operand 2 generates the base/displacement address x’C008’. Again, from the information given in the example above, there is no way to determine how the base/displacement address was computed. Related to the RX type is a similar instruction format called Register to Storage (RS). In this type the index register is replaced by a register reference or a 4-bit mask (pattern).

    One instruction which has a Register to Storage format is STM (Store Multiple). An example of how STM can be coded is as follows,
    STM R14,R12,12(R13)
    The previous instruction would generate the following object code,
    where x’90’ is the op-code, x’E’ = 14 is operand 1, x’C’ = 12, is treated as R3, and x’D00C’ is generated from an explicit base/displacement address (12(R13)). The fifth and final instruction format that we will consider is called Storage Immediate (SI). In this format, the second operand, called the immediate constant, resides in the second byte of the instruction. This constant is usually specified as a self-defining term. The format for SI instructions is listed below.

    • Byte 1 - machine operation code
    • Byte 2 - I2I2 - the immediate constant denoted in operand 2
    • Byte 3 and 4 - the base/displacement address associated with operand 1
    An example of a storage immediate instruction is Compare Logical Immediate (CLI). This instruction will compare one byte in storage to the immediate byte which resides in the instruction itself. We see from the instruction format that operand 2 is the immediate constant. For example, consider the instruction below.
    When assembled, the object code might look like the following,
    The op-code is x’95’, the self-defining term C’A’ is converted to the EBCDIC representation x’C1’, and the variable CUSTTYPE would generate the base/displacement address x’C100’. Again, there is not enough information provided to determine the exact base/displacement address for CUSTTYPE. The x’C100’ address is merely an example of what might be generated.


    Computers provide an extensive set of instruction to give the users the flexibility to carry out various computational task. The symbolic name given to the instructions in the assembly language notation can be different in different computers even for the same instructions. There is a basic set of operations that all the computers perform using instructions. The basic set of instructions can be classified into 3 categories. They are:
    • Data Transfer Instruction
    • Data Manipulation Instruction
    • Program Control Instruction
    The data transfer instructions are used to transfer data from one location to another location without changing the binary information content. The data manipulation instruction are used to perform arithmetic, logic and shift operations. The program control instruction provide decision making capability and change the path taken by the program when executed in the computer.

    Data Transfer Instruction

    The data transfer instruction move the data from one place within the computer to another place without changing the data content. The transfer can be between the memory and the processor register, between the processor register and input or output and between the processor registers themselves. The different types of data transfer instruction and their mnemonics are shown in the table below:
    Name Mnemonics
    Load LD
    Move MV
    Input IN
    Push PUSH
    pop POP
    The load instruction transfers the content from the memory to the processor register, usually the accumulator. The store instruction transfer the content from processor register (AC) into the Memory. The name instruction transfers the content from one register into another or from one location of memory to another location. The exchange register exchanges the content of two registers or two memory locations. The I/O instructions transfer the content from input devices to the CPU register and the CPU registers to the output devices respectively. The push and pop instructions are used to perform stack operation for insert and delete respectively.

    Data Manipulation Instruction

    The data manipulation instruction performs the operation on the data and provides computational capabilities for the computer. These instructions can be divided into 3 types. They are:
    • Arithmetic Instruction
    • Logical and bit manipulation instruction
    • Shift Instruction

    Arithmetic instruction

    The four basic arithmetic operations are addition, subtraction, multiplication and division. Most of the computers provide instructions to perform all these operation but some computers use only the addition and subtraction to perform all the arithmetic operations. In this case, multiplication is performed by repeated use of addition and the division is performed by the repeated use of subtraction. These operations are controlled by software sub-routines. Some of the typical arithmetic instructions are shown in table below:
    Name Mnemonics
    Increment INC
    Decrement DEC
    Add ADD
    Subtract SUB
    Multiply MUL
    Divide DIV
    Add with carry ADDC
    Subtract with borrows SUBB
    Negate (2’s complement) NEG

    Logical and bit manipulation instruction

    Logical instructions binary instruction on string of bits stored in registers. They are used to manipulate individual bits or group of bits that represent binary coded information. The logical instruction consider each bit separately and treat them as a Boolean variable. The logical instructions are used to change bit values, clear a group of bits or to insert new bit value into the operand stored in register or memory. The AND, OR and XOR instructions are used for logical instruction on individual bits of operand. These logical instruction are also used for bit manipulation. A selected bit can be cleared to 0, set to 1 or can be complemented. The typical logical and bit manipulation instructions are:
    Name Mnemonic
    Clear CLR
    Complement COM
    OR OR
    Ex-OR XOR
    Clear carry CLRC
    Set carry SETC
    Enable Interrupt EI
    Disable interrupt DI

    Shift instructions

    Instructions to shift the content of an operand are quite useful and are often provided in several variations (bit shifted at the end of word determine the variation of shift). Shift instructions may specify 3 different shifts:
    • Logical shifts
    • Arithmetic shifts
    • Rotate-type operations
    Name Mnemonics
    Logical shift right SHR
    Logical shift left SHL
    Arithmetic shift right SHRA
    Arithmetic shift left SHRL
    Rotate right ROR
    Rotate left ROL
    Rotate right through carry RORC
    Rotate left through carry ROLC

    Program Control instruction

    Instructions are always stored in successive memory locations. They are fetched from consecutive memory location and executed whenever the instruction is fetched from the memory, the PC is incremented so that it contains the address of next instruction in sequence. After the execution of data transfer or data manipulation operation, the control returns back to the fetch cycle with the program counter containing the address of the instruction but when the program control type of instruction is executed, it may change the address value in the program counter and cause the flow of the control to be altered. Therefore, program control specifies the condition for altering the contained value of the program counter and data transfer and manipulation instruction specify the condition for data processing operations. The typical program control instructions are given in the table below.
    Name Mnemonics
    Branch BR
    Jump JMP
    Skip SKP
    Call CALL
    Return RET
    Compare CMP
    Test (ANDing) TSP


    Historical Background

    IBM System/360, 1964
    - The real beginning of modern computer architecture
    - Distinction between Architecture and Implementation
    - Architecture: The abstract structure of a computer seen by an assembly-language programmer

    Continuing growth in semiconductor memory and microprogramming
    - A much richer and complicated instruction sets
    - CISC(Complex Instruction Set Computer)

    - Arguments advanced at that time
    Richer instruction sets would simplify compilers
    Richer instruction sets would alleviate the software crisis
    - move as much functions to the hardware as possible
    - close Semantic Gap between machine language and the high-level language
    Richer instruction sets would improve the architecture quality

    Complex Instructon Set Combination CISC

    High Performance General Purpose Instructions

    Characteristics of CISC:

    • A large number of instructions (from 100-250 usually)
    • Some instructions that performs a certain tasks are not used frequently.
    • Many addressing modes are used (5 to 20)
    • Variable length instruction format.
    • Instructions that manipulate operands in memory.
    1-Cycle instruction
    Most of the instructions complete their execution
    in 1 CPU clock cycle - like a microoperation
    * Functions of the instruction (contrast to CISC)
    - Very simple functions
    - Very simple instruction format
    - Similar to microinstructions
    => No need for microprogrammed control
    * Register-Register Instructions
    - Avoid memory reference instructions except
    Load and Store instructions
    - Most of the operands can be found in the
    registers instead of main memory
    => Shorter instructions
    => Uniform instruction cycle
    => Requirement of large number of registers
    * Employ instruction pipeline


    Common RISC Characteristics

    - Operations are register-to-register, with only LOAD and STORE accessing memory
    - The operations and addressing modes are reduced
    Instruction formats are simple

    RISC Characteristics

    - Relatively few instructions
    - Relatively few addressing modes
    - Memory access limited to load and store instructions
    - All operations done within the registers of the CPU
    - Fixed-length, easily decoded instruction format
    - Single-cycle instruction format
    - Hardwired rather than microprogrammed control

    More RISC Characteristics

    A relatively large numbers of registers in the processor unit.
    Efficient instruction pipeline
    Compiler support: provides efficient translation of high-level language
    programs into machine language programs.

    Advantages of RISC

    - VLSI Realization
    - Design Costs and Reliability
    - High Level Language Support

    Input/Output Interfaces

    * Provides a method for transferring information between internal storage (such as memory and CPU registers) and external I/O devices
    * Resolves the differences between the computer and peripheral devices
    * Provides a method for transferring information between internal storage (such as memory and CPU registers) and external I/O devices
    * Resolves the differences between the computer and peripheral devices

    I/O Bus And Interface Modules

    Each peripheral has an interface module associated with it Interface
    - Decodes the device address (device code)
    - Decodes the commands (operation)
    - Provides signals for the peripheral controller
    - Synchronizes the data flow and supervises the transfer rate between peripheral and CPU or Memory
    Typical I/O instruction

    Connection Of I/O Bus

    Connection of I/O Bus to One Interface

    I/O Bus And Memory Bus

    Functions of Buses
    *MEMORY BUS is for information transfers between CPU and the MM
    * I/O BUS is for information transfers between CPU and I/O devices through their I/O interface
    Physical Organizations
    * Many computers use a common single bus system for both memory and I/O interface units
    - Use one common bus but separate control lines for each function
    - Use one common bus with common control lines for both functions
    * Some computer systems use two separate buses, one to communicate with memory and the other with I/O interfaces

    I/O Bus

    - Communication between CPU and all interface units is via a common I/O Bus
    - An interface connected to a peripheral device may have a number of data registers , a control register, and a status register
    - A command is passed to the peripheral by sending to the appropriate interface register
    - Function code and sense lines are not needed (Transfer of data, control, and status information is always via the common I/O Bus)

    Isolated vs Memory Mapped I/O

    Isolated I/O

    - Separate I/O read/write control lines in addition to memory read/write control lines
    - Separate (isolated) memory and I/O address spaces
    - Distinct input and output instructions

    Memory-mapped I/O

    - A single set of read/write control lines (no distinction between memory and I/O transfer)
    - Memory and I/O addresses share the common address space
    -> reduces memory address range available
    - No specific input or output instruction
    -> The same memory reference instructions can be used for I/O transfers
    - Considerable flexibility in handling I/O operations

    Programmable Interface

    - Information in each port can be assigned a meaning depending on the mode of operation of the I/O device
    -> Port A = Data; Port B = Command; Port C = Status
    - CPU initializes(loads) each port by transferring a byte to the Control Register
    -> Allows CPU can define the mode of operation of each port
    -> Programmable Port: By changing the bits in the control register, it is possible to change the interface characteristics


    Serial Data Transmission

    Transfers one bit at a time on one data line

    Parallel Data Transmission

    • N bits transmitted at a time over N data lines

    • Synchronization among all N bits

    Note: each N bit is called a word

    Bit Rate Comparison


    This chapter looks at how interrupts are handled by the Linux kernel. Whilst the kernel has generic mechanisms and interfaces for handling interrupts, most of the interrupt handling details are architecture specific.
    Figure 7.1: A Logical Diagram of Interrupt Routing Linux uses a lot of different pieces of hardware to perform many different tasks. The video device drives the monitor, the IDE device drives the disks and so on. You could drive these devices synchronously, that is you could send a request for some operation (say writing a block of memory out to disk) and then wait for the operation to complete. That method, although it would work, is very inefficient and the operating system would spend a lot of time ``busy doing nothing'' as it waited for each operation to complete. A better, more efficient, way is to make the request and then do other, more useful work and later be interrupted by the device when it has finished the request. With this scheme, there may be many outstanding requests to the devices in the system all happening at the same time.
    There has to be some hardware support for the devices to interrupt whatever the CPU is doing. Most, if not all, general purpose processors such as the Alpha AXP use a similar method. Some of the physical pins of the CPU are wired such that changing the voltage (for example changing it from +5v to -5v) causes the CPU to stop what it is doing and to start executing special code to handle the interruption; the interrupt handling code. One of these pins might be connected to an interval timer and receive an interrupt every 1000th of a second, others may be connected to the other devices in the system, such as the SCSI controller.
    Systems often use an interrupt controller to group the device interrupts together before passing on the signal to a single interrupt pin on the CPU. This saves interrupt pins on the CPU and also gives flexibility when designing systems. The interrupt controller has mask and status registers that control the interrupts. Setting the bits in the mask register enables and disables interrupts and the status register returns the currently active interrupts in the system. Some of the interrupts in the system may be hard-wired, for example, the real time clock's interval timer may be permanently connected to pin 3 on the interrupt controller. However, what some of the pins are connected to may be determined by what controller card is plugged into a particular ISA or PCI slot. For example, pin 4 on the interrupt controller may be connected to PCI slot number 0 which might one day have an ethernet card in it but the next have a SCSI controller in it. The bottom line is that each system has its own interrupt routing mechanisms and the operating system must be flexible enough to cope. Most modern general purpose microprocessors handle the interrupts the same way. When a hardware interrupt occurs the CPU stops executing the instructions that it was executing and jumps to a location in memory that either contains the interrupt handling code or an instruction branching to the interrupt handling code. This code usually operates in a special mode for the CPU, interrupt mode, and, normally, no other interrupts can happen in this mode. There are exceptions though; some CPUs rank the interrupts in priority and higher level interrupts may happen. This means that the first level interrupt handling code must be very carefully written and it often has its own stack, which it uses to store the CPU's execution state (all of the CPU's normal registers and context) before it goes off and handles the interrupt. Some CPUs have a special set of registers that only exist in interrupt mode, and the interrupt code can use these registers to do most of the context saving it needs to do.
    When the interrupt has been handled, the CPU's state is restored and the interrupt is dismissed. The CPU will then continue to doing whatever it was doing before being interrupted. It is important that the interrupt processing code is as efficient as possible and that the operating system does not block interrupts too often or for too long.

    Programmable Interrupt Controllers

    Systems designers are free to use whatever interrupt architecture they wish but IBM PCs use the Intel 82C59A-2 CMOS Programmable Interrupt Controller or its derivatives. This controller has been around since the dawn of the PC and it is programmable with its registers being at well known locations in the ISA address space. Even very modern support logic chip sets keep equivalent registers in the same place in ISA memory. Non-Intel based systems such as Alpha AXP based PCs are free from these architectural constraints and so often use different interrupt controllers. Figure 7.1 shows that there are two 8 bit controllers chained together; each having a mask and an interrupt status register, PIC1 and PIC2. The mask registers are at addresses 0x21 and 0xA1 and the status registers are at 0x20 and 0xA0 Writing a one to a particular bit of the mask register enables an interrupt, writing a zero disables it. So, writing one to bit 3 would enable interrupt 3, writing zero would disable it. Unfortunately (and irritatingly), the interrupt mask registers are write only, you cannot read back the value that you wrote. This means that Linux must keep a local copy of what it has set the mask registers to. It modifies these saved masks in the interrupt enable and disable routines and writes the full masks to the registers every time
    When an interrupt is signalled, the interrupt handling code reads the two interrupt status registers (ISRs). It treats the ISR at 0x20 as the bottom eight bits of a sixteen bit interrupt register and the ISR at 0xA0 as the top eight bits. So, an interrupt on bit 1 of the ISR at 0xA0 would be treated as system interrupt 9. Bit 2 of PIC1 is not available as this is used to chain interrupts from PIC2, any interrupt on PIC2 results in bit 2 of PIC1 being set.

    Initializing the Interrupt Handling Data Structures

    The kernel's interrupt handling data structures are set up by the device drivers as they request control of the system's interrupts. To do this the device driver uses a set of Linux kernel services that are used to request an interrupt, enable it and to disable it. The individual device drivers call these routines to register their interrupt handling routine addresses.
    Some interrupts are fixed by convention for the PC architecture and so the driver simply requests its interrupt when it is initialized. This is what the floppy disk device driver does; it always requests IRQ 6. There may be occassions when a device driver does not know which interrupt the device will use. This is not a problem for PCI device drivers as they always know what their interrupt number is. Unfortunately there is no easy way for ISA device drivers to find their interrupt number. Linux solves this problem by allowing device drivers to probe for their interrupts. First, the device driver does something to the device that causes it to interrupt. Then all of the unassigned interrupts in the system are enabled. This means that the device's pending interrupt will now be delivered via the programmable interrupt controller. Linux reads the interrupt status register and returns its contents to the device driver. A non-zero result means that one or more interrupts occured during the probe. The driver now turns probing off and the unassigned interrupts are all disabled. If the ISA device driver has successfully found its IRQ number then it can now request control of it as normal.
    PCI based systems are much more dynamic than ISA based systems. The interrupt pin that an ISA device uses is often set using jumpers on the hardware device and fixed in the device driver. On the other hand, PCI devices have their interrupts allocated by the PCI BIOS or the PCI subsystem as PCI is initialized when the system boots. Each PCI device may use one of four interrupt pins, A, B, C or D. This was fixed when the device was built and most devices default to interrupt on pin A. The PCI interrupt lines A, B, C and D for each PCI slot are routed to the interrupt controller. So, Pin A from PCI slot 4 might be routed to pin 6 of the interrupt controller, pin B of PCI slot 4 to pin 7 of the interrupt controller and so on. How the PCI interrupts are routed is entirely system specific and there must be some set up code which understands this PCI interrupt routing topology. On Intel based PCs this is the system BIOS code that runs at boot time but for system's without BIOS (for example Alpha AXP based systems) the Linux kernel does this setup.
    The PCI set up code writes the pin number of the interrupt controller into the PCI configuration header for each device. It determines the interrupt pin (or IRQ) number using its knowledge of the PCI interrupt routing topology together with the devices PCI slot number and which PCI interrupt pin that it is using. The interrupt pin that a device uses is fixed and is kept in a field in the PCI configuration header for this device. It writes this information into the interrupt line field that is reserved for this purpose. When the device driver runs, it reads this information and uses it to request control of the interrupt from the Linux kernel. There may be many PCI interrupt sources in the system, for example when PCI-PCI bridges are used. The number of interrupt sources may exceed the number of pins on the system's programmable interrupt controllers. In this case, PCI devices may share interrupts, one pin on the interrupt controller taking interrupts from more than one PCI device. Linux supports this by allowing the first requestor of an interrupt source declare whether it may be shared. Sharing interrupts results in several irqaction data structures being pointed at by one entry in the irq_action vector vector. When a shared interrupt happens, Linux will call all of the interrupt handlers for that source. Any device driver that can share interrupts (which should be all PCI device drivers) must be prepared to have its interrupt handler called when there is no interrupt to be serviced.

    Interrupt Handling

    ,br> figure 7.2 Linux Interrupt Handling Data Structures One of the principal tasks of Linux's interrupt handling subsystem is to route the interrupts to the right pieces of interrupt handling code. This code must understand the interrupt topology of the system. If, for example, the floppy controller interrupts on pin 6 1 of the interrupt controller then it must recognize the interrupt as from the floppy and route it to the floppy device driver's interrupt handling code. Linux uses a set of pointers to data structures containing the addresses of the routines that handle the system's interrupts. These routines belong to the device drivers for the devices in the system and it is the responsibility of each device driver to request the interrupt that it wants when the driver is initialized. Figure 7.2 shows that irq_action is a vector of pointers to the irqaction data structure. Each irqaction data structure contains information about the handler for this interrupt, including the address of the interrupt handling routine. As the number of interrupts and how they are handled varies between architectures and, sometimes, between systems, the Linux interrupt handling code is architecture specific. This means that the size of the irq_action vector vector varies depending on the number of interrupt sources that there are. When the interrupt happens, Linux must first determine its source by reading the interrupt status register of the system's programmable interrupt controllers. It then translates that source into an offset into the irq_action vector vector. So, for example, an interrupt on pin 6 of the interrupt controller from the floppy controller would be translated into the seventh pointer in the vector of interrupt handlers. If there is not an interrupt handler for the interrupt that occurred then the Linux kernel will log an error, otherwise it will call into the interrupt handling routines for all of the irqaction data structures for this interrupt source.
    When the device driver's interrupt handling routine is called by the Linux kernel it must efficiently work out why it was interrupted and respond. To find the cause of the interrupt the device driver would read the status register of the device that interrupted. The device may be reporting an error or that a requested operation has completed. For example the floppy controller may be reporting that it has completed the positioning of the floppy's read head over the correct sector on the floppy disk. Once the reason for the interrupt has been determined, the device driver may need to do more work. If it does, the Linux kernel has mechanisms that allow it to postpone that work until later. This avoids the CPU spending too much time in interrupt mode. See the Device Driver chapter (Chapter dd-chapter) for more details


    The code in the OS for Programmed I/O be more like:

    1. keyboard_wait: ; for get_ch
      • test Keyboard_Status, 80000000h

      • jz keyboard_wait

      • mov eax, Keyboard_Data

    2. display_wait: ; for put_ch

      • test Display_Status, 80000000h

      • jz display_wait

      • mov Display_Data, eax

    This scheme is known as BUSY WAITING, or SPIN WAITING. The little loop is called a SPIN WAIT LOOP.



    An important aspect governing the Computer System performance is the transfer of data between memory and I/O devices. The operation involves loading programs or data files from disk into memory, saving file on disk, and accessing virtual memory pages on any secondary storage medium.

    Computer System with DMA

    Consider a typical system consisting of a CPU ,memory and one or more input/output devices as shown in fig. Assume one of the I/O devices is a disk drive and that the computer must load a program from this drive into memory. The CPU would read the first byte of the program and then write that byte to memory. Then it would do the same for the second byte, until it had loaded the entire program into memory. This process proves to be inefficient. Loading data into, and then writing data out of the CPU significantly slows down the transfer. The CPU does not modify the data at all, so it only serves as an additional stop for data on the way to it’s final destinaion. The process would be much quicker if we could bypass the CPU & transfer data directly from the I/O device to memory. Direct Memory Access does exactly that.

    Implementing DMA in a Computer System

    A DMA controller implements direct memory access in a computer system. It connects directly to the I/O device at one end and to the system buses at the other end. It also interacts with the CPU, both via the system buses and two new direct connections. It is sometimes referred to as a channel. In an alternate configuration, the DMA controller may be incorporated directly into the I/O device.


    Two forms of communication
    1. Parallel communication

      • Transfers more than one bit of data at a given time
      • N-bits transmitted at the same time through n- wires
      • Faster but requires many wires and is used in short distances
      • EX: Input/output devices, DMA controllers, and I/O processors

    2. Serial Communication

      • Serial communication refers to devices that cannot handle more than one bit of data at any given time by design.
      • Requires one wire and is slower.
      • Usually CPU use Parallel communication, if the device is serial, then the data is converted to use Parallel communication
      • EX: Modems

      Two types of Serial Communication

      1. Asynchronous Serial Communication

        • Interacts with devices outside of the computer
        • Ex: modem connecting to another computer
        • Transmit individual bytes instead of large blocks
        • Do not share a common clock.

      2. Synchronous Serial transmission

        • Transmits block of data in frames.
        • Frames are had head in front of the data and a tail at the end of the data.
        • The head and tail contain information that allows the two computers to synchronize their clocks