# COMPUTER ORGANIZATION AND ARCHITECTURE DESIGNING FOR PERFORMANCE TENTH EDITION GLOBAL EDITION # William Stallings With contribution by Peter Zeno University of Bridgeport With Foreword by Chris Jesshope Professor (emeritus) University of Amsterdam **PEARSON** # **C**ONTENTS | Foreword 13 | | | | |-----------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|--| | Preface 15 | | | | | About the Author 23 | | | | | PART ONE INTRODUCTION 25 | | | | | Chapter 1 | Basic Concepts and Computer Evolution 25 | | | | 1.1<br>1.2<br>1.3<br>1.4<br>1.5<br>1.6<br>1.7 | Organization and Architecture 26 Structure and Function 27 A Brief History of Computers 35 The Evolution of the Intel x86 Architecture 51 Embedded Systems 53 Arm Architecture 57 Cloud Computing 63 | | | | 1.8 | Key Terms, Review Questions, and Problems 66 | | | | Chapter 2 | Performance Issues 69 | | | | 2.1<br>2.2<br>2.3<br>2.4<br>2.5<br>2.6<br>2.7 | Designing for Performance 70 Multicore, Mics, and GPGPUs 76 Two Laws that Provide Insight: Ahmdahl's Law and Little's Law 77 Basic Measures of Computer Performance 80 Calculating the Mean 83 Benchmarks and Spec 91 Key Terms, Review Questions, and Problems 98 | | | | PART TW | O THE COMPUTER SYSTEM 104 | | | | Chapter 3 | A Top-Level View of Computer Function and Interconnection 104 | | | | 3.1 | Computer Components 105 | | | | 3.2 | Computer Function 107 | | | | 3.3 | Interconnection Structures 123 | | | | 3.4 | Bus Interconnection 124 | | | | 3.5 | Point-to-Point Interconnect 126 | | | | 3.6<br>3.7 | PCI Express 131 Key Terms, Review Questions, and Problems 140 | | | | Chapter 4 | Cache Memory 144 | | | | 4.1 | Computer Memory System Overview 145 | | | | 4.2 | Cache Memory Principles 152 | | | | 4.3 | Elements of Cache Design 155 | | | | 4.4 | Pentium 4 Cache Organization 173 | | | | 4.5 | Key Terms, Review Questions, and Problems 176 Appendix 4A Performance Characteristics of Two-Level Memories 181 | | | # 8 CONTENTS | Chapter 5 | Internal Memory 189 | |------------|-------------------------------------------------------| | 5.1 | Semiconductor Main Memory 190 | | 5.2 | Error Correction 198 | | 5.3 | DDR DRAM 204 | | 5.4 | Flash Memory 209 | | 5.5 | Newer Nonvolatile Solid-State Memory Technologies 211 | | 5.6 | Key Terms, Review Questions, and Problems 214 | | Chapter 6 | External Memory 218 | | 6.1 | Magnetic Disk 219 | | 6.2 | RAID 228 | | 6.3 | Solid State Drives 236 | | 6.4 | Optical Memory 241 | | 6.5 | Magnetic Tape 246 | | 6.6 | Key Terms, Review Questions, and Problems 248 | | Chapter 7 | Input/Output 252 | | 7.1 | External Devices 254 | | 7.2 | I/O Modules 256 | | 7.3 | Programmed I/O 259 | | 7.4 | Interrupt-Driven I/O 263 | | 7.5 | Direct Memory Access 272 | | 7.6 | Direct Cache Access 278 | | 7.7 | I/O Channels and Processors 285 | | 7.8 | External Interconnection Standards 287 | | 7.9 | IBM zEnterprise EC12 I/O Structure 290 | | 7.10 | Key Terms, Review Questions, and Problems 294 | | Chapter 8 | Operating System Support 299 | | 8.1 | Operating System Overview 300 | | 8.2 | Scheduling 311 | | 8.3 | Memory Management 317 | | 8.4 | Intel x86 Memory Management 328 | | 8.5 | Arm Memory Management 333 | | 8.6 | Key Terms, Review Questions, and Problems 338 | | PART THR | REE ARITHMETIC AND LOGIC 342 | | | Number Systems 342 | | 9.1 | - | | 9.1 | The Decimal System 343 Positional Number Systems 344 | | 9.2 | The Binary System 345 | | 9.3<br>9.4 | Converting Between Binary and Decimal 345 | | 9.4 | Hexadecimal Notation 348 | | 9.6 | Key Terms and Problems 350 | | | | | _ | Computer Arithmetic 352 | | 10.1 | The Arithmetic and Logic Unit 353 | | 10.2 | Integer Representation 354 | | 10.3 | Integer Arithmetic 359 | | 10.4 | Floating-Point Representation 374 | |------------|-----------------------------------------------------| | 10.5 | Floating-Point Arithmetic 382 | | 10.6 | Key Terms, Review Questions, and Problems 391 | | Chapter 11 | Digital Logic 396 | | 11.1 | Boolean Algebra 397 | | 11.2 | Gates 400 | | 11.3 | Combinational Circuits 402 | | 11.4 | Sequential Circuits 420 | | 11.5 | Programmable Logic Devices 429 | | 11.6 | Key Terms and Problems 433 | | | | | PART FOU | | | Chapter 12 | Instruction Sets: Characteristics and Functions 436 | | 12.1 | Machine Instruction Characteristics 437 | | 12.2 | Types of Operands 444 | | 12.3 | Intel x86 and ARM Data Types 446 | | 12.4 | Types of Operations 449 | | 12.5 | Intel x86 and ARM Operation Types 462 | | 12.6 | Key Terms, Review Questions, and Problems 470 | | | Appendix 12A Little-, Big-, and Bi-Endian 476 | | Chapter 13 | Instruction Sets: Addressing Modes and Formats 480 | | 13.1 | Addressing Modes 481 | | 13.2 | x86 and ARM Addressing Modes 487 | | 13.3 | Instruction Formats 493 | | 13.4 | x86 and ARM Instruction Formats 501 | | 13.5 | Assembly Language 506 | | 13.6 | Key Terms, Review Questions, and Problems 508 | | Chapter 14 | Processor Structure and Function 512 | | 14.1 | Processor Organization 513 | | 14.2 | Register Organization 515 | | 14.3 | Instruction Cycle 520 | | 14.4 | Instruction Pipelining 524 | | 14.5 | The x86 Processor Family 541 | | 14.6 | The ARM Processor 548 | | 14.7 | Key Terms, Review Questions, and Problems 554 | | Chapter 15 | Reduced Instruction Set Computers 559 | | 15.1 | Instruction Execution Characteristics 561 | | 15.2 | The Use of a Large Register File 566 | | 15.3 | Compiler-Based Register Optimization 571 | | 15.4 | Reduced Instruction Set Architecture 573 | | 15.5 | RISC Pipelining 579 | | 15.6 | MIPS R4000 583 | | 15.7 | SPARC 589 | | 15.8 | RISC versus CISC Controversy 594 | | 15.9 | Key Terms, Review Questions, and Problems 595 | | Chapter 16 | Instruction-Level Parallelism and Superscalar Processors 599 | |------------|--------------------------------------------------------------| | 16.1 | Overview 600 | | 16.2 | Design Issues 605 | | 16.3 | Intel Core Microarchitecture 615 | | 16.4 | ARM Cortex-A8 620 | | 16.5 | ARM Cortex-M3 628 | | 16.6 | Key Terms, Review Questions, and Problems 632 | | PART FIV | E PARALLEL ORGANIZATION 637 | | Chapter 17 | Parallel Processing 637 | | 17.1 | Multiple Processor Organizations 639 | | 17.2 | Symmetric Multiprocessors 641 | | 17.3 | Cache Coherence and the MESI Protocol 645 | | 17.4 | Multithreading and Chip Multiprocessors 652 | | 17.5 | Clusters 657 | | 17.6 | Nonuniform Memory Access 664 | | 17.7 | Cloud Computing 667 | | 17.8 | Key Terms, Review Questions, and Problems 674 | | Chapter 18 | Multicore Computers 680 | | 18.1 | Hardware Performance Issues 681 | | 18.2 | Software Performance Issues 684 | | 18.3 | Multicore Organization 689 | | 18.4 | Heterogeneous Multicore Organization 691 | | 18.5 | Intel Core i7-990X 700 | | 18.6 | ARM Cortex-A15 MPCore 701 | | 18.7 | IBM zEnterprise EC12 Mainframe 706 | | 18.8 | Key Terms, Review Questions, and Problems 709 | | Chapter 19 | General-Purpose Graphic Processing Units 712 | | 19.1 | Cuda Basics 713 | | 19.2 | GPU versus CPU 715 | | 19.3 | GPU Architecture Overview 716 | | 19.4 | Intel's Gen8 GPU 725 | | 19.5 | When to Use a GPU as a Coprocessor 728 | | 19.6 | Key Terms and Review Questions 730 | | PART SIX | THE CONTROL UNIT 731 | | Chapter 20 | Control Unit Operation 731 | | 20.1 | Micro-Operations 732 | | 20.2 | Control of the Processor 738 | | 20.3 | Hardwired Implementation 748 | | 20.4 | Key Terms, Review Questions, and Problems 751 | | Chapter 21 | Microprogrammed Control 753 | | 21.1 | Basic Concepts 754 | | 21.2 | Microinstruction Sequencing 763 | - 21.3 Microinstruction Execution 769 - 21.4 TI 8800 779 - 21.5 Key Terms, Review Questions, and Problems 790 ### Appendix A Projects for Teaching Computer Organization and Architecture 792 - A.1 Interactive Simulations 793 - A.2 Research Projects 795 - A.3 Simulation Projects 795 - A.4 Assembly Language Projects 796 - A.5 Reading/Report Assignments 797 - A.6 Writing Assignments 797 - Test Bank 797 A.7 ### Appendix B Assembly Language and Related Topics 798 - B.1 Assembly Language 799 - B.2 Assemblers 807 - B.3 Loading and Linking 811 - B.4 Key Terms, Review Questions, and Problems 819 # References 824 # Index 833 # Credits 857 | | ONLINE APPENDICES <sup>1</sup> | |------------|--------------------------------------------| | Appendix C | System Buses | | Appendix D | Protocols and Protocol Architectures | | Appendix E | Scrambling | | Appendix F | Victim Cache Strategies | | Appendix G | Interleaved Memory | | Appendix H | International Reference Alphabet | | Appendix I | Stacks | | Appendix J | Thunderbolt and Infiniband | | Appendix K | Virtual Memory Page Replacement Algorithms | | Appendix L | Hash Tables | | Appendix M | Recursive Procedures | | Appendix N | Additional Instruction Pipeline Topics | | Appendix O | Timing Diagrams | | Glossary | | <sup>&</sup>lt;sup>1</sup>Online chapters, appendices, and other documents are Premium Content, available via the access card at the front of this book.