pc_evolution.md (424964B)
1 History of the evolution of the x86 platform, from the IBM PC to the modern era 2 =============================================================================== 3 4 Version 01/01/2021 5 by Zir Blazer 6 7 8 ## Table of Contents 9 10 * [0 - PROLOGUE](#chapter-0) 11 * [1 - INTRODUCTION](#chapter-1) 12 * [2 - The original IBM PC 5150, Intel x86 ISA and IBM PC platform overviews, and the first clones](#chapter-2) 13 * [2.1 - Intel 8088 CPU Overview, Memory and I/O Address Spaces, Bus protocol](#chapter-2.1) 14 * [2.2 - Intel MCS-86 and MCS-85 chip set families, support chips overview](#chapter-2.2) 15 * [2.3 - Intel 8259A PIC IRQs, Intel 8237A DMAC DMA Channels, Intel 8253 PIT Counters, and Intel 8255 PPI Ports](#chapter-2.3) 16 * [2.4 - IBM PC Memory Map (Conventional Memory and UMA)](#chapter-2.4) 17 * [2.5 - IBM PC Motherboard physical characteristics overview, Computer Case and Keyboard](#chapter-2.5) 18 * [2.6 - IBM PC Motherboard Buses (Local Bus, I/O Channel Bus, Memory Bus)](#chapter-2.6) 19 * [2.7 - IBM PC Motherboard Clock Generation and Wait States](#chapter-2.7) 20 * [2.8 - IBM PC Expansion Cards](#chapter-2.8) 21 * [2.9 - IBM PC BIOS Firmware (BIOS Services, Option ROMs)](#chapter-2.9) 22 * [2.10 - IBM PC Operating Systems (PC DOS, PC Booters) and user experience](#chapter-2.10) 23 * [2.11 - Definition of platform](#chapter-2.11) 24 * [2.12 - The first IBM PC clones](#chapter-2.12) 25 * [3 - The IBM PC/XT 5160 and the Turbo XT clones](#chapter-3) 26 * [3.1 - IBM PC/XT Motherboard changes](#chapter-3.1) 27 * [3.2 - IBM PC/XT HDC card and HD, PC DOS 2.0, MBR and partitioning](#chapter-3.2) 28 * [3.3 - The first IBM PC/XT clones and the NEC V20 CPU](#chapter-3.3) 29 * [3.4 - The first Turbo XTs, Hardware requirements and changes to Clock Generation](#chapter-3.4) 30 * [3.5 - Turbo XTs compatibility issues with IBM PC Software, Turbo function, honorable PC platform related mentions](#chapter-3.5) 31 * [4 - The IBM PC/AT 5170, the last common ancestor](#chapter-4) 32 * [4.1 - Intel 80286 CPU Overview, the MMU, Segmented Virtual Memory and Protected Mode, support chips](#chapter-4.1) 33 * [4.2 - Processor IPC and performance, compiler optimizations](#chapter-4.2) 34 * [4.3 - Screwing up x86 forward compatibility: 286 reset hacks, A20 Gate (HMA), 286 LOADALL, the Intel 80186 CPU missing link](#chapter-4.3) 35 * [4.4 - Cascading two Intel 8259A PICs and two Intel 8237A DMACs, the Intel 8254 PIT, the Motorola MC146818 RTC, and the Intel 8042 Microcontroller](#chapter-4.4) 36 * [4.5 - IBM PC/AT Memory Map (Extended Memory and HMA)](#chapter-4.5) 37 * [4.6 - IBM PC/AT Motherboard physical characteristics overview, Computer Case and Keyboard](#chapter-4.6) 38 * [4.7 - IBM PC/AT Motherboard Buses (Local Bus, I/O Channel Bus, Memory Bus)](#chapter-4.7) 39 * [4.8 - IBM PC/AT Motherboard Clock Generation and Wait States](#chapter-4.8) 40 * [4.9 - IBM PC/AT Expansion Cards](#chapter-4.9) 41 * [4.10 - IBM PC/AT BIOS Firmware, BIOS Setup and RTC SRAM, early overclocking, PC DOS 3.0](#chapter-4.10) 42 * [4.11 - The IBM PC/XT 5162 Model 286](#chapter-4.11) 43 * [4.12 - 286 era DOS Memory Management: Extended Memory, Expanded Memory (EMS API), and the XMS API](#chapter-4.12) 44 * [5 - The Turbo ATs and the first Chipsets](#chapter-5) 45 * [5.1 - First generation PC/AT clones and Turbo ATs, integrated BIOS Setup](#chapter-5.1) 46 * [5.2 - The first PC/AT Chipset, C&T CS8220. Feature overview and Clock Generation](#chapter-5.2) 47 * [5.3 - The second C&T PC/AT Chipset generation, C&T CS8221. Feature overview and Clock Generation](#chapter-5.3) 48 * [5.4 - The third C&T PC/AT Chipset generation, C&T 82C235. Feature overview](#chapter-5.4) 49 * [6 - The Compaq DeskPro 386, the Intel 80386 CPU, 386 era DOS Memory Management, DOS Extenders, and the beginning of the x86-IBM PC marriage](#chapter-6) 50 * [6.1 - The Intel 80386 CPU main features, Virtual 8086 Mode, Flat Memory Model, Paged Virtual Memory](#chapter-6.1) 51 * [6.2 - The side stories of the 386: The 32 Bits bug recall, removed instructions, poorly known support chips, and the other 386 variants](#chapter-6.2) 52 53 54 0 - PROLOGUE 55 ------------ 56 57 Back during 2013, I became interested in both Hardware and Hypervisor support to 58 do PCI/VGA Passthrough, as I saw a lot of potential in the idea of 59 passthroughing a real Video Card to a Virtual Machine with a Windows guest just 60 for gaming purposes (Note that this was a few years before this type of setups 61 became popular, so I consider myself a pioneer in this area). At that point in 62 time, the only Hypervisor that supported doing so was Xen, which required a 63 Linux host. Since I had no previous Linux experience, I had to learn everything 64 from scratch, thus I began to take a large amount of notes so that I could 65 perfectly reproduce step by step everything that I did to get my setup in 66 working order. Eventually, those notes evolved into a full fledged guide so that 67 other people could use my knowledge to install an Arch Linux host together with 68 the Xen Hypervisor. 69 70 By 2016, I replaced Xen with the standalone version of QEMU, since it gained 71 important features at a much faster pace than Xen (Mainly the fact that QEMU 72 could workaround nVidia artificial limitation to not let the GeForce Drivers to 73 initialize a GeForce card if it detected that it was inside a VM). In the 74 meantime, I updated my guide, but by that point, everyone and their moms had 75 wrote guides about how to do passthrough. During the 3 years that I spent 76 keeping my guide updated, only two people used it that I'm aware of, thus I lost 77 interest in maintaining it as no one but myself used it. 78 79 Somewhere along the way, I noticed that QEMU is so ridiculous complex, that in 80 order to fully explain all of its capabilities, you require to have a lot of 81 in-depth knowledge about a multitude of Hardware topics related to modern x86 82 platforms. There is no other way to know what QEMU is actually doing if you 83 don't know about the Hardware platform. Fine tuning a VM goes far beyond simple 84 walkthrough installation procedures, which were what nearly everyone else was 85 writing about. So I thought that I had finally found my niche... 86 87 Soon after, my obsolete virtualization guide was morphing to provide heavy 88 coverage of the idiosyncrasies of recent x86 platforms. This grew out of 89 proportions when I decided that the easiest way to explain why modern stuff 90 works in a certain way is simply by starting at the very beginning, then 91 introduce topics one at a time. At that point, I decided to spin off all that 92 into its own Wall of Text, as instead of virtualization, it had became a major 93 course of computer history that goes all the way back to the 1981 IBM PC. 94 95 If you want a recap about why I began to write all this for a virtualization 96 guide, consider that a Virtual Machine is an extremely complex beast that 97 attempts to implement, mostly in software, the functionality and behavior of an 98 entire computer. The mix and match of emulation, virtualization and passthrough 99 only tells you the method that was used to reproduce the functionality of a 100 specific component of the VM, but not why it has to be reproduced at all. If you 101 want to understand the complexity of all what a VMM like QEMU has to do in order 102 to be able to create an usable VM, then you have to understand the underlying 103 Hardware whose functionality it is attempting to reproduce. This includes the 104 Processor, the Chipset, the firmware, Buses like PCI Express and anything that 105 you can plug in them, and the means by which everything is interconnected or 106 related. As you should expect, explaining all that requires major amounts of 107 text. 108 109 What you're going to read next covers, step by step, the almost four decades of 110 evolution of the core aspects of the modern x86 platform, most of which QEMU can 111 or has to reproduce. If you think than all that seems like a ridiculous 112 overkill, chances are that this text isn't for you. As the majority of users 113 have simple computers, there is no real need to know all this to be able to use 114 QEMU since most people can rely on generic instructions to get things running 115 smoothly, which is what nearly everyone actually does. However, as you get into 116 bigger and wider systems like Dual Processors (Which also includes first and 117 second generation AMD ThreadRipper, for reasons you will learn when I get 118 there), it becomes critical to have at the very least a basic understanding of 119 why the platform topology is directly related to the way that the system 120 resources must be allocated, else, you will be left pondering why you are 121 getting sub-par performance. 122 123 Eventually, you will notice that all this knowledge becomes immensely valuable 124 to optimize any type of setup, as you will have a decent idea about what SMP and 125 SMT are, their relationship with the configurable Sockets-Cores-Threads Topology 126 of the VM Processor and why it is recommended to directly map each vCPU Thread 127 to a CPU in the host, what a NUMA Node is and why it matters if you are using 128 Dual Processors, and how to create a correct virtual PCI Bus Topology that 129 solves some obscure issues with passthroughed Video Cards, something that most 130 users seems to ignore because Windows for the most part doesn't care. 131 132 133 1 - INTRODUCTION 134 ---------------- 135 136 Our computing ecosystem as you know it today, is something that has been built 137 directly on top of a pile of design decisions made in the early 80's. How we got 138 here? Backwards compatibility. It was a great idea that a newer computer was an 139 improved, faster version of your previous one, so that it could run all the 140 software that you already owned better than it. However, maintaining that sacred 141 backwards compatibility eventually became a burden. It is the reason why a lot 142 of design mistakes made early on were dragged for ages, even influencing later 143 Chipset and Processor designs just to remain compatible with features only 144 relevant in the earliest generations. 145 146 If you are using a computer that has an Intel or AMD Processor, your platform 147 has an IBM PC compatible heritage. The IBM PC compatibility is not a minor 148 thing, it is a Hardware legacy left from the Day One of the IBM PC that is still 149 present in pretty much every modern x86 based platform, and is the reason why 150 you can run MS-DOS on bare metal in a 2020 computer if you want to (Albeit with 151 a severe amount of limitations that doesn't make it useful at all). The 152 historical importance of the early IBM computers is high enough that there are 153 several specialized Websites that host a lot of extremely detailed information 154 about them, like [MinusZeroDegrees][MinusZeroDegrees] which is from where I got 155 huge amounts of the data that I organized for exposition here. 156 157 While the whole history is rather long, a lot of things will be easier to digest 158 and make much more sense if you start from the very beginning. After all, I'm 159 sure that neither the original IBM PC designers, nor those that built upon it, 160 could have imagined that almost four decades later, many implementation details 161 would still be relevant. It also explains why from our current point of view, 162 some things that looks like ugly hacks have some kind of ingenuity applied to 163 them. 164 165 [MinusZeroDegrees]: http://www.minuszerodegrees.net/index.htm 166 167 168 2 - The original IBM PC 5150, Intel x86 ISA and IBM PC platform overviews, and the first clones 169 ----------------------------------------------------------------------------------------------- 170 171 The beginning of this history starts with the release of the IBM PC 5150 in 172 August 1981. While at the time there already were other moderately successful 173 personal computer manufacturers, IBM, which back then was the dominant mainframe 174 and enterprise player, decided to enter that emerging market by making a 175 personal computer of its own aimed at corporate business users. Its new computer 176 became extremely popular, and thanks to its successors and the compatible 177 cheaper clone computers, its user base grew large enough to make it the most 178 important computer platform of the 80's. The platform itself keep continuously 179 evolving through the years, and by the late 90's, the descendants of the IBM PC 180 had killed almost all the other competing platforms. 181 182 IBM was used to make fully proprietary systems that provided a complete Hardware 183 and software solution, yet it was aware that the personal computer market had a 184 totally different set of requirements. Thus, for the IBM PC, IBM took many 185 unusual design decisions. In order to reduce the time that it would take to 186 design the new computer, IBM used mostly standard chips and other components 187 that were already available in the open market (Usually known as off-the-shelf), 188 and so that anyone else could buy. IBM also delegated the responsibility of 189 creating the main Operating System for the new computer to a third party, 190 Microsoft, which kept the right to license it to other third parties. Finally, 191 IBM deemed necessary for the success of the IBM PC that there had to be a 192 healthy amount of third party support in the form of expansion cards and user 193 software, so IBM opted to make the IBM PC architecture an open platform to make 194 developing Hardware and software for it as easy as possible. All these decisions 195 made the IBM PC rather easy to clone... 196 197 The impact of making the IBM PC an open platform is something that can't be 198 understated. If you want to see how serious IBM was about that, you can peek by 199 yourself at the [IBM PC 5150 Technical Reference Manual (August 1981)][5150ref]. 200 Besides detailed computer specifications, that document included the schematics 201 of all the parts of the computer, and even included the firmware source code in 202 printed form. I doubt that currently there are more than two or three commercial 203 computers (Or even single components, like a Motherboard) that could come close 204 to that level of openness. If anything, I'm actually absolutely impressed by how 205 IBM was extremely detailed and methodical in all its documentation, making most 206 of it a pleasure to read even if you don't fully understand it. I suppose that a 207 lot of IBM former reputation was thanks to that level of attention to the 208 details. 209 210 The basic specifications of the IBM PC were rather good when introduced, albeit 211 due to the hectic pace of the industry, it got outdated surprisingly quickly. It 212 was one of the first mass market 16 Bits desktop computers, as it had an Intel 213 8088 CPU running @ 4.77 MHz when the contemporary home computers used 8 Bits 214 Processors like the MOS Technology 6502 running @ no more than 2 MHz, making the 215 difference, at least on paper, appear to be rather brutal. It had a Socket ready 216 for an optional Intel 8087 FPU, a massive upgrade for the rare software that 217 could make use of floating point math. It supported up to 256 KiB system RAM 218 (Albeit with later expansion cards it was possible to install more than that, 219 this apparently was the official limit at launch), when others maxed out just at 220 64 KiB. It also included a long forgotten, built-in Microsoft BASIC interpreter 221 in ROM known as IBM Cassette BASIC, which seems to have been a rather common 222 feature at that point of time in other computers, with the idea being that even 223 with no additional software, they could be usable. For more add-ons it had 5 224 expansion slots, of which you required to use at least one for a mandatory Video 225 Card. 226 227 The video and sound capabilities of the IBM PC seems to have been rather 228 mediocre, which makes sense since it was aimed at business users. You could use 229 it with either the text specialized MDA Video Card, which was what most business 230 users preferred, or the 16 colors CGA Video Card, that got somewhat more popular 231 for home users as gaming began to grow, but was inferior for pure console text. 232 Sound was provided by a beeper, the PC Speaker, mounted inside the Computer 233 Case. For removable storage the IBM PC was a hybrid of sorts, as IBM intended to 234 tackle both the low end and high end markets at the same time, thus it supported 235 both Cassette and Diskette interfaces. Cassettes were intended to be the cheap 236 storage option and were already ubiquitous in other home computers, so it was 237 decided to have support to interface with them built-in in the IBM PC 238 Motherboard, though it relied on an external Cassette Deck. Diskettes required 239 an extra expansion card with a FDC (Floppy Disk Controller), yet the Case had 240 front bays for either one or two 5.25'' Diskette Drives, so they were 241 essentially internal and integrated to the computer unit. 242 243 During the commercial life of the IBM PC, there were two markedly different 244 models: One colloquially known as PC-1, which was the original model released in 245 1981, and a refresh released around 1983 known as PC-2. The difference between 246 the two is the Motherboard, as the PC-1 Motherboard supported up to 64 KiB of 247 system RAM installed in the Motherboard itself while the PC-2 could have 256 248 KiB. In both cases, more system RAM could be installed via expansion cards. 249 250 If there is a trend that the IBM PC truly set, is establishing that a desktop 251 computer is made of three main components: Monitor and Keyboard as totally 252 independent units, and a Case housing the Motherboard with its internal 253 expansion slots closer to the back of the Case, with front bays to house the 254 Diskette Drives. Other contemporary home computers like the Apple II had the 255 Keyboard integrated to the same unit than housed the Motherboard, but had no 256 front bays, so Diskette Drives had to be external units. Some were even like 257 modern AiOs (All-in-Ones), with a unit housing both Monitor and Motherboard. I 258 recall seeing photos of at least one that had Monitor, Keyboard, and Motherboard 259 as part of a single unit, but were not in a Notebook form factor but instead 260 resembled a portable desktop computer. 261 262 [5150ref]: http://www.minuszerodegrees.net/manuals/IBM_5150_Technical_Reference_6025005_AUG81.pdf 263 264 265 ##### 2.1 - Intel 8088 CPU Overview, Memory and I/O Address Spaces, Bus protocol 266 267 From the IBM PC chip choices, perhaps the most important one was the main 268 Processor. The chosen one was the Intel 8088 CPU (Central Processor Unit), that 269 was based on the same ISA (Instruction Set Architecture) than the more expensive 270 and better performing Intel 8086 CPU (Which is from where the x86 moniker is 271 derived from). What makes the Processor ISA important is Binary Compatibility, 272 as when a compiler creates an executable file by translating source code into 273 machine code (opcodes), it does so by targeting a specific ISA. If you were to 274 change the Processor to one based on a different ISA, it would mean that, at 275 minimum, you would have to recompile (If not port) all the executable files 276 because the Binary Compatibility would not be preserved. This early Processor 277 choice means that any direct IBM PC successor was forced to use a x86 compatible 278 Processor. 279 280 Both the 8088 and 8086 CPUs were 16 Bits Processors, where the usual definition 281 of the "Processor Bits" refers to the size of its GPRs (General Purpose 282 Registers). By today standards, they were Single Core Processors that could 283 execute only a Single Thread at a given moment. As they could not execute 284 multiple Threads concurrently, any form of primitive Multitasking was achieved 285 by an OS that did quick context switches at small time intervals, giving the 286 illusion that multiple applications could run simultaneously (This would hold 287 true for 20 years, until the apparition of SMT in the Pentium 4 around 2001, and 288 Multi-core around 2005 with the Athlon 64 X2). The 8088 had an external 8 Bits 289 Data Bus, while the 8086 had a 16 Bits one, which means that the later could 290 move twice the data though the external Data Bus at once. Both the 8088 and 8086 291 had a 20 Bits Address Bus, allowing them to have a Physical Memory Address Space 292 of 1 MiB (2^20). For comparison, other contemporary 8 Bits Processors had a 16 293 Bits Address Bus so they could directly address only 64 KiB (2^16). 294 295 Since the size of the 16 Bits GPRs was smaller than the 20 Bits of the Address 296 Bus, it was impossible for either CPU to access the whole Memory Address Space 297 with just the value of a single GPR. Intel solution was to use a Segmented 298 Memory Model, where accessing a Memory Address required two values, that in x86 299 Assembler are known as the Segment and Offset pair (Note that you will see some 300 documentation that calls Segmented Memory to how the 8086/8088 1024 KiB Address 301 Space was partitioned into 16 named Segments of 64 KiB each, but that is not 302 what I'm describing). This Segmented Memory Model was a major characteristic of 303 the early x86 based Processors, though not a positive one, as programming in x86 304 Assembler was far messier than the Assembler of other Processors based on a Flat 305 Memory Model, like the Motorola 68000, which IBM considered for the IBM PC at 306 some point. 307 308 Another significant characteristic of the x86 Processors is that they actually 309 had two completely independent Physical Address Spaces: The already mentioned 310 Memory Address Space, and another one known as the I/O Address Space, where an 311 individual I/O Address is known as an I/O Port. Currently, this stands out as an 312 oddity of the x86 architecture since most other Processors ISAs just have a 313 single Address Space, but back then this was somewhat common. The design 314 intention of having two separate Address Spaces was to differentiate between 315 simple RAM or ROM memory from another chip internal Registers. 316 317 In addition to the 20 lines required for the 20 Bits Address Bus, the CPUs also 318 had an extra line, IO/M, that would signal whenever an Address Bus access was 319 intended for a Memory Address or for an I/O Port. However, the I/O Address Space 320 only used 16 Bits of the Address Bus instead of the full 20, so it was limited 321 to 64 KiB (2^16) worth of I/O Ports, which ironically means that when dealing 322 with these, you didn't need cumbersome tricks like the Segment and Offset pair 323 as a single GPR sufficed for the full address space range. Accessing an I/O Port 324 required the use of special x86 instructions, IN and OUT, that triggered the 325 IO/M line. 326 327 It may be convenient to explain right now how the Address Space works. As a 328 simplified explanation, an Address Space can be thought about as if it was 329 represented by a fixed amount of slots, where each slot gets its own individual 330 address that can be used to interact with it. In the case of the 8088 and 8086 331 CPUs, their 20 Bits Address Bus allowed for 1048576 (2^20) individual Memory 332 Addresses in the Memory Address Space, and 65536 (2^16) individual I/O Ports in 333 the I/O Address Space. The reason why they are worth 1 MiB and 64 KiB of memory, 334 respectively, is because the x86 architecture is Byte-level addressable, which 335 means that each individual address location is one Byte in size. As such, each 336 Byte of memory gets its own address, and this also conveniently means that the 337 amount of addressable slots matches the maximum amount of addressable Bytes. An 338 example of an Address Space format that is not Byte-level addressable is the LBA 339 (Logical Block Addressing) used for modern storage drives like HDs (Hard Disks) 340 and SSDs (Solid State Disks), where an individual address points to an entire 341 block of either 512 Bytes or 4 KiB of memory. 342 343 Regardless of the amount or sizes of the Physical Address Spaces that a 344 Processor may have, there has to be something that is listening to the CPU 345 Address Bus looking for a specific address or address range, so when the CPU 346 sends an address through the Bus, someone actually bothers to take the call and 347 reply back to the CPU. In this context, getting memory to respond to an address 348 is known as Memory Mapping. However, in these ancient platforms, mapping memory 349 faced two important implementation challenges: First, the vast majority of chips 350 were rather dumb. It was not possible to dynamically configure the mapping so 351 that each chip that had addressable memory knew in advance to which address 352 ranges it was supposed to respond. This was a major problem in Buses that had 353 multiple chips connected in parallel, as without something that arbitrated the 354 Bus, all the chips could potentially answer simultaneously to the same 355 addresses, a rather chaotic situation. Second, any chip that had something 356 addressable, with ROM memory chips being the best examples, only had as many 357 Address lines as they needed for their own internal addressable memory size, 358 which means that if trying to wire these directly to the wider CPU Address Bus, 359 the upper Bits of it would be left not connected, thus the chips would be unable 360 to fully understand addresses above their own sizes. Instead, they would reply 361 to anything that matched just the lower Bits of whatever was being sent via the 362 Address Bus, an issue known as Partial Address Decoding. In order to make things 363 work by making each chip mapping unique, with no addresses overlapping or other 364 conflicts across all the address space, it was necessary to solve these two 365 issues. 366 367 The basis for Bus arbitration in parallel Buses was the Chip Select line, that 368 was used to make sure that only the active chip would be listening to the Bus at 369 a given moment, making possible to wire several chips in parallel without Bus 370 conflicts caused due to unintended concurrent usage. The Chip Select line could 371 be built-in into the chips themselves as an extra Pin, or could be implemented 372 in those that didn't had it with the help of some external circuitry. Still, for 373 the whole scheme to work, you required something that managed the Chip Select 374 line according to the expected location in the address space of the memory of 375 these chips. That job was done by a bunch of extra discrete components that were 376 collectively known as Glue Logic. The glue logic acted as an intermediary that 377 took care of hooking the narrower chips external Address Buses to the wider CPU 378 Address Bus, assisting them to externally decode the missing upper Bits and 379 activate the Chip Select line when appropriate. Thus, from the CPU point of 380 view, the narrower chips were correctly mapped with no overlapping occurring, as 381 if they effectively understood the full address. You can get an excellent 382 explanation about how the additional supporting decoding logic that helps to 383 listen to the CPU Address Bus works [here][quora_mmap]. 384 385 As the actual address range where the memory from a particular chip or bank of 386 chips would be mapped to depended exclusively on how the arbitrating decoding 387 logic was wired to the Address Bus, the mapping layout of a specific platform 388 was pretty much static by design (In some cases, the decoding logic of certain 389 Devices could be configurable, but that was not the case for those that were 390 part of the base platform). The complete platform mapping layout eventually 391 becomes a valuable data construct known as the Memory Map, that should include 392 information about which address ranges got something present in them, and what 393 their uses are going to be. 394 395 Is also important to mention that an individual address may not actually be 396 unique. For example, to reduce costs, some platform designers could implement on 397 purpose an incomplete address decoding logic that only does partial decoding of 398 the CPU Address Bus. Such decoding logic typically leaves some upper Address 399 lines not connected (Same scenario that if you were directly wiring a memory 400 chip to a wider Address Bus), so whenever the CPU sent an address through the 401 Bus, the incomplete decoding logic would just consider the lower Bits of the 402 address and ignore the value of the upper Bits. This caused that the chips 403 behind such incomplete decoders responded to address ranges not really mean for 404 them for as long that the lower Bits of the full address matched what the 405 decoder logic could understand. In other words, the same Byte of memory in a 406 chip could respond to multiple individual addresses, as if they were aliases of 407 the intended mapped one. This was bad for forward compatibility since sometimes 408 programmers could decide to use an alias instead of the real address, so if in a 409 later platform revision the address decoding logic was improved, the individual 410 address that used to be an alias would now point to a completely different 411 location, causing the software piece that used that trick to not work as 412 intended in the new platform. 413 414 In addition to that, the Segmented Memory Model used by x86 Processors already 415 allowed something similar to address aliases on its own, as it was possible to 416 use multiple different combinations of Segment and Offset pairs that effectively 417 interacted with the same Physical Memory Address. As programmers of the era had 418 often to use x86 Assembler to get the most performance out of the CPU, it was a 419 necessity to have a very clear idea about both the x86 Segmented Memory Model 420 and the specific platform address decoding scheme idiosyncrasies to know what 421 physical address a code could be really pointing to, more so if a developer was 422 intentionally trying to obfuscate its code as much as possible for things like 423 making software copy protection harder to crack. 424 425 There is some degree of flexibility regarding what can be mapped into the 426 Address Space, which in every case is some form of memory. As explained before, 427 the x86 Processors had two separated Physical Address Spaces, the Memory Address 428 Space and the I/O Address Space. The Memory Address Space was intended for 429 addressing both RAM and ROM memory, but there was a very important detail: The 430 memory could be local, as in the case of the system RAM memory that served as 431 the personal workspace of the CPU, or it could be remote, like when the memory 432 is the workspace of some other Processor or Device. This second case is known as 433 MMIO (Memory Mapped I/O), as the Memory Addresses used by the main CPU to 434 directly address remote memory can be effectively used to transfer data between 435 the CPU and another chip (How the commands and data gets routed to and from the 436 remote memory, and how the other chip notices the externally initiated 437 operation, are another matter). In the case of the I/O Address Space, since it 438 was intended to be exclusively used to address other chips internal Registers, 439 anything in that address space was considered PMIO (Port Mapped I/O). A single 440 Device could use both MMIO and PMIO addressing, which is the typical case for 441 Devices that have both internal Registers and their own local RAM or ROM memory. 442 443 The advantage of MMIO over PMIO is that the CPU could use standard generalists 444 instructions like MOV to either read and write from local memory or to do I/O to 445 remote memory, while PMIO required to use the already mentioned special 446 instructions IN and OUT. The drawback of MMIO is that since it takes addresses 447 from the Memory Address Space, you have to sacrifice how much RAM or ROM you can 448 address, which may not be important at first, but becomes a problem as the 449 Address Space gets crowded. Nowadays MMIO is universally preferred, using a 450 single unified Address Space (At least from the point of view of the Processor). 451 452 Curiously, I never understood why Intel decided that x86 had to have two Address 453 Spaces, since the 8086/8088 used 20 Address lines plus the IO/M line, for a 454 total of 21 lines and a combined 1088 KiB addressable memory (1024 KiB Memory 455 and 64 KiB I/O). If instead Intel decided to use all 21 lines for the Address 456 Bus, it would have yielded 2048 KiB (2^21) addressable memory, and that 457 shouldn't have been much harder to implement given than x86 already relied on a 458 Segmented Memory Model anyways. I think that it is related to the fact than the 459 x86 ISA is a descendant of the Intel 8085 CPU, which also had two Address 460 Spaces. While the 8086 is not Binary Compatible with the 8085, Intel intended 461 for it to be Source Compatible, as the idea was that the primitive automatic 462 translations tools available at the era could easily port the software between 463 both ISAs. A single address space would have required to do complete Ports of 464 8085 software. Had Intel started the x86 ISA from scratch with no backwards 465 compatibility of any sort, chances are that it would be different in this 466 regard. 467 468 There are two more details worth mentioning now related to the Address Space. 469 The first is that when the Processor is turned on, the very first thing that it 470 does is to attempt to read code beginning from a hard-coded Physical Memory 471 Address (In the 8086/8088, the location was near the end of the addressable 1024 472 KiB), so it is mandatory that there is some memory mapped into that location 473 containing executable code that can initialize the rest of the platform up to an 474 usable state, a procedure known as Bootstrapping. For obvious reasons, such 475 memory can't be volatile like RAM, so you're pretty much limited to contain the 476 bootstrapping executable code in a ROM type memory chip. The contents of that 477 critical ROM chip are known as the firmware. The second thing is that not all 478 the memory in a system has to be directly addressable by the CPU, sometimes 479 there is memory that is addressable by another Device or Co-processor but that 480 isn't directly visible from the main CPU point of view at all. 481 482 When it comes to interfacing with other chips, there is yet another important 483 characteristic of the 8088 and 8086 CPUs: They had a multiplexed external Bus. 484 Basically, a Bus is a composite of other three specialized Buses called Data, 485 Address and Control, that perform different slices of the same task. In an ideal 486 scenario, these three Buses would be fully independent entities where each 487 signal has its own Pin in the chip package and wire line in the Motherboard, but 488 that wasn't the case with the 8088 and 8086. In order to reduce the Pin count of 489 the CPUs so that they could fit in a standard and cost effective 40 Pin package, 490 some signals had to share the same Pins. In the case of the 8088 CPU, its 8 Bits 491 Data Bus would require 8 lines, but instead of getting its own 8 Pins, the CPU 492 had an internal Muxer unit that multiplexed these 8 lines into the same Pins 493 that were used by 8 of the 20 lines of the 20 Bits Address Bus. For the 8086 CPU 494 and its 16 Bits Data Bus, the 16 lines got multiplexed into 16 of the 20 Pins 495 used by the Address Bus. 496 497 Due to the multiplexed signals, what the 8086/8088 CPUs could be sending though 498 the external Bus at a given moment may be either an address or data, according 499 to which step of the Bus Cycle the CPUs were currently executing. In an 500 identical style than the IO/M line was used to signal an operation to either a 501 Memory or I/O address location, the Processors had a line known as ALE, that was 502 used to differentiate whenever what was currently in the Bus was address or 503 data. However, whatever was at the receiving end of the Bus had to be aware of 504 this line so that the Bus contents could be interpreted correctly. This pretty 505 much means that both CPUs could only be directly interfaced with other chips 506 that explicitly supported their particular Bus Cycle protocol and multiplexing 507 scheme, as for anything else, you needed at least intermediate chips to 508 demultiplex the Address and Data signals back into separate lines. Also, as the 509 Processor uses some of its transistor budget to have an extra Muxer unit plus 510 the Bus Cycle has to be longer than it could actually be just to accommodate two 511 different signals on the same lines, multiplexing the external Bus incurs in a 512 measurable performance penalty compared to a similar design than didn't had 513 limitations due to the amount of available Pins. 514 515 Both the 8088 and 8086 CPUs could be potentially used as single main Processors 516 in a mode known as Minimum Mode, where they generated by themselves all the 517 Control Bus signals required to drive other chips. The Control Bus was also 518 multiplexed, but demultiplexing it with an intermediary chip acting as glue 519 logic was quite easy. An example of a simple homebrew design based around an 520 8088 CPU in Minimum Mode that implements the required Address, Data and Control 521 Bus demultiplexers is [here][hb8088]. However, this is not what IBM used for the 522 PC... 523 524 [quora_mmap]: https://www.quora.com/What-is-Memory-Mapping-in-Microprocessor-based-systems/answer/Balajee-Seshadri 525 [hb8088]: http://www.homebrew8088.com/home 526 527 528 ##### 2.2 - Intel MCS-86 and MCS-85 chip set families, support chips overview 529 530 Chip vendors like Intel didn't just made CPUs, typically they also provided a 531 multitude of complementary support chips that dramatically enhanced their main 532 Processor capabilities, enough to make an extensible computer platform based 533 around them. Both the 8088 and 8086 CPUs belonged to the [Intel MCS-86][mcs86] 534 family of chip sets (Now you know from where the Chipset word is derived from), 535 which were intended to interface with them (Note that the previous link includes 536 80286 and 80386 era parts that aren't really a good match to first generation 537 MCS-86, but for some reason they are included). They were also rather compatible 538 with the previous chip set generation, the [Intel MCS-85][mcs85], albeit with 539 some caveats, as those were intended for the older Intel 8080 and 8085 CPUs 540 which had only an 8 Bits Data Bus and a 16 Bits Address Bus. 541 542 A chip set not only included support chips for the main Processor, there were 543 also major chips considered as a Co-processor class, like the Intel 8087 FPU and 544 the Intel 8089 IOP. The 8088 and 8086 CPUs supported Multiprocessing (It had a 545 different meaning back then than what you would think about now), allowing them 546 to coordinate with these chips. However, Multiprocessing required a complexer 547 signalling scheme than what the previously mentioned Minimum Mode could do. More 548 signals would require more Pins, but, as you already know, that was impossible 549 to do due to the 8086/8088 packaging constrains. The solution was to heavily 550 multiplex the Control Bus signals into only 3 Pins now known as S0, S1 and S2, 551 then add a specialized discrete demultiplexer, the Intel 8288 Bus Controller, 552 that could interpret that set of 3 lines as 8 different commands (2^3 makes for 553 a total of 8 different binary combinations). This Multiprocessing mode was known 554 as Maximum Mode, and is how IBM had set the 8088 CPU in the PC. 555 556 The advantage of using parts of the same chip set family is that you could 557 expect them to easily interface with each other. Members of the MCS-86 family 558 that interacted with the CPU Address and Data Buses typically supported out of 559 the box either or both of the Bus Cycle protocol or the multiplexed Bus protocol 560 of the 8088 and 8086 CPUs. For example, the 8087 FPU Address and Data lines 561 could be pretty much directly wired to those of the 8086 CPU without 562 demultiplexers or additional circuitry other than the 8288 Bus Controller. 563 Moreover, the 8087 FPU was capable of autodetecting whenever the CPU was a 8088 564 or 8086 to automatically adjust the Data Bus width. Other chips usually required 565 some external glue logic, so using chips belonging to the same family was far 566 more convenient than attempting to mix and match functionally equivalent chips 567 from other families. 568 569 The chips from the older MCS-85 family partially supported being interfaced with 570 the MCS-86 family as they shared the same Bus protocol and multiplexing scheme, 571 but for the most part, they required some glue logic due to the Bus width 572 differences. As they were intended for the 8080/8085 CPUs multiplexed 8 Bits 573 Data and 16 Bits Address Bus, they were usually easier to interface with the 574 8088 that also had a 8 Bits Data Bus than the 8086 16 Bits one (This is one of 575 the reasons, along with the cheaper cost and higher availability of the older 576 support chips, that made IBM pick the 8088 CPU instead of the 8086). The Address 577 Bus always required glue logic as the 8086/8088 CPUs supported a 20 Bits Address 578 Bus whereas the older support chips intended for the 8080/8085 CPUs had just 16 579 Bits. However, even with the extra glue logic to make the older chips functional 580 enough, they had diminished capabilities. For example, the IBM PC used an Intel 581 8237A DMAC that due to its native 16 Bits Address Bus was limited to a 64 KiB 582 data transfer at a time, which was the entirety of the Physical Memory Address 583 Space of the 8080/8085s CPU than it was intended for, yet only 1/16 of the 584 address space of an 8088. 585 586 The major chips from the Intel MCS-86 family, besides the 8088 and 8086 CPUs 587 (Central Processor Unit), were the 8087 FPU (Floating Point Unit. Also known as 588 NPU (Numeric Processor Unit), but FPU should be the modern term), which focused 589 on floating point math (It was possible to emulate floating point operations 590 using standard CPU integers via software, but the FPU was ridiculous faster), 591 and the 8089 IOP (I/O Processor), which could be considered a glorified DMAC 592 with some processing capabilities added in. Both of those chips had their own 593 ISAs, so besides the Instruction Set of the x86 CPUs, you had the x87 FPUs and 594 x89 IOPs, too. The less extravagant support chips included the already mentioned 595 8288 Bus Controller, which was a specialized Control Bus demultiplexer required 596 if using the 8086/8088 in Maximum Mode, the 8289 Bus Arbitrator, which allowed 597 to make Bus topologies with multiple Bus Masters (Seems to have been used mostly 598 for Intel Multibus based systems, no idea if they were ever used in IBM PC 599 expansion cards or IBM PC compatible computers), and the 8284A Clock Generator, 600 which could generate clock signals as electrically required by all the mentioned 601 Intel chips. 602 603 The Intel MCS-85 family was based around the Intel 8080 or 8085 CPUs as the main 604 Processor, which is not interesting for the purposes of this story. The support 605 chips, however, are extremely important. The four relevant ones are the 8259A 606 PIC (Programmable Interrupt Controller), the previously mentioned 8237A DMAC 607 (Direct Memory Access Controller), the 8253 PIT (Programmable Interval Timer), 608 and the 8255 PPI (Programmable Peripheral Interface), all of which would be used 609 by the IBM PC. With the exception of the 8255 PPI, the other three would manage 610 to become the staple support chips that defined the IBM PC platform and all its 611 successors. 612 613 For the IBM PC platform, IBM mixed parts from both the MCS-86 and MCS-85 614 families. From the MCS-86 family, IBM picked the Intel 8088 CPU itself, the 615 Intel 8087 FPU as an optional socketeable component, the Intel 8288 Bus 616 Controller required for Maximum Mode, and the Intel 8284A Clock Generator. From 617 the MCS-85 family, IBM picked all the mentioned ones, namely the Intel 8259A 618 PIC, the Intel 8237A DMAC, the Intel 8253 PIT and the Intel 8255 PPI. 619 Surprisingly, IBM fully omitted the Intel 8089 IOP from the IBM PC design, which 620 was one of the advanced support chips from the MCS-86 family that Intel 621 suggested to pair their 8088 or 8086 CPUs with. That is why the 8089 has been 622 completely forgotten. 623 624 [mcs86]:http://www.intel-vintage.info/intelmcs.htm#710771341 625 [mcs85]:http://www.intel-vintage.info/intelmcs.htm#710766706 626 627 628 ##### 2.3 - Intel 8259A PIC IRQs, Intel 8237A DMAC DMA Channels, Intel 8253 PIT Counters, and Intel 8255 PPI Ports 629 630 The support chips are part of the core of the IBM PC. Thanks to backwards 631 compatibility, with the exception of the 8255 PPI, their functionality should be 632 at least partially present even in modern computers. All them were mapped to the 633 CPU I/O Address Space and accessed via PMIO, so the CPU could directly interact 634 with them. 635 636 **Intel 8259A PIC (Programmable Interrupt Controller):** The x86 Processor 637 architecture is Interrupt-driven (Also known as Event-driven). In an 638 Interrupt-driven architecture, there is a framework for external Devices to send 639 to the Processor a request for immediate attention, which is done by signalling 640 an Interrupt Request. When the Processor receives an interrupt request, it stops 641 whatever it was doing, saves its state as if the current task was put on hold, 642 and switches control to an ISR (Interrupt Service Routine, also known as 643 Interrupt Handler) to check the status of the Device that made the interrupt 644 request. After the ISR services the Device, the Processor resumes the previous 645 task by restoring the saved state. An Interrupt-driven architecture requires 646 additional Hardware to work, as there is an out-of-band path between the 647 Processor and the Devices so that they can trigger the interrupts skipping the 648 Bus. A problem with interrupts is that they make application latency highly 649 inconsistent. 650 651 The other alternative to Interrupt-driven architectures are Polling-based ones, 652 where the Processor polls the Devices status continuously, as if it was 653 executing an infinite loop. While technically Polling-based architectures are 654 simpler than Interrupt-driven ones since you don't require the dedicated 655 Hardware to implement interrupts, it was considered that they were inefficient 656 since they wasted too much time needlessly polling the Devices status. However, 657 there is nothing limiting an Interrupt-driven architecture to ignore interrupts 658 and instead do polling only. Actually, someone tested that in x86 and found that 659 [it may be better in some scenarios][polling] (TODO: Broken link, found no 660 replacement). I suppose that polling could be preferred for Real Time OSes as it 661 should provide a very constant or predictable latency, assuming you don't mind 662 the extra power consumption due to the Processor being always awake and working. 663 664 The 8088 CPU supported three different types of Interrupts: Exceptions, Hardware 665 Interrupts, and software Interrupts. All them relied on the same IVT (Interrupt 666 Vector Table) that held 256 entries, with each entry containing a pointer to the 667 actual ISR. The Interrupts relevant here are the external Hardware Interrupts, 668 which were of two types: The standard Maskable Interrupts, and NMIs 669 (Non-Maskable Interrupts). The difference between them is that Maskable 670 Interrupts can be fully ignored if the software being run configures the 671 Processor to do so, while NMIs can't be ignored at all and will always be 672 serviced. To manage the standard interrupts, the 8088 had a pair of dedicated 673 Pins, INTR (Interrupt Request) and INTA (Interrupt Acknowledged), which were 674 used to receive interrupts and acknowledge them, respectively. NMIs had their 675 own dedicated Pin to receive them, NMI, but it lacked another Pin for 676 acknowledge. Note that NMIs are highly specialized and pretty much reserved for 677 niche purposes (Usually to signal Hardware errors), so they don't get a lot of 678 attention. Also, while NMIs can't be ignored by the Processor itself, they can 679 be externally disabled if you have a way to block the NMI line, like via a 680 Jumper. 681 682 The 8088, by itself, was capable of receiving standard Maskable Interrupts from 683 just a single Device since it had only one interrupt request line, the already 684 mentioned INTR. This is precisely when the Intel 8259A PIC comes handy. The 685 purpose of the PIC was to act as an interrupt multiplexer, since it was 686 connected to the Processor INTR and INTA lines and could use them to fanout to 8 687 interrupt lines, thus up to 8 Devices could use interrupts instead of only a 688 single one directly wired to the CPU. The IRQ (Interrupt Request) lines were 689 known as IRQs 0-7 and had priority levels, with 0 being the highest priority 690 interrupt and 7 the lowest one. Each IRQ was just a single line, not two since 691 there was no interrupt acknowledged line between the PIC and the Devices. There 692 could be more than one 8259A PIC in a system, with one master and up to 8 693 slaves, for a total of 64 IRQs. The slave PICs were cascaded by taking each an 694 IRQ line from the master PIC. Note that the 8259A PIC Datasheet mentions 695 cascading, but not daisy chaining, so it seems that you can't have three or more 696 PICs where there is a slave PIC hooked to an IRQ of the master, then the slave 697 PIC has its own slave hooked to an IRQ of it in a three-level arrangement. 698 699 When using the 8259A PIC, whenever a Device signalled an interrupt, the PIC 700 received it via an IRQ line, then the PIC signalled its own interrupt to the 701 Processor, which received it via INTR. The Processor, in turn, used an ISR that 702 activated the INTA line to acknowledge the PIC request. Then, the PIC told the 703 Processor from which IRQ the Device interrupt was coming from, so it could 704 switch to the proper ISR for that IRQ. For more information about how Hardware 705 Interrupts works in x86 based platforms, [read here][interrupts]. 706 707 In the IBM PC, 6 of the 8 IRQs provided by the PIC (2-7, or all except 0 and 1) 708 were free and made directly available to discrete Devices via the expansion 709 slots. IRQs 0 and 1 were used by internal Motherboard Devices: The Intel 8253 710 PIT was wired to IRQ 0, and the Keyboard interface logic to IRQ 1. IRQ 6, while 711 exposed in the expansion slots, was used exclusively by FDC (Floppy Disk 712 Controller) type cards, and I'm not aware of other cards that allowed you to set 713 a Device to that IRQ. Meanwhile, the 8088 NMI line was used for error reporting 714 by the Memory subsystem, but for some reason, the 8087 FPU was also wired to it 715 instead of a standard PIC IRQ as Intel recommended. 716 717 **Intel 8237A DMAC (Direct Memory Access Controller):** On a bare Intel 8088 CPU 718 based platform, any data movement between Devices and a memory address has to be 719 performed by the CPU itself. This means that instead of doing the actual compute 720 work, the CPU spends a lot of its time moving data around as if it was the 721 delivery boy of the Devices. Supporting DMA means that there is a way for data 722 movement to happen without the direct intervention of the CPU, offloading such 723 memory transactions from it. 724 725 In this primitive era, most chips were rather dull and simple, they just limited 726 themselves to do very specific functions. Having Devices like a Floppy Disk 727 Controller or a Sound Card being able to independently initiate a memory 728 transaction was theoretically possible, but expensive in both cost and physical 729 size due to the required amount of extra logic to add such a complex function. 730 The solution was to add a shared controller to do so on behalf of multiple 731 Devices, being able to provide a means to perform DMA without needing to have 732 such capabilities integrated into each Device itself. This is known as 733 Third-party DMA, where you have a DMAC that provides DMA services to Devices. 734 Note that a parallel Bus would still be shared between the CPU and the DMAC, so 735 only one of the two could use it at a given moment. The general idea was that 736 the DMAC would help to move data in the background while the CPU was busy 737 executing instructions that didn't required for it to access the Bus, but 738 sometimes it could be a hindrance if the DMAC didn't wanted to relinquish Bus 739 control to the CPU while it was waiting for data. 740 741 Between the DMAC and DMA supporting Devices there was an out-of-band path 742 similar to IRQs. This path was known as a DMA Channel, and included two lines, 743 DREQ (DMA Request) and DACK (DMA Acknowledge), which were used by the Device to 744 signal a DMA request and by the DMAC to acknowledge that it was ready to service 745 it, respectively. When a Device wanted to perform DMA, it triggered its DMA 746 Channel to tell the DMAC to do a memory transaction. However, because the 747 Devices themselves were rather dumb, the only thing that they could do was to 748 make a request, but they couldn't tell the DMAC what to do when receiving it. 749 There had to be a previous setup period as a Device Driver had to program the 750 DMAC ahead of time so that it knew what operation it had to do when it received 751 a DMA request from a particular Device DMA Channel, something similar in nature 752 to an ISR. 753 754 The 8237A DMAC offered 4 DMA Channels, so up to 4 Devices could be serviced by 755 it. The DMA Channels are numbered 0 to 3 and like the IRQs, they had different 756 priority levels, with 0 being the highest priority channel and 3 the lowest. An 757 important detail of the 8237A is that it was a flyby type DMAC, where flyby 758 means that it could simultaneously set a Device to read or write to the Bus and 759 the memory to accept such transaction in the same clock cycle, so it didn't 760 actually had to buffer or handle the data itself to perform data movement. Like 761 the 8259A PIC, there could be multiple 8237A DMACs in a system. The supported 762 configurations were cascading up to 4 8237A DMACs to the 4 DMA Channels of a 763 master 8237A in a two-level arrangement, or daisy chained, where a master 8237A 764 had another 8237A in a DMA Channel, and this second 8237A had a third one behind 765 it, forming a three-level arrangement (Or more if required) with 3 Devices and 1 766 slave DMAC per level. Cascading and daisy chaining could also be mixed, with no 767 limit mentioned in the 8237A Datasheet. 768 769 In the IBM PC, the 8237A DMA Channel 0 was used as a means to initiate a dummy 770 transfer that refreshed the contents of the DRAM (Dynamic RAM) chips, as the IBM 771 PC Motherboard lacked a proper dedicated DRAM Memory Controller. DMA Channel 0 772 signals was used to refresh both the DRAM located on the Motherboard and on 773 Memory expansion cards too, since there was a Pin on the expansion slots, B19, 774 that made available the DACK0 line to expansion cards, albeit there was no 775 DREQ0/DRA0 line exposed for cards to initiate requests. DMA Channels 1-3 were 776 free and both DREQ and DACK lines are exposed in the IBM PC expansion slots for 777 any Device on an expansion card to use, albeit DMA Channel 2 was considered to 778 be exclusive for FDC cards. Like with IRQ 6, also for FDC type cards, no other 779 expansion card type allowed you to set it to use that DMA Channel that I'm aware 780 of. 781 782 Is important to mention that the 8237A as used in the IBM PC, was severely 783 handicapped due to being intended for the previous Processor generation. While 784 IBM added additional glue logic, known as the Page Registers, to manage the 785 upper 4 address lines of the 8237A external Address Bus so that it could 786 understand the full 8088 20 Bits width, its lack of native support for such 787 width gave it many limitations. The 8237A was limited to transfer at maximum 64 788 KiB in a single transaction, and the transactions could operate only within 789 aligned 64 KiB Segments (0-63 KiB, 64-127 KiB, etc) because the DMAC couldn't 790 modify the Page Registers by itself. Some of the 8237A issues were 791 [specific to the IBM PC implementation][datasheet_danger]. For example, IBM 792 decided to wire the EOP (End of Process) line as Output only instead of 793 Input/Output, so it is not possible for an external source to tell the 8237A to 794 abort a DMA operation, it just uses it to signal when it finished. The 8237A was 795 also able to do memory-to-memory transfers but it required to have DMA Channels 796 0 and 1 available, which was not possible in the IBM PC due to DMA Channel 0 797 being used for the DRAM refresh procedure. 798 799 While having the 8237A DMAC was better than nothing (And not that bad by 1981 800 levels), in a few years it would become a burden due to its performance being 801 nearly impossible to scale up. Several people have already wondered how 802 different the stories about DMA on IBM PC platforms would have been had IBM 803 decided to go with the Intel 8089 IOP instead of the previous generation 8237A. 804 Besides the facts than the 8089 IOP had only 2 DMA Channels compared to the 805 8237A 4 and that it was much more expensive, eventually it may have saved the 806 entire industry a lot of headaches. 807 808 **Intel 8253 PIT (Programmable Interval Timer):** A computer by itself has no 809 concept of time. At most, what it can do is simply to count elapsed clock 810 cycles. Because the clock speed that a given part is running at is a known 811 value, it is possible to externally infer the real time elapsed based on a clock 812 cycle counter, at which point you have the first building block to count seconds 813 (Or fractions of it). While theoretically the CPU can count clock cycles, it 814 would be a waste for it to do so, since it means that it wouldn't be able to do 815 anything else without completely desynchronizing. For this reason, there were 816 dedicated timers whose only purpose was to count cycles without interruptions, 817 as they were required to be as consistent as possible for any timing 818 measurements to be accurate. 819 820 The 8253 PIT was an interesting chip due to the amount of functions than it 821 could do that exceeded those of a simple timer. Actually, it had three 822 independent timers, known as Counters 0, 1 and 2. All three timers could be 823 directly usable by software, as you could program every how many clock cycles 824 they had to tick then just read back values. Each Counter also had both an input 825 GATE line and an output OUT line, the latter which could be independently 826 triggered by that Counter to allow the 8253 to directly interface with other 827 Hardware. 828 829 The IBM PC not only used all three 8253 Counters, it also used all their 830 external OUT lines, too. Counter 0 was used as the System Timer to keep track of 831 time elapsed, with its OUT line hooked directly to IRQ 0 of the 8259A PIC so 832 that it interrupted the CPU to update the clock at regular intervals with the 833 highest Interrupt priority. Counter 1 was used as a DRAM refresh timer, with its 834 OUT line wired directly to the DRQ0 Pin of the 8237A DMAC to request also at 835 regular intervals a dummy DMA transfer that refreshed the DRAM memory. Last but 836 not least, Counter 2 OUT line passed though some glue logic to reach the PC 837 Speaker. In addition to that, Counter 2 had its input GATE line wired to the 838 Intel 8255 PPI, whereas the other two Counters didn't had it connected. Both the 839 8253 PIT and 8255 PPI could be used either individually or in tandem to produce 840 noises via the PC Speaker. 841 842 **Intel 8255 PPI (Programmable Peripheral Interface):** As mentioned before, the 843 CPU external Bus had a protocol of sorts, which means that anything that 844 directly interfaced with it had to understand that protocol. Yet, when it came 845 to interface with external peripherals, instead of adapting them to understand 846 the CPU Bus protocol, designers seemed to always opt for some form of GPIO 847 (General Purpose I/O). GPIO can be used to make protocol agnostic interfaces 848 where you can bit-bang raw data in and out, and leave software like a Device 849 Driver to interpret the raw Bits meanings. It could be considered a raw Bus with 850 an user-defined protocol. Obviously, there were specialized intermediary chips 851 that interfaced with the CPU Bus protocol to provide such GPIO. 852 853 Due to my lack of electronics knowledge, I don't actually understand what GPIO 854 truly did to be useful and earn its place in a platform design. There is 855 technically nothing that forbids you to provide direct access to the CPU Bus via 856 an external connector if you wanted to do so (Albeit stretching that much a 857 circuit may hurt signal integrity if cables are too long or poor quality, 858 causing a severely limited maximum possible stable clock speed, but the point is 859 that it would still work). IBM itself did that with the IBM 5161 Expansion 860 Chassis add-on for the IBM PC 5150 and IBM PC/XT 5160, which was pretty much a 861 separate Computer Case that was cabled to the main computer and provided more 862 room for expansion slots. I assume that bit-banging though GPIO is just far 863 easier to implement in simple peripherals than a direct CPU Bus interface, and 864 it would also be completely neutral in nature thus easier to interface with 865 other computer platforms. Nowadays, a lot of what previously used to be done via 866 GPIO interfaces is now done via USB, which is a well-defined protocol. 867 868 The Intel 8255 PPI provided a total of 24 GPIO Pins arranged as 3 Ports of 8 869 Pins each, named Ports A, B and C. Port C could also be halved into two 4 Pin 870 sets to complement Ports A and B, with the extra lines having predefined roles 871 like generating interrupts in behalf of each Port (Which is not very 872 GPIO-like...). Like the 8253 PIT, the 8255 PPI had several possible operating 873 modes to cover a variety of use cases. 874 875 In the IBM PC, the 8255 PPI was used for a myriad of things, which makes 876 explaining all its roles not very straightforward. Actually, MinusZeroDegrees 877 has [plenty of info][gpio_pinout] about the details of each 8255 GPIO Pin role 878 in both the IBM PC and its successor, the IBM PC/XT. 879 880 The easiest one to describe is Port A. Port A main use was as the input of the 881 Keyboard interface, doing part of the job of a Keyboard Controller with the help 882 of additional logic between it and the Keyboard Port (Albeit the 8255 PPI is not 883 a Keyboard Controller in the proper sense of the word, as it just provides a 884 generic GPIO interface. It could be said than it is as much of a Keyboard 885 Controller as the 8253 PIT is a Sound Card, they were not designed to do those 886 roles, just parts of those subsystems circuitry). While inbound data from the 887 Keyboard generated interrupts on IRQ 1, the 8255 itself didn't signalled those 888 with its built-in interrupt logic because it was not wired to the 8259A PIC at 889 all. Instead, the glue logic that did the serial-to-parallel conversion of the 890 incoming Keyboard data so that it could be feed to the 8255 was also wired to 891 the PIC and signalled the interrupts. Port A also had some auxiliary glue logic 892 that wired it to a set of DIP Switches in the Motherboard known as SW1, and 893 whenever Port A was getting input from the DIP Switches or the Keyboard depended 894 on the status of a configuration Pin on Port B. 895 896 Port B and Port C are where the fun stuff happens. Their jobs included 897 interacting with the 8253 PIT, the PC Speaker (Both directly, and indirectly 898 though the PIT), the Cassette interface, another set of DIP Switches known as 899 SW2, part of the system RAM error detection logic, and even Port B had a Pin 900 used to control whenever Port A was reading the Keyboard or the SW1 DIP 901 Switches. The Pins roles were completely individualized yet mixed among the same 902 Port, unlike Port A. Here comes a major detail involving such implementation: 903 The CPU could do only 8 Bits Data Bus transactions, which means that you 904 couldn't interact with just a single Bit in a Port. Any operation would have it 905 either reading or writing all the 8 Bits at once (I think the 8255 PPI supported 906 doing so in BSR Mode, but that mode was only available for Port C, and I'm 907 unsure whenever the IBM PC had Port C configured that way anyways). As you 908 couldn't arbitrarily change the value of just a single Bit without breaking 909 something, it was critical than whenever you wanted to do things like sending 910 data to the PC Speaker, you first loaded the current value of the entire Port to 911 a CPU GPR, modified the Bit you wanted without altering the others, then write 912 it back to the Port. 913 914 [polling]: http://www.cs.tau.ac.il/~orenkish/When_polling.pdf 915 [interrupts]: http://wiki.osdev.org/Interrupts 916 [datasheet_danger]: http://www.os2museum.com/wp/the-danger-of-datasheets/ 917 [gpio_pinout]: http://www.minuszerodegrees.net/5160/diff/5160_to_5150_8255.htm 918 919 920 ##### 2.4 - IBM PC Memory Map (Conventional Memory and UMA) 921 922 A major design decision of the IBM PC was the organization of the 8088 CPU 923 Address Spaces. As previously mentioned, getting something properly mapped into 924 the Address Space required the help of auxiliary address decoding glue logic, so 925 everything present on a Bus had to be behind one. This applied not only to the 926 Motherboard built-in memory and Devices, but also for the expansion cards, too, 927 as each required its own decoding logic. It was a rather complex situation when 928 you consider that every address decoding logic in the system was listening to 929 the same unified parallel Bus (Reason for why on next chapter), so it was 930 extremely important that there were no address conflicts in anything plugged in 931 the IBM PC. As the platform designer, IBM had to take on the task to explicitly 932 define the addresses or address ranges that the platform built-in memory and 933 Devices would be mapped to and which ranges were free to use by expansion cards, 934 so that there was no overlapping between any of them. This is the basic concept 935 behind a Memory Map. 936 937 The IBM PC 5150 Technical Reference Manual includes the IBM defined Memory Map 938 in Pages 2-25, 2-26 and 2-27, while the I/O Ports Map is in Pages 2-23 and 2-24. 939 These are unique to the IBM PC, or, in other words, different to the Memory and 940 I/O Ports Maps of any other 8086/8088 based platforms. As the address decoding 941 logic that took care of the support chips in the IBM PC Motherboard was 942 hardwired to use the IBM defined addresses, the mapping for them was absolutely 943 fixed. The expansion cards, depending on the manufacturer intentions, could have 944 either a fixed address decoding logic, or it could be configurable via Jumpers 945 or DIP Switches. 946 947 The IBM PC Memory Map, as can be seen on the previously mentioned tables, was 948 almost empty. However, even by that early point, IBM had already taken a 949 critical decision that would have an everlasting impact: It partitioned the 1 950 MiB Memory Address Space into two segments, one that occupied the lower 640 KiB 951 (0 to 639 KiB), which was intended to be used solely for the system RAM (Be it 952 located either in the Motherboard or in expansion cards), and another segment 953 that occupied the upper 384 KiB (640 KiB to 1023 KiB), which was intended for 954 everything else, like the Motherboard ROMs, and the expansion cards RAMs and 955 ROMs as MMIO. These segments would be known in DOS jargon as Conventional Memory 956 and UMA (Upper Memory Area), respectively. This is from where the famous 640 KiB 957 Conventional Memory limit for DOS applications comes from. 958 959 The contents of the Conventional Memory are pretty much software defined, with a 960 single exception: The very first KiB of the Conventional Memory is used by the 961 CPU IVT (Interrupt Vector Table), which has 256 Interrupt entries 4 Bytes in 962 size each. Each entry (Vector) was a pointer for an Interrupt ISR (Interrupt 963 Service Routine). From these 256 entries, Intel used only the first 8 (INT 0-7) 964 as 8086/8088 internal Exceptions, marked from 9-32 (INT 8-31) as reserved for 965 future expansion, then let the rest available for either Hardware or software 966 Interrupts. IBM, in its infinite wisdom, decided to start mapping its ISR 967 pointers beginning at INT 8, which was reserved. This would obviously cause some 968 issues when Intel decided to expand the possible Exceptions Interrupts, causing 969 some overlap. 970 971 In the case of the UMA segment, the IBM defined Memory Map had several ranges 972 that were either used, marked as reserved or explicitly free. The sum of the 973 actually used ones was 68 KiB, consisting of all the Motherboard ROM plus two 974 video framebuffers: 8 KiB (1016 KiB to 1023 KiB) for the firmware ROM, that is 975 the only thing that had to be located at a mandatory address as required by the 976 8088 CPU, 32 KiB (984 KiB to 1015 KiB) for the IBM Cassette BASIC, made out of 977 four 8 KiB ROM chips, 8 KiB (976 KiB to 983 KiB) for an optional ROM chip on the 978 Motherboard that went mostly unused, and two independent RAM based video 979 framebuffers that were MMIO, 4 KiB (704 KiB to 707 KiB) for the MDA Video Card, 980 and 16 KiB (736 KiB to 751 KiB) for the CGA Video Card. 124 KiB addresses were 981 marked as reserved but unused: Two 16 KiB chunks, one right above the end of 982 Conventional Memory (640 KiB to 655 KiB) and the other just below the 983 Motherboard ROM (960 KiB to 975 KiB), and 100 KiB intended for not yet defined 984 video framebuffers, that were themselves part of a 112 KiB contiguous chunk that 985 had already MDA and CGA on it (656 KiB to 767 KiB). Finally, there were 192 KiB 986 explicitly marked free (768 KiB to 959 KiB). Note that the Video Cards were not 987 built-in in the IBM PC Motherboard, these were always independent expansion 988 cards, yet IBM defined a fixed, non overlapping mapping for both of them as part 989 of the base platform. 990 991 As early on IBM had the complete reins of the IBM PC platform, IBM itself 992 decided when an address range stopped being free and was assigned or reserved 993 for something else. Whenever IBM released a new type of Device in an expansion 994 card format, typically it also defined a fixed set of resources (Memory 995 Addresses, I/O Ports, DMA Channel and IRQ line) that it would use, then enforced 996 the usage of these resources by making its expansion cards fully hardwired to 997 them, so any mapping was always fixed. As the IBM PC platform matured, the 998 number of newly defined or otherwise reserved address ranges in the Memory Map 999 grew as IBM released more types of expansion cards (For example, the Technical 1000 Reference Manual previously linked is dated from August 1981, there is a later 1001 one from April 1984 that added a fixed address range for the ROM of the new Hard 1002 Disk Controller card). Sometimes, IBM could define two or more sets of resources 1003 if it was intending that more than one card of the same Device type was usable 1004 in a system, those cards had Jumpers or DIP Switches to select between the 1005 multiple fixed sets of resources. An example is [this IBM card][5150ac] that had 1006 a Serial Port with a DIP Switch to select between two sets of resources, which 1007 would become much more known as COM1 and COM2. 1008 1009 The IBM resource hard-wiring practices would become a problem for third party 1010 developers, as they had to make expansion cards that had a wide range of 1011 configurable resources to guarantee compatibility in the foreseeable scenario 1012 that IBM released a new card that used the same resources, or that another third 1013 party card did. As such, third party expansion cards typically had a lot of 1014 Jumpers or DIP Switches that allowed you to select at which address ranges you 1015 wanted to map whatever memory they had, which depending on the needs of the 1016 card, could include mapping RAM or ROM memory in different ranges of the UMA, 1017 and different I/O Ports. Same with IRQs and DMA Channels. You also had to 1018 configure the software to make sure that it knew where to look at for that card, 1019 as third party Devices had common defaults but these were not guaranteed, unlike 1020 IBM fixed resource definitions for their cards. 1021 1022 While the reserved addressing range for system RAM allowed for a maximum of 640 1023 KiB, the IBM PC Motherboard itself couldn't have that much installed onboard. 1024 During the life cycle of the IBM PC, two different Motherboards were used for 1025 it, whose main difference was the amount of onboard RAM that they supported. The 1026 first PC Motherboard version, known as 16KB - 64KB, supported up to 64 KiB RAM 1027 when maxed with 4 Banks of 16 KiB, and a second Motherboard version, known as 1028 64KB - 256 KB, supported up to 256 KiB RAM when maxed with 4 Banks of 64 KiB 1029 (The second Motherboard was included in a new revision of the IBM PC 5150, known 1030 as PC-2, released in 1983, around the same time than its successor, the IBM 1031 PC/XT 5160). In order to reach the 640 KiB limit (Albeit at launch IBM 1032 officially supported only 256 KiB, as that was the practical limit back then), 1033 you had to use Memory expansion cards. What made these unique is that they were 1034 intended to be mapped into the Conventional Memory range instead of the UMA as 1035 you would expect from any other type of expansion card, which is why the memory 1036 on these cards could be used as system RAM. 1037 1038 An important consideration was that all system RAM had to be mapped as a single 1039 contiguous chunk, gaps were not accepted in the Conventional Memory mapping 1040 (This seems to NOT have been true in the original IBM PC 16KB - 64KB Motherboard 1041 with the first firmware version, as I have actually read about it supporting 1042 non-contiguous memory. Chances are than that feature was dropped because it also 1043 required support from user software, which would have made Memory Management far 1044 more complex). As such, all Memory expansion cards supported a rather broad 1045 selection of address ranges to make sure that you could always map their RAM 1046 right after the Motherboard RAM, or right after the RAM from another Memory 1047 expansion card. A weird limitation of the IBM PC is that it required to have all 1048 4 Banks on the Motherboard populated before being able to use the RAM on Memory 1049 expansion cards, yet the later IBM PC/XT 5160 can use them even with only a 1050 single Bank populated. 1051 1052 A little known quirk is that while the IBM PC designers did not intend that 1053 there was more than 640 KiB system RAM, nothing stopped you from mapping RAM 1054 into the UMA for as long that the address decoding logic of the Memory expansion 1055 card supported doing so. With some hacks to the BIOS firmware, it was possible 1056 to get it to recognize more than 640 KiB Conventional Memory, and pass down this 1057 data to the Operating System. PC DOS/MS-DOS supported this hack out of the box 1058 since they relied on the BIOS to report the amount of system RAM that the 1059 computer had installed, they didn't checked that by themselves. The problem was 1060 that the hack still maintained the limitation that Conventional Memory had to be 1061 a single contiguous segment, so it means that how much you could extend the 1062 system RAM depended on which was the first UMA range in use. 1063 1064 Even in the best case scenarios, the beginning of the Video Card framebuffer was 1065 effectively the highest limit than the hack could work: With CGA, it could go up 1066 to 736 KiB, with MDA, up to 704 KiB, and with later Video Cards like EGA and 1067 VGA, you couldn't extend it at all since their UMA ranges began exactly at the 1068 640 KiB boundary. Mapping the video framebuffer higher was possible, but it 1069 required both a specialized Video Card that allowed you to do so, to further 1070 hack the firmware, and even to patch the user software, so that it would use the 1071 new address range instead of blindly assuming that the video framebuffer was at 1072 the standard IBM defined range. Thus, while theoretically possible, it should 1073 have been even rarer to see someone trying to move the video framebuffers to see 1074 more RAM as standard Conventional Memory due to the excessive complexity and 1075 specialized Hardware involved. These days it should be rather easy to 1076 demonstrate those hacks on emulated environments, then attempting them in real 1077 Hardware. 1078 1079 I don't know if later systems could work with any of these hacks (Albeit someone 1080 recently showed that IBM own OS/2 [had builtin support][cgamemlimit] for 736 KiB 1081 Conventional Memory if using CGA), but as the newer types of Video Cards with 1082 their fixed mappings beginning at 640 KiB became popular, the hacks ceased to be 1083 useful considering how much effort they required. As such, hacks that allowed a 1084 computer to have more than 640 KiB Conventional Memory were rather short lived. 1085 Overall, if you could map RAM into the UMA, it was just easier to use it for a 1086 small RAMDisk or something like that. 1087 1088 [5150ac]: http://www.minuszerodegrees.net/5150_5160/cards/5150_5160_cards.htm#ac_adapter 1089 [cgamemlimit]: https://www.vogons.org/viewtopic.php?p=796928#p796928 1090 1091 1092 ##### 2.5 - IBM PC Motherboard physical characteristics overview, Computer Case and Keyboard 1093 1094 The physical dimensions of the IBM PC Motherboard depends on the Motherboard 1095 version. The IBM PC 5150 Technical Reference from August 1981 mentions that the 1096 Motherboard dimensions are around 8.5' x 11', which in centimeters would be 1097 around 21.5 cm x 28 cm. These should be correct for the first 16KB - 64KB 1098 Motherboard found in the PC-1. The April 1984 Technical Reference instead 1099 mentions 8.5' x 12', and also gives an accurate value in millimeters, rounding 1100 them to 21.5 cm x 30.5 cm. Given than the second Technical Reference corresponds 1101 to the 64KB - 256KB Motherboard of the PC-2, both sizes should be true, just 1102 that IBM didn't specifically mentioned which Motherboard they were talking about 1103 in the later Technical Reference. 1104 1105 For internal expansion, not counting the 5 expansion slots, the IBM PC 1106 Motherboards had two major empty sockets: One for the optional Intel 8087 FPU, 1107 and another for an optional 8 KiB ROM chip (Known as U28) that had no defined 1108 purpose but was mapped. For system RAM, both Motherboards had 27 sockets 1109 organized as 3 Banks of 9 DRAM chips each, with each Bank requiring to be fully 1110 populated to be usable. The 16KB - 64KB Motherboard used DRAM chips with a puny 1111 2 KiB RAM each, whereas the 64KB - 256KB Motherboard used 8 KiB DRAM chips, 1112 which is why it gets to quadruple the RAM capacity using the same amount of 1113 chips. While in total there were 4 RAM Banks (Did you noticed that there was 1114 more installed RAM than usable RAM already?), the 9 DRAM chips of Bank 0 always 1115 came soldered, a bad design choice that eventually caused some maintenance 1116 issues as it was much harder to replace one of these chips if they were to fail. 1117 Some components like the main Intel 8088 CPU, the ROM chips with the BIOS 1118 firmware and the IBM Cassette BASIC came socketed, yet these had only an 1119 extremely limited amount of useful replacements available that appeared later on 1120 the IBM PC life cycle, and ironically, seems to not have failed as often as the 1121 soldered RAM. 1122 1123 While the IBM PC 5150 Motherboards were loaded with chips (On a fully populated 1124 Motherboard, around 1/3 of them were DRAM chips), the only external I/O 1125 connectors they had were just two Ports, one to connect the Keyboard and the 1126 other for an optional Cassette Deck. As mentioned before, both external Ports 1127 were wired to the Intel 8255 PPI (Programmable Peripheral Interface) GPIO with 1128 some additional circuitry between the external Ports and the 8255 input, so it 1129 can't be considered a full fledged Keyboard Controller or Cassette Controller. 1130 There was also a defined Keyboard protocol so that the Keyboard Encoder, located 1131 inside the Keyboard itself, could communicate with the Keyboard Controller 1132 circuitry of the IBM PC. 1133 1134 The IBM PC Motherboards also had an internal header for the PC Speaker mounted 1135 on the Computer Case, which was wired though some glue logic to both the Intel 1136 8253 PIT and Intel 8255 PPI. Talking about the Computer Case, the IBM PC one had 1137 no activity LEDs (Like Power or HDD) at all, it only exposed the Power Switch at 1138 the side, near the back. It was really that dull. Also, as the computer On/Off 1139 toggle type Power Switch was not connected to the Motherboard in any way since 1140 it was part of the Power Supply Unit itself, the PC Speaker had the privilege to 1141 be the first member of the Front Panel Headers found in modern Motherboards. 1142 1143 The IBM PC Keyboard, known as Model F, deserves a mention. The Keyboard looks 1144 rather interesting the moment that you notice that inside it, there is an Intel 1145 8048 Micro-controller working as a Keyboard Encoder, making the Keyboard itself 1146 to look as if it was a microcomputer. The 8048 is part of the Intel MCS-48 1147 family of micro-controllers, and had its own integrated 8 Bits CPU, Clock 1148 Generator, Timer, 64 Bytes RAM, and 1 KiB ROM. Regarding the ROM contents, 1149 manufacturers of micro-controllers had to write the customer code during the 1150 chip manufacturing (Some variants of the 8048 came with an empty ROM that could 1151 be programmed once, some even reprogrammed), which means than the 8048 that the 1152 IBM Model F Keyboard used had a built-in firmware specifically made for it, so 1153 it was impossible to replace the Model F 8048 for another 8048 that didn't came 1154 from the same type of unit. Some people recently managed to dump the 8048 1155 firmware used by the IBM Model F Keyboard, for either emulation purposes or to 1156 replace faulty 8048s with working reprogrammable chips with the original Model F 1157 firmware. 1158 1159 1160 ##### 2.6 - IBM PC Motherboard Buses (Local Bus, I/O Channel Bus, Memory Bus) 1161 1162 It shouldn't be a surprise to say that all the mentioned chips, including the 1163 main Processor, the support chips, the RAM and ROM memory chips, plus all the 1164 glue logic to make them capable of interfacing together and getting mapped to 1165 the expected location in the address spaces, had to be physically placed 1166 somewhere, and somehow interconnected together. The Motherboard, which IBM used 1167 to call the Planar in its literature, served as the physical foundation to host 1168 the core chips that defined the base platform. 1169 1170 Regarding the chip interconnects, as can be seen in the System Board Data Flow 1171 diagram at Page 2-6 of the IBM PC 5150 Technical Reference Manual, the IBM PC 1172 had two well-defined Buses: The first one was the parallel Local Bus, that 1173 interconnected the Intel 8088 CPU, the optional 8087 FPU, the 8288 Bus 1174 Controller and the 8259A PIC, and the second one was called the I/O Channel Bus, 1175 that transparently extended the Local Bus and interfaced with almost everything 1176 else in the system. Additionally, while the mentioned Block Diagram barely 1177 highlights it, the I/O Channel Bus was further subdivided into two segments: One 1178 that connects the Local Bus with the Memory Controller and that also extended to 1179 the expansion slots, exposing all the I/O Channel Bus signal lines for expansion 1180 cards to use, and the other is a subset of I/O Channel for the support chips 1181 located in the Motherboard itself, which just had limited hardwired resources. 1182 There is also a Memory Bus between the Memory Controller and the RAM memory 1183 chips that serves as the system RAM. 1184 1185 **Local Bus:** The core of the Local Bus is the 8088 CPU. As previously 1186 mentioned, the 8088 CPU had a multiplexed external Bus, so all the chips that 1187 sits in the Local Bus had to be explicitly compatible with its specific 1188 multiplexing scheme, limiting those to a few parts from the MCS-86 and MCS-85 1189 chip sets. The Local Bus was separated from the I/O Channel Bus with some 1190 intermediate buffer chips that served to demultiplex the output of the 8088 Data 1191 and Address Buses, making them independent lines so that it was easier to 1192 interface them with third party chips, while the 8288 Bus Controller did the 1193 same with the 8088 Control Bus lines. What this means is that the I/O Channel 1194 Bus is pretty much a demultiplexed transparent extension of the Local Bus that 1195 is behind some glue logic but has full continuity with it, it is not a separate 1196 entities. As such, in the IBM PC, everything effectively sits in a single, 1197 unified system wide parallel Bus. That means that all the chips were directly 1198 visible and accessible to each other, which made the IBM PC Motherboard to look 1199 like a giant backplane that just happen to have some built-in Devices. 1200 1201 **I/O Channel Bus:** It can't be argued than the I/O Channel Bus at the 1202 Motherboard level is just a demultiplexed version of the 8088 Local Bus, but 1203 things gets a bit more complex when you consider I/O Channel at the expansion 1204 slot level. The expansion slots exposed a composite version of the I/O Channel 1205 Bus, since besides being connected to the 8088 CPU demultiplexed Local Bus 1206 Address, Data and Control lines, it also had Pins that were directly wired to 1207 the 8259A PIC and 8237A DMAC, so that an expansion card could use one or more 1208 IRQs and DMA Channels lines at will (Some Devices on the Motherboard were 1209 hardwired to them, too. The DMA and IRQ lines that were used by those Devices 1210 were not exposed in the I/O Channel expansion slots). 1211 1212 A detail that I found rather curious is that IBM didn't really had a need to 1213 create their own, custom expansion slot and its composite Bus. Before IBM began 1214 designing the IBM PC, Intel had already used for at least some of its reference 1215 8086/8088 platforms its own external Bus and connector standard, Multibus. 1216 Another option would have been the older, industry standard S-100 Bus. While it 1217 is easy to find discussions about why IBM chose the Intel 8088 CPU over other 1218 contemporary Processor alternatives, I failed to find anyone asking about why 1219 IBM decided to create a new expansion slot type. Whenever IBM wanted to roll a 1220 new expansion slot standard as a vendor lock-in mechanism or just do so to sell 1221 new cards for the IBM PC, or if the existing options like Multibus weren't good 1222 enough for IBM (Multibus had Pins intended for 8 IRQ lines, but no DMA 1223 Channels), or if IBM had to paid royalties to Intel to use it and didn't want 1224 to, is something that I simply couldn't find answers about. 1225 1226 **Memory Bus:** From the Processor point of view, the most important component 1227 of the platform should be the system RAM, since it is its personal workspace. 1228 However, CPUs usually didn't interfaced with the RAM chips directly, instead, 1229 there was a Memory Controller acting as the intermediary, with the Memory Bus 1230 being what linked it with the RAM chips themselves. Note that there could be 1231 multiple Memory Buses on the same system, like in the case of the IBM PC, where 1232 you always had the Memory Controller and RAM chips that were part of the 1233 Motherboard itself plus those on the optional Memory expansion cards, as they 1234 had their own Memory Controller and RAM chips, too. 1235 1236 The Memory Controller main function was to multiplex the input Address Bus, as 1237 RAM chips had a multiplexed external Address Bus in order to reduce package pin 1238 count. Unlike the Intel 8088 CPU that had multiplexed part of the Address Bus 1239 lines with the Data Bus ones, RAM chips had a dedicated Data Bus, pin reduction 1240 was achieved by multiplexing the Address Bus upon itself. The multiplexing was 1241 implemented simply by halving the addresses themselves as two pieces, Columns 1242 and Rows, so a RAM chip effectively received addresses twice per single 1243 operation. Moreover, since the RAM chips were of the DRAM (Dynamic RAM) type, 1244 they had to be refreshed at periodic intervals to maintain the integrity of 1245 their contents. A proper DRAM Memory Controller would have done so itself, but 1246 the IBM PC didn't had one, instead, it relied on the 8253 PIT and 8237A DMAC as 1247 auxiliary parts of the Memory Controller circuitry to time and perform the DRAM 1248 refreshes, respectively, while some discrete logic performed the Address Bus 1249 multiplexing. Also, the expansion slots exposed the DACK0 signal (From DMA 1250 Channel 0) that could be used by Memory expansion cards to refresh the DRAM 1251 chips in sync with the Motherboard ones. 1252 1253 The IBM PC Motherboard Memory Controller circuitry could manage up to 4 Banks of 1254 DRAM chips through its own Memory Bus, being able to access only a single Bank 1255 at a time. Obviously you want to know what a Bank is. A Bank (It can also be a 1256 Rank. Rank and Bank definitions seems to overlap a lot, depending on context and 1257 era) is made out of multiple RAM chips that are simultaneously accessed in 1258 parallel, where the sum of their external Data Buses has to match the width of 1259 the Memory Data Bus, that, as you could guess, was highly tied to the 8088 CPU 8 1260 Bits Data Bus. There were multiple ways to achieve that sum, including to use 1261 either a single RAM chip with an 8-Bits external Data Bus, 2 with 4-Bits, or 8 1262 with 1-Bit. The IBM PC took the 1-Bit RAM chips route, as these were the 1263 standard parts of the era. While this should mean than the IBM PC required 8 1264 1-Bit RAM chips per Bank, it actually had 9... 1265 1266 As primitive as the Memory Controller subsystem was, it implemented Parity. 1267 Supporting Parity means that the Memory Controller had 1 Bit Error Detection for 1268 the system RAM, so that a Hardware failure that caused RAM corruption would not 1269 go unnoticed. This required the ability to store an extra 1 Bit per 8 Bits of 1270 memory, which is the reason why there are 9 1-Bit RAM chips per Bank instead of 1271 the expected 8. It also means that the Data Bus between the Memory Controller 1272 and a Bank was actually 9 Bits wide, and that per KiB of usable system RAM, you 1273 actually had installed 1.125 KiB (1 + 1/8) of raw RAM. I suppose that IBM picked 1274 1-Bit RAM chips instead of the wider ones because Parity doesn't seem to be as 1275 straightforward to implement with the other alternatives. The Parity circuitry 1276 was wired to both some GPIO Pins of the 8255 PPI and to the NMI Pin on the 8088 1277 CPU, and was also exposed in the expansion slots. When a Memory Controller 1278 signalled an NMI, an ISR fired up that checked the involved 8255 PPI Ports Bits 1279 as a means to identify whenever the Parity error happened on the RAM in the 1280 Motherboard or in a Memory expansion card. Note than not all Memory expansion 1281 cards fully implemented Parity, some cheap models may work without Parity to 1282 save on DRAM chip costs. 1283 1284 At the time of the IBM PC, Parity was a rare feature considered unworthy of 1285 personal computers (Ever hear about famous supercomputer designer Seymour Cray 1286 saying "Parity is for farmers", then comically retracting a few years later of 1287 that statement by saying "I learned that farmers used computers"?), yet IBM 1288 decided to include it anyways. This is the type of attitude that made IBM highly 1289 renowned for the reliability of its systems. In comparison, modern Memory 1290 Modules standards like DDR3/4 also supports 1 Bit Error Detection, but without 1291 the need of extra raw RAM because it works on whole 64 Bits Ranks that allows 1292 for some special type of error detection algorithm that doesn't require to waste 1293 extra memory. The DDR3/DDR4 Memory Modules that have ECC (Error Checking 1294 Correction) capabilities does use an extra RAM chip with both a 1/8 wider 72 1295 Bits Bus and 1/8 more raw RAM than the usable one, exactly like the IBM PC 1296 Parity implementation, but ECC instead allows to do 2 Bits Error Detection with 1297 1 Bit Error Correction, assuming that the platform memory subsystem supports 1298 using ECC. 1299 1300 **External I/O Channel Bus:** The fourth and final Bus was a subset of I/O 1301 Channel, that, according to the Address Bus and Data Bus [diagrams][addrbus] of 1302 MinusZeroDegrees, has the unofficial name of External I/O Channel Bus (The 1303 Technical Reference Block Diagram just calls them external Address Bus and 1304 external Data Bus). The External I/O Channel Bus was separated from the main I/O 1305 Channel Bus by some intermediate glue chips. Connected to this Bus, you had the 1306 ROM chips for the Motherboard firmware and the IBM Cassette BASIC, and the Intel 1307 8237A DMAC, Intel 8253 PIT, and Intel 8255 PPI support chips. An important 1308 detail is that the External I/O Channel Bus didn't had all the 20 Address lines 1309 available to it, it had only 13, that was just enough to address the internal 1310 memory of the 8 KiB sized ROMs. As the ROMs were still effectively behind a 20 1311 Bits address decoding logic, they were mapped correctly into the Memory Address 1312 Space, so things just worked, but the support chips weren't as lucky... 1313 1314 The support chips were mapped to the I/O Address Space behind an address 1315 decoding logic that was capable of decoding [only 10 Bit][10bits] instead of the 1316 full 16, which means that they did just partial address decoding. This not only 1317 happened with the Motherboard built-in support chips, early expansion cards that 1318 used PMIO also decoded only 10 Bits of the CPU Address Bus. As such, in the IBM 1319 PC platform as a whole, there were effectively only 1 KiB (2^10) worth of unique 1320 I/O Ports, that due to the partial address decoding, repeated itself 63 times to 1321 fill the 64 KiB I/O Ports Map. In other words, every mapped I/O Port had 63 1322 aliases. At that moment this wasn't a problem, but as the need for more PMIO 1323 addressing capabilities became evident, dealing with aliases would cause some 1324 compatibility issues in later platforms that implemented the full 16 Bits 1325 address decoding for the I/O Address Space in both the Motherboard and the 1326 expansion cards. 1327 1328 It was possible to mix an IBM PC with cards that did 16 Bits decoding, but you 1329 had to be aware of all of your Hardware address decoding capabilities and the 1330 possible configuration options, as there were high chances that the mapping 1331 overlapped. For example, in the case of an IBM PC with a card that decoded 16 1332 Bits, while the card should be theoretically mappable to anywhere in the 64 KiB 1333 I/O Address Space, if the chosen I/O Port range was above 1 KiB, you had to 1334 check if it overlapped with any of the aliased addresses of the existing 10 Bits 1335 Devices, which I suppose that should have been simple to figure out if you made 1336 a cheatsheet of the current 1 KiB I/O Port Map, then repeated it for every KiB 1337 to get the other 63 aliases, so that you knew which ranges were really unused. 1338 Likewise, it was possible to plug a 10 Bits expansion card into a platform that 1339 did 16 Bits decoding for its built-in support chips and other expansion cards, 1340 but the 10 Bits card would create aliases all over the entire I/O Address Space, 1341 so there were chances that its presence created a conflict with the mapping of 1342 an existing 16 Bits Device that you had to resolve by moving things around. 1343 1344 [addrbus]: http://www.minuszerodegrees.net/5150/misc/5150_motherboard_diagrams.htm 1345 [10bits]: http://www.vcfed.org/forum/showthread.php?34938-PC-ISA-I-O-Address-Space-10-bits-or-16-bits 1346 1347 1348 ##### 2.7 - IBM PC Motherboard Clock Generation and Wait States 1349 1350 Another important aspect of the IBM PC is the clock speed that all the chips and 1351 Buses ran at. A common misconception that people believes in, is that when you 1352 buy a chip advertised to run at a determined Frequency, it automatically runs at 1353 that clock speed. The truth is that in most scenarios, a chip doesn't decide at 1354 which clock speed it will run at, the manufacturer is simply saying that it 1355 rated the chip to run reliably up to that Frequency. What actually defines the 1356 clock speed that something will run at, is the reference clock signal that it is 1357 provided with. The origin source of a clock signal can be traced all the way 1358 back to a Reference Clock that typically comes from a Crystal Oscillator. 1359 1360 The reference clock can be manipulated with Clock Dividers or Clock Multipliers 1361 so that different parts of the platform can be made to run at different clock 1362 speeds, all of which are synchronous between them since they are derived from 1363 the same source. It is also possible for different parts of the same platform to 1364 have their own Crystal Oscillators providing different reference clocks. A part 1365 of the platform that runs at the same clock speed and is derived from the same 1366 reference clock is known as a Clock Domain. A signal that has to transition from 1367 one Clock Domain to a different one must go through a Clock Domain Crossing. The 1368 complexity of a Clock Domain Crossing varies depending on if a signal has to go 1369 from one Clock Domain to another one that is synchronous with it because it is 1370 derived from the same reference clock, or to an asynchronous Clock Domain that 1371 is derived from a different reference clock source. 1372 1373 In the IBM PC, as can be seen in the [Clock Generation][clockgen] diagram of 1374 MinusZeroDegrees, everything was derived from the 14.31 MHz reference clock 1375 provided by a single Crystal Oscillator. This Crystal Oscillator was connected 1376 to the Intel 8284A Clock Generator, that used its 14.31 MHz reference clock as 1377 input to derive from it three different clocks signals, OSC, CLK and PCLK, each 1378 with its own output Pin. While this arrangement was functional in the IBM PC, it 1379 would cause many headaches later on as things would have to eventually get 1380 decoupled... 1381 1382 **8284A OSC:** The first clock output was the OSC (Oscillator) line, which just 1383 passthroughed the 14.31 MHz reference clock intact. The OSC line wasn't used by 1384 any built-in Device in the IBM PC Motherboard, instead, it was exposed as the 1385 OSC Pin in the I/O Channel expansion slots. Pretty much the only card that used 1386 the OSC line was the CGA Video Card, which included its own clock divisor that 1387 divided the 14.31 MHz OSC line by 4 to get a 3.57 MHz TV compatible NTSC signal, 1388 so that the CGA Video Card could be directly connected to a TV instead of a 1389 computer Monitor. Actually, it is said that IBM chose the 14.31 MHz Crystal 1390 Oscillator precisely because it made easy and cheap to derive a NTSC signal from 1391 it. 1392 1393 **8284A CLK:** The second clock output, and the most important one, was the CLK 1394 (System Clock) line, that was derived by dividing the 14.31 MHz reference clock 1395 input by 3, giving 4.77 MHz. Almost everything in the IBM PC used this clock: 1396 The Intel 8088 CPU, the Intel 8087 FPU, the Intel 8237A DMAC, the Buses, and 1397 even the expansion cards, as the I/O Channel expansion slots also exposed this 1398 line as the CLK Pin. Even if an expansion card had its own Crystal Oscillator, 1399 there should be a clock domain crossing somewhere between the 4.77 MHz I/O 1400 Channel CLK and whatever the card internally used. Note that Intel didn't sold a 1401 4.77 MHz 8088 CPU, 8087 FPU or 8237A DMAC, the IBM PC actually used 5 MHz rated 1402 models of those chips underclocked to 4.77 MHz simply because that was the clock 1403 signal that they were getting as input. 1404 1405 **8284A PCLK:** Finally, the third clock output was the PCLK (Peripheral Clock) 1406 line, that was derived by dividing by 2 the previous CLK line, giving 2.38 MHz. 1407 The Intel 8253 PIT used it, but not directly, since it first passed though a 1408 discrete clock divisor that halved it by 2, giving 1.19 MHz, which was the 1409 effective clock input of the PIT. Note that each Counter of the 8253 PIT had its 1410 own clock input Pin, but all them were wired to the same 1.19 MHz clock line in 1411 parallel. 1412 1413 The Keyboard circuitry also used the PCLK line, but I never looked into that 1414 part, so I don't know its details. The Keyboard Port had a clock Pin, but I'm 1415 not sure whenever it exposes the 2.38 MHz PCLK line or not. I'm not sure what 1416 uses this line in the Keyboard side, either. For reference, the 8058 1417 Microcontroller inside the Model F Keyboard has an integrated Clock Generator 1418 that can use as clock input either a Crystal Oscillator, or a line coming 1419 straight from an external Clock Generator. A schematic for the Model F Keyboard 1420 claims that the 8048 uses a 5.247 MHz reference clock as input, yet I failed to 1421 identify a Crystal Oscillator in the photos of a disassembled Keyboard. I'm 1422 still not sure whenever the 8048 in the Keyboard makes direct use of the 1423 Keyboard Port clock Pin or not, as both options are viable to use as reference 1424 clock. 1425 1426 An important exception in the IBM PC clocking scheme were the RAM chips used as 1427 system RAM, as their functionality was not directly bound at all to any clock 1428 signal. The RAM chips of that era were of the asynchronous DRAM type. 1429 Asynchronous DRAM chips had a fixed access time measured in nanoseconds, which 1430 in the case of those used in the IBM PC, were rated at 250 ns. There is a 1431 relationship between Frequency and time, as the faster the clock speed is, the 1432 lower each individual clock cycle lasts. Sadly, I don't understand the in-depth 1433 details of how the entire DRAM Memory subsystem worked, like how many clock 1434 cycles a full memory operation took, or the important MHz-to-ns breakpoints, nor 1435 its relationship with the 8088 Bus Cycle, to know how fast the DRAM chips had to 1436 be at minimum for a target clock speed (For reference, the IBM PC Technical 1437 Reference Manual claims that the Memory access time was 250 ns with a Cycle time 1438 of 410 ns, while an 8088 @ 4.77 MHz had a 210 ns clock cycle with a fixed Bus 1439 cycle of four clocks, for a total of 840 ns. On paper it seems than the 410 ns 1440 Cycle time of the Memory subsystem would have been good enough to keep up with 1441 the Bus cycle of an 8088 running at up to 9.75 MHz, but I know that faster 8088 1442 platforms had to use DRAM chips with a lower access time, so there is something 1443 wrong somewhere...). Basically, the important thing about asynchronous DRAM is 1444 that for as long that it was fast enough to complete an operation from the 1445 moment that it was requested to the moment that it was assumed to be finished, 1446 everything was good. Faster platforms would require faster DRAM chips (Rated for 1447 lower ns, like 200 ns), while faster DRAM chips would work in the IBM PC but 1448 from a performance standpoint be equal to the existing 250 ns ones. 1449 1450 Albeit not directly related to the operating Frequency, a Hardware function that 1451 would later become prevalent is the insertion of Wait States. A Wait State is 1452 pretty much an extra do-nothing clock cycle, similar in purpose to the Processor 1453 NOP Instruction, that is inserted at the end of all CPU Bus Cycles. Adding Wait 1454 States was the typical solution when you had a slower component that wouldn't 1455 work at a higher Frequency. Wait States can be injected in varying amounts, and 1456 rates can be configured depending on the type of operation, like for a Memory 1457 access or an I/O access (This is what makes a Wait State more useful that 1458 outright downclocking everything). 1459 1460 In the case of the IBM PC, it had no Wait States (Also called Zero Wait States) 1461 for Memory accesses, but it had an 1 Wait State for I/O accesses (The Technical 1462 Reference Manual explicitly mentions that an I/O cycle takes one additional 210 1463 ns clock cycle than the standard 8088 840 ns Bus Cycle for Memory accesses, for 1464 a total of 1050 ns in 5 clock cycles. For some reason, I can't find information 1465 that refers to this extra clock cycle as a Wait State, yet it seems than that is 1466 exactly what this extra cycle is). The 1 I/O WS would allow support chips and 1467 expansion cards a bit more time to recover before the next access. Later 1468 platforms that ran at higher Frequencies had user selectable Memory WS and I/O 1469 WS to allow some degree of compatibility with both slower DRAM chips and 1470 expansion cards, but at a significant performance cost. 1471 1472 [clockgen]: http://www.minuszerodegrees.net/5150/misc/5150_motherboard_diagrams.htm 1473 1474 1475 ##### 2.8 - IBM PC Expansion Cards 1476 1477 Since there was pretty much no built-in I/O in the IBM PC Motherboard, to get 1478 the IBM PC 5150 to do anything useful at all, you had to insert specialized 1479 expansion cards into any of the 5 I/O Channel Slots. All the slots had the same 1480 capabilities, so it didn't mattered in which slot you inserted a card. In other 1481 computers of the era, slots could have predefined purposes since the address 1482 decoding logic for them was found in the Motherboards themselves, whereas in the 1483 IBM PC that logic was in the expansion cards. Initially, the card variety was 1484 extremely limited, but as the IBM PC ecosystem grew, so did the amount and type 1485 of cards. 1486 1487 As mentioned previously, the I/O Channel Slots exposed the I/O Channel Bus, 1488 which is itself a demultiplexed yet transparent extension of the CPU Local Bus, 1489 and also had Pins with lines directly wired to the Intel 8259A PIC IRQs and 1490 Intel 8237A DMAC DMA Channels. Because some Devices in the Motherboard were 1491 hardwired to a few of those, only the lines that were actually free were exposed 1492 in the slots: 6 Interrupt Request lines, namely IRQs 2-7, and 3 DMA Channels, 1493 namely 1-3. None of the 8253 PIT Counters input or output lines were exposed. 1494 1495 The most critical discrete card was the Video Card, be it either MDA or CGA, 1496 both because you needed one to connect the IBM PC to a Monitor or TV so you 1497 could see the screen, and because without a Video Card, the IBM PC would fail 1498 during POST (There was nothing resembling a headless mode). Besides the RAM used 1499 as video framebuffer, both Video Cards had an 8 KiB ROM with character fonts, 1500 but these were not mapped into the CPU Address Space as they were for the 1501 internal use of the Video Cards. Each Video Card also had a goodie of sorts: MDA 1502 had a built-in Parallel Port, being perhaps the first example of a multifunction 1503 card, while CGA had a rather unknown header that was used to plug an optional 1504 Light Pen, which is a very old peripheral that was used by pressing the tip on 1505 some location of the screen of a CRT Monitor as if it was a touch screen. 1506 Finally, it was possible for a MDA and a CGA Video Card to coexist 1507 simultaneously in the same system, and even it was possible to make a Dual 1508 Monitor setup with them. Software support to actually use both screens 1509 simultaneously was very, very rare, typically they defaulted to only use the 1510 active card and Monitor combo. Two of the same Video Card type was never 1511 supported at all. 1512 1513 The second most important expansion card was the FDC (Floppy Disk Controller), 1514 so that you could attach 5.25'' Diskette Drives to the IBM PC. The original IBM 1515 FDC had an internal header for cabling either one or two drives inside the 1516 Computer Case, and an external Port for two more drives, for a total of 4. On 1517 the original IBM PC models at launch, the FDC and the internal 5.25'' Diskette 1518 Drives were optional, as IBM intended Diskettes to be the high end storage 1519 option while Cassettes took care of the low end, which is the reason why IBM 1520 included a built-in port in the Motherboard to connect to a Cassette Deck. This 1521 market segmentation strategy seems to have failed very early on the life of the 1522 IBM PC, as Cassette-only models soon disappeared from the market. At the time of 1523 the IBM PC release, the only compatible Diskettes had a 160 KB usable size after 1524 formatting. I have no idea about how much usable capacity a Cassette had. 1525 1526 Other important expansion cards included those that added Serial and Parallel 1527 Ports (Unless you used a MDA Video Card, that had a Parallel Port integrated), 1528 so that you could connect external peripherals like a Modem or a Printer. Memory 1529 expansion cards seems to have been rather common in the early days, as RAM cost 1530 plummeted while densities skyrocketed. Consider that at launch, the IBM PC 1531 officially supported only 256 KiB RAM using the 16KB - 64KB Motherboard fitted 1532 with all 64 KiB plus 3 64 KiB Memory expansion cards, which back then was a ton, 1533 yet in a few years getting the full 640 KiB that the computer Hardware supported 1534 became viable. 1535 1536 Besides the mentioned cards, the IBM PC expansion possibilities at launch day 1537 weren't that many. Assuming that you were using a typical setup with a CGA Video 1538 Card, a Diskette Drive and a Printer, you just had two free expansion slots 1539 left, and I suppose that more RAM would be a popular filler. Later on the life 1540 of the platform, many multifunction cards appeared that allowed to save a few 1541 expansion slots by packing multiple common Device types into the same card, and 1542 thus making possible to fit more total Devices into the slot starved IBM PC. A 1543 very known multifunction card is the AST SixPack, which could have installed in 1544 it up to 384 KiB RAM, a Serial Port, a Parallel Port, and a battery backed RTC 1545 (Real Time Clock) that required custom Drivers to use (A RTC would later be 1546 included in the IBM PC/AT). 1547 1548 Worth mentioning is that the Mouse wasn't a core part of the PC platform until 1549 the IBM PS/2. Actually, the Mouse took a few years before it made its first 1550 appearance on the PC platform, plus a few more before becoming ubiquitous. A 1551 Mouse for the IBM PC would have required its own expansion card, too, be it a 1552 Serial Port card for a Serial Mouse, or a Bus Mouse Controller card for a Bus 1553 Mouse (The Bus Mouse isn't an example of a Device supporting the CPU Bus 1554 protocol that I mentioned when talking about the 8255 PPI because protocol 1555 translation stills happens on the controller card, as it would happen in a 1556 Serial Port card if it were a Serial Bus Mouse, too. They don't seem different 1557 than the GPIO-to-CPU Bus interface of the 8255 PPI). 1558 1559 One of the most curious types of expansion cards that could be plugged in the 1560 IBM PC were the Processor upgrade cards. These upgrade cards had a newer x86 CPU 1561 and sometimes a Socket for an optional FPU, intended to fully replace the 8088 1562 and 8087 of the IBM PC Motherboard. Some cards also had RAM in them (With their 1563 own Memory Controllers) because the newer Processors had wider Data Buses, so 1564 using the IBM PC I/O Channel Bus to get to the system RAM in the Motherboard or 1565 another Memory expansion card would limit the CPU to 8 Bits of Data and 1566 absolutely waste a lot of the extra performance. At some point they were 1567 supposed to be cost effective compared to a full upgrade, but you will 1568 eventually learn a thing or two about why they were of limited usefulness... 1569 1570 An example of a fully featured upgrade card that could be used in the IBM PC is 1571 the [Intel InBoard 386/PC][inboard], perhaps one of the most advanced ones. This 1572 card had a 16 MHz 80386 CPU with its own Crystal Oscillator, a Socket for an 1573 optional 80387 FPU, a local 1 MiB RAM using a 32 Bits Data Bus, and also 1574 supported an optional Daughterboard for even more RAM. In recent times we had 1575 conceptually similar upgrade cards like the [AsRock am2cpu][asrock], but these 1576 work only on specific Motherboards that were designed to accommodate them. 1577 1578 Upgrade cards had several limitations. One of them was that the cards that had 1579 RAM in them were intended to use it as Conventional Memory replacement with a 1580 wider Data Bus for better performance, but, since the RAM in the Motherboard had 1581 an inflexible address decoding logic, it was impossible to repurpose it by 1582 mapping that RAM somewhere else, with the end result that the Motherboard RAM 1583 had to be effectively unmapped and remain unused. Upgrade cards without RAM 1584 could use the Motherboard RAM as Conventional Memory, but that would mean that 1585 it would miss any performance increase due to the newer CPU wider Data Bus. 1586 Another issue is that the I/O Channel Slots didn't provided all the signal lines 1587 that were wired to the 8088 CPU in the Motherboard, like the INTR and INTA 1588 interrupt lines connected to the 8259A PIT, thus the upgrade cards had to use a 1589 cable that plugged into the 8088 CPU Socket (You obviously had to remove the CPU 1590 on the Motherboard, and the FPU, too) to route these to the new main Processor, 1591 making installing an upgrade card less straightforward that initially appears to 1592 be. 1593 1594 As I failed to find enough detailed info about how the upgrade cards actually 1595 worked, I don't know if the card used for I/O the cable connected to the 8088 1596 Socket, or if instead it used the expansion slot itself except for the 1597 unavailabled lines that had to be routed through the cable like the previously 1598 mentioned INTR and INTA. Regardless of these details, at least the RAM on the 1599 upgrade card should have been directly visible in the I/O Channel Bus, else, any 1600 DMA to system RAM involving the 8237A DMAC would have been broken. 1601 1602 There were other cards that were similar in nature to the previously described 1603 upgrade cards, but instead of replacing some of the Motherboard core components 1604 with those sitting in the card itself, they added independent Processors and 1605 RAM. Examples of such cards include the [Microlog BabyBlue II][babyblue], which 1606 had a Zilog Z80 CPU along with some RAM. Cards like these could be used to run 1607 software compiled for other ISAs in these add-on Processors instead of x86, and 1608 is basically how emulation looked like during the 80's where you couldn't really 1609 do it purely in software, actually requiring dedicated Hardware to do the heavy 1610 lifting. 1611 1612 [inboard]: http://www.minuszerodegrees.net/manuals/Intel%20Inboard%20386_PC%20Manual.pdf 1613 [asrock]: https://www.asrock.com/mb/spec/upgrade.asp?Model=am2cpu%20board 1614 [babyblue]: http://retrocmp.de/hardware/babyblue2/babyblue2.htm 1615 1616 1617 ##### 2.9 - IBM PC BIOS Firmware (BIOS Services, Option ROMs) 1618 1619 Yet another major piece of the IBM PC was a software one, the firmware, that was 1620 stored in one of the Motherboard ROM chips. The IBM PC firmware was formally 1621 known as the BIOS. The BIOS firmware was the very first code that the Processor 1622 executed, and for that reason, the ROM had to be mapped to a specific location 1623 of the Memory Address Space so that it could satisfy the 8088 CPU hard-coded 1624 startup behavior. The BIOS was responsible for initializing and testing most of 1625 the computer components during POST before handing out control of the computer 1626 to user software. You can [read here][postbreakdown] a detailed breakdown of all 1627 the things that the BIOS did during POST before getting the computer into an 1628 usable state. Some parts of the BIOS remained always vigilant for user input, 1629 like [the legendary routine][ctrlaltdel] that intercepted the Ctrl + Alt + Del 1630 Key Combination. 1631 1632 The BIOS also provided a crucial component of the IBM PC that is usually 1633 underappreciated: The BIOS Services. The BIOS Services were a sort of API 1634 (Application Programming Interface) that the OS and user software could call via 1635 software Interrupts as a middleware that interfaced with the platform Hardware 1636 Devices. As such, the BIOS Services could be considered like built-in Drivers 1637 for the computer. IBM actually expected that the BIOS Services could eventually 1638 be used as a HAL (Hardware Abstraction Layer), so if the support chips ever 1639 changed, software that relied on the BIOS Services would be forward compatible. 1640 Although IBM strongly recommended to software developers to use the BIOS 1641 Services, it was possible for applications to include their own Drivers to 1642 bypass them and interface with the Hardware Devices directly. Many performance 1643 hungry applications did exactly that, as the BIOS Services were very slow. 1644 Regardless of these details, the BIOS Services were a staple feature of the IBM 1645 PC. 1646 1647 Compared to later systems, there was no "BIOS Setup" that you could enter by 1648 pressing a Key like Del during POST, nor there was any non volatile writable 1649 memory where the BIOS could store its settings. Instead, the Motherboard was 1650 outfitted with several DIP Switches, the most notorious ones being SW1 and SW2, 1651 whose positions had hard-coded meanings for the BIOS, which checked them on 1652 every POST. This made the BIOS configuration to be quite rudimentary in nature, 1653 as any change required physical access to the Motherboard. 1654 1655 The BIOS pretty much did no Hardware discovery on its own, it just limited 1656 itself to check during POST for the presence of the basic Hardware Devices that 1657 the DIP Switches told it that the computer had installed, thus it was very 1658 important that the DIP Switches were in the correct position since there were 1659 many failure conditions during POST that involved the BIOS being unable to find 1660 a Hardware Device that it expected to be present. For example, the BIOS didn't 1661 scanned the entire 640 KiB range of Conventional Memory to figure out how much 1662 system RAM it could find, it simply checked the position of the DIP Switches 1663 that indicated how much Conventional Memory the computer had installed, then 1664 just limited itself to test if that amount was physically present (You could use 1665 the Motherboard DIP Switches to tell the BIOS that the computer had less system 1666 RAM that it actually had and it would work, it failed only if the amount set was 1667 higher than what was physically installed). The type of Video Card was also 1668 configurable via DIP Switches, you could use them to tell the BIOS if you had a 1669 MDA or CGA Video Card, and it would check if it was present and use it as 1670 primary video output. 1671 1672 There were a few BIOS Services that allowed an OS or any other application to 1673 ask the BIOS what Hardware it thought that the computer had installed. For 1674 example, INT 11h was the Equipment Check, that could be used by an application 1675 to determine if the computer was using a MDA or a CGA Video Card, among other 1676 things. There was also INT 12h, which returned the amount of Conventional 1677 Memory. A rather interesting detail of INT 12h is that it was the closest thing 1678 to a software visible Memory Map available during that era. Neither the BIOS, 1679 the OS nor the user applications knew how the Memory Map of the computer truly 1680 was, they just blindly used what they were hard-coded to know about based on 1681 both the IBM defined Memory Map, and any user configurable Drivers that could 1682 point to where in the Memory and I/O Address Spaces an expansion card was mapped 1683 to. 1684 1685 It was possible to upgrade the BIOS of the IBM PC, but not by flashing it. You 1686 were actually required to purchase a new ROM chip with the latest BIOS version 1687 preprogrammed in it, as the standard ROMs of that era weren't of the rewritable 1688 variety (And chances are that those that were would need an external 1689 reprogrammer, so at the bare minimum, you would have to remove the BIOS ROM from 1690 the computer to rewrite it with a new image. It was not like modern in-situ 1691 flashing). There are three known BIOS versions for the IBM PC, and the last one 1692 (Dated 10/27/82) is rather important since it introduced a major feature: 1693 Support for loadable Option ROMs. This greatly enhanced what an expansion card 1694 could do. 1695 1696 When the IBM PC was first released, the BIOS had built-in support for pretty 1697 much all the existing Device types designed by IBM. The BIOS could check for the 1698 presence, initialize, test and provide an interface to use most of IBM own 1699 expansion cards or 100% compatible third party ones, but it could do nothing 1700 about Devices that it didn't knew about. This usually was not an issue since a 1701 Device that was not supported by the BIOS would still work if using either an OS 1702 Driver, or an application that included a built-in Driver for that Device, which 1703 was the case with Sound Cards (Although these came much, much later). However, 1704 there were scenarios where a Device had to be initialized very early for it to 1705 be useful. For example, the earliest HDs (Hard Disks) and HDC (Hard Disk 1706 Controller) cards that could be used on the IBM PC were from third parties that 1707 provided Drivers for the OS, so that it could initialize the HDC and use the HD 1708 as a storage drive. As the BIOS had absolutely no knowledge about what these 1709 things were, it was impossible for the BIOS to boot directly from the HD itself, 1710 thus if you wanted to use a HD, it was an unavoidable procedure to first boot 1711 from a Diskette to load the HDC Driver. IBM should have noticed that the BIOS 1712 had this huge limitation rather quickly, and decided to do something about it. 1713 1714 The solution that IBM decided to implement was to make the BIOS extensible by 1715 allowing it to run executable code from ROMs located in the expansion cards 1716 themselves, effectively making the contents of these ROMs a sort of BIOS 1717 Drivers. During POST, the BIOS would scan a predefined range of the UMA every 2 1718 KiB intervals (The IBM PC BIOS first scans the 768 KiB to 800 KiB range 1719 expecting a VBIOS, then a bit later it scans 800 KiB to 982 KiB for any other 1720 type of ROMs) looking for mapped memory with data in it that had an IBM defined 1721 header to indicate that it was a valid executable Option ROM. Option ROMs are 1722 what allowed an original IBM PC with the last BIOS version to initialize a HDC 1723 card so that it could boot from a HD, or to use later Video Cards like EGA and 1724 VGA, as these had their initialization code (The famous VBIOS) in an Option ROM 1725 instead of expecting built-in BIOS support like MDA and CGA Video Cards. While 1726 if IBM wanted to, it could have keep releasing newer BIOS versions in 1727 preprogrammed ROMs that added built-in support for more Device types, it would 1728 have been a logistical nightmare if every new expansion card required to also 1729 get a new BIOS ROM. Moreover, such implementation would have hit a hard limit 1730 rather soon due to the fixed Motherboard address decoding logic for the BIOS 8 1731 KiB ROM chip, while an expansion card was free to use most of the UMA to map its 1732 Option ROM. Considering that we are still using Option ROMs, the solution that 1733 IBM chose was a very good one. 1734 1735 Related to Option ROMs, an extremely obscure feature of the IBM PC 5150 is that 1736 the previously mentioned empty ROM socket, known as U28, was already mapped in 1737 the 976 KiB to 983 KiB address range (Just below the IBM Cassette BASIC), and 1738 thus ready to use if you plugged a compatible 8 KiB ROM chip there. With the 1739 last BIOS version, a ROM chip fitted in U28 worked exactly like if it was an 1740 Option ROM built-in in the Motherboard itself instead of an expansion card, as 1741 the BIOS routine that scans for valid Option ROMs also scans that 8 KiB address 1742 range, too. So far, the only commercial product that I'm aware of that shipped a 1743 ROM chip that you could plug into the U28 Socket was a 1744 [Maynard SCSI Controller][maynardscsi], albeit I don't know what advantages it 1745 had compared to having the Option ROM in the expansion card itself. Some 1746 hobbyist also managed to make a [custom Option ROM][optionrom] for debugging 1747 purposes. Since this empty ROM socket is present in both IBM PC Motherboard 1748 versions and predates the loadable Option ROMs feature introduced by the last 1749 BIOS version, I don't know what its original intended use was supposed to be, or 1750 if the previous BIOS versions supported U28 as an Option ROM then the last BIOS 1751 extended the scheme for around half of the UMA range. 1752 1753 Last but not least, after the BIOS finished the POST stage, it had to boot 1754 something that allowed the computer to be useful. Without Option ROMs, the IBM 1755 PC BIOS knew how to boot from only two types of storage media: Diskettes, via 1756 the FDC and the first Diskette Drive, or the built-in IBM Cassette BASIC, also 1757 stored in ROM chips in the Motherboard like the BIOS firmware itself. 1758 1759 If your IBM PC couldn't use Diskettes because you had no FDC or Diskette Drive, 1760 or there was no Diskette inserted in the Diskette Drive, the BIOS would boot the 1761 built-in IBM Cassette BASIC. This piece of software isn't widely known because 1762 only genuine IBM branded computers included it. Perhaps the most severe 1763 limitation of the IBM Cassette BASIC was that it could only read and write to 1764 Cassettes, not Diskettes, something that should have played a big role in how 1765 quickly it was forgotten. To begin with, Cassettes were never a popular type of 1766 storage media for the IBM PC, so I doubt that most people had a Cassette Deck 1767 and blank Cassettes ready to casually use the IBM Cassette BASIC. With no way to 1768 load or save code, everything was lost if you rebooted the computer, so its 1769 usefulness was very limited. 1770 1771 The other method to boot the IBM PC was with Diskettes. Not all Diskettes could 1772 be used to boot the computer, only those that had a valid VBR (Volume Boot 1773 Record) were bootable. The VBR was located in the first Sector (512 Bytes) of 1774 the Diskette, and stored executable code that could bootstrap another stage of 1775 the boot process. Besides the bootable Diskettes with the OSes themselves, there 1776 were self contained applications and games known as PC Booters that had no 1777 reliance on an OS at all, these could be booted and used directly from a 1778 Diskette, too. 1779 1780 [postbreakdown]: http://www.minuszerodegrees.net/5150/post/5150%20-%20POST%20-%20Detailed%20breakdown.htm 1781 [ctrlaltdel]: http://www.os2museum.com/wp/ctrl-alt-del-myths/ 1782 [maynardscsi]: http://www.vcfed.org/forum/showthread.php?50195-Trying-to-get-old-IBM-PC-running 1783 [optionrom]: http://www.vcfed.org/forum/showthread.php?38377-quot-Fun-with-BIOS-quot-Programming-Thread&p=291834#post291834 1784 1785 1786 ##### 2.10 - IBM PC Operating Systems (PC DOS, PC Booters) and user experience 1787 1788 The last piece of the IBM PC was the Operating System. While the OS is very 1789 relevant to the platform as a whole functional unit, it is not part of the 1790 Hardware itself (The firmware is software, yet it is considered part of the 1791 Motherboard as it comes in a built-in ROM with code that is highly customized 1792 for the Hardware initialization of that specific Motherboard model. However, the 1793 IBM Cassette BASIC was also in the Motherboard ROM but can actually be 1794 considered some sort of built-in PC Booter). The Hardware platform is not bound 1795 to a single OS, nor it is guaranteed that an user with a specific Hardware 1796 platform uses or even has an OS, either. 1797 1798 What made OSes important are their System Calls. System Calls are software 1799 Interrupts similar in style and purpose to the BIOS Services, as both are APIs 1800 used to abstract system functions from user applications. The major difference 1801 between them is that the BIOS Services relies on firmware support, so they are 1802 pretty much bound to the Hardware platform, while System Calls relies on the OS 1803 itself. Since at the time of the IBM PC launch it was very common that the same 1804 OS was ported to many Hardware platforms, an application that relied mainly on 1805 the System Calls of a specific OS was easier to port to another platform that 1806 already had a port of that OS running in it. I suppose that there should be a 1807 lot of overlapping between BIOS Services and System Calls that did very similar 1808 or even identical things. I image that is also possible that a System Call had a 1809 nested BIOS Service call, thus the overhead of some functions could have been 1810 rather high. 1811 1812 There were two OSes available at the launch date of the IBM PC, and some more 1813 appeared later. Regardless of the alternatives, the most emblematic OS for the 1814 IBM PC was PC DOS. PC DOS was developed mainly by Microsoft, who keep the right 1815 to license it to third parties. Eventually, Microsoft would start to port PC DOS 1816 to other non-IBM x86 based platforms under the name of MS-DOS. 1817 1818 The first version of PC DOS, 1.0, was an extremely dull, simple and rudimentary 1819 OS. PC DOS had no Multitasking capabilities at all, it could only run a single 1820 application at a time. Besides having its own System Calls, known as the DOS 1821 API, it also implemented the System Calls of CP/M, a very popular OS from the 1822 previous generation of platforms with 8 Bits Processors, making it sort of 1823 compatible. The intent was that it would be easier to port CP/M applications to 1824 the IBM PC if the developer had a familiar OS interface to work with, as it 1825 could instead focus on the new Hardware (Mainly the CPU, as it was a different 1826 ISA than the previous 8 Bits ones like the Intel 8080/8085 CPUs). However, as 1827 far that I know, the CP/M System Calls of PC DOS were barely used, and pretty 1828 much entirely forgotten after PC DOS took over the vast majority of OS 1829 market share. 1830 1831 Perhaps the most relevant and exclusive feature of PC DOS was that it had its 1832 own File System for Diskettes, FAT12 (File Allocation Table. Originally it was 1833 merely named FAT, but it has been retconned). A File System is a data 1834 organization format that with the aid of metadata defines how the actual user 1835 data is stored onto a storage media. The DOS API provided an interface that 1836 software could use to easily read and write to Diskettes formatted in FAT12, 1837 greatly simplifying development of user applications that had to store files. I 1838 have no idea if PC DOS was able to effectively use Cassettes on its own as a 1839 Diskette alternative, it is a quite unexplored topic. 1840 1841 In the early days of the IBM PC, it seems that only office and productivity 1842 software that had to deal with data storage in Diskettes used PC DOS, as they 1843 could rely on the DOS API to use FAT12. If the developer didn't needed 1844 sophisticated storage services, or could do its own implementation of them 1845 (Albeit such implementation may not necessarily be FAT12 compatible, thus not 1846 directly readable by PC DOS) along with any other required PC DOS functionality, 1847 it was more convenient to do so to make the application a self contained PC 1848 Booter that relied only on the BIOS Services, since using the DOS API meant that 1849 you had to own and boot PC DOS first and lose some valuable RAM that would be 1850 used by it. Basically, as ridiculous as it sounds now, developers actively tried 1851 to avoid using the IBM PC main OS if storing data was not a required feature. 1852 And don't even get started with the amount of things that you had to be careful 1853 with to not break DOS basic functionality itself, like separately 1854 [keeping track of the System Time][dostimer]. Other than the importance of the 1855 FAT12 File System, there isn't much else to say about the humble beginnings of 1856 PC DOS, it was really that plain. This impression will only get worse the moment 1857 that you discover that before the history of PC DOS began, Microsoft was already 1858 selling a full fledged UNIX based OS, Xenix, for other non-x86 platforms. 1859 1860 One of the things that I was always curious about, is how the daily usage of an 1861 IBM PC was like from the perspective of an early adopter. Information about the 1862 IBM PC 5150 user experience in 1981 is scarce, but based on what I could muster 1863 from digital archivists blog posts and such, the limited amount of expansion 1864 cards and user software means that there weren't a lot of things that you could 1865 do with one right at the launch date. That makes somewhat easy to hypothesize 1866 about how the IBM PC could have been used by an average user... 1867 1868 The lowest end model of the IBM PC had only 16 KiB RAM, the MDA Video Card, and 1869 no FDC nor Diskette Drive. The only thing that this setup could do, was to boot 1870 the built-in IBM Cassette BASIC. As previously mentioned, the IBM PC Motherboard 1871 supported connecting a Cassette Deck directly to it, so it was possible to use 1872 Cassettes as the storage media for the IBM Cassette BASIC environment with just 1873 an external Cassette Deck and the base IBM PC unit. You could also use the IBM 1874 Cassette BASIC with no Cassette Deck at all, but that would made it an entirely 1875 volatile environment as it would have been impossible to save your work, losing 1876 whatever you had done if the computer was shutdown or rebooted. IBM intended 1877 that this particular model competed against other contemporary personal 1878 computers that also used Cassettes, but aimed at people that were willing to pay 1879 a much higher premium for the IBM branding. Supposedly, the Cassette-only IBM PC 1880 5150 was so popular that factory boxed units lasted only a few months in the 1881 market before vanishing, never to be seen again. 1882 1883 A mainstream IBM PC would have 64 KiB RAM (Maxing the first 16KB - 64KB 1884 Motherboard) with a FDC and a single 5.25'' Diskette Drive. In addition to the 1885 IBM Cassette BASIC, this setup would also have been able to boot from Diskettes 1886 with a valid VBR, like standalone PC Booters applications and games, or proper 1887 OSes. While PC DOS was the most popular IBM PC OS, it didn't came by default 1888 with the system, it was an optional purchase, yet most people got it as it 1889 opened the door to use any application that had to be executed from within the 1890 DOS environment. 1891 1892 Booting PC DOS from a Diskette should have been an experience similar to booting 1893 a modern day Live CD in a computer with no Hard Disk, with the major difference 1894 being that every application that you may have wanted to use was in its own 1895 Diskette, so you had to disk swap often. Due to the fact that RAM was a very 1896 scarce and valuable resource, the whole PC DOS didn't stayed resident in RAM, 1897 only the code relevant for the DOS API did, leaving an application able to 1898 freely overwrite the rest. Because of this, if you executed then exited a DOS 1899 application, typically you also had to disk swap back again to the PC DOS 1900 Diskette to reload it, then disk swap once more to whatever other application 1901 you wanted to use. 1902 1903 While is arguable that people didn't multitasked between applications as often 1904 as we do now, the entire disk swapping procedure made using the system rather 1905 clumsy. Actually, it could have been worse than that, since many applications 1906 could require another Diskette to save data, so if you were ready to quit an 1907 application to use another one, you may have had to first disk swap to a blank 1908 Diskette to save the current data, probably disk swap to the application 1909 Diskette again so that it could reload itself (In the same way that the whole PC 1910 DOS didn't stayed in RAM all the time, an application Diskette had to be 1911 accessed often to load different parts of it as code that was already in RAM got 1912 continuously overwritten), exit, then disk swap to the PC DOS Diskette. It is 1913 not surprising that PC Booters were the favoured format for anything not really 1914 requiring the DOS API services until fixed storage became mainstream, as the 1915 amount of disk swapping should have been quite painful. 1916 1917 A high end setup of the IBM PC would include a FDC with two 5.25'' Diskette 1918 Drives. The advantage of this setup was that it massively reduced the daily disk 1919 swapping, as applications that couldn't save data to the same Diskette that they 1920 were loaded from should have been able to write the data to the other Diskette 1921 Drive. You could also do a full disk copy at once, something that was impossible 1922 to do with only a single Diskette Drive before the advent of fixed storage 1923 media, unless you had enough RAM to make a RAMDisk to hold the data (Keep in 1924 mind that early Diskettes were only 160 KB worth in size, so using 3 64 KiB 1925 Memory expansion cards could have been a viable, if expensive, alternative to 1926 another Diskette Drive). 1927 1928 [dostimer]: http://expiredpopsicle.com/2017/04/13/DOS_Timer_Stuff.html 1929 1930 1931 ##### 2.11 - Definition of platform 1932 1933 After fully describing the IBM PC 5150, we can get to the point where it is 1934 possible to explain what a computer platform is. Since by now you know about 1935 nearly all the components present in the IBM PC and their reason to be there, 1936 you should have already figured out yourself about all the things that the 1937 definition of platform involves. A computer platform can be defined, roughly, as 1938 all of the previously explained things considered as part of a whole functional 1939 unit. There are subdivisions like the Hardware platform, that focuses only on a 1940 minimum fixed set of physical components, and the system software, that includes 1941 firmware and OS, the first which can be upgraded or extended, and the latter 1942 that is not even a required component at all for the existence of standalone 1943 user software, albeit it makes for it to be easier to develop. 1944 1945 How much the components of a computer platform can vary while still being 1946 considered the same platform, is something quite constrained by the requirements 1947 of the software that will run on it. The user software can make a lot of fixed 1948 assumptions about both the Hardware platform and the system software, so if you 1949 want that these applications work as intended, the computer must provide an 1950 environment that fully satisfies all those assumptions. In the Hardware side, 1951 the most important part is the Processor due to the Binary Compatibility, so 1952 that it can natively run executable code compiled for its specific ISA. 1953 Following the Processor you have the support chips and basic Devices like the 1954 Video Card, all of which can be directly interfaced with by applications (This 1955 statement was true with 80's and early 90's platforms at most. In our modern 1956 days you always go though a middleware API, like DirectX, OpenGL or Vulkan in 1957 the case of Video Cards or some OS System Call for everything else), as they can 1958 be found in a fixed or at least predictable Memory Map. In the system software 1959 side, you have the firmware and the OS, both of them providing software 1960 Interrupts with a well-defined behavior that are intended to ease for 1961 applications how to interface with the Hardware. An application could assume 1962 that most of these things were present and behaved in the exact same way in all 1963 the systems it would be run on, so any unexpected change could break some 1964 function of the application or cause that it didn't work at all. 1965 1966 When you consider the IBM PC as a whole unit, you can think of it as a reference 1967 platform. As the user software could potentially use all of the IBM PC platform 1968 features, it was a minimum of things that had to be present in any system that 1969 expected to be able to run applications intended for an IBM PC. There could be 1970 optional Devices that could enhance an application functionality, but these 1971 weren't a problem because user software didn't always assumed that an optional 1972 Device was always available. For example, during the early 90's, not everyone 1973 had a Sound Card, but everyone had a PC Speaker. Games typically supported both, 1974 and albeit the PC Speaker capabilities aren't even remotely comparable to a 1975 Sound Card, at least you had some low quality sound effects instead of being 1976 completely mute. 1977 1978 There were other platforms based on x86 Processors that were quite similar to 1979 the IBM PC from a component standpoint, but barely compatible with its user 1980 software. For example, in Japan, NEC released their PC-98 platform a bit more 1981 than a year after the IBM PC, and from the core Hardware perspective, they have 1982 a lot in common. The first PC-98 based computer, the NEC PC-9801, had an Intel 1983 8086 CPU that required a wider Bus and a different type of expansion slots to 1984 accommodate its 16 Bits Data Bus, but it was otherwise functionally equivalent 1985 to the 8088. The support chips included an Intel 8253 PIT, an 8237A DMAC and two 1986 cascaded 8259A PICs, so it can be considered that the core components of the 1987 platform were around halfway between the IBM PC and the IBM PC/AT. Microsoft 1988 even ported MS-DOS to the PC-98 platform, so the same System Calls were 1989 available in both platforms. 1990 1991 However, the NEC PC-98 platform had substantial differences with the IBM PC: The 1992 Devices weren't wired to the 8237A DMAC and the 8259A PICs support chips in the 1993 same way than on the IBM PC, so the standard DMA Channel and IRQ assignments for 1994 them were different. The Video Card was completely different since it was 1995 designed with Japanese characters in mind, thus anything from the IBM PC that 1996 wanted to use it directly, typically games, would not work. The PC-98 also had a 1997 firmware that provided its own software Interrupts, but they were not the same 1998 than those of the IBM PC BIOS, so anything that relied on the IBM PC BIOS 1999 Services would fail to work, too. The Memory Map of the PC-98 was similar since 2000 the Memory Address Space also was partitioned in two with the lower 640 KiB 2001 reserved for system RAM and the upper 384 KiB for everything else, but the upper 2002 section, which in the IBM PC was known as UMA, was almost completely different. 2003 In practice, the only IBM PC user software that had any chance to work in the 2004 PC-98 were PC DOS applications executed under PC-98 MS-DOS that relied only on 2005 the DOS API, were console based and otherwise very light on assumptions, 2006 anything else required to be ported. 2007 2008 From the platforms that were partially compatible with the IBM PC, a very 2009 notorious one came from IBM itself: The IBM PCjr 4860, released in March 1984. 2010 The IBM PCjr was a somewhat cut down version of the IBM PC that was targeted at 2011 the home user instead of the business user. While the PCjr firmware was fully 2012 BIOS compatible, it had some Hardware differences with the PC. The IBM PCjr 2013 Video Card [wasn't fully CGA compatible][pcjrcga] as it had just a partial 2014 implementation of the CGA registers, so any game that tried to directly 2015 manipulate the registers that weren't implemented would not function as 2016 expected. Games that instead used BIOS Services for the graphics mode change 2017 worked, making them a good example of how the BIOS Services were doing the job 2018 of a HAL when the underlying Hardware was different. The IBM PCjr was also 2019 missing the Intel 8237A DMAC, which means that all memory accesses had to go 2020 through the Processor as they couldn't be offloaded to a DMA Controller support 2021 chip like in the IBM PC. This caused some applications to not work as expected, 2022 since they couldn't perform some operations simultaneously. There were many more 2023 differences, but the important point is that due to these differences, around 2024 half of the IBM PC software library didn't run in the IBM PCjr, which is perhaps 2025 the main reason why it failed in the market. 2026 2027 Yet another example of a platform is an IBM PC that had installed the Intel 2028 InBoard 386/PC upgrade card mentioned a while ago. The upgrade card allowed an 2029 IBM PC to enjoy the new Intel 80386 CPU performance and features, however, not 2030 all software requiring a 386 would work with it. For example, a 386 allowed 2031 Windows 3.0 to use the 386 Enhanced Mode, but Windows 3.0 assumed that if you 2032 had a 386 Processor, you also had an IBM PC/AT compatible platform. An IBM PC 2033 with an InBoard 386/PC is a PC class 386, which is a rather unique combination. 2034 In order to make the upgrade card useful, Intel and Microsoft took the extra 2035 effort of collaborating to develop [a port][win3enhanced] of Windows 3.0 386 2036 Enhanced Mode for the IBM PC platform. This shortcoming essentially made the 2037 whole Processor upgrade card concept a rather incomplete idea, it was simply too 2038 much effort. 2039 2040 Among the most amazing engineering feats, I should include all the efforts made 2041 to attempt to achieve cross compatibility between completely different computer 2042 platforms. Because in those days full software emulation was not practical due 2043 to a lack of processing power, the only viable way to do so was to throw in 2044 dedicated Hardware in an expansion card, which was still cheaper than a full 2045 secondary computer. With the help of software similar in purpose to a Virtual 2046 Machine Monitor that arbitrated the shared host resources and emulated what was 2047 missing from the guest platform, you could execute software intended for a 2048 completely different one on your IBM PC. Examples include the previously 2049 mentioned Microlog BabyBlue II, that had a Zilog Z80 CPU so that it could 2050 natively run code from CP/M applications, or the [DCS Trackstar][trackstar] and 2051 Quadram Quadlink, which included a MOS 6502 CPU, its own RAM, and a ROM with an 2052 Apple II compatible firmware so that it could run Apple II applications. 2053 2054 Some computers went dramatically further and attempted to fuse two platforms in 2055 one, including a very high degree of resource sharing as they were considered 2056 part of the same base platform instead of a simple add-on expansion card. These 2057 unique computers include the DEC Rainbow 100, that was both IBM PC compatible 2058 and could run CP/M applications thanks to having both an Intel 8088 CPU and a 2059 Zilog Z80 CPU, but in a more tightly integrated relationship than an IBM PC with 2060 a BabyBlue II installed due to some specific resource sharing, or the Sega 2061 TeraDrive, an IBM PC/AT and Sega Mega Drive (Genesis) hybrid with both an Intel 2062 80286 CPU, Zilog Z80 CPU, and Motorola 68000 CPU. I'm even curious whenever 2063 someone ever attempted to emulate them due to their unique features... 2064 2065 [pcjrcga]: http://nerdlypleasures.blogspot.com/2013/10/the-pcjr-and-cga-compatibility.html 2066 [win3enhanced]: http://www.vcfed.org/forum/showthread.php?48312-Intel-Inboard-386-PC-Extremely-Rare-copy-of-Windows 2067 [trackstar]: https://www.diskman.com/presents/trackstar/ 2068 2069 2070 ##### 2.12 - The first IBM PC clones 2071 2072 Some of the earliest rivals that the IBM PC had to face in the personal computer 2073 market were other computers extremely similar to the IBM PC itself, but based on 2074 different platforms. Most of these PC-like platforms were x86 based (Some even 2075 used the Intel 8086 CPU instead of the 8088) and also had their own ported 2076 versions of MS-DOS and other user software, making them similar to the 2077 previously mentioned NEC PC-98, but closer to the IBM PC than it, and thus far 2078 more compatible. 2079 2080 However, as the IBM PC popularity grew, its software ecosystem did, too. Soon 2081 enough, it became obvious for any computer manufacturer that it would be 2082 impossible to break into the personal computer market with yet another different 2083 computer platform that would require its own software ports of almost 2084 everything, it was already hard enough for the partially compatible platforms 2085 that were already in the market to stay relevant to keep introducing more. Thus, 2086 the idea of non-IBM computers that could run out of the box the same user 2087 software than the IBM PC with no modifications at all, and even use the same 2088 expansion cards, became highly popular... 2089 2090 The open architecture approach of the IBM PC made cloning the Hardware side of 2091 the computer ridiculous easy, as all the chips could be picked off the shelf, 2092 while the IBM PC 5150 Technical Reference Manual had extensive diagrams 2093 documenting how they were interconnected at the Motherboard level. Microsoft 2094 would happily license MS-DOS and a Diskette version of Microsoft BASIC to other 2095 computer manufacturers, too. There was a single showstopper: The IBM PC BIOS. 2096 While IBM openly provided the source code for it, it was proprietary, making it 2097 the only thing that allowed IBM to keep the other computer manufacturers from 2098 being able to fully clone the IBM PC. In fact, some of the earliest makers of 2099 IBM PC clones got sued by IBM since they outright used the IBM PC BIOS source 2100 code that IBM published. 2101 2102 Eventually, a few vendors with good lawyers (Compaq being the most famous and 2103 successful) figured out that it was possible to do a legal BIOS replacement that 2104 was functionally identical to the IBM PC one, for as long that they were using a 2105 clean room design procedure (Basically, someone had to reverse engineer the BIOS 2106 to document what it did, then another developer that had absolutely no contact 2107 with anything related to it had to reimplement the same functionality, as to not 2108 accidentally use any of the IBM code). This opened the door for the first legal 2109 IBM PC clone computers... 2110 2111 The first wave of fully legal IBM PC clone computers was spearheaded by the 2112 launch of the Columbia Data Products MPC 1600 in June 1982, less than a year 2113 after the original IBM PC. It got eclipsed by the Compaq Portable launched in 2114 March 1983, which is far more known. Soon after, there was a great breakthrough 2115 when in May 1984, Phoenix Technologies made its own legal BIOS replacement 2116 available to other computer manufacturers at an affordable license cost (You can 2117 [read about the original Phoenix BIOS author experiences][phoenixbios]). This 2118 caused a flood of competition from new manufacturers, as by that point, anyone 2119 with some capital could set up a workshop to build computers with the same (Or 2120 compatible) Hardware than that of the IBM PC, the PC DOS compatible MS-DOS from 2121 Microsoft, and a firmware functionally equivalent to the IBM PC BIOS from 2122 Phoenix Technologies. All these clones were able to run almost all the software 2123 designed for the IBM PC for a much lower price that what IBM charged for their 2124 systems. 2125 2126 The only thing missing from clones was the IBM Cassette BASIC, as IBM had an 2127 exclusive licensing deal from Microsoft for a built-in ROM version of BASIC 2128 (Amusingly, there were a few rare versions of MS-DOS intended to be loaded from 2129 a built-in ROM used by some clone manufacturers). This wasn't critical for 2130 general IBM PC compatibility, except for some specific BASIC applications that 2131 expected to read code from the fixed addresses reserved for the ROMs of the IBM 2132 Cassette BASIC. Still, for some reason, many clone Motherboards had empty spaces 2133 to plug the 4 ROM chips that made up the IBM Cassette BASIC, but these were only 2134 rarely used. You could either remove the original ROM chips from an IBM PC, or 2135 just make a pirate copy in programmable ROMs. 2136 2137 Clone computers are the very reason why the IBM PC platform was so successful. 2138 IBM prices were always very high since it was aiming mostly at profitable 2139 business and enterprise customers that paid for the brand (Remember the famous 2140 "No one got fire for buying IBM" motto?), it was the clone manufacturers the 2141 ones that were directly fighting against the other partially compatible PC-like 2142 platforms (And themselves) based on price and specifications. Clones 2143 dramatically helped to increase the size of the installed user base of IBM PC 2144 compatible computers, snowballing the growth of its software ecosystem at the 2145 cost of the other platforms, and consequently, driving even more sales of IBM PC 2146 compatible computers. Is possible to say that the clones made all the PC-like 2147 platforms go extinct. 2148 2149 [phoenixbios]: https://www.quora.com/What-is-the-coolest-thing-you-have-ever-created-alone-as-a-programmer/answer/Ira-J-Perlow?share=1 2150 2151 2152 3 - The IBM PC/XT 5160 and the Turbo XT clones 2153 ---------------------------------------------- 2154 2155 IBM, eager to continue the success of the IBM PC, released the IBM PC/XT 5160 in 2156 March 1983. The PC/XT tackled on the main weakness of the IBM PC, which was that 2157 its 5 expansion slots were far from enough as the add-ons ecosystem began to 2158 grow. On top of that, it added an internal Hard Disk plus a HDC (Hard Disk 2159 Controller) card as out-of-the-box components, albeit at the cost of a Diskette 2160 Drive. In later models the Hard Disk would become entirely optional, being able 2161 to run two internal Diskette Drives like the previous IBM PC did. 2162 2163 From the platform perspective, the PC/XT is barely interesting as it had no core 2164 changes over the IBM PC 5150. It still had the same Processor, the same support 2165 chips, the same Clock Generation, and the exact same performance. It had a 2166 slightly different Memory Map, but nothing major. In essence, the IBM PC/XT was 2167 pretty much a wider version of the original IBM PC with some minor enhancements, 2168 but otherwise functionally almost identical to it. In fact, most of the things 2169 that could be added to it would also work in the previous PC. There is a single 2170 thing than the IBM PC/XT actually removed: The Cassette Port and its associated 2171 circuitry, perhaps for the grief of no one. 2172 2173 There were two different Motherboards used in the IBM PC/XT: The original one, 2174 known as the 64KB - 256KB, and a later one released around 1986, known as 256KB 2175 - 640KB. The main difference is obviously the amount of installed RAM that they 2176 supported. There were also a few minor revisions that tried to solve some quirks 2177 involving the Intel 8237A DMAC, a premonition of things to come. 2178 2179 2180 ##### 3.1 - IBM PC/XT changes 2181 2182 Most of the big changes of the IBM PC/XT were found in its Motherboard. The 2183 easiest thing to notice is that the new Motherboard had 8 I/O Channel Slots 2184 instead of the 5 available in the original PC, which was a major upgrade 2185 considering how slot starved the original IBM PC was. To accommodate more slots, 2186 the spacing between them was reduced compared to the IBM PC. This is one of the 2187 greatest legacies of the PC/XT, as its slot spacing became a de facto standard, 2188 and would eventually be formalized for Motherboards and Computer Cases Form 2189 Factors like ATX, so we are still using today the PC/XT slot spacing. 2190 2191 A curious detail regarding the expansion slots is that one of them, Slot 8, 2192 behaved differently to all the other ones. As can be seen in the 2193 [Address Bus and Data Bus diagrams][motherboarddiag] of MinusZeroDegrees, for 2194 some reason, Slot 8 was wired to the External I/O Channel Bus instead of the 2195 main segment. Moreover, Slot 8 repurposed the Pin B8 of the I/O Channel Slot, 2196 which in the IBM PC 5150 was marked as Reserved, as CARD SLCTD (Card Selected), 2197 and it was expected that any card inserted into Slot 8 made use of that line. 2198 Because of this, an expansion card had to explicitly support Slot 8 behavior in 2199 order to work if installed there. Many IBM PC/XT era expansion cards typically 2200 had a Jumper to select either standard or Slot 8 specific behavior, so they 2201 could work in either. There has been some speculation about what was IBM idea 2202 behind Slot 8. So far, there seems to exist at least a single card that ONLY 2203 works in Slot 8, which IBM used in a PC/XT variant known as the PC 3720 that had 2204 some expansion cards bundled with it so that it could emulate an IBM 3270 2205 Terminal (Similar in nature to the Microlog BabyBlue II, DCS Trackstar and 2206 Quadram Quadlink mentioned previously) to interface with IBM mainframes. 2207 2208 While at the core the support chips remained the same, the Intel 8255 PPI had 2209 several GPIO rearrangement changes. It was still being used as part of the 2210 Keyboard and PC Speaker subsystems, and their GPIO Pins remained the same for 2211 backwards compatibility reasons, but many others were rearranged. Port A, being 2212 fully used by the Keyboard interface, remained completely intact, the changes 2213 were among Ports B and C GPIO Pins. The most notorious thing related to those 2214 changes is the removal of anything related to the Cassette interface, as the 2215 Cassette Deck Port and its support circuitry completely vanished, so it was now 2216 impossible to plug a Cassette Deck to the PC/XT. Given how popular the 2217 Cassette-only IBM PC models were known to be, I wonder if someone actually 2218 missed it... 2219 2220 The Motherboard still included the IBM Cassette BASIC in ROM and you could still 2221 boot to it. However, it was the exact same version than the previous IBM PC one, 2222 which means that it still lacked Diskette support. Since Cassettes were now 2223 impossible to use, the BASIC environment was absolutely volatile. Keeping the 2224 IBM Cassette BASIC in the PC/XT wasn't entirely useless because the Diskette 2225 versions of BASIC that Microsoft provided for genuine IBM PC computers were not 2226 standalone, instead, they relied on reading code from the Motherboard ROM. I 2227 suppose that this method could save some RAM, as the BASIC environment could get 2228 part of its executable code directly from the always available ROM instead of 2229 having to waste RAM to load the data from the Diskette. This gimmick became 2230 pretty much useless as the amount of RAM that systems had installed continued to 2231 grow. 2232 2233 The Memory Map of the PC/XT saw a few changes in the UMA that bridges the gap 2234 between the PC and the next platform, the PC/AT. Whereas the original IBM PC had 2235 a 16 KiB chunk just above the Conventional Memory (640 KiB to 655 KiB) marked as 2236 reserved, the PC/XT unifies it with the next segment for video framebuffers that 2237 includes MDA and CGA as to make it a full 128 KiB block (640 KiB to 767 KiB. 2238 Note that the IBM PC/XT Technical Reference from April 1983 seems to be merely 2239 marking it as reserved instead of for video framebuffer purposes as in the IBM 2240 PC Technical Reference). At the end of it, there is a 192 KiB block intended for 2241 Option ROMs (768 KiB to 959 KiB), which is free except for the predetermined 2242 allocation of a 8 KiB chunk for the new HDC (Hard Disk Controller) card Option 2243 ROM (800 KiB to 807 KiB. Note that this allocation was retconned for the 2244 original IBM PC as it also appears on the April 1984 version of its Technical 2245 Reference). Finally, the second 16 KiB chunk that was marked as reserved in the 2246 IBM PC is now unified with the range reserved for the Motherboard ROMs, making 2247 it a 64 KiB block (960 KiB to 1024 KiB). In resume, the UMA consist of 128 KiB 2248 for video framebuffers, 192 KiB for Option ROMs, and 64 KiB for Motherboard 2249 ROMs. 2250 2251 About physical characteristics, as previously mentioned, there were two major 2252 PC/XT Motherboard versions: The first one was the 64KB - 256KB, and the second 2253 one was the 256KB - 640KB, which was released around 1986. The IBM PC/XT 5160 2254 Technical Reference mentions that the Motherboard physical dimensions are 8.5' x 2255 12', which are approximately 21.5 cm x 30.5 cm, and seems to apply to both 2256 Motherboards. That should make them identical in size to the second version of 2257 the IBM PC 5150 Motherboard. However, I have seen at least one source 2258 ([Upgrading and Repairing PCs][upgradingpcs] by Scott Mueller) claiming that a 2259 PC/XT Motherboard measures 8.5' x 13'. Yet, as in the same sentence it says 2260 that the IBM PC Motherboards are of the same size than the PC/XT one, I suppose 2261 than that measurement is inherently wrong, since if the PC/XT Motherboard were 2262 to be longer, it would directly contradict both versions of the IBM PC 5150 2263 Technical Reference... 2264 2265 The two PC/XT Motherboards had a few differences. The first, and rather obvious 2266 difference, is how much system RAM could be installed into the Motherboard 2267 itself, as the first one maxed out at 256 KiB RAM while the second one could fit 2268 the full 640 KiB for Conventional Memory. The latter is rather curious since it 2269 had asymmetric RAM Banks (Two Banks had 256 KiB RAM each and the other two had 2270 64 KiB each, so they had to use different DRAM chip types), whereas the first 2271 had the same arrangement than the 64KB - 256KB IBM PC Motherboard. 2272 2273 Something worth mentioning is that in the PC/XT Motherboards, all the RAM chips 2274 are socketed, making troubleshooting a dead Motherboard easier than it was in 2275 the IBM PC since you can swap the Bank 0 DRAM chips for other known good ones 2276 without having to desolder. Yet another miscellaneous but still interesting 2277 change related to RAM is that the PC/XT, in order to determine how much system 2278 RAM was installed in the computer, had a Conventional Memory scanning routine in 2279 the BIOS that was executed during POST, so it pretty much autodetected it. This 2280 is in contrast to the IBM PC, whose firmware just checked the position of a 2281 secondary block of DIP Switches and stopped testing the Conventional Memory when 2282 it reached the set amount. 2283 2284 The second difference between the PC/XT Motherboards is the amount of ROM memory 2285 in the Motherboard itself. The 64KB - 256KB Motherboard had just two ROM chips, 2286 a 32 KiB one and a 8 KiB one, that were mapped matching their contents expected 2287 position in the IBM PC 5150. The 32 KiB ROM chip one included the full contents 2288 of the 8 KiB BIOS, while the 32 KiB of the IBM Cassette BASIC were split and 2289 spanned the remaining 24 KiB of the 32 KiB ROM chip and the entirety of the 8 2290 KiB ROM chip. In total that makes for 40 KiB worth of ROM, which remains 2291 unchanged from the IBM PC 5150 (If you don't count the IBM PC obscure optional 8 2292 KiB ROM of the U28 socket). A quirk of this arrangement was that, as can be seen 2293 in the Memory Layout of BIOS and BASIC Motherboard Diagram of MinusZeroDegrees, 2294 the 8 KiB ROM was affected by partial address decoding and thus its contents 2295 were repeated four times in the Address Space, so it actually occupied 32 KiB in 2296 the Memory Map, wasting 24 KiB (960 KiB to 983 KiB. This overlapped with the 2297 address range reserved for the U28 socket in the IBM PC, but this was not a 2298 problem since no PC/XT version had such empty ROM socket at all). 2299 2300 The later 256KB - 640KB Motherboard instead used two 32 KiB ROM chips with the 2301 same basic arrangement, but it had a new BIOS version that was extended using 2302 the extra 24 KiB ROM space in the second ROM chip (For a total of 32 KiB for the 2303 BIOS, that now was non-contiguously split among both ROM chips, and a total of 2304 64 KiB ROM memory, fully making use of the extended 64 KiB for Motherboard ROMs 2305 in the PC/XT Memory Map), so those 24 KiB of the address range weren't wasted 2306 mapping nothing anymore. The bigger BIOS included support for a few new Devices 2307 that included Keyboards, FDCs and Diskette Drives. This BIOS version is also 2308 known to do the previously mentioned Conventional Memory scanning and testing 2309 routine faster than the original 1983 one due to code optimizations. It was 2310 possible to upgrade the original 1983 64KB - 256KB PC/XT Motherboard with the 2311 two 32 KiB ROMs of the 1986 BIOS, as the second socket was compatible with both 2312 8 KiB and certain 32 KiB ROM chips. As even more trivial info, both Motherboards 2313 had a Jumper that allowed to disable the two ROM Sockets in case that you wanted 2314 to have an expansion card with ROMs mapped there (As you couldn't bypass the 2315 8088 CPU hard-coded requirements for bootstrapping, this means that the firmware 2316 was external since it would be located in an expansion card). 2317 2318 [motherboarddiag]: http://www.minuszerodegrees.net/5160/misc/5160_motherboard_diagrams.htm 2319 [upgradingpcs]: https://books.google.com/books?id=E1p2FDL7P5QC&pg=PA197&lpg=PA197#v=onepage&q&f=false 2320 2321 2322 ##### 3.2 - IBM PC/XT HDC card and HD, PC DOS 2.0, MBR and partitioning 2323 2324 A major new feature of the IBM PC/XT was that it had models that came from the 2325 factory with a HDC (Hard Disk Controller) card and a 10 MB HD (Hard Disk). The 2326 PC/XT BIOS had no built-in support at all for these (Not even the 1986 one), 2327 instead, the HDC card had an 8 KiB ROM chip that the BIOS could use as a 2328 loadable BIOS Driver thanks to the Option ROM feature first introduced in the 2329 third and last version of the IBM PC 5150 BIOS (Aside from the BIOS timestamp, I 2330 don't know if that IBM PC BIOS version was available in the market before or 2331 after either the IBM PC/XT or the 64KB - 256KB IBM PC Motherboard, so is 2332 debatable if the loadable Option ROM feature was first available in the market 2333 with the IBM PC/XT). As already mentioned, IBM reserved a fixed address range 2334 for the HDC card Option ROM so that newer expansion cards could avoid that 2335 address range. The HDC and HD used the ST-506 interface, which became the 2336 standard for the IBM PCs and compatibles. 2337 2338 The PC/XT also introduced a new version of PC DOS, 2.0. The new OS included a 2339 lot of new features, like native support for Hard Disks (The FAT12 File System 2340 received some minor modifications to make it usable in them. Besides reading and 2341 writing to a HD, PC DOS 2.0 could also be installed and booted from one), the 2342 introduction of directories (PC DOS 1.0/1.1 had no such concept, all the files 2343 were found in the root of the Drive. This also required changes to the way that 2344 FAT12 stored metadata in the disk media), a standardized interface for loadable 2345 Device Drivers (PC DOS 1.0/1.1 had no concept of loadable Drivers, either, the 2346 closest thing to that was to hack the OS itself to add the required 2347 functionality. This was the case with the pre-PC/XT third party Hard Disks, they 2348 directly modified PC DOS 1.0/1.1), and support for TSR (Terminate and Stay 2349 Resident) applications. PC DOS 2.0 also had several major improvements to DOS 2350 subsystems that began to make it look like a minimalistic but real OS, compared 2351 to the ultra primitive PC DOS 1.0. 2352 2353 The enormous amount of space that HDs were capable of (10 MB was a lot in 1983) 2354 required new forms of data organization. While it should have been possible to 2355 have a simple bootable VBR (Volume Boot Record) in the first sector of the HD 2356 like it was done with Diskettes, that would have been quite suboptimal 2357 (Ironically, today that is often done with USB Flash Drives and external HDs. 2358 Formatting them that way is known as Superfloppy). The problem was that at the 2359 time, each OS had its own exclusive File System, and there was almost zero cross 2360 compatibility between different OSes and File Systems. Any type of exchange from 2361 data stored in a media formatted with a specific File System to another one had 2362 to be done with special tools. Thus, if you formatted a HD with FAT12 to use it 2363 with PC DOS, you would have been unable to natively use it with another OS like 2364 CP/M-86, unless you were willing to format it again at the cost of erasing all 2365 the data already stored in it. 2366 2367 As HDs were expensive and considered Workstation class Hardware, it was expected 2368 that they would be used by power users that needed the versatility of having 2369 more than a single OS installed in it. To resolve this issue, IBM decided to 2370 define a partitioning scheme known as MBR (Master Boot Record). By partitioning 2371 a HD then getting each partition formatted with its own File System, it was 2372 possible to use a single HD to install and boot multiple OSes within their own 2373 native environments. 2374 2375 When a HD is partitioned using the MBR scheme, the first sector (512 Bytes) of 2376 the HD contains the MBR itself. The MBR has executable code similar in nature to 2377 a bootable VBR, but it also contains a Partition Table that defines the start 2378 and end of up to 4 partitions (Later known as Primary Partitions). Each 2379 partition could be formatted with a different File System, and it could also 2380 have its own bootable VBR in the first sector of the partition. A major 2381 limitation of the MBR is that only one of those partitions could be flagged as 2382 bootable (Also known as Active Partition) at a given time, and the MBR would 2383 always load only that one. In order to boot an OS installed in another 2384 partition, you had to use disk tools that allowed you to set that partition 2385 bootable flag. Luckily, it was also possible to implement right after loading an 2386 Active Partition VBR a Boot Loader that had chain loading capabilities (Known as 2387 boot manager), so that it could load the VBR of another partition to boot 2388 whatever OS was in it. 2389 2390 Booting an OS from a HD wasn't complicated, instead, it was actually absurdly 2391 linear in nature. The BIOS had absolutely no knowledge of Partitions or File 2392 Systems, the only thing it knew how to do was to read the first sector of the 2393 HD, which is where the MBR is found. Thus, after the BIOS POSTed, it would 2394 instinctively read the MBR, then the MBR in turn would load the VBR of the 2395 partition flagged as bootable, and it could either bootstrap the rest of the OS, 2396 or have a boot manager that could chain load another VBR so that you could boot 2397 to another OS. When there are more that one HD in the same computer, it is 2398 possible to tell the BIOS which HD you want to read the MBR first from. This is 2399 essentially how booting from a Hard Disk (Including Dual Boot) worked during the 2400 next 30 years. 2401 2402 As some anecdotal experiences, perhaps you remember that during the long Windows 2403 XP era, there were some unwritten rules about the order that you had to do 2404 things whenever you wanted to prepare a system for Dual Booting, like installing 2405 first the older Windows version then the newer one, or, if you wanted a Dual 2406 Boot with Windows and Linux, it was recommended that you always installed 2407 Windows first, then Linux second. At the time, I knew that the rules were true 2408 as doing it in the wrong order pretty much wrecked the OS that was already 2409 installed, but it always puzzled me what black magic the OS installers used 2410 since it didn't seem that any particular file managed that. 2411 2412 For example, at some point I wanted to have a Windows 95 and Windows XP Dual 2413 Boot. The first time I tried to do it, I installed W95 to an empty, unformatted 2414 partition in a HD that already had a NTFS partition with a bootable WXP. The W95 2415 installer formatted the empty partition with FAT16 and installed W95 there, but 2416 after that, I couldn't get WXP to boot again. I found no way to make the 2417 computer remember that it previously had WXP installed, it felt like its 2418 existence went entirely ignored. I moved the HD to another computer, and found 2419 that the WXP partition data was still there and intact (The data was not visible 2420 to W95 since the WXP partition was formatted using the NTFS File System, which 2421 W95 does not understand, but I was expecting that). As back then I didn't knew 2422 any way of fixing things other that reinstalling Windows, I decided to start 2423 over by installing W95 first then WXP, and that worked (At a cost of some hours 2424 of work and a lot of lost settings). I figured out that later Windows versions 2425 were intelligent enough to scan the other partitions looking for previous 2426 Windows installations, and automatically built a boot manager so that you could 2427 choose which Windows installation you wanted to boot. However, Windows ignored 2428 any other unsupported OSes (Including newer Windows versions) when building the 2429 list of OSes for the boot manager, which is also why it was suggested to always 2430 install Linux after Windows. Not much later, I also had issues with computers 2431 with two HDs where a Windows installation in the second HD becomes unbootable if 2432 the first HD is removed, something that made absolutely no sense since Windows 2433 had no reason to use the other HD at all. Just in case, I decided to always 2434 install Windows with only one HD plugged in the computer, then plugging the 2435 second one after finishing the Windows install process. Some time later I 2436 finally learned why that happened: Windows could write in the MBR of one HD to 2437 try to load a VBR found in another HD, forcing you to have both HDs unless you 2438 manually fix this. 2439 2440 All these things began to make sense after I finally understood the root cause: 2441 There is a high amount of manipulation that the MBR and VBRs silently suffers 2442 from in the hands of the Windows installers. The MBR and VBRs usually go 2443 unnoticed since you can't see them as standard files, even though their data 2444 exists on the physical disk, just that it is outside of any File System 2445 boundaries. The Windows installers always ask you in what partition you want to 2446 install Windows, but probably in the name of user friendliness, they always hide 2447 the low level details of what modifications they do to the MBR and VBRs of any 2448 connected HDs, causing unnecessary chaos with no justifiable reason. With proper 2449 knowledge and specialized tools to repair the MBR and VBRs, fixing my issues 2450 should have been rather easy, but sadly, I didn't had them back then. I just 2451 learned about both how simple booting is and how much of a mess the Windows 2452 installers can make by reading Linux distributions installation instructions. 2453 Something similar happens when people talks about their "C" and "D" Drives when 2454 actually the letter is just a Windows nomenclature for a mapped partition that 2455 it can access (Partitions that Windows can't understand doesn't get a Drive 2456 letter), but that says nothing about how many physical HDs there are, nor how 2457 they are actually partitioned. The devil is always in the details... 2458 2459 2460 ##### 3.3 - The first IBM PC/XT clones and the NEC V20 CPU 2461 2462 The clone manufacturers didn't stand still after being able to legally clone the 2463 original IBM PC. As you could expect, they continued to chase from behind IBM 2464 and its newest computer. While I'm not exactly sure about specific dates 2465 (Someone that wants to crawl old computer magazines looking for era appropriate 2466 announcements or advertisements may be able to make such a list), the time frame 2467 for the debut of the first PC/XT clones (NOT the PC-likes) should have been 2468 around one year after the IBM PC/XT launch, which means that they should have 2469 been available in the market just in time to meet the new IBM PC/AT. However, 2470 clone manufacturers weren't satisfied enough by just doing the same thing that 2471 IBM did but at a cheaper price, many of them actually attempted to create 2472 genuinely superior products by introducing newer features ahead of IBM while 2473 preserving full IBM PC compatibility, something that the other early x86 based 2474 PC-like platforms failed at. 2475 2476 One of the most surprising features that was first seen in clone computers was 2477 the Reset Button, which neither the IBM PC, the PC/XT, nor even the later PC/AT 2478 had at all. The usefulness of the Reset Button relied on the fact that sometimes 2479 a running application could become unresponsive in such a way that the Ctrl + 2480 Alt + Del Key Combination to reboot the computer was completely ignored (An 2481 example would be an application that masked IRQ 1, the Interrupt Line used by 2482 the Keyboard, then failed to restore it for some reason). If that happened in an 2483 IBM PC, PC/XT or PC/AT, your only option was to power cycle the computer 2484 ([Unless you added a Reset Button via an expansion card...][expansionreset]). 2485 2486 The Reset Button was implemented as a momentary type switch in the front of the 2487 Computer Case, with a cable that plugged into a header in the Motherboard that 2488 was wired to the RES line of the Intel 8284A Clock Generator. When the Reset 2489 Button was pressed, the 8284A CG would receive the RES signal, then in turn 2490 generate a signal through the RESET line that was directly wired to the Intel 2491 8088 CPU. Basically, resetting the CPU was an existing Hardware feature that IBM 2492 didn't exposed in its computers for some reason. The Reset Button was the second 2493 member of the Front Panel Headers to be introduced, the first being the PC 2494 Speaker. 2495 2496 The most daring clones used as Processor the non-Intel but x86 compatible NEC 2497 V20 CPU, released in 1982. IBM wasn't the only major manufacturer that had 2498 products reverse engineered, cloned, and improved upon, the Intel 8088 CPU had 2499 become popular enough to guarantee such treatment, too. The NEC V20 was an 2500 impressive chip: It was around 5% faster than the Intel 8088 CPU at the same 2501 clock speed (Theoretically, with code optimized specifically for it, it could be 2502 much faster, but in practice, V20 optimized code was extremely rare), supported 2503 the new x86 instructions introduced by the Intel 80186/80188 CPUs along with 2504 some custom NEC ones, had a mode that emulated the old Intel 8080 CPU that could 2505 be used by software emulators to run 8080 based 8 Bits CP/M applications without 2506 needing a dedicated card (The NEC V20 did not emulated the popular Zilog Z80 2507 enhancements made to the Intel 8080 ISA, applications that used these didn't 2508 work), and, best of all, remained pin compatible with the 8088. There were some 2509 scenarios where specific applications would not work with it due to low level 2510 details, but otherwise, it was the closest thing that was 8088 compatible 2511 without being an 8088. The NEC V20 was even used as an upgrade for the original 2512 IBM PC and PC/XT, as it was the only possible drop in replacement for the Intel 2513 8088 CPU that was faster than it when running at the same fixed 4.77 MHz clock 2514 speed (The other alternative were upgrade cards). There was also a V30, that 2515 gave the 8086 the same treatment. 2516 2517 Finally, something minor but still noteworthy, is that the PC/XT clone 2518 Motherboards didn't bothered to reproduce the IBM PC/XT functionality up to the 2519 last detail. A notorious omission is the special behavior of the original IBM 2520 PC/XT expansion slot known as Slot 8, as the PC/XT clone Motherboards 2521 implemented it as if it was just another standard slot. This was perhaps for the 2522 better, as it means that you could fit any expansion card in it instead of only 2523 those that had explicit Slot 8 support. I'm not even sure if there was a PC/XT 2524 clone Motherboard that actually implemented the IBM PC/XT Slot 8 behavior. 2525 2526 [expansionreset]: https://books.google.com/books?id=25_xnJJJmvgC&lpg=PP1&pg=PA419#v=onepage&q&f=false 2527 2528 2529 ##### 3.4 - The first Turbo XTs, Hardware requirements and changes to Clock Generation 2530 2531 By 1985, the PC/XT clone computers were reaching maturity level. While by that 2532 point IBM had already greatly extended the PC platform with their latest 2533 computer, the IBM PC/AT, it was a much more complex design than the IBM PC/XT 2534 and thus much more expensive, something that also applied to the early PC/AT 2535 clones. As such, it took several years for PC/AT compatible platforms to become 2536 affordable enough for mainstream users, and a few more to get into the low end 2537 of the market. In the meantime, systems based around the original PC platform 2538 continued to be designed, manufactured and sold, giving PC class based systems 2539 as a whole a surprisingly long life before being phased out of the new computer 2540 market altogether. 2541 2542 Besides price, something that made the PC platform useful life to last as long 2543 as it did, is that it became both possible and cheap to make a faster version of 2544 it. So far, the IBM PC/XT and its direct clones were still stuck at 1981 2545 performance levels since they had the exact same chips running at the same clock 2546 speeds than the original IBM PC (The only reasonable upgrade was the already 2547 mentioned NEC V20 CPU), but in the meantime, Intel and the other chip 2548 manufacturers kept improving their manufacturing processes, allowing them to 2549 release new models of the same chips used by the PC/XT that were capable of 2550 running at higher Frequencies. Eventually, clone manufacturers decided to use 2551 the faster chip models along with a revised Clock Generation scheme in otherwise 2552 basic PC/XT designs. This new class of PC/XT clones became known as Turbo XTs. 2553 If you were to compare the Turbo XTs with the earlier PC-like computers that 2554 were faster than the IBM PC, the main difference is that a PC-like was at best 2555 just partially compatible with the PC platform, whereas a Turbo XT uses it as a 2556 base thus compatibility is far better with both its software and its expansion 2557 cards. 2558 2559 As the Turbo XTs were going beyond the capabilities of the system that they were 2560 copying, I think that at this point, they actually deserve to be considered as 2561 its own class of IBM PC compatibles, which is a much more fitting and 2562 respectable term that being considered just mere clones. Actually, clone is 2563 usually considered a pejorative term, as it is dismissing the fact that most of 2564 the early vendors aspired to become serious competitors, and some eventually 2565 managed to become recognized brand names with their own custom designs and 2566 beating IBM in its own market, like Compaq. If anything, there were a lot of 2567 nameless, rather generic Turbo XT Motherboard designs in the late 80's that had 2568 no redeemable original feature, nor a manufacturer to make support inquiries, 2569 and were aimed to the low end budget market. For those, being called clones 2570 could be more fitting than for the early vendors that managed to provide full 2571 fledged computer systems. 2572 2573 The foundation of a Turbo XT Motherboard was the use of higher binned chips. 2574 Whereas the IBM PC and PC/XT had an Intel 8088 CPU rated for 5 MHz paired with 2575 support chips that matched that, Turbo XTs generally used the newer 8 or 10 MHz 2576 8088 or NEC V20 models, and some even pushed for 12 MHz. As you should already 2577 know, the effective Frequency that a chip would run at depends on its reference 2578 clock input, so assuming the same Clock Generation scheme of the IBM PC and 2579 PC/XT, any faster 8088 CPU would still run at 4.77 MHz for as long that you used 2580 the same 14.31 MHz Crystal Oscillator. To get the CPU to run at 8 or 10 MHz, you 2581 had to use a 24 or 30 MHz Crystal Oscillator, respectively, so that you could 2582 derive the required 8 or 10 MHz CLK line that the CPU used as input. However, if 2583 you actually remember the entire IBM PC 5150 Clock Generation diagram, you 2584 should quickly notice that there is a major problem with this approach: If you 2585 change the 14.31 MHz Crystal Oscillator for any other one, absolutely everything 2586 in the system would run at a different Frequency. Most things could run at a 2587 higher Frequency, but some had issues if ran at anything other than the default 2588 clock speed for very specific reasons. The first platform designer that should 2589 have noticed this issue was not any of the early Turbo XT designers but IBM 2590 itself, since the IBM PC/AT design should predate the earliest Turbo XTs by at 2591 least a year, yet IBM had to face many of the same problems with its new system. 2592 2593 For example, suppose that you want to run an 8088 or V20 @ 8 MHz, be it by 2594 overclocking the 5 MHz models, using a properly rated 8 MHz one, or even 2595 underclocking a 10 MHz version. If you were to simply change the 14.31 MHz 2596 Crystal Oscillator for a 24 MHz one so that the derived CLK line runs at 8 MHz 2597 instead of 4.77 MHz, you would also be deriving a 24 MHz OSC and a 4 MHz PCLK 2598 lines instead of the default 14.31 MHz and 2.38 MHz (Which was further halved to 2599 1.19 MHz), respectively. These clocks would absolutely wreck both the CGA Video 2600 Card and the Intel 8253 PIT. The CGA Video Card would be unable to output a NTSC 2601 video signal as intended, making it unusable with TVs, albeit it should still 2602 work with CGA Monitors. This was not a major issue since in a worst case 2603 scenario, you could replace the CGA Video Card with a MDA one and avoid the OSC 2604 line issue altogether (I don't know if a third party manufactured a CGA Video 2605 Card that includes its own Crystal Oscillator and Clock Generator to remove 2606 dependency on the OSC line, albeit that would mean that the card would have to 2607 deal with a more complex asynchronous Clock Domain Crossing). 2608 2609 The 8253 PIT was not as lucky: Its concept of time was directly related to the 2610 duration of a clock cycle. Basically, the PIT had no idea about what real time 2611 is, it is just a clock cycle counter, so it had to be externally calibrated 2612 against a real time second based on its 1.19 MHz clock input. If the PIT is 2613 running at a faster clock speed, each individual clock cycle would last less 2614 time, which means that the PIT would count the same amount of clock cycles at a 2615 faster rate, or, in other words, completing a count in a shorter period of real 2616 time. This directly translates in that anything that used the PIT to track time 2617 would be unaware that it is actually going faster than real time. While it may 2618 be possible to workaround this issue at the firmware level by enhancing the BIOS 2619 Services to be aware that now a real time second could be worth a varying amount 2620 of clock cycles, software that directly programmed the PIT would assume the 2621 standard IBM timings and misbehave as a result. As such, the PIT input clock had 2622 to be fixed to 1.19 MHz for compatibility reasons, and compared to the CGA 2623 issue, this one was completely unavoidable if you were making an IBM PC 2624 compatible platform, so it had to be solved somehow. 2625 2626 In order to resolve the dependency issue of the 14.31 MHz reference clock, one 2627 solution was to decouple the single system wide reference clock by using two 2628 Clock Generators and two Crystal Oscillators. Instead of a single Intel 8284A 2629 Clock Generator with a single 14.31 MHz Crystal Oscillator, now you had two 2630 8284A CGs, one paired with the 14.31 MHz crystal to derive the 14.31 MHz OSC and 2631 2.38 MHz PCLK lines for the CGA Video Card and the PIT, just like in the IBM PC, 2632 and another paired with a 24 or 30 MHz crystal that was used exclusively to 2633 derive an 8 or 10 MHz CLK line for almost all the rest of the platform. This was 2634 the common arrangement of a Turbo XT platform, but a bit complex since it 2635 required an asynchronous Clock Domain Crossing. Another solution was to use a 2636 single crystal and CG, but passing the 4.77 MHz CLK line by a clock doubling 2637 circuit to turn it into a 9.54 MHz one, with the advantages that it remained 2638 synchronous and was probably cheaper to do due to requiring less parts. IBM had 2639 to deal with the same issue in the PC/AT and decided to go for the former 2640 solution with two CGs, as did all PC/AT clones and compatibles, so the single 2641 crystal solution lasted only as long as the Turbo XT platform. 2642 2643 While an 8 MHz CLK line would also be overclocking the FPU (If present), the 2644 Buses and everything in them, like the support chips and expansion cards, this 2645 was not as much of a problem as the previously mentioned PIT since none of the 2646 other components were as dramatically bound to a fixed Frequency as the PIT was. 2647 Fortunately for Turbo XT based platforms, as the manufacturing technology 2648 progressed quite a bit, almost all the chips that depended on the CLK line 2649 either had available higher rated parts, or at least could tolerate being 2650 overclocked to 8 MHz (This was the case of the Intel 8237A DMAC, the highest 2651 rated one would go up to 5 MHz and was already in use by the original IBM PC. 2652 I'm not sure if the other manufacturers made faster ones, or if the 5 MHz parts 2653 had to be individually tested to make sure than the overclock was stable). As 2654 support chips were built-in in the Motherboard itself, there was no reason to 2655 worry too much about them as it was the responsibility of the Motherboard (Or 2656 computer) manufacturer to make sure than if it is selling a Turbo XT Motherboard 2657 that officially supports running at 8 MHz, all its components can actually work 2658 at that clock speed, so I would take as granted that everything soldered was 2659 good enough. 2660 2661 In the case of the asynchronous DRAM, while it wasn't directly affected by the 2662 higher clock speeds, it still had to be quick enough so that the data was ready 2663 when the Processor was asking for it. Regarding the DRAM chips themselves, the 2664 Manual of a Turbo XT Motherboard mentions that the model intended for 8 MHz 2665 operation needed 150 ns or faster DRAM chips, while the 10 MHz model, 120 ns. 2666 Some Turbo XT Motherboards also had configurable Memory Wait States, so that you 2667 could add a do-nothing cycle for memory accesses to let slower RAM to catch up. 2668 Typical choices were 0 or 1 Memory WS. As I don't know about the MHz-to-ns 2669 ratios, I'm not sure what was the maximum access time tolerable for a given 2670 Frequency, or how much Memory WS you had to add. I suppose that if it was 2671 possible to add enough Memory WS, you could be able to use even the IBM PC 2672 original 250 ns DRAM chips, albeit the performance would have been horrible. 2673 2674 Expansion cards deserve their own paragraphs, since they could be fairly 2675 temperamental. As far that I know, I/O Channel Cards were never rated to be 2676 guaranteed to work to up to a determined Bus clock speed as individual chips 2677 like the CPU, FPU or DMAC were, nor they had a rated access time like the 2678 asynchronous DRAM chips. The only thing that seemed to be set in stone, is that 2679 at the bare minimum, an I/O Channel Card had to work in the original IBM PC or 2680 PC/XT with its 4.77 MHz I/O Channel Bus and 1 I/O WS (This statement should have 2681 been true in 1985 when the IBM PC was still common, but after it became 2682 obsolete, many newer cards that worked in faster platforms and were physically 2683 compatible with the IBM PC I/O Channel Slots would not work in it at all). Some 2684 early IBM PC cards were unable to work reliably at higher clock speeds, yet it 2685 should be safe to assume that era appropriate cards should have consistently 2686 worked @ 8 MHz at a minimum, as otherwise, without a good source of usable 2687 expansion cards, the Turbo XT platforms would have been completely useless. 2688 Regardless, if you were purchasing an entire Turbo XT computer, the 2689 responsibility to test the bundled expansion cards to make sure that they were 2690 capable of reliably working @ 8 MHz or higher was of the system builder. 2691 2692 In order to have a higher degree of compatibility with expansion cards, some 2693 Turbo XT Motherboards had firmwares whose BIOS Services did their expected 2694 function, but also executed some redundant code that served to add extra delay 2695 after I/O accesses, as a sort of artificial software based Wait State 2696 (Basically, it was like a NOP Instruction, but done with a sort of dummy code 2697 instead of a real NOP, apparently because the NOP added more delay than needed). 2698 The problem with this approach was that these software Wait States were not 2699 universal, as applications that accessed the Hardware directly instead of using 2700 the BIOS Services would completely bypass this workaround. An alternative that 2701 was technically possible but that I failed to find examples of, is a Turbo XT 2702 Motherboard that could inject I/O Wait States other than 1 (0 could also be a 2703 valid value), as that would have been an universal solution. You can read more 2704 in-depth info about this topic in [this Thread][8088iodelays], so you can have a 2705 grasp of how complex it was to deal with this. 2706 2707 I'm rather clueless regarding how expansion card compatibility issues were dealt 2708 with back in the day. It seems that it was impossible to know ahead of time if a 2709 particular expansion card would work in a specific Turbo XT computer without 2710 testing it first, given the high amount of platform variety and that there was 2711 nothing standardized above the IBM PC capabilities. Moreover, 8, 10 and 12 MHz 2712 platforms had a vastly different difficulty scale (8 was standard, 10 may have 2713 required stability testing, and 12 should have been on the borderline of the 2714 possible and thus require handpicking the cards). As even back in the day there 2715 were Hardware enthusiast communities, I would expect that there was a lot of 2716 informal knowledge regarding if a particular card model could consistently work 2717 at higher speeds, as if we were talking about the modern day "average overclock" 2718 (The clock speed that almost all chips of a given type can consistently get to). 2719 12 MHz ended up being a clock speed wall for the I/O Channel Bus, which 2720 eventually forced the decoupling of the Processor from the Bus so that the CPU 2721 clock speed could be increased independently, without being burdened by the 2722 slower Bus and everything in it. As far that I know, no Turbo XT Motherboard 2723 tried to decouple them, it took until the Turbo AT platforms from IBM PC 2724 compatible manufacturers to begin the decoupling, and they finally became fully 2725 asynchronous when the I/O Channel Bus was standardized as ISA in 1988. 2726 2727 [8088iodelays]: http://www.vcfed.org/forum/showthread.php?45197-IO-Delays-on-8088-class-computers 2728 2729 2730 ##### 3.5 - Turbo XTs compatibility issues with IBM PC Software, Turbo function, honorable PC platform related mentions 2731 2732 While everything mentioned previously was pretty much the bare minimum required 2733 from the Hardware perspective for a functional Turbo XT computer, there were 2734 still several compatibility issues left to deal with. A lot of applications, 2735 mainly games, didn't followed IBM best programming practices of using the PIT 2736 for timing purposes, instead, they used loops. The faster the Processor was, 2737 the faster the loop would complete, the faster the game would run. This was 2738 pretty much an almost identical situation than what happened with the PIT, with 2739 the major difference being that increasing the PIT clock speed didn't increased 2740 system performance, so there was no real reason to clock it higher, but that 2741 wasn't the case with the CPU. It seems like developers didn't expected that 2742 their software could run in any other IBM PC compatible system with a CPU clock 2743 speed other than 4.77 MHz. Dealing with this compatibility issue required yet 2744 another solution: Selectable Frequency. This feature would be known as Turbo. 2745 Turbo is pretty much what differentiates the Turbo XTs from earlier PC-likes 2746 that weren't fully compatible with the IBM PC precisely because they were 2747 faster. 2748 2749 Turbo was typically implemented as an On/Off toggle type switch in the front of 2750 the Computer Case known as the Turbo Button, that was plugged into a header in 2751 the Motherboard (Technically it is the third member of the Motherboard Front 2752 Panel Headers, albeit by the year 2000 it was already extinct). Its function was 2753 to select between two Frequencies: The clock speed that the platform nominally 2754 supported, and a 4.77 MHz one that provided compatibility with software tuned 2755 for the speed of the IBM PC. The switch worked by selecting which would be the 2756 reference clock source for the derived CLK line. In platforms with two Crystal 2757 Oscillator and Clock Generator pairs, it was rather easy for the 14.31 MHz 2758 crystal already in place for the PIT and CGA to also happily derive the 4.77 MHz 2759 CLK line in the same way that it was done in the IBM PC, while in the simpler 2760 design with one crystal and CG pair, the switch could be used to bypass the 2761 clock doubling circuit. Essentially, this means that a Turbo XT platform had two 2762 operating modes: An IBM PC clone mode that should in theory behave identical to 2763 an IBM PC or PC/XT and thus work with applications bound to the 4.77 MHz 8088 2764 CPU speed (And even allowing to work older expansion cards), and a faster IBM PC 2765 compatible mode for everything else. 2766 2767 Other Turbo implementations allowed for software control of the clock speed. It 2768 was possible to run an executable file with appropriate parameters to change the 2769 clock speed, or a DOS TSR (Terminate and Stay Resident) Driver that could do so 2770 by pressing a Key Combination similar to Ctrl + Alt + Del to dynamically switch 2771 clock speeds on the fly, even when an application was running. I'm not sure if 2772 they called a custom BIOS Service of that Motherboard firmware, or if they 2773 interfaced with the clock source switching circuitry directly, both could be 2774 possible. I wouldn't be surprised either if there was a pure firmware side 2775 implementation that could do on the fly clock switching with a Key Combination 2776 in the same way than Ctrl + Alt + Del, without wasting Conventional Memory in a 2777 TSR Driver. I'm also aware that some platforms had higher granularity and could 2778 select between multiple speeds, but I'm not sure what Clock Generation scheme 2779 they used to achieve that (It may be possible that they were entirely software 2780 Wait States, too). 2781 2782 The switchable clock scheme uncovered yet another issue: If an application was 2783 running when you switched the clock source, chances are that it crashed. This 2784 typically didn't happened if you were at the DOS Command Prompt when you 2785 switched the Turbo status, but applications like games were not happy, though 2786 each piece of software had its own degree of sensitivity. As the Turbo XT 2787 platforms matured, newer designs allowed for better clock switching circuitry 2788 that could make dynamic switching as seamless as possible, so that you could 2789 turn on or off Turbo while applications were running without getting them to 2790 implode. These things that looks pathetically simple and stupid are the staple 2791 of our modern era: Just imagine that currently, the Core of a Processor is 2792 constantly jumping from its nominal Frequency to several Turbo States, and soon 2793 after can go into a low Power State for power saving purposes, then be clock 2794 gated entirely, then jump again into action. If things don't implode as they 2795 used to 30 years ago, is because they learned how to do seamless dynamic clock 2796 switching right. 2797 2798 And yet another compatibility issue that was uncovered by running at high 2799 Frequencies, is that the 8237A DMAC tended to be very problematic. This is not 2800 surprising since they usually were 5 MHz models with a severe overclock. Some 2801 Turbo XT platforms had firmwares that could dynamically switch the CLK line to 2802 4.77 MHz every time that there was a DMA access going on. Assuming that the 2803 platform had mastered the seamless dynamic clock speed switching, this would 2804 have been entirely possible to do, yet it doesn't explain how it was supposed to 2805 work with applications that programmed the DMAC directly instead of going 2806 through the BIOS Services (Identical situation than for anything else that 2807 directly interfaced with the Hardware. This includes bypassing the software I/O 2808 Wait States added by BIOS Services for expansion cards, and being the reason why 2809 the PIT has to run at exactly 1.19 MHz). I don't really know the details or 2810 specific platform implementations, just some comments that seems to point that 2811 some Turbo XTs did run the DMAC above 4.77 MHz all the time (Probably the 8 MHz 2812 ones, since 10 and 12 MHz are simply too much overclock for that chip), while 2813 others automatically slowed down to 4.77 MHz when the DMAC was in use, then 2814 switched back to normal (This would also have been very useful for all I/O 2815 operations, considering that expansion card compatibility was all over the place 2816 and that I/O Wait States were not configurable). It is possible than the systems 2817 that automatically downclocked to 4.77 MHz were all based on Chipset designs 2818 instead of discrete chips like the early Turbo XT platforms. However, is unfair 2819 to directly compare those late Chipset based Turbo XT platforms against the 2820 early ones made out of discrete support chips since they are not at the same 2821 maturity level, and chipsets allowed a platform to do many tricks that would 2822 require a lot of custom circuitry to implement with individual support chips. 2823 2824 Very late Turbo XTs were made using chipsets instead of discrete support chips. 2825 I prefer to cover chipsets after the IBM PC/AT generation since the first ones 2826 seems to have targeted PC/AT compatibility, not PC, so chronologically speaking, 2827 Chipset based Turbo XT platforms appeared after the PC/AT ones. Chipset based 2828 platforms had slightly different Clock Generation schemes, since the Chipset 2829 could fulfill the role of a Clock Generator: It took as input the reference 2830 clock of two Crystal Oscillators instead of needing two independent CGs, and 2831 derived all the required clock lines on its own. Chipsets could also offer a few 2832 more features not found in implementations made out of discrete chips, like a 2833 RAM Memory Controller with support for more than 640 KiB RAM installed in the 2834 Motherboard itself that was able to remap the excess RAM as Expanded Memory 2835 (EMS) or as Shadow RAM. All these things are covered after the IBM PC/AT 2836 section, too. 2837 2838 I should make a few honorable mentions to other contemporary PC-like platforms 2839 that were partially IBM PC compatible yet weren't of the Turbo XT variety. A few 2840 of those were based on the Intel 8086 CPU instead of the 8088, like the Olivetti 2841 M24, which was released during 1983 and used an 8086 @ 8 MHz, making it among 2842 the first IBM PC compatible computers that could run IBM PC applications faster 2843 than it before the IBM PC/AT. Using the 8086 required a more complex design in 2844 order to interface its 16 Bits Data Bus with 8 Bits support chips and expansion 2845 cards, but at that point you were close to the PC/AT design, since IBM had to do 2846 the same thing to interface the 16 Bits Data Bus of the Intel 80286 CPU with the 2847 same 8 Bits support chips. Moreover, the 8086 had a massive disadvantage: It was 2848 not an 8088. You could pick an 8088, make it run @ 4.77 MHz, and get a CPU with 2849 identical compatibility and accuracy than the 8088 in the IBM PC, while an 8086 2850 had no real chance to perform identical to an 8088 (The same applies to the NEC 2851 V20, but you could swap it for an 8088 without needing another Motherboard). As 2852 such, only the 8088 based Turbo XT platforms were theoretically capable of 2853 matching the original IBM PC compatibility level while also having the 2854 capability to offer more performance in Turbo mode. I don't know specific edge 2855 cases where the 8086 had worse compatibility than the 8088, but chances are that 2856 like always, games and copy protection schemes were involved. Still, it didn't 2857 make a lot of sense to use more expensive parts to make a less compatible 2858 computer. Only after mainstream software learned to properly behave regardless 2859 of the speed of the Processor there was a dramatic increase in x86 Processors 2860 variety, but by then, the PC based platforms were almost extinct. 2861 2862 The last fresh breath of the PC platform would be in April 1987, when IBM 2863 revived it for the entry level Model 25 and 30 of the PS/2 line, but I prefer to 2864 talk about them in their own section since it will be easier to digest if going 2865 through the PC/AT and chipsets first. Ironically, the mentioned PS/2 models are 2866 8086 based, not 8088. As amusing as it sounds, IBM own PS/2 line was less IBM PC 2867 compatible than many of the Turbo XT ones... 2868 2869 2870 4 - The IBM PC/AT 5170, the last common ancestor 2871 ------------------------------------------------ 2872 2873 The IBM PC platform true successor was the IBM PC/AT 5170 series, released in 2874 August 1984. By then, IBM had already learned from the recent IBM PCjr fiasco 2875 that the IBM PC compatibility was a serious thing, so it had to make sure that 2876 the new computer was as backwards compatible as possible with their previous 2877 hit. As chronologically speaking IBM should have been the among the firsts to 2878 design a PC compatible platform that was faster than the original PC (I'm aware 2879 only of the Olivetti M24 being commercially available before the IBM PC/AT and 2880 compatible enough, the rest were just far less compatible PC-likes), IBM should 2881 also have noticed early on that doing so was not an easy feat. Meanwhile, Intel 2882 also had to go through a failure of its own with the unsuccessful iAPX 432 2883 Processor architecture, which forced it to continue designing x86 based products 2884 as a stopgap measure for the time being. Amusingly, for marketing purposes Intel 2885 had rebranded all their x86 Processors series with an iAPX prefix, like iAPX 88 2886 for the 8088 or iAPX 286 for the 80286, but eventually Intel reverted them back 2887 to the old nomenclature as to not taint the now successful x86 line with the 2888 failure of the iAPX 432. 2889 2890 The PC/AT is historically significant because it is the last common ancestor 2891 that all modern x86 platforms have. In the original IBM PC era, after the 2892 overwhelming difference that the IBM PC huge software ecosystem made became 2893 evident, everyone ended up following IBM de facto platform leadership, which 2894 served as a sort of unifying force. This can be clearly seen in how the first 2895 generation of non-IBM x86 based computers, the PC-likes, that were similar to 2896 the IBM PC but only partially compatible with it, got followed by a second 2897 generation of systems, known as clones, that strove to be fully compatible with 2898 the IBM PC platform. The PC/AT era was similar, but in reverse: At first, 2899 everyone followed IBM leadership with a first generation of computers that were 2900 mere PC/AT clones, followed by a second generation where the top IBM PC 2901 compatible manufacturers decided to create new things of their own, and a 2902 divergence began when those manufacturers implemented exclusive platform 2903 features ahead of IBM with no standarization nor cross compatibility whatsoever. 2904 2905 Though many of these features or their specific implementations faded away after 2906 failing to achieve mainstream adoption and should be very obscure, some were 2907 popular enough to be adopted by everyone else. Yet, the base PC/AT compatibility 2908 was never compromised, as the new platform features were always added as a sort 2909 of superset of the existing PC/AT platform (And, by extension, the PC), 2910 requiring to be explicitly enabled by the OS or a Driver. Basically, until UEFI 2911 based Motherboard firmwares became mainstream around 2012, x86 based computers 2912 always booted as PC/AT compatible computers with the Processor running in Real 2913 Mode (8086/8088 level ISA behavior and features). Everything else was built on 2914 top of that. 2915 2916 The IBM PC/AT is a major extension of the PC platform. It introduced a newer x86 2917 Processor, the Intel 80286 CPU, widened the I/O Channel Bus to accommodate it, 2918 changed a few support chips and added some more. It also introduced a longer 2919 version of the IBM PC expansion slot that exposed the wider I/O Channel Bus in a 2920 second section, so that a new type of cards could make full use of the PC/AT 2921 platform features (Including more IRQs and DMA Channels) while remaining 2922 physically backwards compatible with the IBM PC cards. Like the IBM PC, the IBM 2923 PC/AT was an open platform, as IBM also released the 2924 [IBM 5170 Technical Reference (March 1984)][5170techref] with all the goodies. 2925 2926 There were three notorious variants of the IBM PC/AT. The original one used the 2927 Type 1 Motherboard, with an Intel 80286 CPU @ 6 MHz. It got eventually replaced 2928 by the Type 2 Motherboard, that had the same specifications and was functionally 2929 identical, but was physically smaller. In April 1986 IBM released the IBM PC/AT 2930 Model 319 and Model 339, which used the Type 3 Motherboard, that was the same as 2931 the Type 2 but had highed binned parts so that it could run the 286 @ 8 MHz. 2932 Finally, there was a little known model, the IBM PC/XT 5162 (Model 286), that is 2933 usually considered part of the PC/AT family, since despise being branded as a 2934 PC/XT, it was fully based on the PC/AT platform. The PC/XT 5162 used a different 2935 Motherboard that was smaller than the PC/AT 5170 Type 2/3 and functionally 2936 slightly different, but otherwise PC/AT compatible. 2937 2938 [5170techref]: http://www.minuszerodegrees.net/manuals/IBM_5170_Technical_Reference_1502243_MAR84.pdf 2939 2940 2941 ##### 4.1 - Intel 80286 CPU Overview, the MMU, Segmented Virtual Memory and Protected Mode, support chips 2942 2943 The first major component to be changed in the PC/AT was the Intel 8088 CPU, 2944 which got replaced by the Intel 80286. The most visible external change caused 2945 by this, is that instead of the 8088 8 Bits Data Bus and a 20 Bits Address Bus 2946 that gives it an 1 MiB (2^20) Physical Memory Address Space, the 80286 had a 16 2947 Bits Data Bus and a 24 Bits Address Bus that gives it a 16 MiB (2^24) Physical 2948 Memory Address Space. Moreover, the Buses were not multiplexed into the same 2949 Pins anymore, each Bit had its own dedicated line. As such, the new Processor 2950 required a major rework of the buffer chips section that separated the Local Bus 2951 from the I/O Channel Bus, as it had to be extended to support both its wider Bus 2952 and the dedicated lines. 2953 2954 Internally, the 80286 had a plethora of new features, the major ones being the 2955 introduction of an integrated MMU (Memory Management Unit), and a new operating 2956 mode, protected mode, to make use of it. Lesser features included Hardware Task 2957 Switching and some new instructions. Ironically, all of these features would be 2958 barely used during most of the useful lifetime of the PC/AT. The saving grace of 2959 the 80286 was simply that it had a faster execution unit than any of the 2960 previous x86 Processors, as it took it less clock cycles to process the same 2961 machine code than the 8088, making it perform much better than it even if it 2962 were to be running at the same clock speed. This ended up being the 286 biggest, 2963 and almost sole, selling point for the average user. 2964 2965 The MMU is a dedicated Co-processor that is used by OSes that implements Virtual 2966 Memory to offload the dynamic translation of Virtual Addresses to Physical 2967 Addresses, dramatically reducing the overhead compared to doing so purely in 2968 software (Something that was done experimentally in other platforms before the 2969 MMUs came into existence. I'm not entirely sure if someone tried a software only 2970 implementation of Virtual Memory for the original IBM PC platform, but good 2971 candidates would be the Microsoft Xenix ports for it, which are not to be 2972 confused with the ports for the PC/AT). I suppose that you also want to know 2973 what Virtual Memory and Virtual Addresses are, and why they are a major feature. 2974 Basically, Virtual Memory is the rather old concept of Address Space 2975 virtualization, which provides each running application its own private and 2976 exclusive Virtual Address Space. 2977 2978 In a platform without support for Virtual Memory or any other type of Memory 2979 Protection, like the original IBM PC, the OS and all applications being run sees 2980 and shares the same Physical Address Space. This essentially means that any code 2981 being executed can potentially read or write directly to any address. In that 2982 environment, creating an OS with Multitasking capabilities was very hard, since 2983 it required that all user applications were aware that not all the visible 2984 memory was theirs to use. At most, it would have been possible to create an OS 2985 API that cooperating applications could call to limit themselves to use only the 2986 memory address ranges that the OS allocated for each of them, yet this would not 2987 stop a badly coded application from accidentally (Or intentionally...) thrash 2988 the memory contents of another one, or even the memory of Devices that used 2989 MMIO. This is the case of PC DOS/MS-DOS, any application could do whatever it 2990 wanted, so it was not practical to transform such Single Tasking OS into a full 2991 fledged Multitasking one (Many attempts were made, like IBM TopView or Digital 2992 Research Concurrent DOS 286, but with mediocre results. Yet, in a few more 2993 years, everything would change with a feature introduced by the Intel 80386 2994 CPU...). 2995 2996 In an OS that implements Virtual Memory, each application sees only its own, 2997 exclusive Virtual Address Space, while the OS, with the help of the MMU, manages 2998 where the Virtual Addresses that are in use are really mapped to in the Physical 2999 Address Space. As Virtual Memory is absolutely transparent to the applications 3000 themselves, the OS has full control of the Memory Management, allowing for 3001 easier implementations of Memory Protection and thus Multitasking. It also 3002 simplifies applications themselves, as they can all assume that all usable 3003 memory is contiguous, completely forgetting about gaps or holes (Remember the 3004 first IBM PC BIOS version supporting non-contiguous memory?). Finally, Virtual 3005 Memory also allows for Swap File or Page File implementations, which is when you 3006 use a file or partition in standard storage media to hold contents of data that 3007 an application believes to be in the system RAM, a technique that was widely 3008 used in the old days since RAM was quite expensive, and it was assumed than 3009 Multitasking with very low performance was better than no Multitasking. However, 3010 this feature was a byproduct of Virtual Memory, and is NOT the reason why it 3011 exist in the first place (If you think otherwise, blame Windows 9x for calling 3012 the Page File Virtual Memory). 3013 3014 Implementing Virtual Memory in the 286 required a rework of the 8086/8088 3015 Segmented Memory Model scheme, which I like to call Segmented Virtual Memory. 3016 Like the 8086/8088, the 286 was a 16 Bits Processor with a Physical Address 3017 Space larger than what the value of a single 16 Bits GPR could address, so it 3018 still had to use the Segment and Offset pairs to access an individual address. 3019 However, in order to accommodate both the larger 24 Bits Physical Address Space 3020 and the Virtual Memory Address Space that was layered on top of it, the Segment 3021 and Offset pair had a different meaning in the 286 compared to the way that they 3022 worked in the 8086/8088. 3023 3024 The 286 made full use of the 16 Bits of the Segment and Offset pairs to 3025 effectively create a compounded 32 Bits value for an individual Virtual Address. 3026 From these 32 Bits, 2 were used for the Privilege Level (Part of the Memory 3027 Protection scheme, colloquially known as Rings 0, 1, 2 and 3), leaving 30 Bits 3028 for addressing that effectively resulted in a maximum of 1 GiB (2^30) Virtual 3029 Memory Address Space for each running application, assuming that it was 3030 organized as 16384 (2^14) Segments of 64 KiB (2^16) each. From those 30 Bits, 3031 one Bit was used to partition the Virtual Address Space between Global and 3032 Local, leaving a 512 MiB space that had the same contents for all applications 3033 (Used for addressing of the OS API, common libraries, etc), and a 512 MiB 3034 private space for each individual application. Note that all this applies only 3035 to the Memory Address Space, the I/O Address Space stayed with the same 64 KiB 3036 (2^16) worth of physical I/O Ports, and actually, was never virtualized nor 3037 extended. 3038 3039 Virtual Memory incurs in a RAM overhead, since there has to be some physical 3040 place to store the data about which Virtual Addresses are mapped to which 3041 Physical Addresses. It is impractical for the mapping data to contain individual 3042 address level granularity, since it would mean that for each Virtual Address 3043 that is mapped to a Physical Address, you would be spending no less than 7 Bytes 3044 (32 Bits from the Virtual Address itself and its Privilege Level plus 24 Bits of 3045 the Physical Address) to hold mapping information about something that is 3046 actually worth a single Byte. The only way to implement Virtual Memory in an 3047 efficient manner, is by mapping chunks of address ranges, like the Segments 3048 themselves. There were data structures known as Segment Descriptor Tables 3049 located somewhere in the system RAM that did exactly that, effectively being 3050 used as the workspace of the MMU. Each Descriptor Table was 8 Bytes in size and 3051 contained the mapping of an entire Segment, which could vary in size from 1 Byte 3052 to 64 KiB, so the Virtual Memory overhead wasn't that much in terms of raw RAM 3053 if using adequate granularity. The whole Virtual-to-Physical translation added 3054 extra complexity that incurred in some overhead even when having a MMU to 3055 offload it, however. 3056 3057 The other major feature of the 80286 was the introduction of protected mode. As 3058 the new meaning of the Segment and Offset pair changed a very basic behavior of 3059 how addressing worked in the x86 ISA, all preexisting code would be pointing to 3060 incorrect addresses. As such, in order to remain Binary Compatible with 3061 executable code intended for the 8086/8088 and 80186/80188 CPUs addressing 3062 style, most of the 286 new functionality required to be explicitly enabled by 3063 setting the CPU to operate in the new protected mode. The 286 itself started in 3064 a mode now known as Real Mode, where the Segment and Offset pair (And several 3065 more things, but not all) behaved like in the previously mentioned older 3066 Processors. Switching the 286 to protected mode was a prerequisite to use both 3067 the 16 MiB Physical Address Space and the MMU, otherwise, the 80286 would behave 3068 just like a faster 8086 and barely anything else. When in protected mode, all 3069 addressing was processed by the MMU Segmentation Unit even if not using Virtual 3070 Memory, Memory Protection or such, causing some performance overhead when 3071 compared to Real Mode. 3072 3073 Though PC DOS/MS-DOS itself didn't used protected mode, DOS applications had 3074 full control of the computer and could do so if they wanted, something that made 3075 sense for any application that was constrained by the 640 KiB Conventional 3076 Memory barrier. There was a problem with this approach, which is that the BIOS 3077 Services (Including those added by expansion cards Option ROMs) and the DOS 3078 System Calls couldn't be used from within protected mode, as all existing code 3079 assumed Real Mode addressing style. That is where a major limitation than the 3080 286 protected mode had became very noticeable: There was no way for software to 3081 return to Real Mode after enabling protected mode. Intel idea was that a new 3082 generation of OSes and applications relied exclusively on the new feature, not 3083 for it to be used as a mere extension that a lone application could enter and 3084 exit at will. 3085 3086 As the primitive PC DOS was the most common OS in use and there was no 3087 mainstream protected mode successor on the horizon, not being able to return to 3088 Real Mode on demand to use the DOS API was a rather harsh limitation. Without 3089 proper OS support, any standalone application that wanted to use protected mode 3090 would have to re-implement everything on its own, massively bloating the 3091 application and making the development process far more complex. As such, a 3092 commercial DOS application that used the pure form of the 286 protected mode 3093 should be very rare, I'm not even aware if one existed at all. The only example 3094 that I know that did used in the IBM PC/AT the pure 286 protected mode as Intel 3095 envisioned, is an OS and all its native applications: The UNIX-based 3096 [IBM PC XENIX][ibmpcxenix], which IBM licensed from Microsoft and was released 3097 less than a year after the PC/AT. However, Xenix was always aimed at the high 3098 end OS market, neither it nor any other Xenix variant ever competed directly 3099 against PC DOS (Albeit one was supposedly planned at some point), which reigned 3100 supreme among average users. One can wonder how different history would be if 3101 IBM decided to push Xenix instead of PC DOS back when it had undisputed control 3102 of the PC platform... 3103 3104 From the minor features of the 80286, a rather interesting yet mostly unknown 3105 one is Hardware Task Switching, something that Intel should have included 3106 expecting it to be useful for Multitasking OSes. Using this feature had a few 3107 disadvantages, the most surprising one being that doing Task Switching using the 3108 built-in Hardware support was usually slower than doing Context Switching fully 3109 in software. As such, the Hardware Task Switching function of x86 Processors 3110 ended up being a completely unused feature. Actually, while it is still 3111 supported by modern x86 Processors for backwards compatibility purposes, it 3112 can't be used while running in Long Mode (64 Bits), effectively making this 3113 feature obsolete. 3114 3115 The other minor feature is that the 80286 added some new instructions to the x86 3116 ISA, albeit most of them deal with protected mode Segmentation thus they were 3117 not very useful. Also, as the introduction of new instructions works mostly in a 3118 cumulative way since a new Processor typically has all the instructions of the 3119 previous ones based on the same ISA, the 80286 included the instructions 3120 previously introduced by the 80186, too. Intel completely ignored the NEC 3121 V20/V30 custom instructions, making them exclusive to NEC x86 ISA extension and 3122 thus short lived. 3123 3124 Deploying the Intel 80286 CPU required a few new support chips that were 3125 superseding versions of the 8086 ones. The 286 was intended to use the Intel 3126 82288 Bus Controller instead of the 8288 (Furthermore, there was no Minimum Mode 3127 that I'm aware of like with the 8086/8088, so you always required the Bus 3128 Controller), and the Intel 82284 Clock Generator instead of the 8284A. On the 3129 Co-processor front, the 80287 FPU succeeded the 8087. However, at its core, the 3130 80287 was an 8087 and performed like it, Intel just upgraded the external Bus 3131 interface to make it easy to wire to the new 286 CPU but otherwise didn't 3132 improve the actual FPU. Interestingly, the 80287 had pretty much no formal 3133 external Address Bus, just a Data Bus. A successor to the 8089 IOP was never 3134 developed, making that Co-processor ISA a dead end. 3135 3136 [ibmpcxenix]: http://www.os2museum.com/wp/ibm-pc-xenix/ 3137 3138 3139 ##### 4.2 - Processor IPC and performance, compiler optimizations 3140 3141 If there is a single thing than the 80286 did over the 8088 that actually 3142 mattered when the IBM PC/AT was launched, it was simply to be much faster than 3143 it even when executing code that only used the features of the basic 8086 ISA, 3144 instantaneously bringing tangible performance benefits to all the preexisting 3145 IBM PC applications (Keep in mind than when the IBM PC/AT was released, there 3146 were already a few PC-likes that were faster than the IBM PC and could run many 3147 applications, but compatibility was usually mediocre. And the PC/AT 80286 CPU @ 3148 6 MHz was quite faster anyways). What made the 80286 perform better is that it 3149 took significantly less clock cycles to process the same machine code (opcodes) 3150 than the 8088, making the 80286 able to execute more instructions in the same 3151 time frame and thus perform faster even if it was running at the same clock 3152 speed than the 8088. This is also the case of the NEC V20, it was slightly 3153 faster than the 8088 because it also took less clock cycles to process the same 3154 machine code, but the difference was rather small and pales in comparison to how 3155 faster the 80286 was. 3156 3157 In general, there are only two viable ways to increase compute performance for 3158 existing software: The first is to run the same Processor at faster clock 3159 speeds, which is what the Turbo XT platforms did with their 8088s running at 8 3160 or 10 MHz. The other is to design a Processor that can execute the same machine 3161 code in a more efficient manner thus making it faster, like the Intel 80286 and 3162 the NEC V20 did. If you have ever heard about the term IPC (Instructions per 3163 Cycle) before, what it describes is precisely these performance differences at 3164 the same clock speed. There is a sort of balance between the concepts of clock 3165 speed and efficiency, since the more efficient the Processor is (Where more 3166 efficient usually means more complex), the harder it is to make that design 3167 clock higher, so the idea is to get to a sweet spot where the complexity and 3168 possible attainable clock speed gives the best overall performance. From the 3169 multitude of things that can increase a Processor efficiency, one that early on 3170 was done very often, was to optimize the Processor execution unit so that 3171 processing opcodes took less clock cycles. 3172 3173 The amount of clock cycles that it takes for the Processor to process an 3174 instruction isn't an uniform fixed value, instead, it depends on the machine 3175 code that it produces (An instruction can produce one or multiple opcodes) and 3176 the context of the operation itself. In the old days, the execution latency of 3177 an instruction in a specific context was pretty much 3178 [a known fixed constant value][pentiuminstrset], as the Processors were rather 3179 simple and the Memory subsystem predictable. In the case of modern Processors, 3180 it takes a lot of work to get the exact execution latency of an instruction 3181 because there are a ridiculous amount of variables at play (Multiple Cache 3182 memory levels, Branch Prediction Units, multiple parallel execution units, 3183 decoding instructions into MicroOps, OoOE (Out of Order Execution), MacroOp 3184 Fusion, resource sharing due to Simultaneous MultiThreading, variable Memory 3185 subsystem latency, etc) that makes for an almost infinite amount of contexts, so 3186 at most, you may get an average value that speaks more about the efficiency of 3187 the Processor as a whole rather than of the execution unit as an isolated 3188 entity. As such, theorizing what the maximum possible performance of a modern 3189 Processor is can be much harder, and even more so attempting to get close to it. 3190 3191 Since the execution latency of each instruction and context is different, 3192 optimizations to the execution unit does not produce uniform IPC increases, 3193 instead, the apparent IPC would greatly depend on the instructions most used by 3194 the code being processed. For example, the NEC V20 could be several times faster 3195 on some operations compared to the 8088, and maybe it is possible that if 3196 running some specialized code that exploited its strengths (Without using the 3197 V20 custom instructions, so that the same code could also run on the 8088 and 3198 80286), it could come close to 80286 level performance, but that would never 3199 happen in real world usage scenarios where it just averaged around 5% more 3200 performance. Precisely due to this reason, IPC is a rather ambiguous and 3201 overgeneralized term that can only be meaningful as a sort of average value when 3202 running code from real applications, as some handcrafted tech demos may give 3203 rather misleading results. With all factors taken into account, in the average 3204 IPC scale, the Intel 8088 comes last, followed by the NEC V20, the Intel 8086 3205 (Due to its wider Data Bus), the NEC V30 (Like the V20, but for the 8086 instead 3206 of the 8088), then finally the 80286 on top of them, with a significant lead 3207 margin. 3208 3209 Perhaps by now, you should have already figured out that due to the fact that 3210 IPC increases are not uniform, is impossible for the 80286 to be cycle accurate 3211 with the 8088, no matter at what clock speed you run it at. Whereas in a Turbo 3212 XT you could underclock a faster binned 8088 (Not a V20) to 4.77 MHz and get 3213 identical results to an IBM PC or PC/XT, at no clock speed the 80286 would 3214 perform exactly identical to the 8088. This means that the IBM PC/AT, by design, 3215 was unable to be 100% IBM PC compatible in the most strict sense of that 3216 definition. 3217 3218 When designing a new Processor, the designers have to face decisions about 3219 performance trade-offs, since they can prioritize to improve the execution unit 3220 to process the opcodes produced by certain types of instructions faster, but 3221 maybe at the cost of not being able to do so for another type of instructions, 3222 or worse, they could be forced to make these slower compared to its predecessor. 3223 In the case of the 80286, even if its IPC increases were not uniform, it was 3224 still faster than the 8088 in pretty much any context. Obviously, an application 3225 could be optimized for the 80286 so that it did the same thing but using the 3226 most favourable instructions, albeit that code should be suboptimal if it ran in 3227 an 8088. However, the execution units of later x86 Processors like the 80386 and 3228 80486 evolved in a radically different way, as they focused in processing as 3229 fast as possible the most commonly used instructions, whereas the less common 3230 ones actually tended to perform worse in each successive generation. This ended 3231 up causing some issues for the software developers of the era, since they had to 3232 decide [which Processor they wanted to optimize for][80x86optimizations], almost 3233 always at the detriment of another one. 3234 3235 Back in the day, optimizing software was much harder than today. The compilers 3236 of the era were quite dumb and couldn't exploit all the potential of a CISC ISA 3237 like that of the x86 Processors, so for anything that required speed, any 3238 optimization relied on the use of handcrafted x86 Assembler code. At a time 3239 where Hardware resources were quite limited, the difference between compiled C 3240 code and optimized ASM could be rather brutal, so it was important to get a 3241 skilled x86 Assembler programmer to do at least the most compute intensive parts 3242 of the application. ASM optimizations could not only make the difference between 3243 a real time application like a game being fluid or a slideshow, it could also be 3244 massively useful in productivity applications like Lotus 1-2-3, which was much 3245 faster than its rivals thanks to being written entirely in ASM (Among other 3246 tricks, like using the Video Card directly, skipping the BIOS Services). 3247 3248 As these code optimizations had to be done manually, optimizing for multiple 3249 Processors means that the Assembler programmer had to know the idiosyncrasies of 3250 each of those, and would also have to do its job two or more times for each 3251 target, something that was expensive in both cost and time. Moreover, including 3252 multiple code paths (And a way to detect which Processor the computer had, at a 3253 time where figuring out so was not standardized) or providing different 3254 executables could increase the application size, something not desirable as 3255 Diskettes capacities were very small. The end result is that software developers 3256 typically chose a Processor to serve as the lowest common denominator that the 3257 application was expected to run in, then optimize only for it, letting clock 3258 speed increases handle the potentially lower relative IPC in later Processors. 3259 For example, a programmer during the late 80's could have had two choices: To 3260 optimize its application targeting an IBM PC/AT with an 80286 @ 6 MHz, 3261 guaranteeing that on a newer 80386 @ 16 MHz it would be much faster anyways due 3262 to the much higher clock speed, or to optimize for the 80386 to perform even 3263 better on it thus giving some headroom to add more features, but maybe at the 3264 cost of being just too slow to be usable enough on the older 80286. 3265 3266 During the next three decades, both Processors and compilers evolved a lot. The 3267 evolution of the x86 execution units, with two notorious exceptions, ended up 3268 being a sort of a feedback loop: software developers had to optimize their code 3269 to use whatever the mainstream Processors available at a given moment did 3270 better, then the next generation of Processors focused on doing the same thing 3271 even better than the previous one since that accounted for the majority of code 3272 already found in the commercial applications relevant during that era. As such, 3273 modern x86 Processors, in general, benefit roughly from most of the same 3274 optimizations. This also caused an implied chicken-and-egg scenario: Designing a 3275 vastly different execution unit without being able to push around the entire 3276 software ecosystem to optimize for what the new Processor did better, would just 3277 give counterproductive performance results that made such Processors look bad. 3278 That is the case of the mentioned two exceptions, as they were Processors that 3279 had extremely different execution units compared to other contemporaneous ones, 3280 and only shined in very specific circumstances. These two were the relatively 3281 recent AMD Bulldozer (FX) architecture, which was quite different when compared 3282 to the previous AMD K10/K10.5 (Phenom/Phenom II), Intel Conroe (Core 2 Duo) and 3283 the contemporaneous Intel Nehalem (Core i7 1st generation), and a much more 3284 famous one, the Intel NetBurst (Pentium 4) architecture, that was radically 3285 different when compared to the Intel P6 (Pentium Pro/2/3/M) or AMD K7 3286 (Athlon/Athlon XP). 3287 3288 Meanwhile, compilers matured enough to almost nullify the gap with handcrafted 3289 Assembler, and also learned how to compile code optimized for each Processor 3290 architecture. Currently, a programmer only has to focus on writing code that 3291 follows the general optimizations guidelines, then simply offload the Processor 3292 architecture optimization job to the compiler, which can produce many code paths 3293 or different executable files optimized for each Processor. Actually, modern 3294 compilers are so good that they can produce faster executables than what an 3295 average ASM programmer can do, since the compiler can implement a whole bunch of 3296 optimizations relevant in modern Processors that are quite complex for humans to 3297 do manually. It takes an expert Assembler programmer to attempt to beat a modern 3298 compiler, and only in very specific niches there will be substantial performance 3299 differences that makes the effort worth it. 3300 3301 [pentiuminstrset]: https://web.archive.org/web/20140313200414/http://www.cs.dartmouth.edu/~mckeeman/cs48/mxcom/doc/x86.html 3302 [80x86optimizations]: http://archive.gamedev.net/archive/reference/articles/article369.html 3303 3304 3305 ##### 4.3 - Screwing up x86 forward compatibility: 286 reset hacks, A20 Gate (HMA), 286 LOADALL, the Intel 80186 CPU missing link 3306 3307 The previous sections covers the basic features of the Intel 80286 CPU, yet that 3308 pales in comparison to the amount of custom circuitry that IBM implemented in 3309 the PC/AT to workaround some of the 286 shortcomings. All these would forever 3310 influence how the x86 architecture evolved, and not in a good way... 3311 3312 Maybe the biggest issue with 286 protected mode is that you were not supposed to 3313 be able to return to Real Mode after enabling it, being barred from using the 3314 BIOS Services or the DOS API. However, someone discovered a workaround that made 3315 doing so possible. The workaround consisted in sending a reset signal to the 3316 80286 while preserving the computer state and RAM contents by keeping everything 3317 powered on. When the Processor was reset, it would restart in Real Mode, then, 3318 as you already know, it would begin to execute code loaded from a known fixed 3319 address location that is where the firmware is mapped to. With a specialized 3320 firmware, it was possible to create an interface where an application that 3321 wanted to use Real Mode could save to a predefined RAM address data about the 3322 last Processor state and the next address location where it wanted to continue 3323 executing code right before triggering the CPU reset. After the reset, the 3324 Processor would begin code execution by loading the firmware, that in turn, 3325 would check very early the predefined RAM address contents to figure out if it 3326 was a normal boot or if instead it was an intentional CPU reset. If there was 3327 valid data in that predefined RAM address, it could load the previous state and 3328 directly resume code execution at the next specified location, completely 3329 bypassing the standard POST and Hardware initialization procedures. This way, it 3330 was possible for an application running in the 286 protected mode to return to 3331 Real Mode. It was a massive hack, but it worked. The method required a lot of 3332 preparation to use since, as you also already know, in Real Mode you're limited 3333 to 1 MiB Physical Address Space, which means that the code that would be 3334 executed in Real Mode had to be found in that addressable MiB. The most fun part 3335 about this 286 reset hack is that we have a rather similar interface in our 3336 modern days, in the form of the [ACPI S3 Low Power State][acpis3lowpower], which 3337 can shut down then restart most of the computer after saving its state while 3338 keeping the system RAM powered to resume where it left. 3339 3340 IBM seems to not have agreed with Intel strict vision of protected mode since it 3341 implemented the described 286 reset hack in the IBM PC/AT. The way that it 3342 worked was by wiring a GPIO Pin from the new Intel 8042 microcontroller (Whose 3343 main purpose was to replace the IBM PC Intel 8255 PPI as Keyboard Controller), 3344 known as P20, to the RESET line of the Processor. Software could then operate 3345 this line via an I/O Port. An application that entered protected mode could use 3346 the 286 reset hack to return to Real Mode, do whatever it wanted to do with the 3347 BIOS Services and the DOS API, then enter protected mode again. The PC/AT BIOS 3348 itself used the 286 reset hack during POST so that it could use protected mode 3349 to test all the physical RAM, then return to Real Mode to boot legacy OSes like 3350 PC DOS as usual. Sadly, while the PC/AT 286 reset hack was functional, it was 3351 rather slow. An application that had to use it often to invoke the BIOS Services 3352 or DOS System Calls would incur in a massive performance penalty, something that 3353 made reliance on the 286 reset hack rather undesirable. Moreover, how much 3354 Conventional Memory was free was still important for applications that used both 3355 Real Mode and protected mode since, as mentioned before, the code and data that 3356 relied on Real Mode had to fit in the first addressable MiB (Or more specific 3357 yet, the 640 KiB of Conventional Memory). Basically, the reset hack made 3358 possible for DOS applications to use more RAM, but via a dramatically complex 3359 method. 3360 3361 Less than a year after the IBM PC/AT was released, a Microsoft developer 3362 [discovered][fasterSyscallTrapRedux] and [patented][msos2patents] an alternative 3363 way to reset the 80286 CPU that was faster than using the Intel 8042 3364 microcontroller to initiate the reset hack. The new method consisted in 3365 basically the same things that had to be done to preserve and resume the 3366 computer state with the previously described reset hack, but instead of 3367 resetting the Processor via external means, the reset was achieved by performing 3368 a controlled CPU Triple Fault, that was much faster and didn't required 3369 additional Hardware at all. This method would eventually be prominently used in 3370 the OS/2 Operating System that was jointly developed by IBM and Microsoft, and 3371 released at the end of 1987. It was also used in some Kernel versions of Windows 3372 intended for the 286, like Windows 3.1 running in Standard Mode. 3373 3374 The CPU Triple Fault is one of the reasons why it was quite hard to 3375 [emulate or virtualize OS/2][os2virtualization], as not all the emulators or 3376 VMM/Hypervisors knew how to deal properly with such event. Due to the fact that 3377 a patent was involved, I doubt that there was any other major commercial 3378 software that triggered a CPU Triple Fault to return to Real Mode, since using 3379 it may have been opening the door to a lawsuit. Moreover, any software targeting 3380 the Intel 80386 or later CPUs could switch back and forth between Modes by 3381 simply changing a Bit, as Intel ended up making the return to Real Mode from 3382 protected mode an official feature of the x86 ISA. It can be said that both the 3383 reset hack via the 8042 microcontroller and the CPU Triple Fault are pretty much 3384 80286 specific, and thus, very obscure, but they should have been supported for 3385 compatibility purposes for a long time, and perhaps they still are. 3386 3387 The other major issue of the 80286 CPU involves backwards compatibility. Though 3388 in theory the 80286 was fully backwards compatible with the previous x86 3389 Processors thanks to Real Mode, its behavior wasn't 100% identical to any of 3390 them. Both backward and forward compatibility is usually tested and guaranteed 3391 only for properly defined and documented behavior, yet sometimes programmers 3392 decide to exploit quirks, bugs (Formally known as erratas), or undocumented 3393 instructions (More on that later...) that are not part of the standardized ISA 3394 of a particular Processor series. As such, applications that relies on any of 3395 those may fail to work when executed in later Processors that doesn't behave as 3396 expected, as the Processor designer is under no obligation to maintain support 3397 for nonstandard or unintentional behavior. 3398 3399 In this case, the 8086/8088 CPUs had a very particular quirk: While thanks to 3400 the 20 Bits Address Bus they had an 1 MiB Physical Address Space, the way that 3401 their Segmented Memory Model worked made possible for these Processors to 3402 internally address a range of almost 64 KiB above the 1 MiB limit, from 1024 to 3403 1087 KiB. Because the 8086/8088 didn't had the 21 Address lines that would be 3404 required to externally express that in binary, the missing upper 1 Bit caused 3405 that any address that the Processor thought that was in the 1024 to 1087 KiB 3406 range to appear in the external Address Bus as if it was in the 0 to 63 KiB 3407 range. This quirk would become known as Address Wraparound. The 286 in Real 3408 Mode, instead, considered that range entirely normal and used 21 Address lines. 3409 Basically, the 8088 and the 80286 would send via the Address Bus two different 3410 addresses for any operation within the mentioned address range, obviously 3411 getting two different results. For some reason, a few IBM PC applications relied 3412 on the Address Wraparound quirk instead of addressing the 0 to 63 KiB range 3413 directly. As the 286 didn't reproduced that behavior even when running in Real 3414 Mode, those applications would run fine in the IBM PC with an 8088, but fail if 3415 using the newer 80286. 3416 3417 IBM once again stepped in with a workaround for this 286 shortcoming. The PC/AT 3418 added some discrete logic infamously known as the A20 Gate, which managed the 3419 80286 A20 Address line (Corresponding to Bit 21 of the CPU Address Bus). The A20 3420 Gate was wired to another spare Pin of the Intel 8042 microcontroller, known as 3421 P21, so that software could control it via an I/O Port like the previously 3422 mentioned 286 reset hack to exit protected mode. The default state of the A20 3423 Gate was to force the 80286 A20 Address line to always return 0 so that it could 3424 reproduce the 8086/8088 Address Wraparound quirk while in Real Mode, and it 3425 could also be configured to let it to function as normal for when in Protected 3426 Mode. Thanks to this external help, the 80286 achieved better backwards 3427 compatibility with the previous x86 Processors than if it was used as a 3428 standalone CPU. Something worthy to mention is than this hack wasn't specific to 3429 the IBM PC/AT and clones or compatibles, the NEC PC-98 based computers that used 3430 the 80286 CPU also implemented their own version of the A20 Gate hack (Albeit it 3431 was managed from a different I/O Port, so like everything in the NEC PC-98, it 3432 was IBM PC like but not IBM PC compatible), and it isn't hard to guess that it 3433 was for the exact same reason. 3434 3435 The OS/2 Museum site did a long series of articles hunting for IBM PC 3436 applications that relied on the 8088 Address Wraparound quirk, identifying the 3437 [CP/M compatible CALL 5 System Call][addressWraparound] implemented in PC 3438 DOS/MS-DOS and any application that used it, like 3439 [MicroPro WordStar][witnessAgainstWordstar], and 3440 [anything compiled with IBM Pascal 1.0][a20WasntWordstar]. There were also 3441 post-IBM PC/AT offenders found like [the spellchecker tool][call5Spell] of early 3442 versions of Microsoft Word, as it used the CP/M CALL 5 interface (This one was 3443 [actually discovered by PCjs][aSpellOnMe]), and any executable compressed with 3444 [Microsoft EXEPACK][exepackAndA20]. However, being released after the IBM PC/AT, 3445 these didn't influenced its design at all thus are irrelevant, yet after users 3446 and developers forgot the intricate details of the A20 Gate, it was possible for 3447 some applications to fail with no explainable reason because not everyone was 3448 taking the A20 Gate state into account. 3449 3450 A major omission of the A20 Gate hack is that IBM didn't implement a BIOS 3451 Service to standardize how to manage the A20 Gate for things like to enable it, 3452 disable it, or ask it about its current status (It was possible for this to be a 3453 classic BIOS Service since it could be called from Real Mode both before 3454 entering and after exiting protected mode). This would become a problem several 3455 years later as all PC/AT compatibles had something that could do what the A20 3456 Gate did, but not everyone managed it in a PC/AT compatible way, thus, with the 3457 lack of a BIOS Service, there was no last resort HAL to manage the A20 Gate. 3458 Later software Drivers whose purpose was to manage the A20 Gate, known as A20 3459 Handlers, had to become unnecessarily bloated and complex because they had to 3460 support all the different [custom implementations][a20handlers]. Even further 3461 adding to the complexity is that back then there was no standardized function to 3462 ask the system which computer model it was, so the only way for an A20 Handler 3463 to automatically get things working was by using a sequence of tests that tried 3464 to find the idiosyncrasies of a specific computer model before it could decide 3465 what code path could be used to manage the A20 Gate in that particular system 3466 (Alternatively, an advanced user could manually configure the A20 Handler if it 3467 allowed the user to do so). 3468 3469 While the A20 Gate was intended to be used to maintain full backwards 3470 compatibility with the IBM PC 8088 behavior when the 80286 was in Real Mode, it 3471 could be managed completely independent from the Processor mode. Several years 3472 after the IBM PC/AT debut, OSes like Windows/286 and PC DOS 5.0/MS-DOS 5.0 would 3473 include A20 Handlers that intentionally made use of the standard behavior of the 3474 80286 A20 line while in Real Mode to get access to the extra 64 KiB memory 3475 range, assuming that there was RAM mapped there for it to be useful. The 1024 to 3476 1087 KiB Address Range would become known as the HMA (High Memory Area). The HMA 3477 was useful because the memory there could be addressed directly from within Real 3478 Mode, without needing to go through protected mode at all. In the particular 3479 case of PC DOS/MS-DOS, if the HMA was available, they could move data from the 3480 OS itself that traditionally used valuable Conventional Memory into it, and 3481 since this was done in a completely transparent way as the OS took care of 3482 enabling and disabling the A20 Gate every time that it had to access its own 3483 data located in the HMA (I suppose that at a performance penalty due to the 3484 extra overhead), any regular Real Mode application could benefit from the extra 3485 free Conventional Memory without needing to directly support the HMA or anything 3486 else. 3487 3488 A thing that still puzzles me, is that I don't recall ever hearing about how 3489 upgrade cards for the IBM PC were supposed to deal with the Address Wraparound 3490 quirk, as any 80286 or 80386 based card (Like the Intel InBoard 386/PC) should 3491 have faced the exact same Processor related problems than the PC/AT. It was 3492 certainly possible to include the A20 Gate functionality in the upgrade card 3493 itself, and even to map the A20 Gate controls to the same I/O Port that in the 3494 IBM PC/AT for compatibility with any software that managed it. Sadly, I don't 3495 know details about specific implementations. I would not be surprised that in 3496 some cases, even if a manageable A20 Gate was present, the implementation was 3497 not compatible with that of the IBM PC/AT and thus failed to run with some 3498 software. In a worst case scenario, the A20 Gate could either use a Jumper or be 3499 completely hardwired, which would make dynamic switch via software impossible. 3500 The only thing that I'm certain, is that I doubt that the upgrade cards used an 3501 Intel 8042 microcontroller to manage the A20 Gate, since it would have been 3502 expensive and pointlessly overkill to include it just for that, so if upgrade 3503 cards did include the A20 Gate hack, they likely used something else to control 3504 it. I'm not sure either how they were supposed to deal with the 286 reset hack, 3505 either, albeit the Triple Fault method should still be possible. 3506 3507 By this point, you get the idea that the story of the 80286 CPU is about an 3508 incomplete Processor whose shortcomings had to be worked around with hacks. Keep 3509 in mind that the 80286 was released in 1982, one year after the IBM PC and two 3510 before the PC/AT, so it should have been designed well before Intel had any 3511 chances to realize what its long term impact could be. The sad truth is that 3512 Intel didn't had a solid plan for the evolution of the x86 architecture, the 3513 whole product line was supposed to be an afterthought Plan B of sorts that acted 3514 as a filler while Intel finished what it believed to be its next flagship 3515 product, the iAPX 432 Processor architecture. However, when the iAPX 432 finally 3516 became available, it catastrophically failed in the market. As such, Intel 3517 vision of the world would completely change when it became aware that it relied 3518 on the newfound success of its x86 Processor line to stay relevant. Basically, 3519 the main reason why Intel began to take x86 seriously was because at that moment 3520 it had no other viable alternative, and it had to thank the success of the IBM 3521 PC and PC/AT platforms for that opportunity. 3522 3523 When the 80286 was deployed in new computer platforms like the IBM PC/AT, Intel 3524 should have quickly noticed that some of its big customers of x86 Processors (At 3525 least IBM and NEC, not sure how many other did the same) were going out of their 3526 way to make the new 286 do things that it wasn't supposed to do, be it either to 3527 improve backwards compatibility with the previous x86 Processors (The A20 Gate 3528 hack), or to add features that Intel didn't cared about (The 286 reset hack to 3529 return to Real Mode from protected mode). Around that time, Intel must have 3530 realized that the success of the x86 architecture depended on it being able to 3531 meet the future needs of any successors of the current platforms, and most 3532 important, of the software that would run in them, which by definition included 3533 all the preexisting ones. Having learned from its mistakes, Intel efforts 3534 culminated in it doing a much better job when it designed the 80386, which fixed 3535 most of the 286 weak points and synchronized the evolution of the x86 3536 architecture with the needs of the software. The 386 would not only be an 3537 absolute success as a Processor, it would also be good enough to solidify the 3538 basics of the x86 ISA for almost two decades thanks to the much better long term 3539 planning. However, by then, it was already too late, as the dark ages of the 3540 80286 had already done a lot of irreparable damage to the x86 ISA as a whole. 3541 The legacy of the 80286 CPU is the cause of both broken backwards compatibility 3542 and also broken forward compatibility, all of which had to be taken into account 3543 in future generations of x86 Processors. 3544 3545 The issues of the 80286 were caused not only by Intel own short-sightedness when 3546 designing it, the idiosyncrasies of the software developers of the era had a lot 3547 to do with the prevalence of those issues, too. Back in the early days, software 3548 developers weren't shy of using, even for commercial software, undocumented 3549 instructions (LOADALL, more about that one soon), quirks (The already covered 3550 8086/8088 Address Wraparound quirk, reason why the A20 Gate exist), Bits marked 3551 as reserved in the Processor Registers or other type of data structures (The IBM 3552 PC XENIX 1.0 made by Microsoft is a [good example][fwdCompatLandmines] of this 3553 one), or any other function of a Processor that was not part of its standard 3554 ISA. How developers found about all these things is a rather interesting topic 3555 itself, yet the point is that after a particular discovery had been made public, 3556 there were high chances that its use would proliferate. If another developer 3557 thought that the discovered nonstandard behavior was somehow useful, it could 3558 carelessly use it in its applications, completely disregarding than there were 3559 no guarantees than that behavior would be the same in the next generation of 3560 Processors. Thus, if a newer Processor didn't implement whatever non standard 3561 behavior or quirk a previous one had, there was a high risk that as soon as 3562 someone discovered that a popular application didn't work with the new 3563 Processor, the selling point of backwards compatibility would go down the drain, 3564 with all the economic consequences associated with having a less marketable 3565 product. This was precisely the case of the 80286 and the Address Wraparound 3566 quirk, and why IBM did its own effort to workaround it. 3567 3568 Many software developers were at fault regarding the compatibility issues, since 3569 abusing non-standardized behavior just because it worked fine at that point of 3570 time made a clean forward compatibility strategy outright impossible to 3571 implement for both Intel and IBM. At the time where there were only IBM branded 3572 PCs with 8088s or PC/ATs with 80286s and direct clones of them, being bound to 3573 their idiosyncrasies didn't really mattered as all the computers based on these 3574 platforms were functionally identical to each other (If it wasn't, then it was 3575 either a bad clone or a mere PC-like), but as the parts used to build IBM PC or 3576 PC/AT compatible computers began to massively diversify, applications that 3577 relied on nonstandard Hardware behavior became a major compatibility problem as 3578 the subtle differences in "100% compatible" parts were discovered (An 3579 interesting example is [IBM's own CGA Video Card][8088mph], as not all CRTC 3580 chips that could be used for it behaved exactly the same than the Motorola 3581 MC6845). Is like if no developer thought that there could be newer x86 3582 Processors that didn't include the undocumented functions or quirks of the older 3583 ones, thus not behaving in the same way that expected in an 8088, 80286, or 3584 whatever other Processor the unintended behavior was first seen. If this 3585 situation sounds familiar, is because it is. Is something pretty much identical 3586 than what happened with applications (Mainly games) that were tuned for the 3587 exact performance of the 8088 @ 4.77 MHz of the IBM PC, as described in the 3588 Turbo XT section. 3589 3590 All these issues were not exclusive to the software ecosystem of the IBM PC and 3591 PC/AT platforms, it also happened in other contemporaneous platforms. For 3592 example, the Apple Macintosh platform had a major transition period when the 3593 Motorola 68000 CPU line extended its addressing capabilities from 24 to 32 Bits. 3594 The Motorola 68000 CPU that the early Macintoshes used had Address Registers of 3595 32 Bits in size, but only 24 Bits of them were computed, as the Processor 3596 ignored the upper 8. Early Macintosh software stored flags in these unused 8 3597 Bits (Quite similar to what IBM PC XENIX 1.0 did with the reserved Bits of the 3598 Descriptor Tables of the 80286 MMU). The newer Motorola 68020 extended 3599 addressing to 32 Bits by actually computing the upper 8 Bits, causing a lot of 3600 applications that used the Address Registers the wrong way to implode when 3601 executed in later Macintoshes equipped with the new Processor (Note that Apple 3602 implemented a translation layer in its OSes to maintain backwards 3603 compatibility). At least on the x86 side, from the P5 Pentium onwards, Intel 3604 decided that instead of simply discouraging reserved Bits use, software 3605 developers would be entirely sandboxed, as any attempt to write to something 3606 that they were not supposed to access would cause a GPF (General Protection 3607 Fault), forever fixing that sort of forward compatibility issues. 3608 3609 While making patches for the affected software was technically possible (After 3610 all, it was the developer fault for using Hardware functions that it shouldn't), 3611 it would have been a logistical nightmare to distribute them among affected 3612 users, since the only mainstream way to do so was by mailing Diskettes, 3613 incurring in material and shipping costs. Is not like today that a software 3614 developer can release a half broken piece of software, then fix it via patches 3615 that anyone can download via Internet at a negligible cost. A decade later, it 3616 would become common for computer magazines to include CDs with a tons of 3617 utilities, patches for games, applications and such, but when the PC/AT was 3618 released, that wasn't an option, either. The only viable way to mitigate the 3619 compatibility issues, was to forcibly design new Hardware that was fully 3620 backwards compatible with the old software. Thus, what Intel learned the hard 3621 way, and a bit later than IBM (Thanks to the IBM PCjr fiasco, and perhaps also 3622 because IBM could notice first hand how 100% IBM PC compatibility was the 3623 supreme goal of clone manufacturers), is that backwards compatibility was much 3624 more than just reproducing the documented and standardized behavior of 3625 something, it should also reproduce the undocumented and unintentional one, too. 3626 What was broken in previous Processors, had to remain broken in the newer ones. 3627 As soon as Intel took backwards compatibility as serious as its main customers 3628 did, it had to make sure that the unintended behavior of any new Processor 3629 matched that of the old ones, just to support software that blatantly ignored 3630 the recommended programming practices. But this alone was not enough, either. 3631 The workarounds and hacks that Intel customers implemented for its Processors 3632 quirks, like those that IBM included in the PC/AT due to the halfway attempt of 3633 the 80286 CPU at backwards compatibility, produced their own quirks, so Intel 3634 had to support them, too. 3635 3636 For example, the A20 Gate is an excellent case of a customer introduced 3637 workaround that Intel would end up having to accept as part of its x86 3638 architecture. Thanks to the previously mentioned HMA, Intel couldn't simply go 3639 back and fix Real Mode in newer Processors by always using the 8086/8088 style 3640 Address Wraparound, as now it had to consider that there was mainstream software 3641 that depended on the normal behavior of the 80286 A20 Address line while in Real 3642 Mode to address the extra 64 KiB used for the HMA. The only way to maintain full 3643 backwards compatibility was to support two different modes for the A20 Address 3644 line that could be changed in the fly: Address Wraparound for 8086/8088 3645 compatibility, and treating the A20 line as normal for 80286 HMA compatibility. 3646 Intel eventually included support for that hack directly in their Processors 3647 beginning with the 80486 CPU, which had a Pin, A20M, that was exclusively used 3648 to change the Processor internal behavior of the A20 Address line. This is a bit 3649 amusing when you consider that the 80486 as a standalone CPU was more compatible 3650 with the 8088 than an earlier successor, the 80286, which was released back when 3651 8088 compatibility was much more important. Hardware support for the A20 Gate 3652 hack was still present as late as 2014 with Intel Haswell, albeit not in 3653 physical form with a dedicated Pin anymore. 3654 3655 The fact that a particular behavior or function of a Hardware part is not 3656 documented doesn't necessarily means that the manufacturer is not aware about 3657 its existence. At times, not providing public documentation about something can 3658 be intentional, since it may be meant to be a confidential feature that gives an 3659 advantage to privileged developers. For example, the 80286 CPU had an infamous 3660 undocumented instruction, LOADALL, that could be used to directly write values 3661 to the Processor Registers in a way that bypassed standard validity checks. If 3662 used properly, LOADALL could bent the 286 own official rules to do useful things 3663 that were otherwise impossible (Ironically, not even LOADALL could help the 3664 80286 to return to Real Mode from protected mode). While LOADALL was not 3665 officially documented, Intel did had [internal documentation][loadall] about it, 3666 which they provided to several privileged developers in a document known as 3667 Undocumented iAPX 286 Test Instruction. One of those developers was Digital 3668 Research, that used it for the development of a new OS, Concurrent DOS 286, a 3669 direct descendant of CP/M. With the help of LOADALL, Concurrent DOS 286 could 3670 use the 286 protected mode to emulate an 8088 CPU, making it able to multitask 3671 some PC DOS/MS-DOS applications. 3672 3673 Even if there is internal documentation about publicly not divulged features, 3674 their behavior may still not have been standardized and thus is subject to 3675 change. Sometimes, behavior changes can happen even between different steppings 3676 (Minor or Major Revisions) of the same Processor design. That was precisely the 3677 roadblock that sealed the fate of [Concurrent DOS 286][concurrentDOS]. Digital 3678 Research developed that OS using early steppings of the 80286, where its 8088 3679 emulation trick worked as intended. For some reason, Intel decided to change 3680 LOADALL behavior in the production steppings of the 80286, breaking the 3681 emulation capabilities of the preview version of Concurrent DOS 286. Digital 3682 Research delayed the widespread release of its new OS while it argued with Intel 3683 about the LOADALL behavior change. Eventually, Intel released the E2 Stepping of 3684 the 80286 with a LOADALL version that allowed Concurrent DOS 286 to perform 3685 again the emulation trick as intended. However, the whole affair caused the 3686 final version of Concurrent DOS 286 to be a bit too late to the OS market to 3687 make any meaningful impact, besides that it had much more specific Hardware 3688 requirements as the emulation feature relied on a particular Processor Stepping. 3689 I don't know if the E2 Stepping LOADALL behaves like the early steppings that 3690 Digital Research said that worked with Concurrent DOS 286, or if it is a 3691 different third variation, with the second variation being the broken one. The 3692 point is, not even something that was intentionally designed and internally 3693 documented had its behavior set in stone, and reliance on it harmed a privileged 3694 developer. Such is the unpredictable nature of undocumented things... 3695 3696 Another privileged software developer was Microsoft, which also got 3697 documentation from Intel about the 80286 LOADALL instruction. Whereas Digital 3698 Research used LOADALL to achieve 8088 emulation, Microsoft was perhaps the first 3699 one to use LOADALL to address the entire 16 MiB Physical Address Space from 3700 within Real Mode, completely skipping protected mode (Amusingly, Intel 3701 documented both use cases in its [LOADALL documentation][loadallRealMode]). This 3702 trick would informally become known as Unreal Mode. As LOADALL could access the 3703 memory above the 1088 KiB boundary without having to enter protected mode nor 3704 needing to exit it by using the 286 reset hacks (The A20 Gate still had to be 3705 enabled and disabled on demand), it was the fastest way for a Real Mode 3706 application to use more memory. During the time period where 286 CPUs were 3707 mainstream (Late 80's to early 90's), Microsoft often used Unreal Mode in some 3708 of their PC DOS/MS-DOS Drivers. By that point of time, Unreal Mode would have 3709 already evolved further with the newer Intel 80386 CPU, as there was no need to 3710 use an undocumented instruction to enable it, and Unreal Mode itself was 3711 extended to access from within Real Mode the 4 GiB (2^32) Physical Memory 3712 Address Space of the 386 instead of only the 16 MiB (2^24) of the 286. Chances 3713 are that Intel had to keep support for Unreal Mode going forward because its 3714 usage became widespread enough that, like with the A20 Gate, it was forced to 3715 either adopt it as part of the x86 ISA or risk to implode backwards 3716 compatibility. Decades later, Unreal Mode is still missing from the official x86 3717 ISA documentation, yet Intel acknowledge it as "Big Real Mode" in other 3718 specifications that it helped to develop like the PMMS (POST Memory Manager 3719 Specification), somewhat solidifying its status as a permanent component of the 3720 x86 ISA. This left Unreal Mode as yet another legacy of the 80286 days that 3721 still remains even on modern x86 Processors. 3722 3723 While typically a newer Processor has all the instructions of a previous one, 3724 the undocumented LOADALL would be specific to the 80286 ISA. As a lot of 3725 important software used it, platforms based on later Processors had to deal with 3726 the removal of LOADALL by having the firmware trap and emulate it to maintain 3727 backwards compatibility. Since LOADALL was extremely unique in nature, fully 3728 emulating it was pretty much impossible, yet the emulation techniques seems to 3729 have been good enough for mainstream use cases. An interesting detail is that 3730 the Intel 80386 CPU included a different version of LOADALL that is similar in 3731 purpose but not behavior than the 286 LOADALL. The 386 LOADALL wasn't as 3732 interesting as the 286 one because there weren't a lot of useful things that it 3733 could do that weren't possible to achieve via more standard means, including 3734 enabling Unreal Mode. Still, something that made the 386 LOADALL useful is that 3735 it could actually be used to [fully emulate the 286 LOADALL][loadall386], a 3736 method that at least Compaq used for the firmwares of its 386 based PC/AT 3737 compatible platforms. Regardless, is possible than there is software that uses 3738 the 286 LOADALL that fails to work in any other platform that either isn't a 286 3739 or does not use the emulation method based on the 386 LOADALL. 3740 3741 Finally, to end this extremely long section, I'm sure that you may be wondering 3742 about why there are almost no mentions about the Intel 80186 and 80188 CPUs, as 3743 if everyone had simply opted to skip these Processors even when they were the 3744 direct successors to the 8086 and 8088. A little known detail about them is that 3745 they were not standalone CPUs (Nor the 80286 for that matter, since it had an 3746 integrated MMU. In other chip set families of the era, the MMU was a discrete 3747 Co-processor), instead, they were closer to a modern SoC (System-on-Chip). 3748 Besides the CPU, they also included an integrated Clock Generator, PIC, DMAC and 3749 PIT. At first glance, this seems wonderful, but that is until you notice that 3750 the integrated functions were not 100% compatible with the discrete parts that 3751 were used in the IBM PC, like the 8259A PIC and the 8237A DMAC. Thus, designing 3752 an IBM PC compatible computer based around an 80186/80188 was harder than a 3753 standard clone based on either the PC 8088 or the PC/AT 80286. Given the almost 3754 complete lack of IBM PC compatible computers that used the 80186/80188 3755 Processors, I'm not sure if it was possible to somehow disable or bypass the 3756 integrated functions and wire to them the standard discrete support chips as 3757 usual, or if that was a lost cause. I recall hearing about at least one 80188 3758 computer that claimed to be IBM PC compatible, but I have absolutely no idea how 3759 good it was in practice. There was also an upgrade card with an 80186. 3760 Regardless, the reason why almost everyone skipped the 80186/80188 Processors is 3761 because they were at a horrible middle point, as you couldn't make an 100% IBM 3762 PC compatible with them, nor you could make an IBM PC/AT clone. 3763 3764 Regarding the capabilities of the 80186/80188 CPUs themselves, they were rather 3765 decent. The difference between the 80186 and the 80188 is the same one that 3766 between the 8086 and the 8088, the first had a 16 Bits external Data Bus while 3767 the later had a 8 Bits one. Both had a 20 Bits Address Bus, so they were limited 3768 to an 1 MiB Physical Address Space like the 8086/8088. The execution unit was 3769 vastly improved over the 8086/8088, with an average IPC more than halfway 3770 between them and the 80286 CPU. Actually, if 80186/80188 based IBM PC compatible 3771 computers had been viable, the average IPC of the 80286 wouldn't appear as 3772 ridiculous high as it was since people wouldn't be directly comparing it to the 3773 8088. The 80186/80188 introduced new x86 instructions that were included in all 3774 later x86 Processors like the 80286, but also in the NEC V20/V30. They had no 3775 support for protected mode or anything related to it like the MMU and Virtual 3776 Memory, albeit at least Siemens in its PC-like PC-X and PC-D platforms coupled 3777 an 80186 with an external MMU precisely to use Virtual Memory (I failed to find 3778 info about which MMU chip these used and how it was interfaced with the 80186, 3779 as it stands out as a major oddity). While pretty much unused in the world of 3780 the IBM PC platform, these SoCs were supposedly rather successful in the 3781 embedded market. 3782 3783 The 80186/80188 generation biggest legacy in the x86 ISA (Instruction Set 3784 Architecture) was the introduction of a proper method to handle invalid opcodes. 3785 An opcode is a single instruction of machine code that is what the Processor 3786 actually executes and tells it what to do. Opcodes can be displayed in an human 3787 readable format as an Assembler programming language instruction, which is 3788 highly specific to a particular ISA, and has a direct 1:1 translation to machine 3789 code. As an opcode encodes both instruction and registers involved, a single 3790 Assembler instruction can be translated into several different opcodes depending 3791 on the registers that are involved in such operation. Also, a single complex 3792 instruction can produce multiple opcodes as part of a single operation. While 3793 any Assembler code to machine code translation should cover all the opcodes 3794 available in the proper, valid ISA, as otherwise the Assembler tool would reject 3795 the invalid input, it is still possible to generate opcodes that doesn't 3796 correspond to any Assembler instruction by manually inputting hexadecimal 3797 values. An invalid opcode can be considered worse than an undocumented 3798 instruction (Or single opcode) because the latter is supposed to have a defined 3799 function that isn't acknowledge by the manufacturer, whereas an invalid opcode 3800 is undefined by nature, albeit its behavior should be reproducible on the same 3801 Processor Stepping. Depending on the Processor, invalid opcodes can produce a 3802 crash, do nothing, be an alias of other documented opcodes, or do something 3803 actually useful that can't be done with documented opcodes (In such case, they 3804 may be more likely undocumented opcodes than merely invalid ones). You can read 3805 some interesting information pertaining to invalid and 3806 [undocumented opcodes][undocOpCodes] (Note that it mentions other Processors 3807 besides x86). 3808 3809 Whenever the 8086/8088 CPUs encountered an invalid opcode, they tried to 3810 interpret it anyways since there was nothing forbidding the execution unit from 3811 doing so. Because by nature they're undefined, the only way to know what an 3812 invalid opcode does is via trial-and-error while watching the Processor state 3813 before and after its execution, something that someone already 3814 [bothered to do][undocOpCodes8086]. Some of these opcodes were reused in later 3815 x86 Processor generations, so software relying on them would, as you expect, not 3816 work in later Processors, once again breaking forward compatibility. On the same 3817 scenario, the 80186/80188 instead triggered an Invalid Opcode Exception, 3818 allowing an Exception Handler to take over execution and do things like 3819 emulating the invalid opcode. This also forced programmers to not do things that 3820 they weren't supposed to do in the first place, which ironically is pretty much 3821 the basis for forward compatibility. Basically, if you don't want programmers 3822 using some obscure Processor function that could cause troubles later on, don't 3823 even give them the chance to use it at all. 3824 3825 [acpis3lowpower]: https://github.com/coreboot/coreboot/blob/master/Documentation/AMD-S3.txt 3826 [fasterSyscallTrapRedux]: https://blogs.msdn.microsoft.com/larryosterman/2005/02/08/faster-syscall-trap-redux/ 3827 [msos2patents]: http://www.os2museum.com/wp/ms-os2-patents/ 3828 [os2virtualization]: http://www.os2museum.com/wp/why-os2-is-hard-to-virtualize/ 3829 [addressWraparound]: http://www.os2museum.com/wp/who-needs-the-address-wraparound-anyway/ 3830 [witnessAgainstWordstar]: http://www.os2museum.com/wp/another-witness-against-wordstar/ 3831 [a20WasntWordstar]: http://www.os2museum.com/wp/the-a20-gate-it-wasnt-wordstar/ 3832 [call5Spell]: http://www.os2museum.com/wp/a-word-on-the-call-5-spell/ 3833 [aSpellOnMe]: https://www.pcjs.org/blog/2018/05/27/ 3834 [exepackAndA20]: http://www.os2museum.com/wp/exepack-and-the-a20-gate/ 3835 [a20handlers]: https://jeffpar.github.io/kbarchive/kb/072/Q72302/ 3836 [fwdCompatLandmines]: http://www.os2museum.com/wp/forward-compatibility-landmines/ 3837 [8088mph]: https://scalibq.wordpress.com/2015/08/02/8088-mph-the-final-version/ 3838 [loadall]: https://www.pcjs.org/documents/manuals/intel/80286/loadall/ 3839 [concurrentDOS]: https://books.google.com/books?id=_y4EAAAAMBAJ&lpg=PP1&pg=PA21#v=onepage&q&f=false 3840 [loadallRealMode]: https://www.pcjs.org/documents/manuals/intel/80286/real_mode/#discrepancies-from-an-iapx-86-88-using-emulation 3841 [loadall386]: http://www.rcollins.org/articles/loadall/tspec_a3_doc.html 3842 [undocOpCodes]: https://retrocomputing.stackexchange.com/questions/1591/use-of-undocumented-opcodes 3843 [undocOpCodes8086]: http://www.os2museum.com/wp/undocumented-8086-opcodes-part-i/ 3844 3845 3846 ##### 4.4 - Cascading two Intel 8259A PICs and two Intel 8237A DMACs, the Intel 8254 PIT, the Motorola MC146818 RTC, and the Intel 8042 Microcontroller 3847 3848 The IBM PC/AT cast of support chips received a massive upgrade. It added a 3849 second set of 8259A PIC and 8237A DMAC that provided additional IRQs and DMA 3850 Channels, thus now having two of each chip. The 8253 PIT was upgraded to a 8254 3851 PIT, that has a few more functions but is otherwise backwards compatible. A 3852 Motorola MC146818 RTC was added as a core piece of the platform, so that with 3853 proper OS support, you didn't need to input the Date and Time on OS startup ever 3854 again. Finally, the 8255 PPI got replaced by a full blown microcontroller, the 3855 Intel 8042, whose main job was that of a Keyboard Controller, but actually did a 3856 multitude of roles. 3857 3858 While none of these chips are present in their discrete physical forms any 3859 longer, most of their functionality is still found in all modern x86 based 3860 computers. The Chipset absorbed the functionality of all the support chips 3861 except the 8042 microcontroller, which stuck around as an independent chip for 3862 many more years until being replaced by a newer class of Embedded Controllers 3863 that were later on integrated onto the Super I/Os chips (I think that a few 3864 chipsets existed that included 8042 functionality, too, but was not usual). 3865 Everything else would change, yet compatibility with the PC/AT support chips is 3866 a given even now, making them the IBM PC/AT greatest legacy. 3867 3868 **2x Intel 8259A PICs (Programmable Interrupt Controller):** The 8259A PIC could 3869 be easily cascaded with one or more slave 8259A, each wired to an IRQ line of 3870 the primary one. While with two 8259A PICs like in the IBM PC/AT you had 16 3871 total IRQs, due to the cascading, the usable amount was 15. Interrupt Priority 3872 would depend on how the PICs were wired. 3873 3874 In the IBM PC/AT, the IRQ 2 line of the primary 8259A PIC was repurposed and 3875 used to cascade the slave PIC, which provided 8 IRQs of its own. These IRQs 3876 would be known as IRQs 8-15. If you remember correctly, IRQ 2 was directly 3877 exposed in the PC and PC/XT I/O Channel Slots, but that wasn't possible anymore 3878 in the PC/AT due to having been repurposed. Instead, the IRQ2 pin of the I/O 3879 Channel Slots was wired to the second PIC, at IRQ 9. For backwards compatibility 3880 purposes, the PC/AT BIOS firmware transparently rerouted Interrupt Requests from 3881 IRQ 9 so that they appeared software side as if they were made from IRQ 2. Being 3882 cascaded from IRQ 2 also impacted Interrupt Priority, as from the main PIC point 3883 of view, any incoming interrupts from the slave PIC at IRQ 2 would have higher 3884 priority than those of IRQs 3-7. The slave PIC itself had the standard 8259A 3885 Interrupt Priority, with IRQ 8 being the highest one and IRQ 15 the lowest one. 3886 Thus, Interrupt Priority in the IBM PC/AT would be non contiguous, with IRQs 0-1 3887 being followed by 8-15 then 3-7. I suppose that it was considered a more 3888 balanced approach than having wired the slave PIC to IRQ 7 of the primary one. 3889 3890 From the new 8 IRQs, there were two that were for internal Motherboard use only, 3891 IRQs 8 and 13. IRQ 8 was used by the new RTC, whereas IRQ 13 was used by the 3892 Intel 80287 FPU. The FPU case is interesting as in the IBM PC, the 8087 FPU was 3893 wired to the 8088 NMI line (Along with the error reporting from the Memory 3894 Parity subsystem) instead of a standard IRQ as Intel recommended. For backwards 3895 compatibility reasons, IBM decided to give IRQ 13 the same treatment that it did 3896 with IRQ 9, as the firmware rerouted Interrupt Requests from IRQ 13 as NMI. 3897 Something that I found quite weird is that the 80287 FPU does not generate 3898 Interrupts directly like the 8087 FPU did, instead, the 82288 Bus Controller 3899 seems to do so in its behalf with its INTA line, which passes though some glue 3900 logic before getting into the slave 8259A PIC at IRQ 13. The remaining 6 IRQs, 3901 namely IRQs 10-12, 14 and 15, were exposed in the new section of the longer 3902 expansion slots, so only new cards could use them (Does not apply to IRQ 9 as it 3903 is used as a direct IRQ2 Pin replacement, thus found in the first section of the 3904 slots). IRQ 14 was pretty much reserved for HDC (Hard Disk Controller) type 3905 class cards only. 3906 3907 **2x Intel 8237A DMAC (Direct Memory Access Controller):** Like the PIC, the 3908 8237A DMAC supported to use multiple units in the same system. It was possible 3909 to wire one or more slave DMACs to a primary one, each wired to a DMA Channel, 3910 either cascaded or daisy chained. This also means that for each DMAC, you added 3911 4 DMA Channels but lose one of the chip upstream of that DMAC. 3912 3913 A major limitation is that as DMA bypasses the CPU, it also means that you are 3914 bypassing its MMU. As such, DMA transactions involved Physical Addresses only, 3915 as it was impossible for a Device to be aware of the new Virtual Memory scheme, 3916 nor for the CPU to be aware of the independent DMA transactions. This wasn't 3917 important at a time when PC DOS reigned supreme since everything used Physical 3918 Addresses, but eventually it would become a pain for protected mode OSes as they 3919 required Bounce Buffers to move data from the Physical Addresses that the DMA 3920 would see to its final destination (This is what the IOMMU would solve 3 decades 3921 later, not by making Devices themselves aware of Virtual Memory, but by giving 3922 an OS a means to transparently intercept and remap Device DMA directly to 3923 Virtual Addresses). 3924 3925 In the IBM PC/AT, the DMAC cascading was done in a reverse way than that of the 3926 PICs. Whereas the new PIC was added as a slave to the existing one, the new DMAC 3927 was inserted upstream of the preexisting DMAC, thus making it a slave. This 3928 master-slave relationship is in truth just a matter of point of view at a 3929 platform level, since as numbered by IBM, DMA Channels 0-3 came from the slave 3930 DMAC and 4-7 from the master one. I don't know whenever it could have been 3931 possible to do it the other way around (Add the new DMA Channels with a cascaded 3932 slave controller, like the PICs) or just renumbering them, as at least it would 3933 have made a bit more sense. The discrete logic of the 4 Bits Page Register that 3934 already extended the native 16 Bits Address Bus of the 8237A DMAC to 20 Bits, 3935 was further extended to 8 Bits to support all 24 Bits of the I/O Channel Address 3936 Bus, so it was possible to perform a DMA to any of the 80286 16 MiB Physical 3937 Memory Address Space. 3938 3939 A surprising peculiarity is that the Address Bus of the master DMAC was wired 3940 differently than the slave. Whereas the slave DMAC was wired roughly the same 3941 way than in the IBM PC, which means being directly wired to the lower 16 Bits of 3942 the Address Bus (Pins A0 to A15) and letting the Page Registers handle the rest, 3943 the master DMAC 16 Address lines were offset by one Bit, as they weren't wired 3944 to the lowest Bit of the Address Bus and instead covered one higher Address line 3945 (Pins A1 to A16), with one Bit of the Page Registers being ignored. As the 8237A 3946 DMAC could do flyby operations that didn't required the DMAC itself to handle 3947 the data, just to set up the transfer, this gave the master DMAC the ability to 3948 do transfers of 16 Bits Words instead of standard 8 Bits Bytes, even though the 3949 DMAC was otherwise designed with 8 Bits Data Buses in mind. For the scheme to 3950 work, 16 Bits transfers required the Bytes to be aligned at even address 3951 boundaries, and addresses also had to be aligned at 128 KiB boundaries (0-128, 3952 128-256, etc). As it worked with Words, a 16 Bits block transfer could do a 3953 maximum of 128 KiBs (65536 Words) in one go instead of 64 KiBs. 3954 3955 Regarding the DMA Channels themselves, from the 4 DMA Channels of the master 3956 DMAC, the only one used was DMA Channel 4 to cascade the slave DMAC. Since from 3957 the 8237A point of view DMA Channel 4 is its first channel, this also means that 3958 the slave DMAC channels had higher priority. The other 3 channels, known as DMA 3959 Channels 5-7, were exposed in the second section of the new expansion slots, and 3960 did only 16 Bits Word sized transfers. The slave DMAC still provided DMA 3961 Channels 0-3, but with a notorious change: Due to the IBM PC/AT having new 3962 dedicated logic to do DRAM refresh, there was no need to waste a DMA Channel on 3963 that. This freed up DMA Channel 0, which was also exposed in the second section 3964 of the slots. Meanwhile, the B19 Pin that previously was wired to the DACK0 of 3965 DMA Channel 0 to provide DRAM refresh for Memory expansion cards now exposed the 3966 REFRESH signal from the Memory Controller dedicated refresh subsystem, so from a 3967 Memory expansion card perspective, nothing changed. 3968 3969 Having both DMA Channels 0 and 1 available theoretically made possible to do the 3970 famed memory-to-memory transfers supported by the 8237A, albeit it was supposed 3971 to be impractical. The master DMAC would still never be able to do them for two 3972 reasons: First, memory-to-memory transfers required to be performed on channels 3973 0 and 1, but like in the original IBM PC, its channel 0 was used, this time to 3974 cascade the other DMAC. Second, because these transfers required the DMAC to 3975 actually manipulate the data by internally buffering it instead of doing a flyby 3976 operation, and since its buffer register was only 8 Bits wide, those would be 3977 impossible anyways. 3978 3979 **Intel 8254 PIT (Programmable Interval Timer):** As a superset of the 8253 PIT, 3980 the 8254 does all what it did plus adds a Status Read-Back function, which seems 3981 to ease reading the current status of the Counters compared to having to do so 3982 with 8253 commands. 3983 3984 The IBM PC/AT had the 8254 PIT 3 Counters assigned to the same tasks than the 3985 IBM PC and PC/XT, but the wiring was slightly different due to changes in the 3986 other chips and discrete circuitry. Counter 0 was used as the System Timer, with 3987 its GATE line unused and its OUT line hooked to IRQ 0 of the master 8259A PIC. 3988 Counter 1 took care of the DRAM refresh timer, with its GATE line unused and its 3989 OUT line wired to the new discrete DRAM refresh circuitry (Part of the Memory 3990 Controller circuitry). Counter 2 was used to drive the PC Speaker, with its GATE 3991 line coming from discrete logic now managed by the new Intel 8042 3992 microcontroller (As replacement for the Intel 8255 PPI) and its OUT line wired 3993 to the PC Speaker circuitry. 3994 3995 **Motorola MC146818 RTC (Real Time Clock):** A RTC is a specialized Timer that 3996 is intended to track the human concept of time, instead of just counting 3997 arbitrary clock cycles like other generalist Timers do. Because the speed that 3998 the Timers count clock cycles is directly related to the clock speed than they 3999 are running at, RTC chips also define in their specifications the Clock 4000 Generation scheme that should be used for them to work as intended. As such, you 4001 can consider RTCs to be factory calibrated to tick every second (Or a fraction 4002 of it) if implementing them correctly. RTCs usually have a calendar so that they 4003 can keep track of the full Time and Date, and is common for them to be part of a 4004 circuit that can operate with an external battery so that they remain powered 4005 and ticking while the rest of the system is off, as to not have to set the 4006 current time on every power on. 4007 4008 The Motorola MC146818 RTC should have been among the earliest RTC chips (I don't 4009 know if there are any other single chip RTCs before it), and provided a set of 4010 features that would become standard for this class of Devices. It supported 4011 tracking Time and Date, had alarm support, and could signal Interrupts. It had 4012 built-in support to use either a 32.768 KHz, 1.048576 MHz and 4.194304 MHz 4013 reference clock. It also included a built-in 64 Bytes SRAM, which is where the 4014 current Time and Date and alarm settings were stored. From those 64 Bytes, 14 4015 were used for the standard chip functions, and 50 were free for user data. 4016 Curiously enough, the MC146818 RTC supports a mode where the RTC functions can 4017 be disabled to free up 9 more SRAM Bytes (The chip Registers are directly mapped 4018 on the SRAM thus they can't be disabled), giving a total of 59 Bytes for user 4019 data. This mode was supposed to be for systems that have two or more MC146818 4020 RTCs so that only one serves in the RTC role while the others are treated as 4021 mere memory. I have no idea if someone ever used them that way, nor if from a 4022 cost perspective it made sense compared to standard SRAM chips. 4023 4024 The IBM PC/AT introduced the RTC as part of the base platform. The RTC circuitry 4025 used both standard Motherboard power and also an external battery, the idea 4026 being that the battery only supplied power when the computer was powered off. 4027 Thanks to the battery, the RTC 64 Bytes SRAM became NVRAM (Non-Volatile RAM) in 4028 nature, which IBM decided to put to use to store the Motherboard firmware 4029 configuration settings. From the 50 Bytes of SRAM available to user data, IBM 4030 used 24 to store BIOS settings (IBM even documented which address stored which 4031 setting), and the other 28 were marked as reserved. The RTC Interrupt line was 4032 wired to the slave PIC, at IRQ 8 (Its first IRQ). 4033 4034 **Intel 8042 UPI (Universal Peripheral Interface) Microcontroller:** A 4035 microcontroller is pretty much a chip that is more highly integrated than a SoC 4036 as it includes Processor, RAM, ROM, GPIO and maybe other built-in peripherals, 4037 yet it is minimalistic in specifications, being a sort of Swiss army knife. 4038 Microcontrollers are used by any type of electronic or electromechanical devices 4039 that requires a very small embedded computer performing tasks that are essential 4040 but which don't demand a lot of computing power. 4041 4042 The Intel 8042 UPI was part of the same MCS-48 family than the previously 4043 described 8048 used in IBM Model F Keyboards. Compared to it, it had slightly 4044 better specifications, but also had a slightly different Instruction Set, thus 4045 it couldn't run all 8048 software (Not all 8048 instructions were supported, yet 4046 8048 machine code not using those would run in the 8042). The 8042 included an 8 4047 Bit CPU, Clock Generator, Timer, 2 KiB ROM and 256 Bytes RAM, and at least 16 4048 GPIO Pins (I'm not really sure how many of the other Pins could be repurposed 4049 for GPIO), organized as two 8 Bit Ports. The contents of the built-in ROM had to 4050 be factory programmed. It also had an 8 Bit external Data Bus while for 4051 addressing there is a single line known as A0 to select which of its two Ports 4052 will be accessed, and an optional mode where 12 of the GPIO Pins could be used 4053 for a 12 Bits Address Bus, providing a 4 KiB Address Space (This matches the 4054 maximum supported external ROM size, as there are other models based on the 8042 4055 design that don't have built-in ROM and instead fully rely on an external ROM 4056 chip). The Bus protocol of the 8042 was quite compatible with that of the Intel 4057 8080/8085 CPUs, thus it could interface directly with support chips from the 4058 MCS-85 family, just like the 8086/8088 CPUs. 4059 4060 Besides the 8042, there were several pin and functionally compatible variants. 4061 The most important one is the 8742, whose main difference with the 8042 is that 4062 the former used EPROM instead of ROM for its internal firmware, making it user 4063 programmable, assuming that the EPROM was blank. It had two major variants, one 4064 being OTP (One Time Programmable), useful for low volume products that only had 4065 to write the firmware once in its lifetime (It would have been easier to program 4066 small numbers of those chips than requesting a custom 8042 batch from Intel), 4067 and the other had a crystal window so that you could erase the EPROM with 4068 ultraviolet light and write to it again, making it reusable. There was also the 4069 8242 series, which had multiple models that came with commercial firmwares in 4070 their ROMs. An example is the 8242BB, which had an IBM firmware that was usable 4071 for PC/AT compatible platforms (I have no idea whenever IBM licensed a generic 4072 firmware, or if it is the same than one used for its own systems. Nor I have 4073 idea about why these chips existed on the first place... Maybe IBM decided to 4074 try to get some extra cash from clone manufacturers?). 4075 4076 In the IBM PC/AT, the 8042 replaced the 8255 PPI used by the IBM PC. Whereas the 4077 8255 PPI provided pretty much just dumb GPIO interfaces that had to be operated 4078 by the main Processor, the 8042 capabilities also allowed it to do some 4079 processing itself, thus it is more akin to a polyvalent auxiliary controller. 4080 While the 8042 had a multitude of roles, like providing host managed GPIO 4081 interfaces for the A20 Gate and the 286 reset hack (These could have been 4082 implemented with the 8255 as they're dumb GPIO, too), its most known role is 4083 just being used as a full fledged Keyboard Controller. In such role, the 8042 4084 entirely replaced the discrete Keyboard circuitry of the IBM PC and the 4085 associated 8255 PPI GPIO interface, as it could do the serial-to-parallel data 4086 conversion, signal Interrupts and interface with the I/O Channel Bus all by 4087 itself, thus the Keyboard Port could be directly wired to it. In order to keep 4088 backwards compatibility with IBM PC software, the Keyboard function was mapped 4089 to the same I/O Port than the 8255 PPI Port A was, Port 60h. However, that is 4090 pretty much where backwards compatibility with the 8255 PPI ends. 4091 4092 Since the IBM PC/AT had no Cassette interface and the Motherboard had just a few 4093 Jumpers, its GPIO requirements were lesser than the previous Motherboards, which 4094 is quite convenient as the 8042 had 8 less GPIO Pins than the 8255 PPI. Because 4095 the exposed configuration Bits on the second 8042 Port were different enough, 4096 IBM should have decided to map it somewhere else, as to not risk any conflicts 4097 with applications that blindly manipulated the other 8255 PPI Ports. As such, 4098 the second 8-Bit Port was mapped to I/O Port 64h, jumping over the I/O Ports 61h 4099 and 62h, which is where 8255 PPI Ports B and C used to be mapped to, 4100 respectively. The reason why the mapping jumped 4 addresses was because the IBM 4101 PC/AT wired the A0 Pin used by the 8042 to choose Port to the XA2 Address line 4102 of the External I/O Address Bus, whereas had it been wired to XA0 as usual, the 4103 mapping would have been continuous, thus it would have been predictably found at 4104 Port 61h. While for the most part Ports 61h and 62h readable Bits are PC or 4105 PC/XT specific, there is a notorious exception involving the PC Speaker, as the 4106 PC Speaker used two Bits from Port 61h (More precisely, the one that managed 4107 GATE 2 input on the 8254 PIT and the Speaker data). The IBM PC/AT still had the 4108 PC Speaker and it was programmed in the same way as to remain backwards 4109 compatible with IBM PC software, but I'm not sure who responds to Port 61h 4110 requests, as the 8042 is completely unrelated to it now. I suppose that some 4111 discrete glue logic may be involved. 4112 4113 It is easy to make a full roundup of the GPIO Pins of the 8042 as used by the 4114 IBM PC/AT. In total, there were 16 GPIO Pins organized as two 8-Bit Ports known 4115 as Port A and B, which managed the Pins known as P10-17 and P20-27, 4116 respectively. P10-13, P22-23 and P25 were Not Connected, for a total of 7 4117 completely unused GPIO Pins. The Keyboard interface used only two Pins, P26 and 4118 P27 for Clock and Data, respectively, which were wired to the Keyboard Port. 4119 Related to the Keyboard, P24 was wired to the master PIC to generate Interrupts 4120 on IRQ 1, and P17 was used for the long forgotten Keyboard Inhibitor (The IBM 4121 PC/AT Computer Case had a physical Kensington Lock and a matching key that 4122 toggled this function). P21 was used to manage the infamous and already covered 4123 A20 Gate. P20 was wired to the Reset circuitry, and was used by the 286 reset 4124 hack. P16 was used to read the SW1 Jumper to choose between MDA or CGA Video 4125 Cards. P14 was used for the J18 Jumper, to set whenever there were one or two 4126 populated RAM Banks on the Motherboard. Last but not least, P15 wasn't Not 4127 Connected but I don't understand what it is supposed to do since the schematic 4128 doesn't wire it anywhere, either. Also, you can check MinusZeroDegrees Keyboard 4129 Controller Chip [diagram][5170motherboard] to see most of the auxiliary 4130 functions of the 8042 together, albeit it is a bit incomplete given my coverage. 4131 4132 [5170motherboard]: http://www.minuszerodegrees.net/5170/motherboard/5170_motherboard_diagrams.htm 4133 4134 4135 ##### 4.5 - IBM PC/AT Memory Map (Extended Memory and HMA) 4136 4137 As the 80286 CPU extended the Physical Address Space from 1 MiB to 16 MiB, there 4138 had to be a new Memory Map to organize it. As can be seen in Page 1-8 of the IBM 4139 PC/AT 5170 Technical Reference (March 1984), the PC/AT Memory Map is, for 4140 obvious backwards compatibility reasons, built upon what was on the IBM PC one. 4141 As such, you will instantly recognize it. 4142 4143 As you would expect, the organization of the first MiB was mostly the same as 4144 previously. While the Conventional Memory covering the 0 to 639 KiB range with 4145 system RAM would be forever the same, the UMA saw a few changes. The first 128 4146 KiB chunk (640 KiB to 767 KiB) remains intended for video framebuffers, just 4147 like in the PC/XT, and is perhaps the only one that was consistent since the 4148 original PC. The range reserved for Option ROMs in expansion cards has been 4149 reduced from 192 KiB in the PC/XT to just 128 KiB (768 KiB to 895 KiB). 4150 Meanwhile, the range for Motherboard ROMs was extended from the 64 KiB of the 4151 PC/XT to a whopping 128 KiB (896 KiB to 1023 KiB). This range was the one that 4152 grew the most, as the original IBM PC had just 48 KiB (40 KiB installed plus the 4153 empty U28 Socket for an extra 8 KiB ROM) for Motherboard ROMs, which got 4154 extended to 64 KiB in the IBM PC/XT (Either only 40 KiB or the full 64 KiB were 4155 used, depending on firmware version), then doubled on the IBM PC/AT, both times 4156 at the expense of Option ROMs. Note that the PC/AT actually had only 64 KiB of 4157 ROM memory installed, but it had two empty sockets for a second pair of 32 KiB 4158 ROM chips that were mapped and ready to use, just like the original IBM PC with 4159 its U28 Socket. 4160 4161 The things that are completely new are obviously those above the 1024 KiB range, 4162 courtesy of the 80286 CPU. IBM defined the range between 1 MiB to 15 MiB as 4163 Extended Memory, which would be used to map system RAM from a new type of Memory 4164 expansion cards (The PC generation cards didn't had the Address Bus lines to map 4165 themselves above 1 MiB). At the very end of the Physical Address Space, there 4166 was a 128 KiB chunk (16256 KiB to 16383 KiB) that mirrored the 128 KiB of 4167 Motherboard ROMs located at the end of the UMA. I don't really know what purpose 4168 this mirroring would serve other than a possible attempt to remove the 4169 Motherboard ROMs from the UMA in a future platform. Finally, the range between 4170 15 MiB and 16255 KiB (16 MiB - 128 KiB) is not actually mentioned in the 4171 Technical Reference, not even marked as reserved or anything, so I prefer to 4172 call it undefined. So far, so good, as this covers the basic IBM PC/AT Memory 4173 Map at the time of release. 4174 4175 There are a whole bunch of things related to the PC/AT Memory Map that belongs 4176 to its own section since they were later additions that belonged to the DOS 4177 ecosystem and not part of the original IBM platform definition, which is what 4178 I'm covering here. The only one worthy of mention now is the HMA (High Memory 4179 Area). Some years after the IBM PC/AT release, the first 64 KiB of the Extended 4180 Memory (1024 KiB to 1087 KiB) would become known as the HMA, as it was different 4181 from the rest of the Extended Memory since it could be accessed from within Real 4182 Mode (Related to the 80286 not behaving like the 8088 with its Address 4183 Wraparound quirk, which the A20 Gate hack worked around), but otherwise the HMA 4184 was simply a subset of it. 4185 4186 One thing that became noticeable in the IBM PC/AT Memory Map is the concept of 4187 Memory Hole. The Conventional Memory and the Extended Memory address ranges are 4188 both used to map system RAM, yet they are not contiguous in nature because the 4189 UMA is between them. Because it was not possible to relocate the UMA elsewhere 4190 without breaking IBM PC compatibility, any application than could use the full 4191 16 MiB Physical Address Space had to be aware of the fact than the system RAM 4192 was composed of two separated RAM pools, which complicates things when compared 4193 to a single continuous unified address range. Sadly, I don't have specific 4194 examples to quantify how messier it was, but consider than the first version of 4195 the IBM PC firmware was supposed to support non-contiguous Conventional Memory, 4196 yet that feature was removed. Eventually, Virtual Memory would solve Memory Hole 4197 issues since it would abstract the Memory Map details from user software and 4198 simply present it with a completely uniform environment, so it didn't mattered 4199 for anyone but the OS developer how ugly the physical Memory Map of the platform 4200 truly was. 4201 4202 4203 ##### 4.6 - IBM PC/AT Motherboard physical characteristics overview, Computer Case and Keyboard 4204 4205 The physical dimensions of the IBM PC/AT Motherboard depends on the Motherboard 4206 version. As Type 2 and 3 Motherboards were identical except for the use of a 4207 higher Frequency Crystal Oscillator and higher binned chips, there are only two 4208 physically different versions to be considered, Type 1 and 2/3. The dimensions 4209 are a bit of an issue since the IBM PC/AT 5170 Technical Reference appears to 4210 contradict itself: The March 1984 version, which covers only the Type 1 PC/AT 4211 Motherboard, says that its dimensions were 12' x 13', or 30.5 cm x 33 cm, yet 4212 the March 1986 version specifically says than the Type 1 Motherboard size was 4213 12' x 13.8', or 30.5 cm x 35 cm, making it 2 cm longer. This difference seems to 4214 be because the PC/AT Motherboards are not standard rectangles but had a slightly 4215 different shape, so it may be a minor correction as if the initial measurement 4216 was made from the shortest side. Either way, it was gargantuan in size compared 4217 to those of the PC and PC/XT. For the Type 2/3 Motherboards, the March 1986 4218 Technical Reference says that they measured 9.3' x 13.8', or 23.8 cm x 35 cm, 4219 making it 6.7 cm narrower than Type 1. As always, MinusZeroDegrees has 4220 [info and photos about this][5170boardRevs]. 4221 4222 For internal expansion, not counting the expansion slots, the PC/AT Motherboards 4223 had an empty socket for the optional Intel 80287 FPU, and the previously 4224 mentioned two empty sockets a pair of optional 32 KiB ROM chips. Type 1 4225 Motherboards had 36 sockets for RAM chips, of which at least half of them (A RAM 4226 Bank) had to be populated, while Type 2/3 had only 18 sockets since they used 4227 RAM chips with twice the capacity. The Intel 80286 CPU and the two 32 KiB ROM 4228 chips for the BIOS firmware and IBM Cassette BASIC came socketed, albeit there 4229 were limited replacements for them. The Motherboards also had an internal header 4230 for the Case mounted PC Speaker, now wired to the newer Intel 8254 PIT. 4231 4232 The PC/AT Motherboards, like those of the PC/XT, only had a single external I/O 4233 connector, the Keyboard Port. However, there is a major difference between it 4234 and the one of the PC and PC/XT: While the physical size and pinout of the 4235 Keyboard Port was the same as them, the Keyboard protocol was different. You 4236 could plug an older Keyboard to the PC/AT, but it would not work. The main 4237 difference between the old Keyboard protocol and the new one was that the former 4238 was unidirectional, as the Keyboard could send data to the IBM PC but not 4239 receive anything back, whereas the new one was bidirectional, so the Keyboard 4240 Controller could send commands to the Keyboard if it wanted to do so. If you 4241 ever saw in a firmware an option to toggle the NumPad default status as On or 4242 Off, is precisely thanks to this bidirectional protocol. 4243 4244 Ironically, the 8042 of the PC/AT supported an alternate mode to use the 4245 [PC Keyboard protocol][ibmatKBC], but due to the fact that for the BIOS, not 4246 detecting a PC/AT compatible Keyboard during POST was a fatal error, the only 4247 way to use this mode was to boot with a PC/AT Keyboard, use a command to switch 4248 the 8042 to PC Keyboard protocol mode, disconnect the PC/AT Keyboard, then plug 4249 in the PC Keyboard (Actually, hot-plugging the Keyboard was not supported. You 4250 were not supposed to disconnect the Keyboard while the computer was on, and 4251 there was a real risk of Hardware damage since back then, the Keyboard Port was 4252 not designed in a way that gracefully handled hot-plugging). I have no idea why 4253 IBM left this feature half done, as selecting the Keyboard protocol could have 4254 been done via a Jumper on the Motherboard given than the 8042 had many unused 4255 GPIO Pins. Later third party Keyboards had switchable modes to use either the PC 4256 or the PC/AT Keyboard protocol (Typically known as "XT mode" and "AT mode"), and 4257 could work in either platform. The PC/AT Keyboard protocol is still being used 4258 today, just that the Keyboard Port connector mutated to the smaller yet pin 4259 compatible Mini-DIN-6 format with the IBM PS/2 (This is the reason why AT-to-PS2 4260 and PS2-to-AT passive adapters always worked so well). 4261 4262 The IBM PC/AT Computer Case is relevant enough to deserve its own paragraph, 4263 compared to those of the PC and PC/XT that had pretty much absolutely nothing 4264 worthy of mention. The new Case introduced what IBM called the Control Panel, 4265 nowadays known as the Front Panel. There were two activity LEDs, one for 4266 computer Power and the other for HDD activity, and it also had a physical 4267 Kensington Keylock with a matching key. The Power LED and the Keylock were 4268 plugged via a cable to the same internal header array in the Motherboard. The 4269 Power LED doesn't seem to be connected to any circuitry that controlled it, it 4270 just received dull power to point out that the Motherboard was receiving it from 4271 the Power Supply. The Keylock was wired to Pin P17 of the 8042 Microcontroller, 4272 which toggled the Keyboard Inhibitor function. When the Keyboard Inhibitor was 4273 turned on, the 8042 didn't listened to the Keyboard (And some OSes extended that 4274 to the [Mouse][shellMouseMystery]), something that served as a primitive form of 4275 security against unauthorized users with physical access to the computer. 4276 Finally, the HDD LED wasn't plugged to the Motherboard itself, instead, its 4277 cable was plugged to a header on a new combo FDC/HDC expansion card that came 4278 with the PC/AT (I literally spend HOURS looking around for info about this one, 4279 and so far, none of the IBM documentation I checked that deals with disassembly 4280 and assembly of the PC/AT or of the particular expansion card seems to 4281 explicitly point out that the PC/AT Case HDD LED is plugged in the J6 header of 4282 that card. I do not know if I didn't checked well enough or IBM actually omitted 4283 to explicitly mention it). Also, IBM still didn't bothered to implement a Reset 4284 Button. 4285 4286 As the previous Keyboards were unusable in the IBM PC/AT, IBM had to release a 4287 revised version of the Model F Keyboard that used the new Keyboard protocol. 4288 These revised Model F Keyboards were the first to implement the 3 status LEDs 4289 for Numpad Lock, Caps Locks and Scroll Lock that nowadays you see in almost all 4290 Keyboards. The PC/AT compatible Model F Keyboards used the same Intel 8048 4291 Microcontroller as Keyboard Encoder than the previous Model F based units for 4292 the PC, but the contents of its internal ROM were different as they had to 4293 implement the PC/AT Keyboard protocol and the controls of the new status LEDs. 4294 4295 Beginning in 1985, IBM released the famous Model M Keyboards. These Keyboards 4296 used a Motorola 6805 Microcontroller, that had its own embedded CPU, RAM, ROM, 4297 Clock Generator and GPIO like its Intel counterparts, but was based on a 4298 different ISA. Model M Keyboards had a Ceramic Resonator that provided a 4 MHz 4299 reference clock for the 6805, which could run at either 1/2 or 1/4 the clock 4300 input (Either 2 or 1 MHz), albeit I'm not sure how it was configured in the 4301 Model M. Due to the extreme amount of Keyboard submodels, even after some 4302 googling I don't have clear enough if all Model F are Intel 8042 based and if 4303 all Model M are Motorola 6805 based. Regardless, what matters is that the 4304 Keyboard protocol could be implemented with either Microcontroller. Learning the 4305 implementation details are relevant just because you learn to respect more the 4306 Keyboards when you notice that they were far more complex that they initially 4307 appear to be. As a bonus, you can read [here][pckbd5years] for the evolution of 4308 the Keyboard Layout, covering both the original PC Model F, the PC/AT Model F, 4309 and the Model M. 4310 4311 [5170boardRevs]: http://www.minuszerodegrees.net/5170/motherboard/5170_motherboard_revisions.htm 4312 [ibmatKBC]: https://www.seasip.info/VintagePC/ibmat_kbc.html 4313 [shellMouseMystery]: http://www.os2museum.com/wp/the-dos-4-0-shell-mouse-mystery/ 4314 [pckbd5years]: http://www.os2museum.com/wp/pc-keyboard-the-first-five-years/ 4315 4316 4317 ##### 4.7 - IBM PC/AT Motherboard Buses (Local Bus, I/O Channel Bus, Memory Bus) 4318 4319 As can be seen in the System Board Block Diagram in Page 1-6 of the IBM PC/AT 4320 Technical Reference, the Buses of the IBM PC/AT had almost an identical layout 4321 to those of the original IBM PC, being immediately recognizable as wider 4322 versions of them. These Buses still were the Local Bus, the I/O Channel Bus, the 4323 Memory Bus and the External I/O Channel Bus. 4324 4325 **Local Bus:** Like always, the Local Bus interconnected the main Processor with 4326 the support chips that could be directly interfaced with it. As the new Intel 4327 80286 CPU had a demultiplexed Bus with entirely dedicated lines instead of 4328 multiplexed ones like in the 8086/8088 CPUs, only the new 286 specific support 4329 chips, namely the 82288 Bus Controller and the optional Intel 80287 FPU, could 4330 be directly wired to it. The Intel 8259A PIC that was present in the Local Bus 4331 of the IBM PC got kicked out because it only supported the multiplexed Bus of 4332 the previous Processors, leaving only three chips in the Local Bus instead of 4333 four. As the main Processor dictates the Bus width, the Local Bus was obviously 4334 extended for the 286 16 Bits Data Bus and 24 Bits Address Bus. 4335 4336 **I/O Channel Bus:** Being a direct extension of the Local Bus, the I/O Channel 4337 Bus was widened along with it. It was still separated from the Local Bus by some 4338 buffer chips, albeit compared to the IBM PC, their task should have been 4339 simplified since demultiplexing the output and multiplexing the input was no 4340 longer required. The I/O Channel Bus did mostly the same thing that it did 4341 previously: It interconnected the Local Bus with the DRAM Memory Controller, the 4342 External I/O Channel Bus, and extended up to the expansion slots, where it was 4343 exposed for the expansion cards to use. Exposing the wider I/O Channel Bus 4344 required a new type of slot with more Pins to expose the new Bus lines, and to 4345 use them you obviously needed a new type of cards. 4346 4347 The IBM PC/AT also moved the ROM memory from the External I/O Channel Bus into 4348 the main I/O Channel section. Since the ROM chips had just an 8 Bits wide Data 4349 Bus, IBM decided to do the same thing that it did for the RAM memory and 4350 organized the ROM chips as Banks (A Bank uses multiple chips accessed 4351 simultaneously in parallel to match the width of the Data Bus), making a 16 Bits 4352 ROM Bank using two 8 Bits ROM chips. You can check MinusZeroDegrees BIOS ROM 4353 chips - [Diagram 2][5170motherboard] to see how it was supposed to work. 4354 4355 A rarely mentioned detail is that the PC/AT actually had two ROM Banks: One had 4356 two 32 KiB ROM chips with the firmware and the IBM Cassette BASIC, occupying the 4357 standard 64 KiB mapping for them (960 KiB to 1023 KiB) as defined by the PC/XT. 4358 The other Bank had two ROM sockets, known as U17 and U37, that were mapped (896 4359 KiB to 959 KiB) and ready to use but otherwise unpopulated. Their usefulness was 4360 roughly the same as the empty U28 Socket in the IBM PC 5150: It was only an 4361 extremely obscure and niche feature that was rarely, if ever, used. As far as I 4362 know, it could be used for an optional Option ROM, and it was possible to insert 4363 some type of writable ROMs and program them without the need of an external 4364 programmer (Albeit you had to open the Computer Case). I'm not sure if it was 4365 possible to only install and use a single ROM chip, but assuming ht was 4366 possible, doing so would have required some type of parser software to write or 4367 read it correctly. Either way, good luck trying to find someone that actually 4368 used these ROM sockets. 4369 4370 **Memory Bus:** The Memory Bus, which interconnected the RAM Banks with the DRAM 4371 Memory Controller present in the I/O Channel Bus, was also extended to match the 4372 80286 CPU external Data Bus capabilities. IBM keep using DRAM chips with an 4373 1-Bit external Data Bus, so in order to match the 16 Bits Data Bus of the 80286, 4374 the PC/AT had to use 16 DRAM chips per Bank. However, like in the PC and PC/XT, 4375 IBM implemented Parity for RAM memory error detection, which took two more DRAM 4376 chips. Thus, there were 18 1-Bit DRAM chips per RAM Bank (Twice that of the 9 4377 DRAM chips per Bank in the PC and PC/XT), effectively making the Data Bus part 4378 of the Memory Bus 18 Bits wide. Additionally, while the Memory Controller was 4379 still made out of discrete logic, it was more complex than the PC and PC/XT one 4380 as it could refresh the DRAM by itself without needing to waste a DMA Channel, 4381 albeit it still used an 8254 PIT Counter and its OUT line. 4382 4383 Depending on the PC/AT Motherboard version (Type 1 or Type 2/3, respectively), 4384 there were either two 256 KiB RAM Banks (18 DRAM chips of 16 KiB per Bank), for 4385 a total of 512 KiB of usable RAM, or a single one that used DRAM chips of twice 4386 the capacity (32 KiB) to get the same total RAM size (Note that this didn't was 4387 a downgrade from the IBM PC/XT since the 256KB - 640KB Motherboard was released 4388 around two years after the PC/AT, so both the PC and PC/XT Motherboards of the 4389 era maxed out at 256 KiB RAM). The Motherboard RAM was mapped as Conventional 4390 Memory, as usual. Additionally, the Type 1 Motherboard had a Jumper, J18, that 4391 could be used to disable the mapping of the second RAM Bank, thus the Memory 4392 Controller only responded to addresses for the first 256 KiB RAM. This allowed 4393 you to use Memory expansion cards with RAM mapped in the 256 to 512 KiB address 4394 range, should for some reason using such card be more convenient than populating 4395 the second RAM Bank 18 DRAM chips. 4396 4397 If you wanted more RAM, you had to use Memory expansion cards. All the PC/AT 4398 Motherboards could use cards that mapped 128 KiB RAM into the 512 to 640 KiB 4399 range to max out the Conventional Memory. Also, something that I'm not entirely 4400 sure is if 8 Bits Memory expansion cards for the IBM PC worked in the PC/AT. I 4401 would suppose that they should, but accessing their RAM would come at a high 4402 performance penalty, so even if they worked, it would be highly undesirable to 4403 use them for Conventional Memory. 4404 4405 **External I/O Channel Bus:** The External I/O Channel Bus had quite major 4406 changes. To begin with, it was harder to interface old support chips of the 4407 MCS-85 family intended for the 8085 CPU 8 Bits Data Bus with the 80286. Whereas 4408 in the IBM PC the External I/O Channel Bus had the same Data Bus width than 4409 everything else, the 80286 had a wider 16 Bits Data Bus, with means the same 4410 issues than the 8086 had. As such, the glue logic that separated the External 4411 I/O Channel Bus from the main I/O Channel Bus also castrated its Data Bus width 4412 to 8 Bits. Additionally, the address decoding logic was vastly improved so it 4413 now fully decoded the 16 Bits of the I/O Address Space instead of only 10 Bits. 4414 4415 As the IBM PC/AT made significant chances to the support chips, the External I/O 4416 Channel Bus also had a rearrange of denizens. Now you've got 2 8237A DMACs, 2 4417 8259A PICs, the upgraded 8254 PIT, the MC146818 RTC, and the 8042 4418 Microcontroller, and some discrete logic that managed the now orphaned I/O Port 4419 of the PC Speaker. The ROM chips with the firmware and IBM Cassette BASIC were 4420 moved to the main I/O Channel Bus. 4421 4422 4423 ##### 4.8 - IBM PC/AT Motherboard Clock Generation and Wait States 4424 4425 The IBM PC/AT, as can be seen in the Clock Generation diagram of 4426 [MinusZeroDegrees][5170motherboard], had a much more elaborated Clock Generation 4427 scheme than that of the IBM PC and PC/XT. Whereas they derived all their clocks 4428 from a single reference clock source, the PC/AT had three Crystal Oscillators 4429 and three Clock Generators. A major difference when comparing the PC/AT with the 4430 Turbo XT platforms is that the clock speed of everything was fixed, as the PC/AT 4431 didn't had two or more selectable speed modes like the Turbo XTs did. 4432 4433 The first Clock Domain was used by almost everything in the system, and involved 4434 the Intel 82284 Clock Generator and a Crystal Oscillator whose Frequency 4435 depended on the PC/AT Motherboard version. Type 1/2 Motherboards had a 12 MHz 4436 Crystal Oscillator, whereas the later Type 3 Motherboards had a 16 MHz one. As 4437 such, derived Frequencies varied accordingly. The 82284 only provided two output 4438 clock lines, CLK and PCLK, instead of the three of the previous generation Intel 4439 8284A CG. 4440 4441 **82284 CLK:** The CLK (System Clock) line was the most important one. Its clock 4442 speed was equal to that of the reference clock (Like the 8284A OSC line), which 4443 means either 12 or 16 MHz depending on the Motherboard version. It directly 4444 provided input clock for the 80286 CPU, the 80287 FPU and the 82288 Bus 4445 Controller. However, whereas the 8088 CPU ran at the Frequency of the input 4446 clock, the 80286 CPU and the 82288 Bus Controller internally halved it, so their 4447 effective clock speed was either 6 or 8 MHz. The 80287 FPU was a bit special 4448 since it could run either at the input clock line Frequency, or at 1/3 of it. In 4449 the PC/AT case, the 80287 FPU was hardwired to run at 1/3 of the input clock, so 4450 with a 12 or 16 MHz CLK line, it effectively ran at either 4 or 5.33 MHz, 4451 respectively. 4452 4453 The 82284 CLK line also passed though two discrete clock dividers that halved 4454 it, providing two separate 6 or 8 MHz lines, one that served as input for the 4455 Intel 8042 Microcontroller, which ran at the same clock speed than the input 4456 clock (Its built-in CG went unused), and a SYSCLK line that was used for almost 4457 everything else, including the I/O Channel Bus and also the expansion cards, as 4458 it was exposed in the I/O Channel Slots as the CLK Pin. Moreover, while not 4459 shown in the MinusZeroDegrees diagram, the SYSCLK line was used to derive yet 4460 another line for the two DMACs. That line passed though another discrete clock 4461 divider that halved it to provide either a 3 or 4 MHz input for the DMACs, 4462 making the I/O Channel DMA subsystem slower than the IBM PC one (This should 4463 have been the easiest way to deal with the fact that the Intel 8237A DMAC chip 4464 never had factory binned models that ran higher than 5 MHz, a problem that the 4465 Turbo XTs had to deal with by either binning the DMAC chips, or underclocking 4466 the system to 4.77 MHz whenever a BIOS Service wanted to use DMA). 4467 4468 **82284 PCLK:** The PCLK (Peripheral Clock) line was simply CLK/2. For reasons 4469 that I don't understand, it went completely unused. Why IBM decided to use 4470 discrete clock dividers to derive SYSCLK instead of directly using the 82284 4471 PCLK line is not something I know. 4472 4473 The second Clock Domain involved a secondary Intel 8284A Clock Generator and a 4474 14.31 MHz Crystal Oscillator. As explained in the Turbo XT sections, some chips 4475 like the 8253/8254 PIT had to run at a specific Frequency as to not screw up the 4476 timing for applications that relied on these. The PC/AT was among the first IBM 4477 PC compatible computers that had to deal with this issue, and IBM solution, 4478 later adopted by everyone else, was to decouple the system wide clock into two 4479 clock domains. Since the 8284A CG is the same one used in the IBM PC, it also 4480 works in the same way, providing three derived clock lines as output: OSC, CLK 4481 and PCLK. 4482 4483 **8284A OSC:** The OSC (Oscillator) line passthroughed the reference clock 4484 input, so it ran at 14.31 MHz. As in the IBM PC platform, it wasn't used 4485 internally by the Motherboard but instead was exposed as the OSC Pin in the 4486 expansion slots, pretty much just for CGA Video Cards. 4487 4488 **8284A CLK:** The CLK (System Clock) line was OSC/3, thus 4.77 MHz like in the 4489 IBM PC. In the IBM PC/AT, this line went unused. 4490 4491 **8284A PCLK:** The PCLK (Peripheral Clock) line was CLK/2, thus 2.38 MHz. It 4492 was halved with the help of a discrete clock divider to provide the 1.19 MHz 4493 required by the 8254 PIT, as usual. 4494 4495 Finally, the third Clock Domain was extremely simple: There is a 32.768 KHz 4496 Crystal Oscillator that serves as input for a Motorola MC14069 Hex Inverter (It 4497 is not explicitly a Clock Generator. Since I have near zero electronics 4498 knowledge, I don't know what it is supposed to be or do). Its output is a 32.768 4499 KHz line, used by the Motorola MC146818 RTC. This clock domain will also become 4500 a very common sight as the 32.768 KHz for the RTC would become as ubiquitous as 4501 the 14.31 MHz OSC and the 1.19 MHz clock line for the PIT. Note than the 4502 MC146818 RTC could also work with 1.048576 MHz and 4.194304 MHz reference clock 4503 inputs, but I have no idea why 32.768 KHz was chosen, nor if there was any 4504 advantage of using the other ones. 4505 4506 Regarding Wait States, the PC/AT still used asynchronous RAM, but the DRAM chips 4507 for it had to be rated for a 150 ns access time or faster. This included even 4508 the Type 3 Motherboard with the 80286 @ 8 MHz, so it seems that the initial 150 4509 ns DRAM chips were overrated for 6 MHz operation. However, IBM also decided to 4510 introduce a Memory Wait State. I'm not sure why IBM did that, as the fact that 4511 the same 150 ns DRAM chips could keep up in the PC/AT 8 MHz version seems to 4512 point out that IBM was perhaps a bit too conservative with the RAM subsystem in 4513 the Type 1/2 Motherboards. Regardless if it was necessary or not, all the PC/AT 4514 Motherboards had a fixed 1 Memory WS. Measured in time, according to the IBM 4515 5170 Technical Reference, the 286 @ 6 MHz had a clock cycle time of 167 ns with 4516 a Bus cycle of 500 ns, and the 286 @ 8 MHz had a clock cycle time of 125 ns with 4517 a Bus cycle of 375 ns. These Bus cycle values for Memory operations included 2 4518 clock cycles, which were the fixed Bus cycle of the 80286, plus 1 Memory WS (For 4519 reference, the 8088 had a Bus cycle of 4 clock cycles. This also helped with the 4520 lower instruction execution latency of the 80286). 4521 4522 Meanwhile, the I/O subsystem of the IBM PC/AT got a bit more complex since it 4523 had to deal simultaneously with both the new 16 Bits expansion cards and the old 4524 8 Bits ones. To maintain compatibility with the old cards, a lot of Wait States 4525 were added to operations that involved 8 Bits Devices. Moreover, the IBM 5170 4526 Technical Reference at Page 1-7 mentions that the PC/AT further separates I/O 4527 operations to these 8 Bits Devices into two different types: 8 Bits operations 4528 to 8 Bits Devices, which had 4 I/O WS, and 16 Bits operations to 8 Bits Devices, 4529 which had 10 I/O WS (Somehow the page that mentions those values is missing 16 4530 Bits operations to 16 Bits Devices...). These brutal amount of Wait States 4531 should have been added to try to be as compatible as possible with the older I/O 4532 Channel Cards for the original IBM PC, as those were designed expecting only an 4533 1050 ns I/O Bus cycle time (8088 @ 4.77 MHz plus 1 I/O WS). Compatibility 4534 between these cards and the PC/ATs with the 286 @ 6 MHz should have been rather 4535 high, since for 8 Bits operations, the two 167 ns clock cycles of the 80286 Bus 4536 cycle plus 4 I/O WS is equal to an 1000 ns Bus cycle, which seems to be a 4537 difference small enough compared to the expected 1050 ns. For 16 Bits 4538 operations, 2 clock cycles plus 10 I/O WS is equal to 2000 ns, exactly twice 4539 that of 8 Bits operations. Something that I'm not sure about is how 16 Bits 4540 operations to 8 Bits Devices were supposed to work, perhaps there was a discrete 4541 buffer chip to split a single 16 Bits operation into two 8 Bits ones, which 4542 makes sense considering than the Bus cycle time of 16 Bits operations is twice 4543 that of 8 Bits ones. 4544 4545 A notorious problem is that the amount of I/O Wait States were still the same 4546 for the 8 MHz PC/AT version, which means that with a 125 ns clock cycle time, 4547 you get 750 ns and 1500 ns Bus cycle times for 8 Bits and 16 Bits operations, 4548 respectively. Basically, at first, with the first version of the PC/AT, IBM 4549 tried to compensate with high I/O Wait States to maintain compatibility with the 4550 PC and PC/XT expansion cards, but by the time of the 8 MHz PC/AT models, it 4551 seems that IBM didn't cared about these cards enough to readjust the amount of 4552 I/O Wait States. It was very likely that a first generation I/O Channel Card for 4553 the IBM PC worked in the 6 MHz PC/AT, but not as likely that it worked in the 4554 later 8 MHz one. The cause of this was the exact same one that caused card 4555 compatibility issues across different speeds of Turbo XT based systems, as the 4556 closest thing to a standard was compatibility with the original IBM PC (Or now, 4557 with one of the two versions of the PC/AT), but there was nothing above that nor 4558 a rating system that could tell an end user about what was the lowest Bus cycle 4559 latency than the card was designed to work reliably at. 4560 4561 In the case of the new 16 Bits expansion cards, I'm not sure what the default 4562 amount of I/O WS was supposed to be since the IBM 5170 Technical Reference 4563 doesn't explicitly mentions it (It should be a 16 Bits operation to a 16 Bits 4564 Device). However, completely related to that, the PC/AT repurposed one Pin from 4565 the I/O Channel Slots, B8, to use for a new line known as 0WS (Zero Wait 4566 States). Expansion cards intended for the PC/AT could use the 0WS line to 4567 completely bypass the I/O Wait States, albeit I have no idea if all the 16 Bits 4568 cards did so. 4569 4570 If you have good memory, you may remember that I mentioned that the PC/XT uses 4571 the B8 Pin for the CARD SLCTD (Card Selected) line used by the infamous Slot 8. 4572 As such, there was a risk that older 8 Bits cards hardwired for Slot 8 could 4573 make contact with the repurposed line if plugged into the PC/AT and misbehave as 4574 a result. Due to the fact than the B8 Pin was marked as reserved in the original 4575 IBM PC 5150, and that the Slot 8 weird behavior was never popular to begin with 4576 (The PC/AT was released just one year after the PC/XT and whatever Slot 8 4577 special purpose was supposed to be, it was already dropped), I'm not sure if 4578 there actually exist cards that could have issues when plugged in the PC/AT due 4579 to the different behavior of that Pin (The cards that supported Slot 8 typically 4580 did so as an alternate mode selectable via a Jumper). It could also be possible 4581 that there were 8 Bits cards intended for the PC/AT that misbehaved in the PC/XT 4582 because they were expecting the 0WS line, but this seems a rare occurrence since 4583 anyone making a card that could fit in either the original IBM PC or PC/XT 4584 should have thought about the possibility than that card was used in those 4585 computers, too. I don't really know if there were demonstrable compatibility 4586 issues or edge cases caused by the repurposed Pin. 4587 4588 4589 ##### 4.9 - IBM PC/AT Expansion Cards 4590 4591 To expose the wider I/O Channel Bus (16 Bits Data Bus and 24 Bits Address Bus) 4592 plus the additional IRQs and DMA Channels, the IBM PC/AT required a new 4593 expansion slot type with more Pins for them, and obviously new cards that used 4594 those. The new 16 Bits I/O Channel Slot was physically backwards compatible with 4595 the previous 8 Bits slot from the IBM PC since it merely added to the existing 4596 slot a second section for the new Pins instead of redesigning the entire slot, 4597 so you could physically plug an IBM PC expansion card into the PC/AT. Doing it 4598 the other way around by plugging a new 16 Bits card into an 8 Bits slot while 4599 leaving the second section connector hanging was also possible assuming that 4600 there was no physical obstacle, albeit it only worked if the card supported an 8 4601 Bits mode. Note that not all PC era cards worked properly in the PC/AT for the 4602 same reason that they didn't in Turbo XTs, some simply couldn't handle the 4603 higher Bus clock speeds. For IRQs and DMA Channels, the new 16 Bits I/O Channel 4604 Slot exposed all the ones available in the old 8 Bits slots, namely IRQs 2-7 4605 (Note that IRQ2 was wired to IRQ9) and 8 Bits DMA Channels 1-3, plus the new 4606 additions of IRQs 10-12, 14-15, 8 Bits DMA Channel 0, and 16 Bits Channels 5-7. 4607 In total, the new slots exposed 11 IRQs, four 8 Bits DMA Channels, and three 16 4608 Bits DMA Channels. 4609 4610 All the PC/AT Motherboards had 8 I/O Channel Slots for expansion cards, but they 4611 were of two different types: 6 were of the new 16 Bits type, and two of them 4612 were of the older 8 Bits type, physically the same as those of the PC and PC/XT. 4613 I find very intriguing that if you see photos of the PC/AT Motherboards, the two 4614 shorter 8 Bits Slots had extra solder pads like if at some point it was intended 4615 for those to be full fledged 16 Bits Slots. As far that I know, 8 Bits and 16 4616 Bits Devices could be freely mixed for as long that their MMIO address ranges 4617 were not mapped into the same 128 KiB address block (In the UMA, this means 4618 three 128 KiB blocks: 640 to 767 KiB, 768 to 895 KiB, and 896 to 1023 KiB. Each 4619 block had to be either all 8 Bits or all 16 Bits). I'm not entirely sure if this 4620 also means that it is possible to use older 8 Bits Conventional Memory expansion 4621 cards for the IBM PC in the 512 to 639 KiB range. 4622 4623 There was a huge variety of expansion cards for the IBM PC/AT, some of them 4624 [made by IBM itself][5170cards] and eventually adopted by others. The most 4625 prominent one was the new 16 Bits multi-function FDC + HDC card, which used the 4626 ST-506 interface for HDs like the previous PC/XT HDC. I'm not entirely sure if 4627 this card came with all PC/ATs or only those that came with a HD, but the latter 4628 seems improbable cause I couldn't find a FDC only card for the PC/AT. The FDC + 4629 HDC card supported up to two internal Diskette Drives and one internal Hard 4630 Drive, or one Diskette Drive and two Hard Drives. It didn't had a Port for 4631 external Diskette Drives like the IBM PC FDC, and since you couldn't use two FDC 4632 cards nor it made sense to downgrade to the IBM PC FDC, this effectively capped 4633 the PC/AT platform to just two Diskette Drives. 4634 4635 Talking about PC/AT with no HDs, those models include one of the most amusing 4636 hacks I ever saw. Since the Power Supply required a certain load to turn on and 4637 a bare PC/AT without HD wasn't enough for its expected minimum load, IBM decided 4638 to use a [50W resistor][5170loadResistor] to plug into the computer. As far that 4639 I know, this should make the PC/AT with no HD models the first true space 4640 heater-computer hybrids, even before the Intel Pentium 4 Prescott and AMD 4641 Piledriver-based FX 9000 series! 4642 4643 The other major type of expansion cards were obviously the Video Cards. 4644 Initially, Video Cards included only MDA and CGA, as they were the only ones 4645 available at the release date of the PC/AT. Some months later, IBM released the 4646 EGA (Enhanced Graphics Adapter) and PGC (Professional Graphics Controller) Video 4647 Cards. EGA prevailed for a while before VGA superseded it 4 years later, yet it 4648 left an important legacy due to its allocation in the Memory Map and the 4649 specialized type of Option ROM that it introduced, the VBIOS. To pitch EGA, IBM 4650 even developed an obscure tech demo known as [Fantasy Land][fantasyLand] to 4651 pitch EGA. The PGC, as impressive as it was as a piece of Hardware at the time, 4652 left no legacy, thus is pretty much irrelevant. 4653 4654 The original IBM EGA Video Card came with 64 KiB RAM to use as framebuffer, and 4655 could be expanded to 256 KiB RAM using an optional daughterboard populated with 4656 a ton of DRAM chips (Amusingly, Fantasy Land required the EGA card be fully 4657 geared). However, there was a major problem: A framebuffer of that size was far 4658 bigger than the 128 KiB range that IBM reserved in the UMA for such purpose. 4659 Extending the range to 256 KiB was impossible as it would leave pretty much no 4660 room for Option ROMs, and relying on the 286 16 MiB Physical Address Space would 4661 make EGA totally incompatible with Real Mode software and the still relevant PC, 4662 so it wasn't a viable option, either. To access all of the EGA framebuffer, IBM 4663 had to resort to map it indirectly via Bank Switching, as getting those 256 KiB 4664 as straightforward MMIO was not possible. 4665 4666 Bank Switching is a means to indirectly map more memory than the Physical 4667 Address Space would allow if done directly, at the cost of not being able to 4668 access all the memory at the same time. It works by partitioning the full pool 4669 of memory that you want to map into chunks known as Banks, then reserving an 4670 address range to use as a Window, which is where the Processor would see them. 4671 With the help of a specialized Memory Controller, you can tell it to map at a 4672 given moment either a single Bank of the same size than the window or multiple 4673 smaller Banks (Depending on the implementation details, obviously), then tell it 4674 to switch which Banks are mapped when you want to access more memory. In the 4675 case of EGA, its 256 KiB framebuffer was partitioned as 16 Banks of 16 KiB each 4676 while the entire 128 KiB block in the UMA reserved for video framebuffers (640 4677 KiB to 767 KiB) was used as a MMIO window, thus the Processor was able to 4678 simultaneously see 8 Banks. By switching which of the EGA 16 Banks you wanted to 4679 see in the mapped window, you could indirectly access all of its 256 KiB 4680 framebuffer. 4681 4682 If you have good memory, maybe you already figured out that the EGA framebuffer 4683 completely overlapped both MDA and CGA ones. IBM considered that, as the 4684 original EGA Video Card also supported an alternate mode where it just used a 64 4685 KiB window (640 KiB to 703 KiB) for 4 Banks so that it didn't overlapped with 4686 the fixed location of the previous Video Cards framebuffers, a mode that should 4687 have been useful in case that you wanted to use dual displays with EGA + MDA or 4688 EGA + CGA. 4689 4690 Is notorious than the PC/AT had no built-in firmware support for the EGA Video 4691 Card like it had for MDA and CGA, instead, the EGA card had its own 16 KiB 4692 Option ROM that the BIOS could load to initialize it. This Option ROM with the 4693 Video Card firmware would become known as the VBIOS. Is quite important to 4694 mention that the VBIOS is a special type of Option ROM, as it is mapped to a 4695 specific location (The EGA VBIOS was mapped to the 768 KiB to 783 KiB range) 4696 than the BIOS would check very early in the POST process, as it was extremely 4697 useful to get the computer main video output working as soon as possible so that 4698 the BIOS can display error codes on screen if something goes wrong (For 4699 reference, in the last BIOS version of the IBM PC 5150, it checks if there is a 4700 VBIOS available several steps before any other Option ROM, actually, even before 4701 it fully test the Conventional Memory). Pretty much every Video Card that had to 4702 be initialized as primary output during POST would have a VBIOS. 4703 4704 Regarding the PGC, the card itself was [really, really complex][pgcCard], and 4705 priced accordingly. The PGC was made out of three PCBs sandwiched that took two 4706 adjacent slots with PC/XT spacing (Actually, it couldn't fit in the original IBM 4707 PC because the slot spacing was different), had an Intel 8088 CPU all for 4708 itself, a ROM with firmware, 320 KiB DRAM to use as framebuffer, and 2 KiB SRAM 4709 that was simultaneously mapped in the host UMA (792 KiB to 793 KiB) and in the 4710 PGC 8088 so that it could be used as a buffer to pass data between both the host 4711 CPU and the card CPU. It also had CGA emulation capabilities. To use the PGC 4712 full capabilities you were expected to use a specialized IBM Monitor. Due to the 4713 extremely scarce info about the PGC is hard to say anything else, albeit the few 4714 comments of users and programmers that had to deal with it seems to point that 4715 it was extremely powerful at the time. 4716 4717 The final type of major expansion cards were the Memory expansion cards. Whereas 4718 with the PC and PC/XT the Memory expansion cards were rather simple since they 4719 just had RAM that was going to be mapped into some address range inside the 640 4720 KiB Conventional Memory, in the PC/AT, thanks to the 286 16 MiB Physical Address 4721 Space, RAM memory could also be mapped above 1024 KiB. The RAM mapped over the 4722 old 8088 1 MiB Physical Address Space boundary became known as Extended Memory, 4723 as would the cards that supported mapping RAM to ranges above 1024 KiB. 4724 Technically, both Conventional Memory and Extended Memory are system RAM, just 4725 that the former is mapped to a range that can be accessed in Real Mode and works 4726 exactly as expected by applications intended for an IBM PC with an 8088, while 4727 the latter requires dealing with all the issues described in the section 4728 explaining the 80286 protected mode (Or any other alternative like using the 64 4729 KiB HMA, or LOADALL. All them are reliant on proper management of the A20 Gate, 4730 too), so they are treated in two completely different ways. 4731 4732 Some Extended Memory cards had an address decoding logic flexible enough that 4733 allowed mapping part of the card RAM into the Conventional Memory range. This 4734 was known as backfilling. For example, a 2 MiB Extended Memory card could be 4735 inserted into a PC/AT Type 1 Motherboard with only 256 KiB RAM installed, then 4736 configured to backfill 384 KiB in the 256 to 640 KiB range so that you could max 4737 the 640 KiB Conventional Memory, then the remaining 1664 KiB would be mapped 4738 into the 1024 KiB to 2687 KiB range as Extended Memory. Like always, the user 4739 had to be careful to make sure that there was no overlap if using multiple cards 4740 and that the Extended Memory mapping was contiguous, something that would 4741 require to deal with Jumpers or DIP Switches on the cards themselves. It was not 4742 necessary to fully fill the 640 KiB of Conventional Memory to use the Extended 4743 Memory. Albeit it may be redundant to mention, due to the fact that addressing 4744 above 1024 KiB required more than the 20 Bits Address lines exposed by an 8 Bits 4745 I/O Channel Slot, only 16 Bits I/O Channel Cards could provide Extended Memory, 4746 since these implemented the full 24 Bits Address lines of the 80286. 4747 4748 Another type of Memory expansion cards that appeared on the market around 1986 4749 were those that added Expanded Memory. Expanded Memory worked conceptually the 4750 same way that the already explained EGA Video Card: It reserved a 64 KiB window 4751 in the UMA, then used Bank Switching to indirectly map either one 64 KiB or four 4752 16 KiB Banks into it. Expanded Memory cards became somewhat popular since these 4753 didn't rely on any of the PC/AT exclusive features (Including those of the 80286 4754 CPU) and thus could work in an IBM PC or PC/XT for as long that the application 4755 supported it. The Expanded Memory cards required a specialized API to use it, 4756 and eventually the Extended Memory would get its own API, too, beginning the 4757 nightmare that was DOS Memory Management... 4758 4759 [5170cards]: http://minuszerodegrees.net/5170/cards/5170_cards.htm 4760 [5170loadResistor]: http://www.minuszerodegrees.net/5170/misc/5170_load_resistor.htm 4761 [fantasyLand]: https://www.pcjs.org/blog/2018/04/23/ 4762 [pgcCard]: http://www.seasip.info/VintagePC/pgc.html 4763 4764 4765 ##### 4.10 - IBM PC/AT BIOS Firmware, BIOS Setup and RTC SRAM, early overclocking, PC DOS 3.0 4766 4767 There isn't much to say about the functionality of the PC/AT firmware itself, 4768 the BIOS did the same basic things that it used to do, and added a few more BIOS 4769 Services on top of the previous ones to support the new PC/AT platform Hardware 4770 changes. The most important change was in how the BIOS was configured. Compared 4771 to the PC and PC/XT, the PC/AT Motherboards had a dramatically reduced amount of 4772 configurable stuff that required physical interaction, with just a Jumper to set 4773 the main Video Card type, and, in Type 1 Motherboards only, a Jumper to select 4774 between 256 or 512 KiB RAM installed on the Motherboard. Nearly everything else 4775 became a software configurable setting that the BIOS would read during POST. The 4776 PC/AT took advantage of the fact that the Motorola MC146818 RTC had 50 Bytes 4777 free of SRAM that thanks to being battery backed could be used as an NVRAM (Non 4778 Volatile RAM) to store the BIOS Settings. This also gave birth to the Clear CMOS 4779 procedure: If you wanted to force a reset of the BIOS Settings, you had to cut 4780 the battery power for the RTC, so that the SRAM would lose its contents 4781 (Including the Date and Time, which was the primary purpose of the RTC). 4782 4783 The PC/AT also introduced the BIOS Setup so that you could change the BIOS 4784 Settings stored in the RTC SRAM, but it was not accessed in the way that you 4785 know it. Instead of being stored in the Motherboard ROM itself, the BIOS Setup 4786 was a traditional application that came in a bootable Diskette known as 4787 Diagnostics for the IBM Personal Computer AT, so you had to boot with it in 4788 order to change the BIOS Settings. As a last resort, it was possible to use the 4789 built-in IBM Cassette BASIC to edit the BIOS Settings, but you had to know 4790 exactly what you were doing. Some years later, third party firmware vendors 4791 released [customs BIOSes][5170bios] for the PC/AT delivered as a pair of ROM 4792 chips so that you could replace the IBM ones with the standard firmware. These 4793 had a built-in BIOS Setup accessed via a Key Combination during POST, something 4794 that may sound familiar to you. 4795 4796 One of these amusing trivial details is that even 35 years ago, there used to be 4797 Hardware enthusiasts that liked to tinker with their Hardware. A popular mod for 4798 the PC/AT was to replace its 12 MHz Crystal Oscillator for a 16 MHz one (Like 4799 the one used by the Type 3 Motherboards), which typically worked well because 4800 the early PC/AT models were very conservatively clocked and most chips could 4801 take the overclock. IBM didn't liked that power users were toying with its 4802 business Hardware, so to make sure that their PC/ATs were not getting 4803 overclocked, in a later PC/AT BIOS version, IBM added a speed loop test during 4804 POST that failed if the 286 was running above 6 MHz. The Type 3 Motherboard had 4805 another BIOS version that revised the speed loop test to support 8 MHz 4806 operation. If you were overclocking, avoiding these speed loops was a good 4807 reason to use a third party firmware. 4808 4809 The PC/AT also included a new version of its main OS, PC DOS 3.0, which could 4810 use some of the new computer capabilities (Mainly the RTC for time keeping) and 4811 introduced the FAT16 File System. The convenience of having a built-in battery 4812 backed RTC was enormous, as in the PC and PC/XT, you had to type in the Date and 4813 Time every time you turned on the computer (Unless you had an expansion card 4814 with a RTC and OS Drivers for it. RTC cards were a popular add-on for the 4815 original IBM PC), whereas in a PC/AT with PC DOS 3.0, the RTC worked out of the 4816 box. You could also use older versions of PC DOS, assuming you didn't mind their 4817 limitations. 4818 4819 [5170bios]: http://www.minuszerodegrees.net/bios/bios.htm#5170 4820 4821 4822 ##### 4.11 - The IBM PC/XT 5162 Model 286 4823 4824 Besides the three PC/AT Motherboards types and the multiple PC/AT computer 4825 models based on them, there was another computer that is usually considered part 4826 of the IBM PC/AT series. In 1986, IBM launched the IBM PC/XT 5162 Model 286, 4827 which has a misleading name since it is actually based on the PC/AT platform. 4828 While the computer was fully PC/AT compatible, it had both minor and major 4829 differences that are relevant to the PC/AT section. 4830 4831 The PC/XT 5162 Motherboard was completely new. It had 8 I/O Channel Slots, 5 4832 were for 16 Bits cards and 3 for 8 Bits ones, a slight downgrade from the PC/AT 4833 (One of the 8 Bits slots had extra solder pads, as if it could have been a 16 4834 Bits slot). It measured 8.5' x 13.2', or 22 cm x 33.8 cm, making it slightly 4835 smaller than the PC/AT Type 2/3 Motherboard. This size is highly important 4836 because it would eventually become the basis for the Baby AT Motherboard Form 4837 Factor, making the PC/XT 5162 quite relevant in the history of the PC platform 4838 evolution. 4839 4840 The clock generation scheme of the PC/XT 5162 was very similar to the Type 1/2 4841 PC/AT Motherboards with the 286 @ 6 MHz, but there were two major differences: 4842 It had 0 Memory Wait States instead of 1, making it perform faster than these 4843 PC/ATs, albeit it was slower than the Type 3 with the 286 @ 8 MHz. It used 150 4844 ns DRAM chips like all the PC/AT models, pretty much confirming that IBM was a 4845 bit too conservative with the 1 Memory WS in the original 6 MHz PC/AT. 4846 4847 The other difference involves the entire 80287 FPU clocking subsystem. The 287 4848 FPU could be configured to run at either the reference clock input or 1/3 of it. 4849 In the PC/AT, it was hardwired to run at 1/3, whereas in the PC/XT 5162 it 4850 instead was hardwired to run at clock input speed. However, it is not wired to 4851 the 82284 CG 12 MHz CLK line like it was in the PC/AT but to the secondary 8284A 4852 CG 4.77 MHz CLK line that in the PC/AT went unused, so 4.77 MHz is its effective 4853 speed. This also made the FPU to run fully asynchronous from the CPU. Not sure 4854 why IBM made these changes at all. 4855 4856 The PC/XT 5162 Motherboard could max out the 640 KiB of Conventional Memory 4857 without requiring Memory expansion cards. However, it had a completely 4858 asymmetrical arrangement: It had two Banks, the first was 512 KiB in size and 4859 the second one a smaller 128 KiB. Filling the first Bank was mandatory, but the 4860 latter could be disabled via a Jumper. The 512 KiB RAM Bank wasn't made out of 4861 discrete DRAM chips on the Motherboard like all the previous computers, instead, 4862 it had two 30 Pin SIMM Slots, each fitted with a 9 Bits 256 KiB SIMM Memory 4863 Module. These SIMMs were made out of nine 1-Bit 32 KiB DRAM chips each and 4864 included Parity. The second 128 KiB RAM Bank was made out of DRAM chips socketed 4865 on the Motherboard, as usual. However, this Bank had an even more asymmetrical 4866 arrangement, as it used four 4-Bit 32 KiB DRAM chips plus two 1-Bit 32 KiB DRAM 4867 chips for Parity (The sum of that was 18 Bits for the Data Bus, as expected, but 4868 technically most of the RAM in the Parity chips should have gone unused since 4869 they were much bigger than what was required). 4870 4871 I'm not sure if the PC/XT 5162 was the first PC based computer that had its RAM 4872 installed in SIMM Slots. At first, SIMMs were usually seen in very high density 4873 Memory expansion cards only, they took a year or two before being used in the 4874 Motherboards themselves. In the early days, SIMMs used the same DRAM chips that 4875 used to be socketed on the Motherboard itself but in a much more convenient 4876 format as they took far less Motherboard space than a ton of individual DRAM 4877 chips sockets, yet there wasn't any actual functional difference. 4878 4879 The two empty ROM sockets for the optional ROM Bank than the PC/AT Motherboards 4880 had isn't present on the PC/XT 5162 Motherboard. I'm not sure if the 64 KiB 4881 range in the PC/AT Memory Map reserved for those are free in the PC/XT 5162 4882 Memory Map or not. At least a source I recall said that the firmware and IBM 4883 Cassette BASIC are mirrored there (Perhaps unintentionally caused by partial 4884 address decoding...), so that range may not be free at all. 4885 4886 The IBM PC/XT 5162 used an actual PC/XT 5160 Case, so it had no Control Panel 4887 (The two LEDs and the Keylock). I'm not sure if the Motherboard had a leftover 4888 header for the Power LED or if it was removed since it wasn't going to be used 4889 anyway. However, the PC/XT 5162 came by default with the same HDC + FDC 4890 multi-function card that was used by the IBM PC/ATs, so the header for the HDD 4891 LED should still be there, but unused. 4892 4893 4894 ##### 4.12 - 286 era DOS Memory Management: Extended Memory, Expanded Memory (EMS API), and the XMS API 4895 4896 During the lifetime of the IBM PC/AT, a major issue arose that multiple 4897 generations of users had to learn how to deal with: Everything related to DOS 4898 Memory Management. From the launch of the IBM PC/AT onwards, this topic would 4899 slowly become a convoluted mess, particularly during the 386 era, where its 4900 complexity skyrocketed due to new Processor features that added even more 4901 workarounds to deal with this problem. Knowledge of DOS Memory Management 4902 techniques would remain necessary even by the late 90's, as people still used 4903 DOS for applications and games that weren't Windows 9x friendly (Or simply due 4904 to better performance, since executing them from within a Windows environment 4905 added a not insignificant overhead). It took until Windows XP became the most 4906 common mainstream OS for this topic to stop being relevant, after which it would 4907 be almost completely forgotten. 4908 4909 The issues with 286 DOS Memory Management are directly related to the unusual 4910 requirements of the 80286 CPU to be able to use its entire Physical Address 4911 Space combined with the PC/AT platform idiosyncrasies, like its Memory Map. As 4912 you already know, the 80286 CPU used in the IBM PC/AT had a 16 MiB Physical 4913 Memory Address Space, a significant upgrade over the 1 MiB one of the 8088 CPU 4914 used in the IBM PC. While the extra 15 MiB that the IBM PC/AT could address 4915 should have been good enough for several years, in order to use them as Intel 4916 intended when it designed the 80286, software had to be running within Protected 4917 Mode, whose usage had a multitude of cons that were already detailed. 4918 4919 To recap protected mode cons, first of all, software that relied on it couldn't 4920 run on the IBM PC 8088 at all, significantly reducing the amount of potential 4921 customers of such products unless the developer also provided a dedicated IBM PC 4922 port of its application, something that would require more developing resources 4923 than just making a single Real Mode version that could easily run in both 4924 platforms. Moreover, the mainstream software ecosystem made use of the DOS API 4925 from PC DOS to read and write to the FAT File System, and the BIOS Services from 4926 the BIOS firmware to do the role of a Hardware Driver, both of which were usable 4927 only from within Real Mode. A developer that wanted to make a protected mode 4928 application without getting support from a protected mode environment would have 4929 had to reimplement absolutely everything from scratch, similar to a PC Booter 4930 for the IBM PC but worse, since those could at least rely on the BIOS Services 4931 to deal with the Hardware. Albeit it was still possible to make a protected mode 4932 DOS application that could use the DOS API and the BIOS Services by using the 4933 286 reset hack to return to Real Mode, it was slow and cumbersome to do so (Keep 4934 in mind that while Hardware support for both Unreal Mode and resetting the 80286 4935 CPU via a Triple Fault was present in the IBM PC/AT since Day One, the 4936 techniques to use them were not discovered or mastered until much later. Unreal 4937 Mode required knowledge about how to use LOADALL, something that only privileged 4938 developers had access to during the 80's, making it effectively unavailable. The 4939 286 Triple Fault technique was discovered and patented early on by Microsoft, so 4940 even if public it was risky to use, yet chances are that most developers didn't 4941 knew about it back then, either). 4942 4943 On top of the Physical Memory Address Space you have the Memory Map, which 4944 defines how that address space is intended to be assigned to try to cover all 4945 the stuff that has addressing needs, like RAM and ROM memories, the memory of 4946 other Devices to use as MMIO, etc. For backwards compatibility reasons, the IBM 4947 PC/AT Memory Map had to be built on top of the original IBM PC one. 4948 4949 To recap the basic IBM PC Memory Map, IBM defined it by subdividing the 8088 1 4950 MiB Physical Address Space into two segments: A 640 KiB range between 0 to 639 4951 KiB known as Conventional Memory, intended to be used exclusively for system 4952 RAM, and a 384 KiB range between 640 KiB to 1023 KiB known as UMA (Upper Memory 4953 Area), intended for everything else, including ROMs like the Motherboard BIOS 4954 firmware, Option ROMs in expansion cards, and MMIO, like the Video Card 4955 framebuffer. For the IBM PC/AT Memory Map, in addition to keeping what was in 4956 the IBM PC one, IBM defined that the new address space above 1024 KiB and up to 4957 15 MiB would be known as Extended Memory, intended to be used for more system 4958 RAM (The last 128 KiB were used to mirror the Motherboard ROM, and the remaining 4959 896 KiB between the end of the Extended Memory and the beginning of the ROM 4960 mirror were left undefined. But these are not important). An issue caused by 4961 this arrangement is that the system RAM was no longer a single continuous chunk, 4962 since the presence of the UMA between the Conventional Memory and the Extended 4963 Memory left a memory hole, so software that directly interacted with the 4964 physical memory had to be aware of that (This is something that Virtual Memory 4965 greatly simplifies, as it abstracts these details from user applications). 4966 4967 Some years after the IBM PC/AT release, the first 64 KiB of the Extended Memory 4968 (1024 KiB to 1087 KiB) would become known as the HMA (High Memory Area), as it 4969 was different from the rest of it since the HMA could be accessed from within 4970 Real Mode (Related to the 80286 not behaving like the 8088 with its Address 4971 Wraparound quirk, which the A20 Gate hack worked around), but otherwise it was 4972 simply a subset of the Extended Memory. So far, so good, as this covers the 4973 basic IBM PC/AT Memory Map. 4974 4975 Due to the complexity of actually putting the extra address space to use, for a 4976 few years after the IBM PC/AT release, only niche high end applications that 4977 really required more memory took care of all the mandatory hassle to use the RAM 4978 mapped into the Extended Memory address range, the vast majority of mainstream 4979 software just keep using pure Real Mode and PC DOS as the lowest common 4980 denominators, both for full IBM PC compatibility and to ease development. This 4981 means that in a PC/AT, while mainstream applications could execute faster, they 4982 were still constrained by the 640 KiB RAM limit of the Conventional Memory, 4983 which would quickly become a severe limitation. I think that during the years 4984 that the IBM PC/AT was top of the line, the only popular mainstream DOS software 4985 that used Extended Memory were RAMDisks, as the virtual disk contents could be 4986 stored into the Extended Memory with little risk than it would get thrashed by 4987 PC DOS or another application since these were mostly Real Mode only (Remember 4988 that there was effectively no Memory Protection). The RAMDisk Driver resided in 4989 Conventional Memory and took care of switching Processor modes to access the 4990 Extended Memory when requested to do so, then restored the Processor state. 4991 4992 Perhaps one of IBM most notorious shortsightedness events was precisely that it 4993 didn't took seriously enough the importance of pushing for a protected mode OS 4994 for the IBM PC/AT as soon as possible, as transitioning to a protected mode 4995 software ecosystem early on could have saved us from the chaos that was DOS 4996 Memory Management during the 90's. The big irony is that just a few months after 4997 the IBM PC/AT release, a viable protected mode OS appeared in the form of IBM PC 4998 XENIX, a Xenix port for the IBM PC/AT that had FAT File System support, fully 4999 replacing any need for the DOS API, the BIOS Services, or Real Mode itself for 5000 any new applications that targeted it. However, since Xenix was aimed at serious 5001 enterprise users, it was priced far above PC DOS, which means that applications 5002 for Xenix would have an even lower potential customer base than an IBM PC/AT 5003 running PC DOS. In addition to that, Xenix itself also required more resources 5004 than PC DOS, so users that didn't made use of any of Xenix advanced features 5005 would certainly have their applications running slower than on PC DOS due to the 5006 higher OS overhead for no tangible benefit. 5007 5008 I believe that such transition could have been done anyways since during the 5009 middle of the 80's, IBM was powerful enough that it should have been possible 5010 for it to force the entire PC/AT software ecosystem to adopt either Xenix or 5011 another new protected mode OS even if it was at the cost of compatibility, like 5012 Microsoft did when it pushed Windows XP, which during the early days had a lot 5013 of compatibility issues with Windows 9x and DOS software. As even at the 1984 5014 release date of the IBM PC/AT you could partition a HD to install multiple OSes, 5015 power users could have survived the transition perhaps with no compatibility 5016 loss, as they could have been able to Dual Boot both PC DOS when requiring an 5017 IBM PC compatible environment and a protected mode OS for PC/AT applications. 5018 Sadly, IBM didn't made any serious attempt to transition the software ecosystem 5019 to protected mode until 1987 with OS/2, but by then, it was already too late. 5020 With application memory requirements increasing yet no mainstream protected mode 5021 OS in the horizon to use the enhanced 80286 addressing capabilities, it seems 5022 logical to expect that such situation created the need for stopgap measures that 5023 allowed applications to use more memory from within PC DOS. The problem relies 5024 on the fact that those stopgap measures lasted for far longer than expected, and 5025 directly increased PC DOS longevity at the detriment of better thought 5026 alternatives that fixed the issue directly from its root... 5027 5028 In 1985, Lotus, Intel and Microsoft teamed up to create a hack that allowed for 5029 more memory to be used from within Real Mode, also done in a way that 5030 conveniently made it usable in the original IBM PC. This hack was known as 5031 Expanded Memory (Not to be confused with Extended Memory). It also had an 5032 associated API, EMS (Expanded Memory Specification). 5033 5034 Expanded Memory was physically implemented as a new type of Memory expansion 5035 cards that had a Memory Controller capable of doing Bank Switching, similar in 5036 nature to the one in EGA Video Cards. Initially, these cards had 256 KiB, 512 5037 KiB or 1 MiB capacities. The way that Expanded Memory worked was by reserving an 5038 unused 64 KiB address range in the UMA to use as a window, then mapping to it a 5039 specific 64 KiB block of RAM, known as a Page Frame, from the Expanded Memory 5040 card. For example, a card with 512 KiB Expanded Memory would be partitioned into 5041 8 Page Frames, each with 64 KiB RAM. By switching which Page Frame was visible 5042 in the UMA window at a given time, it was effectively possible to use more RAM, 5043 although at a notorious complexity and overhead cost, since the application had 5044 to keep track about which Page Frame had which contents, then switch on demand 5045 between them. A later generation of Expanded Memory cards that came with a new 5046 version of the EMS specification allowed to subdivide a compliant card RAM into 5047 16 KiB blocks, so that four Page Frames could be mapped at a given time in the 5048 UMA window instead of only a single 64 KiB one. The reserved 64 KiB window for 5049 Expanded Memory located somewhere in the UMA would become a common sight in 5050 future PC and PC/AT Memory Maps, as it became ubiquitous enough. 5051 5052 Since in order to properly do Bank Switching the Memory Controller had to be 5053 managed, the Expanded Memory cards always required Drivers. To hide this from 5054 the application developers, an API was defined, the previously mentioned EMS, 5055 which allowed the software developer to rely on it to access the Expanded Memory 5056 instead of having to manually program the Memory Controllers themselves. This 5057 was quite important, as there were multiple Expanded Memory cards manufacturers 5058 whose Hardware implementations were different, so using the EMS API provided a 5059 very convenient Hardware Abstraction Layer so that applications didn't had to 5060 include Drivers for each card. 5061 5062 Sometime around early 1988, Lotus, Intel and Microsoft teamed up again, now also 5063 along with AST, to develop another API: XMS (Extended Memory Specification). XMS 5064 was conceptually different than EMS since it didn't required to be paired with 5065 new Hardware, instead, it focused on managing the existing Extended Memory via a 5066 new Driver known as the XMM (Extended Memory Manager). 5067 5068 What the XMS API did was merely to standardized how a Real Mode DOS application 5069 could use the Extended Memory by delegating the responsibility of switching 5070 Processor modes, exchange data between the Conventional Memory and the Extended 5071 Memory, then restoring the Processor state, to the XMM Driver. Basically, an 5072 application that supported XMS just had to use its API to ask the XMM Driver to 5073 move data to or from the Extended Memory, then let it take care of everything 5074 else, significantly easing application development since the developer had no 5075 need to meddle with all the mandatory hassles required to access the Extended 5076 Memory from within a mostly Real Mode environment. Executable code was expected 5077 to remain in the Conventional Memory since the XMS API just moved RAM contents 5078 around, leaving the Extended Memory as a sort of secondary storage just for 5079 data, so the 640 KiB Conventional Memory limit was still important. 5080 5081 The first XMM Driver was probably Microsoft HIMEM.SYS, which also doubled as an 5082 A20 Handler. The first version of HIMEM.SYS that was distributed should have 5083 been the one included in Windows 2.1 (Windows/286), released around the middle 5084 of 1988. In 1991, a newer version of HIMEM.SYS was included in PC DOS/MS-DOS 5085 5.0, so that XMS was usable in a pure DOS environment (Keep in mind that while 5086 it worked on the original 1984 IBM PC/AT, it isn't really era appropriate, as by 5087 then, these early 286s were rather ancient). A very interesting detail is that 5088 the Microsoft XMM Driver was much faster than it should have been under normal 5089 circumstances, and that is because Microsoft used all its knowledge about the 5090 Intel Processors undocumented functions to cheat. Many versions of HIMEM.SYS 5091 relied on [LOADALL and its Unreal Mode][unrealMode] to access the Extended 5092 Memory, completely bypassing the standard way of doing things that included 5093 entering protected mode then resetting the Processor. Basically, any application 5094 that used XMS under DOS in a 286 based computer transparently benefited from the 5095 performance hacks involving Unreal Mode. 5096 5097 [unrealMode]:(http://www.os2museum.com/wp/himem-sys-unreal-mode-and-loadall/) 5098 5099 5100 5 - The Turbo ATs and the first Chipsets 5101 ---------------------------------------- 5102 5103 To surprise of no one, clone manufacturers didn't rest in their laurels. Right 5104 after mastering how to clone the PC, they began to work on how to clone the 5105 PC/AT. The early PC/AT clones should either slightly predate or be 5106 contemporaries of the earliest Turbo XTs (Not PC-likes), and be both faster and 5107 more expensive than them. The PC/AT clones would almost immediately evolve into 5108 Turbo ATs (A term that is rarely used), which were faster than the system that 5109 they were to be compatible with. 5110 5111 Turbo ATs would eventually further evolve with the introduction of the first 5112 chipsets, as they would dramatically alter the platform topology. This also 5113 highly reduced the number of variety in original Motherboard designs, as the 5114 Chipset based Motherboards tended to consolidate their feature set around a 5115 specific Chipset characteristics, thus a vendor had to really go out of its way 5116 to do something different enough. Turbo XTs would eventually get chipsets, too, 5117 but apparently slightly later. 5118 5119 5120 ##### 5.1 - First generation PC/AT clones and Turbo ATs, integrated BIOS Setup 5121 5122 The first PC/AT clone was the [Kaypro 286i][kaypro286i], announced at the end of 5123 February 1985 and available in March, a mere 7 months after the first IBM PC/AT 5124 release. It had some redeeming features over the IBM PC/AT, like supporting 640 5125 KiB RAM mapped as Conventional Memory installed in the Motherboard itself as in 5126 the later IBM PC/XT 5162 instead of just 512 KiB as in the PC/AT. Note that the 5127 80286 @ 6 MHz was of a cheaper soldered variant instead of a socketed one, but 5128 otherwise performed the same. Other than that, it didn't did anything different 5129 enough to be remarkable. 5130 5131 In July 1985, 4 months after the Kaypro 286i and almost a year after the 5132 original IBM PC/AT, Compaq released its [DeskPro 286][deskpro286]. What made it 5133 interesting, is that it had an Intel 80286 CPU that could run at two operating 5134 modes, one known as the Slow Mode that worked at 6 MHz for full compatibility 5135 with the original IBM PC/AT, and a Fast Mode that ran at 8 MHz (Note that it 5136 predates IBM own 8 MHz PC/AT model). Unlike most Turbo XT systems, the DeskPro 5137 286 didn't had a Turbo Button, instead, it relied on either a Key Combination, 5138 Ctrl + Alt + \\ or the MODE SPEED command in MS-DOS. Since it explicitly 5139 mentions two different clock speeds, I suppose that it has two different Crystal 5140 Oscillators like Turbo XTs and switches between them on demand. As I'm not aware 5141 whenever there was an earlier system with such feature, the DeskPro 286 may be 5142 the first PC/AT compatible that can be classified as a Turbo AT. 5143 5144 A major difference between Turbo XTs and Turbo ATs is what they were intended to 5145 be compatible with. As you already know, a lot of early PC software was tuned 5146 expecting only an IBM PC with its 8088 CPU @ 4.77 MHz, thus they didn't work as 5147 intended in faster computers like Turbo XTs or PC/AT based ones. This exact 5148 issue also happened with some PC/AT software, as at least a few applications 5149 were tuned specifically for the IBM PC/AT (Usually related to those pesky copy 5150 protection schemes...), failing to work properly in anything faster than it. 5151 However, when it comes to PC/AT software, there was an additional hurdle: Which 5152 IBM PC/AT version an application was tuned for? Whereas both the PC and PC/XT 5153 had identical performance, there was an early 6 MHz PC/AT and a later 8 MHz 5154 model, plus the PC/XT 5162 that ran at 6 MHz but with 0 Memory WS. The 5155 performance of these three PC/AT platform variants was different. As all PC/AT 5156 software released before 1986 would only be expecting the original 6 MHz IBM 5157 PC/AT model, it was possible that some of these very early applications would 5158 not work as intended even in IBM own PC/AT variants. Later PC/AT compatible 5159 manufacturers noticed this and implemented user configurable clock speed and 5160 Memory WS, so that their PC/AT clone computers could perfectly match any of the 5161 three IBM PC/AT variants performance levels, resulting in the Turbo ATs being at 5162 times more compatible with older PC/AT software and expansion cards than the 5163 later IBM PC/AT 8 MHz models themselves, as IBM never bothered to implement a 5164 feature similar to Turbo. 5165 5166 An example of a 1987 Turbo AT that covered the full PC/AT compatibility spectrum 5167 is the [AST Premium/286][astPremium286], which had an 80286 CPU that could run 5168 at either 6 MHz @ 1 Memory WS, 8 MHz @ 2 Memory WS (Albeit based on a page that 5169 list the Motherboard Jumpers it seems that it is actually 8 MHz @ 1 Memory WS, 5170 which makes sense as otherwise it wouldn't match the 8 MHz IBM PC/AT), or 10 MHz 5171 @ 0 Memory WS. Getting a 286 to run at 10 MHz @ 0 Memory WS was quite an 5172 accomplishment, and it made it slightly faster than another contemporary 286 5173 PC/AT compatible system running at 12 MHz @ 1 Memory WS. However, it required 5174 new and expensive 100 ns DRAM chips mounted in a custom, very long Memory 5175 expansion card that used a proprietary connector, known as FASTram, that was 5176 pretty much a standard 16 Bits I/O Channel Slot followed with another separate 5177 section at the end of the card (This is roughly the same thing that VESA Local 5178 Bus Slots would do years later). The Motherboard had two slots with the FASTram 5179 extension, so you could use two of these custom Memory expansion cards. Finally, 5180 like the Compaq DeskPro 286, the AST Premium/286 didn't had a Turbo Button, it 5181 changed clock speed via a Key Combination, yet for convenience, the Case Front 5182 Panel had 3 LEDs to indicate at which clock speed the Processor was currently 5183 running at (Changing Memory Wait States required to deal with Motherboard 5184 Jumpers). You can read the experiences and bragging rights of the AST 5185 Premium/286 Motherboard designer [here][turboButton], at The AST 286 Premium 5186 section (Worthy of mention is that it is the same guy that wrote the Assembly 5187 code of the original Phoenix BIOS). 5188 5189 Something that I don't remember hearing about is of any PC/AT compatible that 5190 could go lower than 6 MHz to try to have better compatibility with early PC 5191 software, as a 286 @ 3 MHz may have been not much above the PC 8088 @ 4.77 MHz 5192 performance level and thus could have been far more usable for things like 5193 games. Considering that, as explained elsewhere, perfect PC compatibility was 5194 impossible due to the 286 not being cycle accurate with the 8088, it makes sense 5195 that PC/AT compatible manufacturers didn't bothered adding a way to slow down 5196 their newer flagships systems since it was a lost cause anyways (Ironically, at 5197 least an 80386 based system, the [Dell System 310][dellSystem310], could be 5198 underclocked to 4.77 MHz. Perhaps Dell added such option because, as you already 5199 know, the 4.77 MHz Frequency is easy to derive). Actually, compatibility with 5200 early PC software would crumble rather fast after the PC/AT generation, as no 5201 one going for the high end segment of the PC/AT compatible market would event 5202 attempt bothering to try to support these any longer. 5203 5204 The good thing is that at the end of the PC/AT generation (More likely after the 5205 release of the 8 MHz IBM PC/AT), there was a paradigm shift about how the 5206 software developers took care of implementing any type of timers or timings in 5207 their applications. Instead of outright relying on speed loops or any other bad 5208 practices that worked at that moment but killed forward compatibility, software 5209 developers began to be aware that they needed to be more careful as to not make 5210 things that are hardly usable on faster computers, as by that point it had 5211 already become quite obvious that the PC/AT platform was going to have faster 5212 versions of the same base platform, or successors that were going to be mostly 5213 backwards compatible. This paradigm shift is why late 80' software is much less 5214 problematic on faster computers from the late 90's compared to many of the early 5215 software that pretty much forced you to use a PC class computer. There are some 5216 notorious exceptions like the 1990 Origin Systems game Wing Commander, which 5217 could play faster than intended on some era accurate 80386 systems, depending on 5218 the Processor clock speed and external Cache configuration. 5219 5220 In general, timing issues and bad programming practices would still be present 5221 for a long time, but they took many years to manifest instead of the very next 5222 computer model, as happened during the early days. Perhaps the most famous 5223 timing issue of the 90's was the Windows Protection Error when trying to boot 5224 Windows 95 on [AMD Processors over 350 MHz][amd350mhz], as it was a pest for 5225 mainstream users that in many cases forced the upgrade to Windows 98. This bug 5226 was recently [researched by OS/2 Museum][fastMachineWin9xCrashes], which noticed 5227 both a divide by zero and division overflow bugs in a speed loop code that 5228 relied in the LOOP instruction, then came to the conclusion than the reason why 5229 AMD Processors were affected but Intel ones were not is because on an era 5230 accurate AMD K6-II, the LOOP instruction executed with a mere 1 clock cycle 5231 latency, whereas on an Intel Pentium Pro/II/III it took 6 clock cycles, thus it 5232 would trigger the bug at a much lower clock speed than otherwise (In much later 5233 Intel Processors that can finally match the K6-II speed on that particular 5234 routine, the bug is reproducible, too). Every now and then other ancient timing 5235 issues pop out when trying to run older pieces of software in newer computers, 5236 like [this early Linux SCSI Controller Driver][timingIsHard], but none of these 5237 are as notorious as the Windows 95 one, nor their cause is as esoteric in 5238 nature. 5239 5240 Finally, while I'm not sure about the exact time frame (It could be as late as 5241 1988), an extremely important feature that everyone considers a given nowadays 5242 began to appear in either PC/AT clones or Turbo ATs. Third party BIOS firmware 5243 vendors like Phoenix Technologies and AMI decided to integrate the BIOS 5244 configuration application into the firmware ROM image itself, giving birth to 5245 the modern BIOS Setup. By integrating this critical tool, if you ever wanted to 5246 configure the BIOS Settings, you could use a Key Combination during POST to 5247 enter the BIOS Setup, without having to worry about preserving the Diskette that 5248 came with the system, which could be damaged or become lost. Some firmware 5249 vendors even offered third party BIOSes for the original IBM PC/AT that included 5250 an integrated BIOS Setup, which were delivered as a pair of ROM chips to replace 5251 the standard IBM ones (These firmware images should also work in 100% compatible 5252 PC/AT clones, too). At the time this integration was a minor convenience, but as 5253 the PC/AT compatible platforms began to drastically differ between them and thus 5254 required a BIOS configuration tool appropriately customized for that computer or 5255 Motherboard revision, having the BIOS Setup integrated was becoming a necessity. 5256 Some time later, during the IBM PS/2 generation time frame, IBM would learn the 5257 hard way that having a multitude of different configuration Diskettes was a 5258 logistical nightmare... 5259 5260 [kaypro286i]: https://books.google.com/books?id=vVNXxWE6BX0C&pg=PA35&lpg=PA35#v=onepage&q&f=false 5261 [deskpro286]: https://books.google.com/books?id=Cy8EAAAAMBAJ&lpg=PA40&pg=PA39#v=onepage&q&f=false 5262 [astPremium286]: https://books.google.com/books?id=1TAEAAAAMBAJ&pg=PA53&lpg=PA53#v=onepage&q&f=false 5263 [turboButton]: https://www.quora.com/History-of-Computing-What-did-the-turbo-button-on-early-90s-PCs-do/answer/Ira-J-Perlow?share=1 5264 [dellSystem310]: https://books.google.com/books?id=Eq0wALnyM_MC&lpg=PA36&pg=PA35#v=onepage&q&f=false 5265 [amd350mhz]: https://jeffpar.github.io/kbarchive/kb/192/Q192841/ 5266 [fastMachineWin9xCrashes]: http://www.os2museum.com/wp/those-win9x-crashes-on-fast-machines/ 5267 [timingIsHard]: http://www.os2museum.com/wp/timing-in-software-is-too-hard/ 5268 5269 5270 ##### 5.2 - The first PC/AT Chipset, C&T CS8220. Feature overview and Clock Generation 5271 5272 A major breakthrough that revolutionized the computer industry was the 5273 introduction of the Chipset. Up to this point, it seems that the only one that 5274 had to deal with fierce competence was IBM. Whereas clone computer manufacturers 5275 had been trying to differentiate themselves from IBM by providing computers that 5276 were faster or had more features than IBM ones while remaining as close as 5277 possible to 100% compatible with them, most of the semiconductor industry had 5278 been happy enough merely by being used as second sources that manufactured Intel 5279 designed chips (Having second sources was a requirement that IBM imposed to 5280 Intel. At least until Intel got big enough that it didn't need those second 5281 sources any longer, thus Intel tried to get rid of them, but that is part of 5282 another story...). So far, NEC was the only one that I'm aware of that early on 5283 did successfully improve an Intel design with its V20/V30 CPUs. 5284 5285 Eventually, some smaller semiconductor designers wanted to create their own 5286 chips that could compete with Intel ones yet remain compatible with them. One of 5287 their ideas was to integrate as much as possible of the PC/AT platform 5288 functionality into a reduced set of specialized chips intended to be always used 5289 together, which is what in our modern days we known as a Chipset (Not to be 5290 confused with the original chip set meaning, which is very similar, just that 5291 the individual chips were far more generalist). The Chipset effectively began 5292 the trend of integration that culminated in much more affordable computers, as 5293 providing the same functionality of a multitude of discrete generalist chips 5294 with far less parts allowed for smaller, simpler, and thus cheaper Motherboards. 5295 5296 The Chipset era also began the consolidation of the PC/AT platform. Instead of 5297 having a bunch of general purpose support chips that in the IBM PC/AT behaved in 5298 a certain way due to how they were wired, you had a smaller set of chips 5299 intended to reproduce the generalist chips behavior exactly as implemented in 5300 the IBM PC/AT itself, at the cost of leaving no room for further additions. For 5301 example, the Intel 8259A PIC supported being cascaded with up to seven other 5302 8259A, for a total of eight of them interconnected together. The PC used only 5303 one PIC, and its successor, the PC/AT, used two, but there was nothing stopping 5304 a third party computer manufacturer to make its own superset of the PC/AT 5305 platform by using three or more and still be mostly backwards compatible. A 5306 Chipset that was intended to reproduce the IBM PC/AT functionality should behave 5307 in the same way that its two cascaded 8259A would, but it was impossible to wire 5308 to a Chipset a third standalone 8259A because they weren't intended to be 5309 interfaced with more support chips. As chipsets invaded the market, they pretty 5310 much cemented the limits of the PC/AT platform, since it was not possible to 5311 extend a highly integrated Chipset by adding individual support chips. Thus, the 5312 evolution of the PC platform as a whole became dominated by which features got 5313 into chipsets and which did not. 5314 5315 The very first Chipset was the C&T (Chips & Technologies) CS8220, released in 5316 1986, and whose datasheet you can see [here][cs8220]. The CS8220 Chipset was 5317 composed of 5 parts: The 82C201 System Controller (Which also had a better 5318 binned variant for 10 MHz Bus operation, the 82C201-10), the 82C202 RAM/ROM 5319 Decoder, and the 82C203, 82C204 and 82C205 specialized Buffer chips. These chips 5320 integrated most of the IBM PC/AT generalist glue chips functionality (Including 5321 those that made the primitive Memory Controller, its Refresh Logic, and the A20 5322 Gate) but few of the Intel support chips, as from those, the CS8220 only 5323 replaced the two Clock Generators, namely the Intel 82284 and 8284A, and the 5324 Intel 82288 Bus Controller. Nonetheless, that was enough to allow the C&T CS8220 5325 Chipset to provide almost all of the platform different reference clocks, to 5326 interface directly with the Intel 80286 CPU and the Intel 80287 FPU, and to do 5327 the role of Memory Controller for the system RAM and ROM memories. While by 5328 itself the CS8220 didn't provide the full PC/AT platform functionality, it could 5329 do so if coupled with the standard set of Intel discrete support chips. So far, 5330 the CS8220 didn't add anything revolutionary from a feature or performance 5331 standpoint, yet the way that it simplified Motherboard designs allowed for 5332 easier implementations of the IBM PC/AT platform, making it the first step of 5333 bringing PC/AT compatible computers to the masses. 5334 5335 As can be seen in the first page of the CS8220 datasheet, there is a Block 5336 Diagram of a reference 286 platform based on it. The Block Diagram showcases 5337 four Buses: The Local Bus for the Processor, the Memory Bus for the Motherboard 5338 RAM and ROM memory, the System Bus for expansion cards, and the Peripheral Bus 5339 for support chips. The Block Diagram also makes clear that each chip had a very 5340 specific role, as the Data, Address and Control components of each Bus were 5341 dealt with by a specialized chip each (Actually, the Address Bus was divided 5342 into the lower 16 Bits and the upper 8 Bits, requiring two chips). Since the 5343 System Bus and the Peripheral Bus of the CS8220 effectively fulfilled almost the 5344 same roles than the I/O Channel Bus and the External I/O Channel Bus of the 5345 original IBM PC/AT, respectively, the Bus topology of both platforms are 5346 actually directly comparable, as the IBM PC/AT also had 4 Buses. When comparing 5347 both sets of Buses, is rather easy to notice that the main difference is that in 5348 the CS8220, the Buses are much more clearly defined due to the Chipset acting as 5349 a formal separator that makes them to look as individual entities instead of 5350 mere transparent extensions of the Local Bus that go though a multitude of glue 5351 chips. 5352 5353 **Local Bus:** In the Local Bus of a CS8220 based platform, you have the 286 CPU 5354 and the 287 FPU, along with the Chipset 5 components. The Processor no longer 5355 interfaces directly with any other system device, for everything but the FPU now 5356 it has to go always though the Chipset, which centralizes everything. 5357 5358 **Memory Bus:** The Memory Bus still pretty much works in the same way as in the 5359 original IBM PC/AT, including being wider than the Processor Data Bus to include 5360 Parity support, just that now the Chipset includes specialized logic that can be 5361 formally called a Memory Controller instead of merely using generalist chips 5362 that were wired together to serve that role. However, a major difference is that 5363 now the Memory Controller is closer to the Processor since there are far less 5364 glue chips that it has to go though to get to it, which is quite convenient, as 5365 shorter physical distances potentially allows for lower latencies and higher 5366 operating Frequencies. For comparison, if the Processor wanted to read or write 5367 to the system RAM located on the Motherboard, in the original IBM PC/AT the 5368 commands and data had to travel from CPU -> Local Bus -> Buffer Chips -> I/O 5369 Channel Bus -> Memory Controller -> Memory Bus -> DRAM Chips, whereas in a 5370 CS8220 based platform it had to do one hop less, CPU -> Local Bus -> 5371 Chipset/Memory Controller -> Memory Bus -> DRAM Chips. 5372 5373 **System Bus:** Albeit it is not notorious at first glance, the new System Bus 5374 saw perhaps the most significant changes when compared to its predecessor. The 5375 most visible difference is that whereas in the IBM PC/AT almost everything had 5376 to go though the I/O Channel Bus, in Chipset based platforms, the Chipset is the 5377 one that centralizes the Buses and takes care of interfacing them together, 5378 leaving the System Bus relegated to being just a specialized Bus for expansion 5379 cards. However, as a consequence from that, the System Bus is no longer a direct 5380 extension of the Local Bus, it is now a completely separate entity that can have 5381 its own protocol. What this means is that if a new x86 Processor type changed 5382 the Local Bus protocol (Which eventually happened a multitude of times), an 5383 appropriate Chipset could easily take care of interfacing both Buses by 5384 translating between the new and old protocols, so that it would still be 5385 possible to use I/O Channel Cards that were designed to use the 8088 or 80286 5386 Local Bus protocol in a much newer platform. This effectively began to pave the 5387 way for fully decoupling whatever Bus the expansion cards used from the one that 5388 the Processor used. 5389 5390 **Peripheral Bus:** Finally, the Peripheral Bus had the same duties than the 5391 external I/O Channel Bus as both were intended for support chips, but like the 5392 previously described Memory Bus, the Peripheral Bus and the support chips in it 5393 are closer to the Processor than the external I/O Channel Bus was due to a 5394 lesser amount of glue chips. Basically, whereas in the original IBM PC/AT 5395 communications from the Processor to a support chip like the PIC had to go from 5396 CPU -> Local Bus -> Buffer Chips -> I/O Channel Bus -> Buffer Chips -> External 5397 I/O Channel Bus -> PIC, in a CS8220 based platform it had one hop less, CPU -> 5398 Local Bus -> Chipset -> Peripheral Bus -> PIC. 5399 5400 What I'm not sure about is how the Chipset address decoding logic worked, as it 5401 is possible that a Chipset was hardwired to always map some address ranges to a 5402 specific Bus. For example, it could be possible that Memory Addresses under 640 5403 KiB and above 1024 KiB had a hardwired mapping to the RAM attached to the 5404 Chipset Memory Bus, conflicting with older memory expansion cards now located in 5405 the System Bus that wanted to map their own RAM into the Conventional Memory or 5406 Extended Memory ranges. It may explain why the Memory expansion cards vanished 5407 so quickly, as they may not have been compatible with some Chipset based 5408 platforms (I'm aware that at least the [Dell System 220][dellSystem220] 5409 computer, equipped with the C&T CS8220 Chipset successor, the CS8221, claimed to 5410 have upgradable RAM via an "AT-style memory card", so maybe Memory expansion 5411 cards did work. However, around the same time the SIMM format became popular, so 5412 it could have simply been that the convenience of populating the SIMM Slots on 5413 the Motherboard itself demolished DRAM sockets and Memory expansion cards in 5414 less than a generation...). 5415 5416 The CS8220 Chipset also simplified clock generation. As mentioned before, 5417 instead of requiring the Intel 82284 and 8284A Clock Generators as used in the 5418 original IBM PC/AT, the CS8220 had the 82C201 System Controller fulfilling the 5419 role of Clock Generator. The 82C201 had two reference clock inputs and six 5420 derived clock outputs. As inputs it used two Crystal Oscillators, one that 5421 provided a 14.31 MHz reference clock and another one that could be either 12, 16 5422 or 20 MHz (20 MHz would require the better binned variant of the chip, the 5423 82C201-10). The 14.31 MHz input was used as usual to derive two clock lines, the 5424 14.31 MHz OSC line for expansion slots and the 1.19 MHz OSC/12 line for the 5425 external 8254 PIT. The other crystal was used to derive four clock lines that 5426 would supply the reference clocks for everything else in the system. These lines 5427 were PROCCLK (Processor Clock), SYSCLK (System Clock), PCLK (Peripheral Clock) 5428 and DMACLK (DMA Clock). 5429 5430 The most important reference clock was the PROCCLK line, which ran at the same 5431 Frequency than the Crystal Oscillator and supplied the reference clock of both 5432 the 286 CPU and 287 FPU. As you already know, the 286 CPU internally halves the 5433 input clock, and the 287 FPU typically runs at one third the input (It can also 5434 run at the clock speed of the input clock, but as far that I know, only the IBM 5435 PC/XT 5162 used it that way), so assuming a 20 MHz crystal, the effective 5436 operating clock speed for them would be 286 @ 10 MHz and 287 @ 6.66 MHz. The 5437 SYSCLK and PCLK lines ran at half the PROCCLK speed, with the difference between 5438 them being that SYSCLK was synchronized to the Processor clock cycles while PCLK 5439 seems to be asynchronous (I'm not sure about the practical difference). SYSCLK 5440 was used for the CLK line of the expansion slots, and PCLK was used as the input 5441 line for the support chips, except the DMACs and the RTC. Finally, the DMACLK 5442 line ran at half the SYSCLK speed (Effectively 1/4 PROCCLK) and was used solely 5443 by the pair of Intel 8237A DMACs. Assuming a 20 MHz crystal, the DMACs would be 5444 running @ 5 MHz, which is the clock speed than the highest rated Intel 8237A 5445 DMACs could officially run at. 5446 5447 Missing from the CS8220 Chipset clock generation scheme is everything related to 5448 the RTC, which includes the 32.768 KHz Crystal Oscillator and its Clock 5449 Generator. These should still have been discrete parts. 5450 5451 Soon after the CS8220 Chipset, C&T launched the [82C206 IPC][82C206IPC] 5452 (Integrated Peripherals Controller). The 82C206 IPC combined the functions of 5453 the two Intel 8237A DMACs, the two 8259A PICs, the 8254 PIT, and the Motorola 5454 MC146818 RTC (Including the spare RTC SRAM) into a single chip. All these 5455 account for almost the entire heart of the PC/AT platform, the missing part is 5456 just the Intel 8042 Microcontroller. The 82C206 could be used either as a 5457 standalone chip or paired on top of a CS8220 platform, in both cases as 5458 replacement of the mentioned discrete support chips. A curious thing about the 5459 82C206 is that some Internet sources (Mostly Wikipedia) claim that it integrates 5460 the functions of the Intel 82284 and 8284A Clock Generators and of the Intel 5461 82288 Bus Controller too, but those are nowhere to be found in the datasheet. In 5462 the CS8220 Chipset, these functions are performed by the 82C201 and 82C202 5463 components, respectively. 5464 5465 From the clock scheme perspective, the 82C206 had two reference clock inputs, 5466 one that had to be half the Processor input clock line (If paired with the 5467 CS8220 Chipset it should be the PCLK line), and another clock line with a 32.768 5468 KHz Frequency for the internal RTC (Albeit the Data Sheet mentions that it 5469 supported other two input values, that is the most typical one). The integrated 5470 DMACs could run at either the input clock speed or internally halve it, removing 5471 the need for a third clock input that ran at half its speed, which is what the 5472 CS8220 DMACLK line was for. 5473 5474 So far, the C&T CS8220 Chipset and the 82C206 IPC fits perfectly the definition 5475 of what would be later known as a Northbridge and a Southbridge. The Northbridge 5476 took care of all the logic required to interface completely different chips and 5477 Buses together, and the Southbridge provided functionality equivalent to the 5478 PC/AT support chips. Is amusing than in the very first Chipset generation the 5479 Northbridge and Southbridge were actually separate, independent products, but in 5480 the next one, they would be part of the same one. 5481 5482 [cs8220]: https://datasheetspdf.com/datasheet/CS8220.html 5483 [dellSystem220]: https://books.google.com/books?id=NIG9adkUxkQC&pg=PA210&lpg=PA210&#v=onepage&f=false 5484 [82C206IPC]: http://computercaveman.eu/datasheet/386/P82C206-IPC.pdf 5485 5486 5487 ##### 5.3 - The second C&T PC/AT Chipset generation, C&T CS8221. Feature overview and Clock Generation 5488 5489 C&T eventually released an improved version of the CS8220 Chipset, the 5490 [CS8221][cs8221], also known as the famous NEAT Chipset. The CS8221 managed to 5491 integrated the functions of the five chips from the previous Chipset into only 5492 three parts: The 82C211 Bus Controller, the 82C212 Memory Controller and the 5493 82C215 Data/Address Buffer. It also added the previously described 82C206 IPC as 5494 an official fourth chip, merging Northbridge and Southbridge as part of the same 5495 Chipset product. 5496 5497 The CS8221 was a late 286 era Chipset. By that time period, Intel was already 5498 selling 286s binned to run at 16 MHz, while other manufacturers put some more 5499 effort to get them to 20, or even 25 MHz. The previous CS8220 could run a 286 @ 5500 10 MHz if using the better binned variant, but even if 286s could be clocked 5501 higher, 10 MHz was pretty much the upper limit of what the expansion cards 5502 sitting in the System Bus and the support chips could tolerate (The slower, 5503 asynchronous RAM was dealt with by using Wait States), a limit also shared by 5504 8088 Turbo XT platforms. Among the multiple major features of the CS8221, the 5505 most notorious one was that the System Bus could be configured to be a fully 5506 independent asynchronous clock domain that was not bound to the clock speed of 5507 the Local Bus. This is the very reason why it could run a 286 at clock speeds 5508 that easily broke the 10 MHz barrier, since clocking the Processor higher didn't 5509 clocked almost everything else higher. While the CS8221 datasheet claims that it 5510 supports either 12 or 16 MHz 286s, I'm aware that some Motherboards used it with 5511 20 MHz ones, too. 5512 5513 The CS8221 Chipset had two Clock Generators. The less interesting one was in the 5514 82C212 Memory Controller, which used as input clock the already too many times 5515 mentioned 14.31 MHz crystal to derive the OSC and OSC/12 lines from. The main 5516 Clock Generator was in the 82C211 Bus Controller, which could use as input 5517 either one or two Crystal Oscillators for synchronous (Single CLK2IN input) or 5518 asynchronous (CLK2IN and ATCLK inputs) operating modes. In total, the 82C211 5519 supported five clock deriving schemes, three synchronous and two asynchronous, 5520 giving some degree of configuration flexibility according to the crystals used. 5521 Amusingly, the 82C211 had only two output lines, PROCCLK and SYSCLK, which 5522 provided the reference clocks for everything else in the system. As the 82C206 5523 IPC could internally halve the clock for the DMACs, there was no need for 5524 another clock line at all. 5525 5526 At the time that the CS8221 was being used in commercial Motherboards, it seems 5527 that there were two standard scenarios regarding how to clock the platform. 5528 Getting a 286 @ 16 MHz could be easily achieved by relying on just a single 32 5529 MHz Crystal Oscillator wired to the CLK2IN input line, as in one of the Clock 5530 Generator synchronous modes it could be used to derive a 32 MHz PROCCLK (Same as 5531 CLK2IN) and a 8 MHz SYSCLK (CLK2IN/4). Basically, the Processor clock ran 5532 synchronous with the rest of the system, but at twice its speed. The other use 5533 case is far more interesting, as it involves a higher clocked Processor, a 286 @ 5534 20 MHz. Using a 40 MHz crystal to derive the reference clocks for the entire 5535 system wasn't a good idea because in the previous setup, it would also mean that 5536 the System Bus would be @ 10 MHz, which was borderline (Albeit still within the 5537 realm of the possible, as the previous CS8220 Chipset had a better binned 5538 version that could do so). By running the Clock Generator in asynchronous mode 5539 with a companion 16 MHz crystal wired to the ATCLK input, it was possible to 5540 have a 40 MHz PROCCLK (Same as CLK2IN) with a 8 MHz SYSCLK (ATCLK/2). This seems 5541 to be the way that Turbo AT manufacturers got their high speed 286s running, 5542 like the [Dell System 220][dellSystem220] and the [GenTech 286/20][gentech286], 5543 both of which had a 286 @ 20 MHz with a 8 MHz Bus using a C&T Chipset. This 286 5544 era Hardware is quite representative of how the platform topology would be for 5545 the next 20 years. 5546 5547 The other major features introduced by the CS8221 Chipset involved its advanced 5548 Memory Controller. The 82C212 Memory Controller supported up to 4 Banks with 5549 Parity, as usual, but it had extensive dynamic mapping capabilities that allowed 5550 to use the RAM managed by it to emulate Expanded Memory (EMS), and also allowed 5551 for the introduction of a new type of memory known as Shadow RAM. These features 5552 by themselves wouldn't be as important if it wasn't because at the same time 5553 that Motherboards with this Chipset were being designed, the SIMM Memory Module 5554 format became prominent. With SIMMs, it was possible to save a ton of physical 5555 space in the Motherboard compared to the old way of having sockets for 5556 individual DRAM chips, and that space could then be used for more SIMM Slots so 5557 that you could install even more RAM in the Motherboard itself. All that RAM was 5558 directly managed by the Chipset Memory Controller as a single unified RAM pool 5559 that could be partitioned and mapped as the user wanted to, removing any need 5560 for Memory expansion cards. 5561 5562 In order to understand in retrospective how amazing the capabilities of the 5563 CS8221 Memory Controller were, you first have to consider about how much RAM a 5564 PC/AT of the immediate previous generation had, and the required Hardware to 5565 actually install that much RAM. As an example, suppose that a well geared 1988 5566 PC/AT computer had at the bare minimum the full 640 KiB Conventional Memory. As 5567 not all computers supported to have that much installed on the Motherboard 5568 itself (This was the case on the IBM PC/AT, which supported only 512 KiB RAM on 5569 the Motherboard), chances are that to get to 640 KiB, the computer needed a 5570 Memory expansion card. I suppose that it was preferable to pick an Extended 5571 Memory card that could also backfill the Conventional Memory instead of wasting 5572 an entire expansion slot for just 128 KiB RAM or so. However, since most 5573 applications were unable to use Extended Memory due to the 80286 CPU 5574 idiosyncrasies, for general purposes only the first 64 KiB of Extended Memory 5575 for the HMA actually mattered (Albeit before 1991, which is when PC DOS 5.0 5576 introduced HIMEM.SYS in a pure DOS environment, I think that only Windows/286 or 5577 later could make use of it). Meanwhile, the applications that were memory heavy 5578 relied on EMS, which means that you also required an Expanded Memory card (There 5579 were also pure software emulators that could use Extended Memory to emulate 5580 Expanded Memory, but I suppose that these were very slow and used only as a last 5581 resort. I don't know how Expanded Memory emulators were supposed to work in a 5582 286, those are functionally different from the more known 386 ones). Thus, a 5583 well geared PC/AT would probably have two memory expansion cards, a 512 KiB or 5584 so for Extended Memory that also backfilled the Conventional Memory, and a 512 5585 or 1 MiB Expanded Memory one. The CS8221 Chipset along with SIMM Memory Modules 5586 would dramatically change that... 5587 5588 A typical Motherboard based on the CS8221 Chipset had 8 SIMM Slots. SIMMs had to 5589 be installed in identical pairs to fill a Bank (Two 9-Bit SIMMs for a 18-Bit 5590 Bank). With SIMMs capacities being either 256 KiB or 1 MiB, a computer could 5591 have from 512 KiB up to 8 MiB installed on the Motherboard itself, which at the 5592 time was a massive amount. The magic of the Memory Controller relied on its 5593 mapping flexibility, which could be conveniently managed via software. 5594 Basically, you could install a single RAM memory pool in the Motherboard via 5595 SIMMs without having to touch a single Jumper, then set in the BIOS Setup how 5596 you wanted to map it. For example, with 2 MiB installed (8 256 KiB SIMMs), you 5597 could fill the 640 KiB Conventional Memory (I'm not sure if mapping lower than 5598 that was possible in the CS8221. The original IBM PC/AT didn't required to max 5599 out Conventional Memory to use an Extended Memory expansion card), then choose 5600 how much of the remaining 1408 KiB would be mapped as Extended Memory or used 5601 for Expanded Memory emulation. If you wanted, you could tell the BIOS Setup to 5602 use 1024 KiB for Expanded Memory, then leave 384 KiB for Extended Memory. In 5603 resume, the Chipset Memory Controller took care of all the remapping duties so 5604 that your system RAM was where you wanted it to be, and all this was possible 5605 without the need of specialized Hardware like the previous Memory expansion 5606 cards, nor having to pay the performance overhead of software emulation. A 5607 trivial detail is that the Memory Controller required an EMS Driver for the 5608 Expanded Memory to work, something that should make this Chipset maybe the first 5609 one to require its own custom Driver to be installed instead of relying on 5610 generic PC or PC/AT 100% compatible firmware and OS support. 5611 5612 The other Memory Controller feature was Shadow RAM. By the time of the CS8221 5613 Chipset, RAM memory was significantly faster than ROM chips. The PC/AT platform 5614 had several ROMs that were read very often, like the BIOS due to the BIOS 5615 Services, and the Video Card VBIOS due to its own routines. Shadow RAM consisted 5616 on copying the contents of these ROMs into RAM memory right after POST, then 5617 tell the Memory Controller to map that RAM into the same fixed, known address 5618 ranges than these ROMs were expected to be. Thanks to this procedure, some ROMs 5619 were read only once to load them into RAM, then applications would transparently 5620 read them from it, which was faster. This resulted in a significant performance 5621 boost for things that called the BIOS Services or the VBIOS often enough. After 5622 copying the ROMs contents to the Shadow RAM, it was typically write protected, 5623 both for safety reasons and to reproduce ROM behavior, as it was impossible to 5624 directly write to it anyways. However, write protecting the Shadow RAM was not 5625 mandatory, so I suppose that either due to an oversight or maybe intentionally, 5626 someone could have left it writable so that live patches could be applied for 5627 things like the BIOS or VBIOS code. I wonder if someone ever had fun doing that? 5628 5629 What can be shadowed is extremely dependent on the Chipset capabilities. In the 5630 case of the CS8221 Chipset, it seems to be able to shadow the entire 384 KiB of 5631 the UMA in individually configurable chunks of 16 KiB (This is what the 5632 datasheet says that the Chipset supports, the BIOS developers of a particular 5633 Motherboard could have skimped on exposing the full settings and just left the 5634 ones to enable shadowing in the ranges that they thought that mattered the 5635 most). However, shadowing the entire UMA was rather pointless because there are 5636 things that shouldn't be shadowed to begin with, like the Video Card framebuffer 5637 (Depending on Video Card type, could be as much as 128 KiB), which is already 5638 RAM, or the 64 KiB window for the Expanded Memory, that is also RAM. Typically, 5639 the two most important ranges to shadow were the BIOS (896 KiB to 959 KiB) and 5640 the VBIOS (768 KiB to 799 KiB), which means that in total, 96 KiB RAM had to be 5641 set aside for general shadowing purposes if you enabled them both. I suppose 5642 that Option ROMs in expansion cards were also worth shadowing, for as long that 5643 you knew at which address they were located as to not waste RAM shadowing 5644 nothing. Finally, shadowing nothing actually served a purpose, since doing so 5645 was still effectively mapping usable free RAM into otherwise unmapped UMA 5646 address ranges, something that in previous generations would require a 5647 specialized Memory expansion card as regular users didn't mapped RAM into the 5648 UMA. That unused mapped RAM would eventually become useful for UMBs (Upper 5649 Memory Blocks). However, UMBs pretty much belongs to the 386 era Memory 5650 Management section since these aren't really era appropriate for a 286, and 5651 their availability on 286 platforms was extremely dependent on the Chipset 5652 mapping or shadowing capabilities. 5653 5654 [cs8221]: https://media.searchelec.com//specshee/CHIPS/P82C206.pdf 5655 [gentech286]: https://books.google.com/books?id=h6qtgYAzqDgC&lpg=PP1&pg=RA10-PP18&redir_esc=y#v=onepage&q&f=false 5656 5657 5658 ##### 5.4 - The third C&T PC/AT Chipset generation, C&T 82C235. Feature overview 5659 5660 Later developments from C&T for the 286 includes the 82C235 Chipset released 5661 during 1990, also known as SCAT, whose reference platform Block Diagram can be 5662 seen in Page 11 [here][scatDiagram]. The C&T 82C235 integrated almost all the 5663 previously mentioned things into a mere single chip, the most notorious 5664 exception still being the Intel 8042 Microcontroller. Is rather ironic if you 5665 consider how almost everything had been consolidated into a single chip, then 5666 eventually would get bloated in the Intel 80486 and P5 Pentium generations 5667 before repeating again the cycle of consolidation. By the time that the 82C235 5668 Chipset was relevant, computers based on the 80386 CPU were already available in 5669 the mass market and next to become mainstream, pushing the 286 based Turbo ATs 5670 as the new budget computers, while the original PC platform based ones like the 5671 Turbo XTs were extremely close to obsolescence. 5672 5673 For some reason that I don't understand, the 82C235 had a lower Frequency 5674 ceiling than the previous CS8221 Chipset, since it seems that it supported only 5675 12.5 MHz 286s instead of up to 16 MHz, and there is no mention about Bus speeds 5676 at all (I'm a bit curious about all the "12.5 MHz" 286s found in some Data 5677 Sheets and PC magazines of the era, since the Processor bins themselves seems to 5678 always have been the standard 12 MHz ones. It is even more weird since the 5679 CS8221 Chipset clock generation scheme was as simple as it could get if using 5680 synchronous mode, there was no reason to change anything. Maybe it was a small 5681 factory overclock that computer manufacturers could get away with?). There is 5682 also the legend of the 25 MHz 286 from Harris, which I never bothered to check 5683 the details about the required platform to get it running, like which Chipset 5684 supported it and which was the preferred clock generation method. 5685 5686 Even thought C&T appeared to be quite important during the late 80's and early 5687 90's, it would eventually be purchased by Intel in 1997 and its legacy fade into 5688 obscurity... 5689 5690 [scatDiagram]: http://www.bitsavers.org/components/chipsAndTech/1989_Chips_And_Technologies_Short_Form_Catalog.pdf 5691 5692 5693 6 - The Compaq DeskPro 386, the Intel 80386 CPU, 386 era DOS Memory Management, DOS Extenders, and the beginning of the x86-IBM PC marriage 5694 ------------------------------------------------------------------------------------------------------------------------------------------- 5695 5696 In September 1986, Compaq, one of the most known manufacturers and among the 5697 firsts to release a virtually 100% IBM PC compatible computer, launched the 5698 DeskPro 386. The launch was significant enough to cause a commotion in the 5699 industry, as it was the first time that a clone manufacturer directly challenged 5700 IBM leadership. Until then, IBM was the first to use and standardize the 5701 significant platform improvements, with clone manufacturers closely following 5702 the trend that IBM set before eventually attempting to do it better, faster or 5703 cheaper. That was the case with the original IBM PC and PC/AT, the clone 5704 manufacturers would begin with the same parts than IBM, then eventually deploy 5705 higher clocked ones in Turbo XTs and Turbo ATs. A similar thing happened with 5706 the EGA Video Card: IBM designed the card, some semiconductor designers like C&T 5707 integrated it in a Chipset-like fashion, then manufacturers began to use it to 5708 make cheaper EGA compatible Video Cards. This time it was totally different, as 5709 Compaq bested IBM in releasing a PC/AT compatible computer with the latest and 5710 greatest Intel x86 Processor, the 80386 CPU. The consequences of this would be 5711 catastrophic for IBM, as it would begin to lose control of its own platform. 5712 5713 The Compaq DeskPro 386 was the first PC/AT compatible system to make use of the 5714 new Intel 80386 CPU, placing Compaq ahead of IBM. As IBM didn't really planned 5715 to use the 80386 CPU in any of its PC compatible computers since it was still 5716 milking the PC/AT, it was the DeskPro 386 launch what forced IBM to compete to 5717 maintain its spot at the top, which it did when it launched a new platform, the 5718 IBM PS/2, in April 1987. Sadly for IBM, the DeskPro 386 lead of half a year in 5719 the market gave it an enormous momentum, since other PC/AT compatible 5720 manufacturers began to follow Compaq and pretty much do clones of the DeskPro 5721 386. Besides, the IBM PS/2 was heavily proprietary in nature, whereas the PC/AT 5722 was an open architecture, which gave PC compatible vendors even more incentive 5723 to go with Compaq approach, helping the DeskPro 386 to become a de facto 5724 standard. As such, the DeskPro 386 has an enormous historical 5725 [importance][deskpro386], as we're considered 5726 [direct descendants of it][deskpro386at30] instead of IBM next platform, the 5727 PS/2. 5728 5729 [deskpro386]: https://dfarq.homeip.net/compaq-deskpro-386/ 5730 [deskpro386at30]: http://www.os2museum.com/wp/deskpro-386-at-30/ 5731 5732 5733 ##### 6.1 - The Intel 80386 CPU main features, Virtual 8086 Mode, Flat Memory Model, Paged Virtual Memory 5734 5735 The original 80386 CPU, released in October 1985, is perhaps the most important 5736 Processor in all the 35 years of evolution of the x86 ISA, as its feature set 5737 tackled everything that mattered at the best possible moment. The 386 introduced 5738 pretty much almost everything of what would later became the backbone of the 5739 modern x86 architecture, with the 386 ISA remaining as the baseline for late era 5740 DOS software, and going as far as Windows 95 (Even if by then the performance of 5741 a 386 was far from enough to be usable, it could still boot it). This happened 5742 mostly thanks to Intel finally learning that backwards compatibility was 5743 important, so many of the 386 features were introduced to solve the shortcomings 5744 of the 286. 5745 5746 To begin with, the 80386 was a 32 Bits Processor, as measured by the size of its 5747 GPRs (General Purpose Registers). A lot of things were extended to 32 Bits: The 5748 eight GPRs themselves, which previously were 16 Bits (And for backwards 5749 compatibility, they could still be treated as such), the Data Bus, and the 5750 Address Bus. Extending the Address Bus to a 32 Bits width was a rather major 5751 feature, since it gave the 80386 a massive 4 GiB (2^32) Physical Address Space. 5752 Protected mode was upgraded to allow it to return to Real Mode by just setting a 5753 Bit, completely removing the need of resetting the Processor and all the 5754 involved overhead when using any of the 286 reset hacks. The 386 also introduced 5755 a new operating mode, Virtual 8086 Mode, a sort of virtualized mode that helped 5756 it to multitask 8088/8086 applications. A lot of action also happened in the 5757 integrated MMU, too. It was upgraded to support, in addition to Segmentation, 5758 Paging, as a new, better way to implement Virtual Memory in an OS, so both the 5759 old Segmentation Unit and the new Paging Unit coexisted in the MMU. The MMU 5760 Paging Unit also had its own small cache, the TLB (Translation Lookaside 5761 Buffer), which you may have hear about a few times. 5762 5763 The vast majority of the 386 features were available only in protected mode, 5764 which got enhanced to support them but in a backwards compatible manner, so that 5765 the 80386 protected mode could still execute code intended to run in the 80286 5766 protected mode. Since the immense amount of new features means that applications 5767 targeting protected mode in an 80386 would not work in an 80286, I prefer to 5768 treat these modes as two separate entities, 286/16 Bits protected mode and 5769 386/32 Bits protected mode. Real Mode could be considered extended, too, since 5770 it is possible to do 32 Bits operations within it, albeit such code would not 5771 work in previous x86 Processors. 5772 5773 The Virtual 8086 Mode was a sub-mode of protected mode where the addressing 5774 style worked like Real Mode. The idea was that a specialized application, known 5775 as a Virtual 8086 Mode Monitor, executed from within a protected mode OS to 5776 create a Hardware assisted Virtual Machine (We're talking about 30 years ago!) 5777 for each 8088/8086 application that you wanted to run concurrently. The V86MM 5778 was almost identical in role and purpose to a modern VMM like QEMU-KVM, as it 5779 could provide each virtualized application its own Virtual Memory Map, trap and 5780 emulate certain types of I/O accesses, and a lot of other things. A V86MM 5781 intended to be used for DOS applications was known as a VDM (Virtual DOS 5782 Machine), which was obviously one of the prominent use cases of the V86 Mode. 5783 Another possible usage of the V86 Mode was to call the BIOS Services or the 5784 VBIOS from within it, which had an alternative set of pros and cons when 5785 compared to returning to Real Mode to do so. 5786 5787 Memory Management in the 80386 was incredibly complex due to the many schemes 5788 that it supported. For memory addressing you had, as usual, the old Segmented 5789 Memory Model that required two GPRs with a Segment and Offset pair to form a 5790 full address, in both its 8088 Real Mode addressing style variant, the 286 5791 protected mode addressing style variant, and a new 386 protected mode one that 5792 differed in that it allowed to extend Segments to be up to 4 GiB in size 5793 compared to the previous maximum of 64 KiB. Moreover, since in the 80386 the 5794 size of the GPR was equal to that of the Address Bus, it was now finally 5795 possible to add a different mode for memory addressing: Flat Memory Model. Using 5796 Flat Memory Model, a single 32 Bits GPR sufficed to reference an address, 5797 finally putting x86 on equal footing against other competing Processor ISAs. The 5798 fun part is that the Flat Memory Model was merely layered on top of the 5799 Segmented Memory Model: Its setup required to create a single 4 GiB Segment. 5800 Basically, if using the Flat Memory Model, the MMU Segmentation Unit still 5801 performed its duties, but these could be effectively hided from the programmer 5802 after the initial setup. 5803 5804 When in comes to Virtual Memory, you could use the existing Segmented Virtual 5805 Memory scheme, either at a 286 compatible level or with 386 enhancements, or the 5806 new Paged Virtual Memory scheme. The Virtual Address Space also got extended to 5807 4 GiB per task. Is important to mention that internally, when the 386 was in 5808 protected mode and thus addresses were always translated by the MMU, the address 5809 translation was done by the Segmentation Unit first, then optionally, if Paging 5810 was being used, the Paging Unit, before finally getting the Address Bus to 5811 output a Physical Address. Basically, even if using Paging, the Segmentation 5812 Unit couldn't be disabled or bypassed, albeit it could be quite neutered if 5813 using a Flat Memory Model. And this is perhaps one of the 386 MMU less known 5814 tricks: It someone wanted, it could fully use Segmentation and Paging 5815 simultaneously, which made for an overly complicated Memory Management scheme 5816 that still somehow worked. Surprisingly, there was at least a single specific 5817 use case where mixing them could be useful... 5818 5819 The Paged Virtual Memory scheme consist on units known as Page Frames that 5820 referenced a block of addresses with a fixed 4 KiB size, giving a consistent 5821 granularity and predictability compared to variable size Segments. Instead of 5822 Segment Descriptor Tables (Or better said, in addition to, since you required at 5823 least a minimal initialization of the MMU Segmented Unit), Paging uses a 5824 two-level tree hierarchy of Page Directories and Page Tables to reference the 5825 Page Frames. The Page Directories and Page Tables are also 4 KiB in size, with a 5826 Page Directory containing 1024 4-Byte entries of Page Tables, and each Page 5827 Table containing 1024 4-Byte entries of Page Frames. The 386 MMU TLB could cache 5828 up to 32 Page Table entries, and considering that they are 4 Bytes each, it 5829 should mean that the TLB size was 128 Bytes. 5830 5831 Compared to Segment Descriptor Tables, the Page Directories and Page Tables data 5832 structures could have a substantially higher RAM overhead, depending on how you 5833 are comparing them. For example, to hold Virtual-to-Physical mapping data of 4 5834 MiB worth of addresses, with Paging you could do it with a single 4 KiB Page 5835 Table (And an additional required 4 KiB Page Directory, plus 8 Bytes for the 4 5836 GiB Segment Descriptor) as it can hold the mapping data of 1024 4 KiB Page 5837 Frames, or in other words, it can map these 4 MiB worth of addresses with a 4 5838 KiB overhead. In comparison, with the Segmented scheme you could have either a 5839 single 8 Byte Segment Descriptor for a 4 MiB Segment, 64 Segment Descriptors 5840 with a Segment size of 64 KiB each for a total of 512 Bytes overhead, or even 5841 1024 Segment Descriptors with a Segment size of 4 KiB each for a total of 8 KiB 5842 overhead, just to make it an even comparison with Page Frames. However, keep in 5843 mind that there was a fixed limit of 16384 Segments (It was not extended from 5844 the 80286), so Segments would absolutely not scale with low granularity, whereas 5845 with Paging, with just a 4 KiB Page Directory and 4 MiB in 1024 Page Tables, you 5846 are already addressing 1048576 Page Frames of 4 KiB each for a grand total of 4 5847 GiB mapped addresses with a reasonable 4100 KiB overhead. 5848 5849 Paging had a drawback of sorts: Memory Protection was simplified, so that each 5850 Page could only be set with either Supervisor (Equivalent to Ring 0/1/2) or User 5851 (Equivalent to Ring 3) privileges. For the vast majority of uses this was 5852 enough, the typical arrangement was to have the OS running as Supervisor/Ring 0 5853 and the user applications being User/Ring 3. However, in the early 2000's, an 5854 use case appeared where this was not enough: x86 virtualization. The first 5855 attempts at x86 virtualization were made entirely in software, there was no 5856 specialized Hardware features that helped with it. These early VMMs (Virtual 5857 Machine Managers) had to run both the guest OS and the guest applications at the 5858 same privilege level, Ring 3, which basically means that the guest OS had no 5859 Hardware Memory Protection from its user applications. By mixing Segmentation 5860 and Paging, it was possible to implement a technique known as Ring 5861 Deprivileging, where the host OS could run in Ring 0, as usual, the guest OS as 5862 Ring 1, and the guest applications at Ring 3, providing some form of Hardware 5863 protection. Ring Deprivileging and everything associated with x86 virtualization 5864 via software only methods pretty much disappeared after Intel and AMD Hardware 5865 virtualization extensions, VT-x and AMD-V, respectively, became mainstream 5866 (Actually, a VMM that uses them is colloquially considered to be running in Ring 5867 -1). 5868 5869 While it doesn't seems like a lot of features, it had all the ones that it 5870 needed to make it extremely successful. Actually, its success dramatically 5871 altered where Intel was heading, as it would strongly shift its focus to x86 5872 Processors, adapting to the fact that most of them would be used for IBM PC 5873 compatible computers. You may want to read [this interview][intel386design] with 5874 many of the original designers involved in the 80386, which can give you a 5875 better idea of how important the 386 was for Intel. Still, the 386 generation 5876 had a lot of rarely told tales that happened while Intel was still experimenting 5877 with their product lineup... 5878 5879 [intel386design]: http://archive.computerhistory.org/resources/text/Oral_History/Intel_386_Design_and_Dev/102702019.05.01.acc.pdf 5880 5881 5882 ##### 6.2 - The side stories of the 386: The 32 Bits bug recall, removed instructions, poorly known support chips, and the other 386 variants 5883 5884 While the 80386 CPU would quickly become an era defining Processor, it had a 5885 very rough start in the market. This is not surprising, as the 386 was an 5886 extremely ambitious CPU that mixed new, modern features on top of backwards 5887 compatibility with previous Processors that in some areas operated quite 5888 differently, so it was like combining the behaviors of three different 5889 Processors into a single one that did them all (8088 Real Mode, 80286 16 Bits 5890 protected mode with Segmented Virtual Memory, 80386 own 32 Bits protected mode 5891 with both Segmented and Paged Virtual Memory, and the Virtual 8086 Mode). Having 5892 so many operating modes made it such a complex beast that the early steppings 5893 were [plagued with issues][386issues]. 5894 5895 The earliest issue with the 386 was that it couldn't hit its clock speed 5896 targets. As far that I know, Intel was expecting that the 386s would be able to 5897 run at 16 MHz, but it seems that yields for that bin were initially low since it 5898 also launched a 12 MHz part, which in modern times is an extremely rare 5899 collectible chip (Since the Compaq DeskPro 386 launched using a 16 MHz 80386, I 5900 suppose that the 12 MHz ones were used only in the earliest development systems, 5901 then removed from the lineup). However, not hitting the clock speed target was 5902 perhaps one of the more minor issues... 5903 5904 From all the bugs and errata that the 80386 had, a major one was a 5905 multiplication bug when running in 32 Bits protected mode. It seems that the bug 5906 was caused by a flaw in the manufacturing process since not all 386s were 5907 affected. According to the scarce info that can be found about that matter, the 5908 bug was severe enough that Intel issued a recall of the 80386 after it was found 5909 around 1987. The 386s that were sent back and those newly produced were tested 5910 for the bug in 32 Bits mode, the good ones were marked with a Double Sigma, and 5911 the bad ones as "16 BIT S/W ONLY". It makes sense than not all 80386 units were 5912 sent back, so many shouldn't be marked with either. 5913 5914 The recall and replacement cycle caused an industry wide shortage of 386s, yet 5915 Intel seems to have capitalized on that as they sold those defective 16 Bits 386 5916 anyways. As there was little to no 32 Bits software available (Actually, the 5917 earliest 386 systems like the Compaq DeskPro 386 were the developing platforms 5918 for them), most 386s were used just as a faster 286, so it made sense to sell 5919 them, probably at some discount. I have no idea to whom Intel sold these 16 Bits 5920 386s, nor in which computers of the era they could be found, nor if end users 5921 knew that they could be purchasing computers with potentially buggy chips, 5922 either. The Compaq DeskPro 386, being one of the first computers to use a 386 5923 (And the first IBM PC/AT compatible to do so), should have been affected by all 5924 these early 386 era issues, albeit I never looked around for info about how 5925 Compaq handled it. 5926 5927 A questionable thing that Intel did was playing around with the 386 ISA after 5928 the 80386 launched. There used to be two properly documented instructions, 5929 [IBTS and XBTS][ibtsxbts], that worked as intended in early 80386 units, but 5930 were removed in the B1 Stepping because Intel thought that they were redundant. 5931 The respective opcodes became invalid opcodes. However, these opcodes were 5932 reused for a new instruction in the 80486 CPU, CMPXCHG. Eventually, it seems 5933 that Intel noticed that it wasn't a good idea to overlap two completely 5934 different instructions onto the same opcodes, so in later 80486 CPU steppings, 5935 these instructions were moved to formerly unused opcodes, as to not have any 5936 type of conflict. All this means that there may exist software intended for 5937 early 80386 CPUs that uses the IBTS and XBTS instructions, thus will fail to 5938 execute properly in later 80386 CPUs or any other Processor except the early 5939 80486s, where it can show some undefined behavior, as these could execute the 5940 otherwise invalid opcodes but with different results. Likewise, early 80486 5941 software that used the CMPXCHG instruction with the old encoding may fail in 5942 anything but the earliest 80486 steppings, and misbehave on early 80386s. I 5943 suppose that there may still exist early Compilers or Assembler versions that 5944 can produce such broken software. As always, details like this is what makes or 5945 breaks backwards and forward compatibility. 5946 5947 One of the most surprising things is that the 80386 CPU was released with pretty 5948 much no support chips available specifically for it, the only one that I could 5949 find was the Intel 82384 Clock Generator. As the 386 Bus protocol was backwards 5950 compatible with the previous x86 Processors, the early 386 platforms could get 5951 away by reusing designs very similar to non-Chipset based 286 platforms, but 5952 with at least the Local Bus width extended to 32/32 (Data and Address Buses, 5953 respectively), then letting glue chips fill the void. The most notorious example 5954 of this were early 386 platforms that had a Socket for an optional 80287 FPU, 5955 which could partially sit in the Local Bus (The 16 Bits of the Data Bus were 5956 directly wired to the lower 16 Bits of the CPU itself, the Address Bus had to be 5957 behind glue logic). Essentially, the whole thing was handled as IBM did in its 5958 PC/AT, which used the 286 with its 16/24 Bus for a platform extension of the IBM 5959 PC with the 8088 with its 8/20 Bus, and everything to end up interfacing them to 5960 8/16 8085 era support chips. Is fun when you consider how advanced the 80386 5961 was, and how much the rest of the platform sucked. 5962 5963 Intel eventually released some new support chips to pair with its 80386 CPU. The 5964 most known one is the Co-processor, the 80387 FPU, simply because it was a major 5965 chip. Because it arrived two years after the 80386 CPU, computer manufacturers 5966 filled the void by adapting the 80287 FPU to run with an 80386, as previously 5967 mentioned. The FPU would remain a very niche Co-processor, as in the DOS 5968 ecosystem only very specific applications supported it. There was also the Intel 5969 82385 Cache Controller, a dweller of the Local Bus that interfaced with SRAM 5970 chips to introduce them as a new memory type, Cache. As faster 386s entered the 5971 market, it was obvious that the asynchronous DRAM was too slow, so the solution 5972 was to use the faster, smaller, but significantly more expensive SRAM as a small 5973 Cache to keep the Processor busy while retrieving the main DRAM contents. The 5974 Cache is memory, yet it is not mapped into the Processor Physical Address Space 5975 thus is transparent to software. Later 386 chipsets like the 5976 [C&T CS8231][cs8231] for the 80386 incorporated their own Cache Controllers, 5977 albeit Cache SRAM itself was typically populated only in high end Motherboards 5978 due to its expensive cost. 5979 5980 Maybe one of the breaking points of the entire x86 ecosystem is that Intel 5981 actually tried to update the archaic support chips, as it introduced a chip for 5982 the 386 that both integrated and acted as a superset of the old 8085 ones. This 5983 chip was the Intel 82380 IPC (Integral Peripheral Controller), which was similar 5984 in purpose to the C&T 82C206 IPC, as both could be considered early Southbridges 5985 that integrated the platform support chips. There was an earlier version of it, 5986 the 82370 IPC, but I didn't checked the differences between the two. 5987 5988 The 82380 IPC, among some miscellaneous things, integrated the functions of a 5989 DMAC, a PIC and a PIT, all of which were much better compared to the ancient 5990 discrete parts used in the PC/AT. The integrated DMAC had 8 32 Bits channels, a 5991 substantial improvement compared to the two cascaded 8237A DMACs in the PC/AT 5992 platform and compatible chipsets that provided 4 8 Bits and 3 16 Bits channels. 5993 The 82380 integrated PIC was actually three internally cascaded 8259A compatible 5994 PICs instead of just two like in the PC/AT. The three chained PICs provided a 5995 total of 20 interrupts, 15 external and 5 internal (Used by some IPC integrated 5996 Devices) compared to the PC/AT total of 15 usable interrupts. The PIT had 4 5997 timers instead of the 3 of the 8254 PIT, and also took two interrupts instead of 5998 one, but these interrupts were internal. Finally, as it interfaced directly with 5999 the 80386, it could sit in the Local Bus. 6000 6001 The question that remains unsolved of the 82380 IPC is that of backwards 6002 compatibility. Both the new PIC and PIT were considered by Intel supersets of 6003 the traditional parts used in the PC/AT, so in theory these two should have been 6004 IBM PC compatible. What I'm not so sure about is the DMAC, as the datasheet 6005 barely makes a few mentions about some software level compatibility with the 6006 8237A. Since I failed to find any IBM PC compatible that used the 82380 IPC, I 6007 find it hard to assume that its integrated Devices were fully compatible 6008 supersets of the PC/AT support chips, albeit that doesn't seem logical since I 6009 suppose that by that point, Intel had figured out that another 80186 CPU/SoC 6010 spin-off with integrated Devices that are incompatible with its most popular 6011 ones wouldn't have helped it in any way. There were some non IBM PC compatible 6012 386 based computers that used the 82380 IPC like the Sun386i, but in the IBM 6013 PC/AT world, everyone seems to have ignored it, as no one used it neither to 6014 implement the PC/AT feature set, nor the new supersets. Even Intel itself seems 6015 to have forgotten about it, since some years later, when Intel got into the 6016 Chipset business during the 486 heyday, the chipsets only implemented the 6017 standard PC/AT support chips functionality, not the 82380 superset of them. 6018 Basically, whatever features the 82380 IPC introduced seems to have been 6019 orphaned in a single generation like the 8089 IOP, but it is an interesting fork 6020 that the x86 ecosystem evolution could have taken. 6021 6022 During the 386 generation, Intel began to experiment heavily with product 6023 segmentation. Whereas in the 286 generation Intel consolidated everything behind 6024 the 80286 (Assuming that you ignore the many different bins, packagings and 6025 steppings) with no castrated 80288 variant like in the previous two generations, 6026 for the 386 generation Intel ended up introducing a lot of different versions. 6027 6028 The first 386 variant was not a new product but just a mere name change, yet 6029 that name change is actually intertwined with the launch of a new product. After 6030 the 80386 had been for 2 years or so in the market, Intel decided to introduce a 6031 386 version with a castrated Bus width, a la 8088 or 80188. However, instead of 6032 introducing it under the 80388 name that [was supposed to be obvious][386dx], 6033 Intel decided first to rename the original 80386 by adding a DX suffix, becoming 6034 the 80386DX. This name change also affected the Co-processor, as the DX suffix 6035 was added to the 80387 FPU, effectively becoming the 80387DX. The 82385 Cache 6036 Controller seems to have avoided the DX suffix, since there aren't mentions of a 6037 82385DX being a valid part at all. Soon after the name change, it launched the 6038 castrated 386 version as the 80386SX. The SX suffix was also used by the new 6039 support chips that were specifically aimed for the 80386SX, namely the 80387SX 6040 FPU and the 82385SX Cache Controller, which are what you expect them to be. I'm 6041 not sure whenever Intel could have come up with an autodetection scheme so that 6042 the DX parts could be used interchangeably with either DX or SX CPUs, as the 6043 8087 FPU could work with either the 8088 or the 8086. Amusingly, Intel designed 6044 the 80386 with both a 32/32 and 16/24 Bus operating modes, so the same die could 6045 be used in either product line according to factory configuration and packaging. 6046 6047 Whereas the now 80386DX had a 32 Bits Data Bus and 32 Bits Address Bus, the new 6048 80386SX had a 16 Bits Data Bus and a 24 Bits Address Bus (16 MiB Physical 6049 Address Space. This was just the external Address Bus, it could still use the 4 6050 GiB Virtual Address Space per task of the 386DX and all its features). 6051 Technically that was a bigger difference than the one between the 8088 and 8086, 6052 as these only had a different Data Bus width (8 Bits vs 16 Bits), yet the 6053 Address Bus was still the same in both (20 Bits). While the 16 Bits Data Bus and 6054 24 Bits Address Bus of the 386SX matched that of the 286 and the Bus Protocol 6055 was the same, it couldn't be used as a drop in replacement since the package 6056 pinout was different (Ironically, if you read the previously linked article, you 6057 will certainly notice that the 80386SX was both expected to be named 80388, and 6058 be fully pin compatible with the 80286, too. Some old articles simply didn't age 6059 well...). As it was close enough in compatibility, simple adapters were 6060 possible, so upgrading a socketed 286 Motherboard to use a 386SX could have been 6061 done, allowing that platform to use the new 386 ISA features plus some IPC 6062 improvements. Sadly, such upgrades weren't usually cost effective since adapters 6063 could cost almost as much as a proper 386SX Motherboard, thus very few people 6064 did that, and that is even assuming that you had a socketed 286 and not a 6065 soldered one to begin with. 6066 6067 The next 386 variant, and for sure the most mysterious one, is the 6068 [80376 CPU][376cpu], introduced in 1989. The 376 was a sort of subset of the 386 6069 intended for embedded use, which had a few peculiarities not seen anywhere else 6070 in the history of the x86 ISA. The most important one is that it had no Real 6071 Mode support, instead, it directly initialized in protected mode. The second one 6072 is that its integrated MMU didn't support Paging for some reason. While 80376 6073 applications should run in an 80386, the opposite was not true if they used any 6074 of the unsupported features (Basically, nothing DOS related would run on a 376). 6075 If you ever wondered why Intel never tried to purge backwards compatibility from 6076 the modern x86 ISA, the fact that it is very probable that you never hear before 6077 about how Intel already tried to do so with the 376, should tell you something 6078 about how successful doing so was. 6079 6080 In 1990, Intel launched the 80386SL, targeting the nascent laptop and notebook 6081 market. The 386SL was almost a full fledged SoC, as it integrated a 386SX 6082 CPU/MMU Core with a Memory Controller (Including Expanded Memory emulation 6083 support), a Cache Controller and an ISA Bus Controller (The I/O Channel Bus had 6084 already been standardized and renamed to ISA by 1990). It also had a dedicated 6085 A20 Gate pin, which was first seen on the 80486 released in 1989. However, it 6086 didn't integrated the core platform support chips, something that the 6087 80186/80188 CPUs did. Instead, the 80386SL had a companion support chip, the 6088 82360SL IOS (I/O Subsystem), which sits directly in the ISA Bus and implements 6089 most of the PC/AT core (Except that for, some reason, it had two 8254 PITs 6090 instead of one. Albeit Compaq would have a 386 system like that...), thus making 6091 it comparable to a Southbridge. 6092 6093 If you check the 386SL Block Diagrams, is easy to notice that it didn't had a 6094 standard Local Bus to interface with other major chips, since all them were 6095 integrated. As such, the 386SL was its own Northbridge, with the Memory Bus, 6096 Cache Bus, Co-processor Bus (The 386SL could be paired with a 387SX FPU) and ISA 6097 Bus being directly managed by it. The Intel 82360SL also had its own Bus, the 6098 X-Bus, a little known ISA Bus variant that is directly comparable with the 6099 external I/O Channel Bus since it castrated the Data Bus to 8 Bits and typically 6100 hosted an 8042 compatible Keyboard Controller, the Floppy Disk Controller and 6101 the firmware ROM. Perhaps the 80386SL and 80360SL are the first Intel chips 6102 where we can see how it embraced the x86-IBM PC marriage, as Intel made clear 6103 than this pair of chips were designed specifically to be used for PC/AT 6104 compatible computers. Also, I think that the 82360SL was technically Intel first 6105 pure PC/AT compatible Southbridge. 6106 6107 The 386SL introduced a major feature, as it was the first Processor to implement 6108 a new operating mode, SMM (System Management Mode), which you may have hear 6109 about if you follow Hardware security related news. The intention of the SMM was 6110 that the BIOS firmware could use it to execute code related to power management 6111 purposes in a way that was fully transparent to a traditional OS, so that the 6112 computer itself would take care of all the power saving measures without needing 6113 software support, like a Driver for the OS. The SMM also had a lot of potential 6114 for low level Hardware emulation, a role for which it is currently used by 6115 modern UEFI firmwares to translate input from an USB Keyboard into a PC/AT 6116 compatible virtual Keyboard plugged into the Intel 8042 Microcontroller. Being 6117 in SMM mode is colloquially known as Ring -2, albeit this is a modern term since 6118 by the time that it received that nomenclature, the mode introduced by Intel 6119 VT-x and AMD-V Hardware virtualization extensions was already being called Ring 6120 -1. The Processor can't freely enter this mode, instead, there is a dedicated 6121 Pin known as the SMI (System Management Interrupt) that generates a special type 6122 of Interrupt to ask the Processor to switch to SMM. The SMI line is typically 6123 managed by the Chipset (In the 80386SL case, by its 82360SL IOS companion chip), 6124 any request to enter SMM has to be done though it. 6125 6126 Another interesting thing is that the 386SL had a hybrid Memory Controller that 6127 could use either DRAM or SRAM for system RAM. While everyone knows that SRAM is 6128 theoretically faster, in [Page 26 of the 386SL Technical Overview][386slto] 6129 Intel claims that it performed slower than DRAM since the SRAM chips that were 6130 big enough to be worth using as system RAM required 3 Memory WS, being 6131 ridiculous slower than the small ones that are used as Cache. Thus, the purpose 6132 of using SRAM as system RAM was not for better performance, but because it was 6133 an ultra low power alternative to DRAM. I found that chart quite appalling, 6134 since I always though that SRAM as system RAM should have been significantly 6135 faster, even if at a huge capacity cost for the same money. Also, the 386SL 6136 datasheet claims that it has a 32 MiB Physical Address Space, pointing out to a 6137 25 Bits Address Bus, but the integrated Memory Controller supports only up to 20 6138 MiB installed memory. I'm not sure why it is not a standard power of two. 6139 6140 The final step in the 386 evolution line was the 80386EX, released in 1994. 6141 While the 386EX is out of era since by that time 486s were affordable, the 6142 original P5 Pentium had already been released, and Intel had formally entered 6143 the PC/AT compatible Chipset business, it is still an interesting chip. The 6144 80386EX is somewhat similar to the 80386SL as it had a lot of integrated stuff 6145 SoC style, but instead of notebooks, it targeted the embedded market. The 386EX 6146 had a 386SX CPU/MMU Core plus the addition of the 386SL SMM, with an external 16 6147 Bits Data Bus and 26 Bits Address Bus (64 MiB Physical Address Space). However, 6148 compared to the 386SL, it doesn't have an integrated Memory Controller, Cache 6149 Controller or ISA Controller, instead, it integrates the functionality of some 6150 PC/AT support chips like those found in the 82360SL IOS. Basically, the 80386EX 6151 had an integrated Southbridge, with the Northbridge being made out of discrete 6152 chips, exactly the opposite thing than the 80386SL (Not counting its companion 6153 Southbridge). Compared to the previous chip that Intel designed to target the 6154 embedded market, the 80376, the 80386EX was quite successful. 6155 6156 While the 386EX had some PC/AT compatible peripherals, they don't seem to be as 6157 compatible as those of the 82360SL IOS. The RTC seems to be completely missing, 6158 so it needs a discrete one. There is a single 8254 PIT, which seems to be 6159 standard. It has two 8259A PICs that are internally cascaded, but it only 6160 exposes 10 external interrupts, with 8 being internal. This means that if trying 6161 to make a PC/AT compatible computer with ISA Slots, one Interrupt Pin would 6162 remain unconnected, since an ISA Slot exposes 11 Interrupt Lines. Finally, the 6163 DMAC is supposed to be 8237A compatible but only has 2 channels, which also 6164 means unconnected Pins on the ISA Slots. Moreover, implementing the ISA Bus 6165 would require external glue chips since the 386EX exposes its Local Bus, not a 6166 proper ISA Bus like the 386SL. It seems that it was possible to make a somewhat 6167 PC/AT compatible computer out of an 386EX, giving that some embedded computer 6168 manufacturers are selling it as such, but not fully compatible in a strict 6169 sense. As such, I find the 386SL much more representative of predicting where 6170 Intel would be going... 6171 6172 There are two almost unheard-of variants of the 386 that aren't really important 6173 to mention but I want to do so to satisfy curious minds. The first is the 6174 80386CX, which seems to be mostly an embedded version of the 80386SX with minor 6175 changes. The second is the [80386BX][386bx], which is a SoC similar to the 386EX 6176 but with a totally different set of integrated peripherals that aimed to cover 6177 the needs of early PDAs (The Smartphones of late 90's) with stuff like an 6178 integrated LCD Controller. 6179 6180 [386issues]: https://www.pcjs.org/pubs/pc/reference/intel/80386/ 6181 [ibtsxbts]: https://www.pcjs.org/documents/manuals/intel/80386/ibts_xbts/ 6182 [cs8231]: http://www.bitsavers.org/components/chipsAndTech/386AT_Chipsets_1989.pdf 6183 [386dx]: https://books.google.com/books?id=1L7PVOhfUIoC&pg=PA73&lpg=PA73&#v=onepage&f=false 6184 [376cpu]: http://www.pagetable.com/?p=460 6185 [386slto]: ftp://bitsavers.informatik.uni-stuttgart.de/components/intel/80386/240852-002_386SL_Technical_Overview_1991.pdf 6186 [386bx]: http://www.cpu-world.com/forum/viewtopic.php?t=35160