首页    期刊浏览 2024年11月29日 星期五
登录注册

文章基本信息

  • 标题:The HP Vectra 486 memory controller
  • 作者:Marilyn J. Lang
  • 期刊名称:Hewlett-Packard Journal
  • 印刷版ISSN:0018-1153
  • 出版年度:1991
  • 卷号:Oct 1991
  • 出版社:Hewlett-Packard Co.

The HP Vectra 486 memory controller

Marilyn J. Lang

The memory subsystem architecture and the memory controller in the HP Vectra 486 personal computer provide a high-performance burst-mode capability.

During the investigation phase for the HP Vectra 486 personal computer, in-house performance tools confirmed that the memory system was a key to overall system performance (see article on page 92). Selecting an optimal memory and controller architecture for a high-performance memory subsystem was a major design consideration for the HP Vectra 486 design team.

While performance was considered important to the success of the HP Vectra 486, it was but one of many important factors to consider for the memory controller design. The PC server market (a target for the HP Vectra 486) continues to demand more memory, yet entry level systems require a small starting memory and incremental memory size. There is also an emerging need to simplify the installation and configuration of memory by both customers and dealers. We were also anticipating future Intel486 microprocessor speed upgrades, and wanted a memory architecture that could support these upgrades with minimal changes. And, of course, we were striving to deliver, at a competitive price, a system that included the EISA standard.

From these requirements, the memory controller objectives became the desire to:

* Meet the HP Vectra 486 schedule and cost structure

* Provide competitive performance for 25-MHz systems

* Have a large and logical memory upgrade scheme

* Provide a design for supporting higher-speed Vectra 486 systems.

With these objectives, the design team began investigating relevant technologies that would help determine the optimal feature set. Three main areas were focused on: the Intel486's burst-mode capability, the 4M-bit DRAM, and the emerging 36-bit SIMM (single in-line memory module) standard for PCs.

Investigations

The Intel486, with its on-board 8K-byte cache, uses burst mode to fill a cache line from an external memory system. Burst mode, long used in larger computer systems but new to personal computers, is a more efficient method of transferring data. Rather than transferring only a single piece of data for each address generated, burst mode allows multiple pieces of data (typically four dwords*) to be transferred for each address. Since subsequent addresses need not be generated, fewer cycles are required to move information, and bandwidth increases.

* 32 bits

Supporting burst mode, on the other hand, requires more complexity than traditional memory or cache controllers.

Using our available performance tools, the Intel486 burst-mode capability was matched with various memory architectures, ranging from a simple, single-bank memory array to a cached, multiple-bank configuration. The single-bank memory array was quickly dropped, because it was not a competitive solution. The key finding from this analysis was that for 25-MHz systems, by using the burst-mode capability in the Intel486, a DRAM memory controller communicating directly to the Intel486 could compare quite favorably with a moderately sized external memory cache. This was particularly true for cache controllers that only supported burst mode between the Intel486 and the cache (or did not support burst mode at all). When the cost of the cache was factored in, the interleaved, bursting memory controller was the clear preference for the Vectra 486.

The 4M-bit DRAM was scheduled for production about the same time the Vectra 486 was to be released. Although the 4M-bit DRAM would provide the highest memory density available, it was considerably more costly than the IM-bit DRAM, which had been in production for several years. Being able to support both densities would allow us to exploit both the 1M-bit and 4M-bit advantages. Standard memory configurations could be built with the cost-effective 1M-bit DRAMs, while large memory arrays could use the 4M-bit. Furthermore, as the 4M-bit DRAM progressed down the production cost curve, we could move quickly to it when prices became attractive. By working closely with some of our key memory vendors, we were able to secure prototype and production volumes of 4M-bit DRAMs for the Intel486.

Previous HP personal computers had used SIMMs, and the general feedback from our customers and dealers was very positive. A SIMM is a small printed circuit board with memory installed on it typically surface mounted). An edge connector on the SIMM allows a customer to install it easily into an available connector. The typical SIMM organization is nine bits wide (eight data bits and a parity bit) and the edge connector has 26 pins. During Intel486 development a new SIMM organization was beginning to get attention-36 bits wide with a 72-pin edge connector-which allows a full dword (32 bits plus parity) to be on a single SIMM. This SIMM also supports presence detect, which encodes the size and speed of the module on four of the 72 bits, and allows the module characteristics to be read directly from the SIMM. The new SIMM was already available in 1M-byte and 2M-byte densities. Both densities use 1M-bit and 256K-bit DRAMs, but at the time none used the 4M-bit DRAM. Working with our key memory vendors, we were able to establish standard 4M-byte and 8M-byte SIMMs.

From these investigations and other discussions, the Intel486 memory controller feature set was defined to include:

* Intel486 burst-mode support

* 2M-byte to 64M-byte memory array size

* Minimum memory upgrade size of 2M-bytes

* Support for 1M-byte, 2M-byte, 4M-byte, and 8M-byte SIMMs

* Support for shadowing or remapping of 16K-byte memory blocks

* Full support for EISA devices, including bus masters.

Since many of the features we wanted to include involved new technologies, no commercial memory controllers were available that supported our feature set. Furthermore, a short investigation concluded that using an existing memory controller with additional surrounding logic to support the new features would not meet our cost or performance goals. We decided that the best design approach was to develop a new controller using an ASIC to implement the memory controller.

Memory System Architecture

The memory system is completely contained on a 5.6-inch by-13.3-inch memory board, and uses a proprietary connector on the Vectra 486 motherboard. The memory system sits directly on the 25-MHz Intel486 bus.

Allocating board space for the memory controller, the DRAM drivers, and other support logic, a maximum of eight SIMMs can be accomodated on the board. When populated with 8M-byte SIMMs, this allows a maximum memory size of 64M bytes. This is four times what previous HP personal computers had supported.

In burst-mode operations, the Intel486 is capable of accepting one dword each processor clock cycle. At 25 MHz, this means an ideal memory system would be able to deliver one dword every 40 ns. Since we were using 80-ns DRAMS, a simple 32-bit memory array was clearly not sufficient to meet our performance goals. Two possible architectures were investigated: a 128-bit-wide memory array and a 64-bit-wide memory array. With a 128-bit memory array, all four dwords would be fetched on the initial Intel486 memory access, and one dword output on each of the four clock cycles. For the 64-bit memory array, two dwords would be fetched using the Intel486-generated address, and two more dwords fetched using an address generated by the memory controller. The additional address generation requires another clock cycle, so the 64-bit memory array provides four dwords in five clocks, rather than four clocks. Although this was slower than ideal, the 64-bit-wide memory system allowed a minimum system configuration and upgrade increment of 2M bytes, rather than the 4M bytes required in the 128-bit architecture. We decided the 64-bit-wide memory array provided the best overall solution for the Vectra 486.

Fig. 1 shows the block diagram of the Vectra 486 memory system. The 36-bit SIMMs are organized in pairs, creating the 64-bit-wide memory array. SIMMs 1, 3, 5, and 7 contain the lower-order dword, while SIMMs 2, 4, 6, and 8 contain the higher-order dword. Each SIMM pair must be of the same SIMM density, but different density pairs are allowed in the memory array. The memory array is further divided into upper and lower memory halves (UPPER_MD and LOWER_MD) to reduce the maximum capacitance on each memory data line. Although this increased part count on the board and loading on the system host bus, it improved timing margins in the the most critical system timing paths.

Data transceivers are used to move data between the Intel486 and the memory array, and sit directly on the system host data bus (HOSTDATA(31:0)). Since the 64-bit memory system requires two memory accesses for each Intel486 burst

access, latching data transceivers are used to output data from the flat fetch while the second 64 bits are read.

The generation of memory addresses and control signals by the memory controller is complicated by the organization of the SIMMs. The 1M-byte and 4M-byte SIMMs are organized as a single block of memory (or memory bank), 256K deep by 36 bits wide and 1M deep by 36 bits wide respectively. Each memory bank has one row address strobe and four column address strobes (one for each byte). The 2M-byte and 4M-byte SIMMs, however, are organized as two banks of memory. The 2M-byte SIMM contains two 1M-byte banks, and the 8M-byte SIMM contains two 4M-byte banks. These two-bank SIMMs have two row address strobes (one per bank) and four shared column address strobes (to select one of four bytes in both banks). A SIMM socket can contain either a one-bank or a two-bank SIMM.

To correctly control the one-bank or two-bank SIMMs, the memory controller generates row address strobes and row addresses to the array based on the memory bank configuration. Each SIMM pair contains either one or two banks, depending on the SIMM installed. Eight row address strobes (RAS(7:0)) are generated directly from the memory controller, two for every SIMM pair. For a 2M-byte or 8M-byte SIMM the memory controller uses both row address strobes. For a 1M-byte or 4M-byte SIMM only one address strobe is used. The row address appears on MA(9:0) when the row address strobe goes active.

The memory controller also takes advantage of the page mode capability of the SIMMs, and keeps the row address strobe asserted in each memory bank. If a subsequent memory access falls within an active page (has the same row address as a previous access to the bank), the much faster page mode access is performed.

The column address strobe and column addresses to the array are generated from the four column address strobes from the memory controller (SCAS(3:0)), providing one strobe per SIMM pair. Because the Intel486 can operate on a single byte of data, each byte in the array is made individually accessible. Each SIMM has four column address strobes, so 32 strobes (CAS(31:0)) are generated for the eight SIMMs by combining SCAS(3:0) with eight byte enable signals BE(7:0)). BE(7:0) is also used to generate the direction controls READ_OE and WRITE_OE) and latch signal (LATCH_DATA) to the data tranceivers.

Parity is also handled on a byte basis. Because of memory controller pinout and timing, parity generation and detection are implemented using PALs and random logic. Another PAL is used as a SIMM presence detect encoder, which reads four presence detect (PD) bits from the first SIMM of each pair and encodes them into six SIMM_CONFIGURATION bits. This encoding specifies several different possible memory configurations, including combinations of 1M-byte and 4M-byte SIMMs, or 2M-byte and 8M-byte SIMMs. When used with the EISA configuration utility, the presence detect capability allows the user to configure memory from the screen.

To accommodate the Intel486s 33-MHz timing (which was not available during the design phase of the project), the READ_OE signals to the data tranceivers are generated one clock early and pipelined through an external registered PAL. This scheme ensured that the read path was as fast as possible. It also gave us some flexibility in host bus timing, in case of changes in CPU timing.

Memory Controller Architecture

Fig. 2 shows a block diagram of the Vectra 486 memory controller. There are seven major blocks in the memory controller. The configuration registers contain address range, remap and shadow regions, and other memory configuration information typically set by the BIOS at power-on (see the article on page 83). The 8-bit XI) bus, a data bus available on all PCs, is used to access all memory controller registers because fast access is not a high priority at power-on time.

The memory configuration information, along with the SIMM configuration information from the presence detect pins on each pair of SIMMs, is used by the address block to determine if the current memory cycle on the host address bus is in the memory controller's address range. If it is, the address block will also determine which memory bank is selected, whether it is a page hit or miss (whether the current row address is the same as an active page), and the appropriate DRAM row and column addresses MA(9:0)).

Memory cycles that appear on the host bus are generated either from the CPU or from a backplane device such as an EISA bus master. Two independent state machines, the CPU state machine and the EISA/ISA/Refresh state machine, monitor the state of each device. The CPU state machine is actually two interlocked state machines. One machine monitors the host bus and when it sees a memory request, it starts a second state machine. The second machine generates the appropriate CPU_CYCLE_CNTL signals (page hit or miss, dword write, or one, two, or four dword read). The CPU state machine is fully synchronous with the Intel486 processor clock.

The EISA/ISA/Refresh state machine generates control signals for all other cycles. This machine supports EISA burst read or write cycles, EISA- and ISA-compatible DRAM refresh, and all ISA cycles. Because ISA is an asynchronous bus, the EISA/ISA/Refresh state machine is a semi-synchronous state machine, and uses BCLK (the backplane clock), and external delay lines to generate the BACKPLANE_CYCLE_CNTL signals.

The CPU_CYCLE_CNTL and BACKPLANE_CYCLE_CNTL signals are generated on every memory cycle. Each set of signals includes the DRAM timing relationships that optimize the respective CPU or backplane device bus cycle. HLDA (hold acknowledge) is used as the select signal to a multiplexer to determine the correct set of signals. Once the correct CYCLE_CNTL is selected, the corresponding DRAM control signals RAS, CAS, and WE are generated for each bank via the DRAM interface block. The byte, word, and dword addressability of the memory array is also handled by the DRAM interface block, which generates the appropriate data transceiver control signals READ_OE and WRITE_OE). For the Vectra 486, all memory reads are 64 bits while memory writes can be one byte, one word (two bytes), or one dword (four bytes).

The row address strobe timeout clock is used for DR" timing. The maximum time a page can be open (RAS active) is 10 [mu]s. Since it is possible to exceed this limit during an EISA burst cycle, continuous page hits, or a long Intel486 idle time, it is necessary to monitor the time each bank is active. Eight timeout counters, one for each bank, monitor the active page time. Counters are enabled when the row address strobe is active, reset when the row address strobe goes inactive, and clocked by an external 1.16-MHz oscillator. When the timeout limit is reached, RAS TIMEOUT is generated. The CPU state machine and the EISA/ISA/Refresh state machine will then finish the current memory cycle and allow the DRAM interface block to disable the timed-out DRAM page. In some instances it is possible to disable a page without incurring any clock penalties because a page hit on one bank can be done while turning off a timed-out bank.

The test block is used to debug and test the memory controller chip. An external test pin puts the memory controller into the test mode. In test mode, external address lines are used to select which signals and state machine states are put on the internal test bus. The internal test bus contents are available via the XD bus.

Burst Mode Read

All Intel486 memory requests are initiated by placing the memory address on the host address bus, setting appropriate control lines (i.e. memory read or write) and strobing ADS#. Fig. 3 shows some of the key timing for a burst-mode read cycle for four dwords. One of the control lines, BLAST# (burst last) is asserted if the Intel486 requests a burst-mode cycle. If the memory system is incapable of supporting burst mode, it will return a single dword and assert RDY# (ready). If the memory system can support burst mode, it will assert BRDY# (burst ready) and return two or four dwords depending on the type of Intel486 request. The Intel486-generated memory address is used to fetch the first two dwords, and a second address (incremented by two dwords) is generated by the memory controller to complete the four-dword burst read.

Returning a burst-mode request entails several operations within the memory system. For simplicity, we assume a DRAM page hit (for a page miss, additional cycles are required to generate a row address strobe and row address). When the Intel486 requests a burst cycle, it win output an address for each of the four dwords in the burst. These addresses (and respective data) follow a particular sequence, depending on the initial address supplied by the Intel486. The memory controller uses only the initial address because the subsequent addresses from the Intel486 would not meet our system timing. The memory controller will latch the initial address and generate the identical sequence earlier in the burst cycle.

There are four possible address sequences, determined by the state of HOST_ADDRESS(3:2):

The memory controller will generate the correct address sequence by toggling A2 on each dword. The third and fourth dwords differ in A3, so the second memory read has a column address that differs from the first only in one bit.

To improve burst-mode timing, rather than waiting for BLAST# to be asserted (which may come relatively late in the cycle), the memory controller assumes every memory read is a burst-mode read, and begins generating CAS, READ_OE and BRDY# signals. The memory controller will return RDY# with the first dword of every read cycle. The memory controller will then use BLAST# (now valid) to determine if the request was for a burst read. If it was not, a RDY# will be generated, the second dword read ignored, and the cycle terminated. If it is a burst read, then CAS is precharged in preparation for a second memory read, the first and second dwords are latched in the data transceivers, and the second dword is output. BRDY# is returned for the second dword on the next clock cycle, at which time the second memory read begins and the first data latch is opened to receive data for the third dword.

One clock later, both data latches are open, and the third and fourth dwords are put on the host data bus in consecutive clock cycles. The memory controller completes the burst-mode read by generating a SERDYO# (shared early ready) signal. This signal is input to a logic block in the Vectra 486 memory subsystem which forms the RDY# signal to the Intel486 (see Mg. 1). In the Intel486 a burst mode read cannot be prematurely terminated, so once a burst sequence has started, all four dwords must be read.

Conclusion

The memory controller design began at the same time as the HP Vectra 486 SPU (system processing unit), and remained the critical path component for most of the development schedule. The project team successfully met the HP Vectra 486 schedule objective by delivering a fully functional first-pass memory controller chip. This chip revision was used for the HP Vectra 486/25T production until introduction of the HP Vectra 486/33T memory controller version. Fig. 4 shows one of the memory benchmarks run on the HP Vectra 486 and other cached 25-MHz Intel486-based machines.

Acknowledgments

Key to the success of the HP Vectra 486 memory controller were the other members of the design team: Sridhar Begur, Stuart Siu and Deepti Menon. Wes Stelter was responsible for the memory board design, and provided much assistance during initial chip debug. Carol Bassett led the vendor selection investigation and the writing of the data sheet. Bob Campbell contributed to the initial architecture, while Mark Brown provided project management during the initial definition and architecture phase. Wang Li and the HP Circuit Technology Group deserve special recognition for their execution and delivery of prototype and production chips.

Bibliography

1. i486 Microprocessor Data Book, Intel Corporation, 1991.

2. 82350 EISA Chipset Data Sheet, Intel Corporation, 1989.

COPYRIGHT 1991 Hewlett Packard Company
COPYRIGHT 2004 Gale Group

联系我们|关于我们|网站声明
国家哲学社会科学文献中心版权所有