.

Monday, April 1, 2019

Pentium Memory Management Unit Computer Science Essay

Pentium retention centering Unit encryptr Science riseThe of import aim of the research paper is to analyze Pentium Memory anxiety Unit. Here, received key characters associated with a retrospection wariness unit similar(p) break take in, pagination, their treasureion, compile associated with MMU in form of transformation look asunder buffer, how to perfect micro dishors execution after implementing those features etc. defend been discussed. Some problems and their respective solutions cereb t aver to Pentium retrospect heed unit atomic number 18 likewise covered. Also, the current and future tense research work d angiotensin- changeing enzyme in the field of stock centering is covered too. The main challenge is to get accustomed with the Pentium storage precaution unit and analyze the crucial factors related.IntroductionA hardwargon office li qualified in handling different approach pathes to keeping requested by mainframe is know as depot solic itude unit (MMU), which is also termed as scalawagd store management unit (PMMU). The main functions of MMU merchant ship be categorized as follows-1 comment of realistic(prenominal)(prenominal) extensi unmatchables to somatic broodes which is also cognize as realistic(prenominal) memory management (VMM).Memory trade protection roll up ControlBus Ar tourrationBank switchingThe memory governing body for Pentium micro lickor is 4G bytes in size just as in 80386DX and 80486 micro mainframe computers. Pentium uses a 64- potato chip information bus to turn memory organized in eight banks that to separately one turn backs 512M bytes of data.Most microprocessors including Pentium also supports virtual memory concept with the alleviate of memory management unit. practical(prenominal) memory is use to manage the resource of forcible memory. It gives an lotion the illusion of a very large f ar of memory, typically much larger than what is actually available. It supports t he execution of processes instigateially resident in memory. Only the close to recently employ portions of a processs turn quad actually occupy visible memory-the rest of the source piazza is stored on plough until needed. The Intel Pentium microprocessor supports both(prenominal) fractionation and section with summon. some other important feature support by Pentium processors is the memory protection. This mechanism helps in encloseing glide slope to certain constituents or rogues based on permit directs and thus protect little data if kept in a privilege level with highest priority from different attacks.Intels Pentium processor also supports save, translation look aside buffers, (TLBs), and a store buffer for temporary on-chip (and out-of-door) storage of learning manual and data.A nonher major issue resolved by MMU is the fragmentation of memory. Some cartridge holders, the size of largest beside free memory is much smaller than the total available me mory because of the fragmentation issue. With virtual memory, a contiguous range of virtual screames bathroom be officeped to several non-contiguous blocks of natural memory. 1This research paper basically revolves nigh different functions associated with a memory management unit of Pentium processors. This includes features like virtual memory management, memory protection, and save control and so on. Pentiums memory management unit has some problems associated with it and some benefits as well which will be covered in detail in the later part. The above mentioned features help in solving major performance issues and has given a bunce to the microprocessor world.HistoryIn some aboriginal microprocessor designs, memory management was performed by a separate integrated circuit much(prenominal) as the VLSI VI475 or the Motorola 68851 utilise with the Motorola 68020 CPU in the Macintosh II or the Z8015 used with the Zilog Z80 family of processors. Later microprocessors such(p renominal) as the Motorola 68030 and the ZILOG Z280 fit(p) the MMU together with the CPU on the same integrated circuit, as did the Intel 80286 and later x86 microprocessors.The first memory management unit came into existence with the release of 80286 microprocessor chip in 1982. For the first quantify, 80286 offered on-chip memory management which hold backs it sui tabularize for multitasking exploits. On many machines, collect entrance fee date limits the quantify round of drinks rate and in turn it affects to a greater extent than the fairish memory get at time. therefore, to achieve fast irritate times, fitting the save on chip was very important and this on-chip memory management surface the way.The major functionalities associated with a memory management argon divideation and paging. part unit was found first and fore or so on 8086 processor which had inactive one purpose of serving as a gateway for 1MB material parcel out space. To pass on easy portin g from old applications to the new environment, it was decided by Intel to keep the separateation unit alive under defend- method. professionaltected panache does non have fixed sized memory blocks in memory, comminutedly instead, the size and position of each part is roundabout in an associated data structure called a subdivision Descriptor. All memory references be accessed sexual relation to the base address of their synonymic segment so as to book re military position of program modules fairly easy and also debar direct system to perform code fix-ups when it loads applications into memory. 2 With paging enabled, the processor adds an limited level of indirection to the memory translation process. Instead of serving as a fleshly address, an application-generated address is used by the processor to index one of its look-up tables. The corresponding entry in the table contains the actual physical address which is sent to the processor address bus. Through the use of paging, operational systems tush create distinct address spaces for each running application thus simplifying memory access and keep backing potential conflicts. realistic-memory allows applications to allocate more than memory than is physically available. This is done by keeping memory varlets partially in random-access memory and partially on disk. When a program tries to access an on-disk knave, anExceptionis generated and the in operation(p) system reloads the varlet to allow the faulting application resume its execution. 2The Pentium 4 was Intels utmost endeavor in the realm of mavin-core CPUs. The Pentium 4 had an on-die stash memory of 8 to 16 KB. The Pentium 4 memory accumulate is a memory location on the CPU used to store instructions to be processed. The Pentium 4 on-die memory accumulate is an extremely fast memory location which stored and decoded instructions known as microcode that were about to be executed by the CPU. 3By todays standards, the Pentium 4 save up size is very lacking in capacity. This lack of hoard memory means the CPU must make more calls to RAM for operating instructions. These calls to RAM atomic number 18 performance diminution, as the latency involved in transferring data from RAM is much higher(prenominal) than from the on-die squirrel away. Often overlooked, the compile size of any CPU is of vast magnificence to predicting the performance of acomputerprocessor. While the Pentium 4s level one pile up was very limited by todays standards, it was at the time of its release more than adequate for the majority of computer applications. 4Likely Pentium Pros most noticeable appurtenance was its on-package L2 cache, which ranged from 256 KB at introduction to 1 MB in 1997. Intel placed the L2 die(s) separately in the package which still allowed it to run at the same clock speed as the CPU core. Additionally, unlike most motherboard-based cache aims that shared the main system bus with the CPU, the Pentium Pros cache had its own back-side bus. Because of this, the CPU could read main memory and cache concurrently, greatly reducing a traditional bottleneck. The cache was also non-blocking, meaning that the processor could issue more than one cache request at a time (up to 4), reducing cache-miss penalties. These properties combined to produce an L2 cache that was immensely faster than the motherboard-based caches of older processors. This cache alone gave the CPU an advantage in input/output performance over older x86 CPUs. In multiprocessor configurations, Pentium Pros integrated cache skyrocketed performance in comparison to arc crashectures which had each CPU sharing a central cache. 4However, this out-of-the-way(prenominal) faster L2 cache did come with some complications. The processor and the cache were on separate dies in the same package and connected closely by a full-speed bus. The two or three dies had to be bonded together early in the production process, before testing w as possible. This meant that a single, tiny blur in either die made it necessary to discard the whole assembly. 5Technical Aspects of Pentiums Memory Management UnitVirtual Memory Management in PentiumThe memory management unit in Pentium is upward congruous with the 80386 and 80486 microprocessors. The one-dimensional address space for Pentium microprocessor is 4G bytes that means from 0 to (232 1).MMU translates the Virtual hook to Physical address in less than a single clock cycle for a HIT and also it minimizes the cache come time for a MISS. CPU generates limpid address which are given to air division unit which produces melodic phrasear address which are then given to paging unit and thus paging unit generates physical address in main memory. Hence, paging and segmentation units are sub forms of MMUs. trope 3.1 analytic to Physical channelize Translation in PentiumPentium can run in both modes i.e. real or defend. Real mode does not allow multi-tasking as there is no protection for one process to arbitrate with another whereas in protected mode, each process runs in a separate code segment. Segments have different privilege levels preventing the lower privilege process (such as an application) to run a higher privilege one (e.g. Operating system). Pentium running in Protected mode supports both segmentation and segmentation with paging. segmentation PentiumThis process helps in dividing programs into logical blocks and then placing them in different memory commonwealths. This makes it possible to regulate access to critical sections of the application and help identify bugs during the development process. It includes several features like to define the exact location and size of each segment in memory and set a specific privilege level to a segment which protects its contented from unauthorized access. 6Segment registers are now calledsegment selectorsbecause they do not map directly to a physical address but taper to an entry of the fo rm table.Pentium CPU has half a dozen 16 bit segment registers called SELECTORS. The logical address consists of 16 bit of segment size and 32 bit offset. The to a lower place figure shows a multi-segment model which uses the full capabilities of the segmentation mechanism to provide hardware enforced protection of code, data structures, and programs and tasks. This is supported by IA-32 architecture. Here, each program is given its own table of segment descriptors and its own segments. realize 3.1.1.1 Multi-Dimensional ModelWhen the processor needs to translate a memory location SEGMENT OFFSET to its corresponding physical address , it takes the following steps 7Step 1 take care the start of the descriptor table (GDTR register)The at a lower place figure shows CPU selectors provide index (pointer) to Segment Descriptors stored in RAM in the form of memory structures called Descriptor prorogues. Then, that address is combined with the offset to locate a specific notationar add ress.Figure 3.1.1.2 Selector to Descriptor and then to finally delimitar address in Pentium MMUStep 2 Find the Segmententry of the table this is the segment descriptor corresponding to the segment.There are two guinea pigs of Descriptor tables ball-shaped Descriptor Table and local anesthetic Descriptor table. international Descriptor Table It consists of segment definitions that apply to all programs like the code belonging to operating system segments created by OS before CPU switched to protected mode.Local Descriptor Table These tables are unique to an application.This figure finds the entry of the segment table and then a segment descriptor is chosen corresponding to the segment. 7Figure 3.1.1.3 Global and Local Descriptor TablePentium has a 32 bit base address which allows segments to begin at any location in its 4G bytes of memory. The below figure shows the format of a descriptor of a Pentium processor 7Figure 3.1.1.4 Pentium Descriptor FormatStep 3 Find the base physic al address of the segmentStep 4 Compute = + OFFSET 7Paging UnitPaging is an address translation from linear to physical address. The linear address is divided into fixed continuance pages and besides the physical address space is divided into same fixed length frames. Within their respective address spaces pages and frames are numbered sequentially. The pages that have no frames depute to them are stored on the disk. When the CPU needs to run the code on any non-assigned page, it generates a page fault exception, upon which the operating system reassigns a currently non-used frame to that page and copies the code from that page on the disk to the newly assigned RAM frame. 9Pentium MMU uses the two-level page table to translate a virtual address to a physical address. The page directory contains 1024 32-bit page directory entries (PDEs), each of which points to one of 1024 level-2 page tables. for each one page table contains 1024 32-bit page table entries (PTEs), each of whic h points to a page in physical memory or on disk. The page directory base register (PDBR) points to the beginning of the page directory.Figure 3.1.2.1 Pentium multi-level page table 8For 4KB pages, Pentium uses a two level paging dodge in which division of the 32 bit linear address asFigure 3.1.2.2 Division of 32 bit linear addressThe below figure shows the complete address translation process in Pentium i.e. from CPUs virtual address to main memorys physical address.Figure 3.1.2.3 Summary of Pentium address translation 8The size of a paging table is dynamic and can decease large in a system that contains large memory. In Pentium, collectable to the 4M byte paging feature, there is just a single page directory and no page tables. Basically, this mechanism helps operating system to create VIRTUAL (faked) address space by swapping code between disk and RAM. This procedure is known as virtual memory support. 9 The paging mechanism in Pentium functions with 4K byte memory pages or wit h a new extension available to the Pentium with 4M byte memory pages. The 20-bit VPN is partitioned into two 10-bit chunks. VPN1 indexes a PDE in the page directory pointed at by the PDBR. The address in the PDE points to the base of some page table that is indexed by VPN2. The PPN in the PTE indexed by VPN2 is concatenated with the VPO to form the physical address. 8Figure 3.1.2.4 Pentium foliate table Translation 8 part with Paging PentiumPentium supports both pure segmentation and segmentation with paging. To select a segment, program loads a selector for that segment into one of six segment registers. For e.g. CS register is a selector for code segment and DS register is a selector for data segment. Selector can specify whether segment table is Local to the process or Global to the machine. Format of a selector used in Pentium is as followsCBb4JPGfoo4-43.jpgFigure 3.1.3.1 Selector FormatThe steps overlookd to achieve this methodology are as follows-Step 1 Use the Selector to convert the 32 bit virtual offset address to a 32 bit linear address.Step 2 Convert the 32 bit linear address to a physical address utilise a two-stage page table.Figure 3.1.3.2 mapping of a linear address onto a physical address 9The below figures shows the complete process of segmentation on with paging which is one of the important functionalities of Pentiums memory management unit. 9Figure 3.1.3.3 Segmentation with pagingSome modern processors allow usage of both, segmentation and paging alone or in a combination (Motorola 8030 and later, Intel 80386, 80486, and Pentium) the OS designers have a choice which is cgiven in the below table. 9SegmentationPagingNoNoSmall (embedded) systems,low overhead, high performanceNoYes running(a) address spaceBSD UNIX, Windows NTYesNoBetter controlled protection and sharing.ST can be kept on chip predictableaccess times (Intel 8086)YesYesControlled protection/sharingBetter memory management.UNIX Sys. V, OS/2.Figure 3.1.3.4 Usage of segmentat ion and paging in different processorsIntel 80386, 486 and Pentium support the following MM scheme which is used in IBM OS/2. The diagram is shown belowFigure 3.1.3.5 Intels Memory Management scheme implemented in IBM OS/23.1.4 Optimizing Address Translation in Pentium processorsThe main goal of memory management for address translation is to have all translations in less than a single clock cycle for a HIT and minimize cache fetch time for a MISS. On page fault, the page must be fetched from disk and it takes millions of clock cycles which are handled by OS code. To minimize page fault rate, two methods used are-1. Smart replacement algorithms To invalidate page fault rate, the most preferred replacement algorithm is least-recently used (LRU). In this, a reference bit is set to 1 in page table entry to each page and is periodically modify to 0 by OS. A page with reference bit stir to 0 has not been used recently. 102. Fast translation using Translation Look aside buffer zone Ad dress translation would appear to require extra memory references i.e. one to access the Page table entry and then the other for actual memory access. But access to page tables has good locality and thus use a fast cache of PTEs within the CPU called a Translation Look-aside devotee (TLB) where the typical rate in Pentium is 16-512 PTEs, 0.5-1 cycle for hit, 10-100 cycles for miss, 0.01%-1% miss rate. 11Page size4KB -64 KB photograph Time50-100 CPU clock cyclesMiss PenaltyAccess timeTransfer time106 107 clock cycles0.8 x 106 -0.8 x 107 clock cycles0.2 x 106 -0.2 x 107 clock cyclesMiss rate0.00001% 0.001%Virtual addressspace sizeGB -16 x 1018 byteFigure 3.1.4.1 TLB ratesUsing the below mentioned two methods, TLB misses are handled (hardware or software)The page is in memory, but its physical address is missing. A new TLB entry must be created.The page is not in memory and the control is transferred to the operating system to hold with a page fault where it is handled by causing e xception ( resolve) using EPC and Cause register. There are two ways of handling them- tuition page faultStore the advance of the processLook up the page table to find the disk address of the referenced pageChoose a physical page to replaceStart a read from disk for the referenced pageExecute another process until the read completesRestart the instruction which caused the fault 12Data access page faultOccurs in the middle of an instruction.MIPS instructions are restartable prevent the instruction from completing and restart it from the beginning.More complex machines interrupting instructions (saving the bring up of CPU)3. The other method used to reduce the HIT time is to avoid address translation during indexing. The CPU uses virtual addresses that must be mapped to a physical address. A cache that indexes by virtual addresses is called a virtual cache, as opposed to a physical cache. A virtual cache reduces hit time since a translation from a virtual address to a physical addre ss is not necessary on hits. Also, address translation can be done in parallel with cache access, so penalties for misses are reduced as well.Although some difficulties are associated with Virtual cache technique i.e. process switches require cache purging. In virtual caches, different processes share the same virtual addresses even though they map to different physical addresses. When a process is swapped out, the cache must be purged of all entries to make sure that the new process gets the correct data. 13 contrasting solutions to overcome this problem are-PID tags Increase the width of the cache address tags to include a process ID (instead of purging the cache.) The current process PID is specified by a register. If the PID does not match, it is not a hit even if the address matches.Anti-aliasing hardware A hardware solution called anti-aliasing guarantees every cache block a unique physical address. Every virtual address maps to the same location in the cache.Page coloring Thi s software technique forces aliases to share some address bits. Therefore, the virtual address and physical address match over these bits.Using the page offset An alternative to get the best of both virtual and physical caches. If we use the page offset to index the cache, then we can lap covering the virtual address translation process with the time required to read the tags. Note that the page offset is unaffected by address translation. However, this limitation forces the cache size to be smaller than the page size.Pipelined cache access Another method to improve cache is to divide cache access into stages. This will lead to the following resultPentium 1 clock cycle per hitPentium II and III 2 clock cycles per hitPentium 4 4 clock cycles per hitIt helps in allowing faster clock, while still producing one cache hit per clock. But the problem is that it has higher severalize penalty, higher load delay. 13Trace caches A trace cache is a specialized instruction cache containing in struction traces that is, sequences of instructions that are apt(predicate) to be executed. It is found on Pentium 4 (NetBurst microarchitecture). It is used instead of constituted instruction cache. amass blocks contain micro- trading operations, rather than raw memory and contain branches and continue at branch target, thus incorporating branch prediction. Cache hit requires correct branch prediction. The major advantage is that it makes sure instructions are available to supply the pipeline, by avoiding cache misses that result from branches and the disadvantage is that the cache may hold the same instruction several times and it has more complex control. 13System Memory Management ModeThe system memory management mode (SMM) is on the same level as protected mode, real mode and virtual mode, but it is provided to function as a manager. The SMM is not intended to be used as an application or a system level feature. It is intended for high-level system functions such as power management and security, which most Pentiums use during operation, but that are controlled by the operating system.Access to the SMM is accomplished via a new external hardware interrupt applied to the SMI pin on the Pentium. When the SMM interrupt is activated, the processor begins executing system-level software in an area of memory called the system management RAM, or SMMRAM, called the SMM state waste-yard record. The SMI interrupt disables all other interrupts that are normally handled by user applications and the operating system. A return from the SMM interrupt is accomplished with a new instruction called RSM. RSM returns from the memory management mode interrupt and returns to the interrupted program at the point of the interruption.SMM allows the Pentium to treat the memory system as a flat 4G byte system, instead of organism able to address the first 1M of memory. SMM helps in executing the software initially stored at a memory location 38000H. SMM also stores the state of the Pentium in what is called a dump record. The dump record is stored at memory locations 3FFA8H through 3FFFFH. The dump record allows a Pentium based system to enter a log Zs mode and reactivate at the point of program interruption. This requires that the SMMRAM be supply during the sleep period. The Halt auto restart and I/O sand trap restarts are used when the SMM mode is exited by the RSM instruction. These data allow the RSM instruction to return to the halt state or return to the interrupt I/O instruction. If neither a halt nor an I/O operation is in effect upon entering the SMM mode, the RSM instruction reloads the state of the machine from the state dump and returns to the point of interruption. 14Memory protection in PentiumIn protected mode, the Intel 64 and IA-32 architectures provide a protection mechanism that operates at both the segment level and the page level. This protection mechanism provides the ability to limit access to certain segments or pages based on privilege levels. The Pentium 4 also supports four protection levels, with level 0 being the most privileged and level 3 the least.Segment and page protection is incarnate in localizing and detecting design problems and bugs. It can also be implemented into end-products to offer added robustness to operating systems, utilities software, and applications software. This protection mechanism is used to hold certain protection checks before actual memory cycle gets started such as Limit checks, type checks, privilege level checks, restriction of addressable domains and so on.The figure shows how these levels of privilege are interpreted as go of protection. Here, the center (reserved for the most privileged code, data, and stacks) is used for the segments containing the critical software, usually the join of an operating system. Outer rings are used for less critical software. At each instant, a running program is at a certain level, indicated by a 2-bit field in its PSW (Program S tatus Word). Each segment also belongs to a certain level.Figure 3.3.1 Protection on Pentium IIMemory protection implemented by associating protection bit with each frame valid-invalid bit attached to each entry in the page tableValid indicates that the associated page is in the process logical address space, and is thus a legal page. remove indicates that the page is not in the process logical address space.As long as a program restricts itself to using segments at its own level, everything works fine. Attempts to access data at a higher level are permitted. Attempts to access data at a lower level are mislabeled and cause traps.3.4 Cache in Pentium mainframe computersCache control is one of the most common techniques for improving performance in computer systems (both hardware and software) is to utilize caching for frequently accessed information. This lowers the average cost of accessing the information, providing greater performance for the overall system. This applies in pr ocessor design, and in the Intel Pentium 4 Processor architecture, caching is a critical agent of the systems performance.The Pentium 4 Processor Architecture includes tenfold types and levels of caching direct 3 Cache This type of caching is only available on some versions of the Pentium 4 Processor (notably the Pentium 4 Xeon processors). This provides a large on-processor tertiary memory storage area that the processor uses for keeping information nearby. Thus, the contents of the Level 3 cache are faster to access.Level 2 Cache this type of cache is available in all versions of the Pentium 4 Processor. It is normally smaller than the Level 3 cache and is used for caching both data and code that is being used by the processor.Level 1 Cache this type of cache is used only for caching data. It is smaller than the Level 2 Cache and generally is used for the most frequently accessed information for the processor.Trace Cache this type of cache is used only for caching decoded i nstructions. Specifically, the processor has already broken down the normal processor instructions into micro operations and it is these micro ops that are cached by the P4 in the Trace Cache.Translation Look aside Buffer (TLB) this type of cache is used for storing virtual-to-physical memory translation information. It is an associative cache and consists of an instruction TLB and data TLB.Store Buffer this type of cache is used for taking arbitrary write operations and caching them so they may be written back to memory without blocking the current processor operations. This decreases line between the processor and other separate of the system that are accessing main memory. There are 24 entries in the Pentium 4.Write Combining Buffer this is similar to the Store Buffer, except that it is specifically optimized for burst write operations to a memory region. Thus, multiple write operations can be combined into a single write back operation. There are 6 entries in the Pentium 4.T he disadvantage of caching is handling the situation when the original simulate is modified, thus making the cached information incorrect (or stale). A significant amount of the work done within the processor is ensuring the consistency of the cache, both for physical memory as well as for the TLBs. In the Pentium 4, physical memory caching remains coherent because the processor uses the MESI protocol. MESI defines the state of each unique cached firearm of memory, called a cache line. In the Pentium 4, a cache line is 64 bytes. Thus, with the MESI protocol, each cache line is in one of four states special the cache line is owned by this processor and there are modifications to that cache line stored within the processor cache. No other part of the system may access the main memory for that cache line as this will obtain stale information.Exclusive the cache line is owned by this processor. No other part of the system may access the main memory for that cache line.Shared the ca che line is owned by this processor. Other parts of the system may follow shared access to the cache line and may read that feature cache line. None of the shared owners may modify the cache line.Invalid the cache line is in an indeterminate state for this processor. Other parts of the system may own this cache line, or it is possible that no other part of the system owns the cache line. This processor may not access the memory and it is not cached. 15Current Problems and Solution associated with themWhen you run multiple programs (especially MS-DOS-based programs) on a Windows-based computer that has insufficient system memory (RAM) and contains an Intel Pentium Pro or Pentium II processor, information in memory may pose unavailable or damaged, leading to unpredictable results. For example, copy and compare operations may not work consistently.This behavior is an indirect result of certain performance optimizations in the Intel Pentium Pro and Pentium II processors. These optim izations affect how the Windows 95 Virtual Machine Manager (VMM) performs certain memory operations, such as find which sections of memory are not in use and can be safely freed. As a result, the Virtual Machine Manager may free the wrong pages in memory, leading to the symptoms described earlier. This problem no longer occurs in Windows 98. To resolve this problem, install the current version of Windows. 16There is a little problem with sharing in

No comments:

Post a Comment