what does it mean to be byte addressable
In computer architecture, word addressing ways that addresses of retentiveness on a computer uniquely identify words of memory. It is usually used in contrast with byte addressing, where addresses uniquely place bytes. Almost all mod computer architectures use byte addressing, and word addressing is largely simply of historical interest. A reckoner that uses word addressing is sometimes called a word machine.
Tables showing the same data organized under byte and word addressing
Basics [edit]
Consider a reckoner which provides 524,288 (iixix) bits of memory. If that retention is arranged in a byte-addressable flat accost infinite using 8-flake bytes, then there are 65,536 (216) valid addresses, from 0 to 65,535, each cogent an independent eight bits of retentivity. If instead it is arranged in a word-addressable flat address space using 32-bit words, then there are sixteen,384 (ii14) valid addresses, from 0 to xvi,383, each cogent an contained 32 bits.
More mostly, the minimum addressable unit (MAU) is a belongings of a specific retentivity abstraction. Different abstractions within a computer may use different MAUs, even when they are representing the same underlying retentiveness. For example, a computer might use 32-bit addresses with byte addressing in its education set, but the CPU'southward enshroud coherence organisation might piece of work with retentiveness just at a granularity of 64-byte cache lines, allowing any item cache line to be identified with only a 26-bit address and decreasing the overhead of the cache.
The address translation done by virtual retentiveness often affects the structure and width of the address space, but it does non change the MAU.
Trade-offs of different minimum addressable units [edit]
The size of the minimum addressable unit of retention can have circuitous merchandise-offs. Using a larger MAU allows the same amount of memory to be covered with a smaller address, which can substantially decrease the retentivity requirements of a program. Still, using a smaller MAU makes it easier to piece of work efficiently with small-scale items of data.
Suppose a program wishes to store one of the 12 traditional signs of Western astrology. A single sign can be stored in 4 $.25. If a sign is stored in its ain MAU, then iv bits will be wasted with byte addressing (50% efficiency), while 28 $.25 will exist wasted with 32-bit give-and-take addressing (12.5% efficiency). If a sign is "packed" into a MAU with other data, and then it may exist relatively more expensive to read and write. For example, to write a new sign into a MAU that other data has been packed into, the computer must read the current value of the MAU, overwrite but the appropriate bits, and then shop the new value back. This volition be especially expensive if it is necessary for the program to let other threads to concurrently change the other data in the MAU.
A more common example is a cord of text. Common string formats such as UTF-viii and ASCII store strings as a sequence of 8-fleck lawmaking points. With byte addressing, each code point can be placed in its own independently-addressable MAU with no overhead. With 32-chip discussion addressing, placing each lawmaking indicate in a split up MAU would increase the memory usage by 300%, which is non viable for programs that piece of work with large amounts of text. Packing adjacent code points into a unmarried discussion avoids this cost. However, many algorithms for working with text prefer to exist able to independently accost code points; to exercise this with packed code points, the algorithm must use a "broad" address which too stores the starting time of the character within the word. If this wide address needs to be stored elsewhere within the program's retentiveness, it may require more retention than an ordinary address.
To evaluate these effects on a consummate programme, consider a web browser displaying a big and circuitous page. Some of the browser'southward memory will be used to shop simple information such as images and text; the browser volition probable choose to shop this data as efficiently as possible, and it will occupy about the aforementioned amount of memory regardless of the size of the MAU. Other retentiveness volition correspond the browser's model of diverse objects on the page, and these objects volition include many references: to each other, to the image and text information, and so on. The amount of memory needed to store these object will depend greatly on the address width of the estimator.
Suppose that, if all the addresses in the program were 32-bit, this spider web folio would occupy near 10 Gigabytes of memory.
- If the web browser is running on a estimator with 32-fleck addresses and byte-addressable memory, the address space will cover 4 Gigabytes of retention, which is bereft. The browser will either be unable to display this folio, or it volition need to be able to opportunistically motility some of the data to slower storage, which will essentially injure its performance.
- If the web browser is running on a calculator with 64-bit addresses and byte-addressable retention, information technology will require substantially more memory in gild to store the larger addresses. The exact overhead volition depend on how much of the 10 Gigabytes is simple information and how much is object-similar and dense with references, simply a figure of 40% is non implausible, for a total of 14 Gigabytes required. This is, of course, well within the capabilities of a 64-scrap address space. Nonetheless, the browser will generally exhibit worse locality and make worse utilise of the computer'south memory caches inside the reckoner, assuming equal resources with the alternatives.
- If the spider web browser is running on a computer with 32-bit addresses and 32-scrap-word-addressable memory, it will likely crave extra memory because of suboptimal packing and the demand for a few wide addresses. This affect is likely to be relatively small-scale, as the browser volition use packing and not-wide addresses for most important purposes, and the browser will fit comfortably within the maximum addressable range of 16 Gigabytes. Withal, in that location may be a meaning runtime overhead due to the widespread apply of packed data for images and text. More than importantly, 16 Gigabytes is a relatively low limit, and if the spider web page grows significantly, this calculator will exhaust its address space and brainstorm to accept some of the aforementioned difficulties every bit the byte-addressed computer.
- If the web browser is running on a computer with 64-chip addresses and 32-bit-word-addressable retention, it will suffer from both of the above runtime overheads: it require substantially more than memory to accommodate the larger 64-chip addresses, hurting locality, while as well incurring the runtime overhead of working with extensive packing of text and image information. Word addressing means that the programme tin can theoretically address up to 64 Exabytes of memory instead of merely 16 Exabytes, but since the program is nowhere near needing this much memory (and in practice no existent computer is capable of providing it), this provides no benefit.
Thus, word addressing allows a reckoner to address substantially more retentivity without increasing its accost width and incurring the corresponding large increase in memory usage. However, this is valuable but within a relatively narrow range of working set sizes, and it can innovate substantial runtime overheads depending on the application. Programs which do relatively piffling work with byte-oriented data like images, text, files, and network traffic may be able to do good well-nigh.
Sub-word accesses and wide addresses [edit]
A program running on a estimator that uses word addressing can even so piece of work with smaller units of memory by emulating an access to the smaller unit. For a load, this requires loading the enclosing word then extracting the desired bits. For a store, this requires loading the enclosing discussion, shifting the new value into place, overwriting the desired bits, and so storing the enclosing word.
Suppose that four consecutive code points from a UTF-8 string need to be packed into a 32-bit word. The first lawmaking betoken might occupy bits 0–seven, the 2nd 8-15, the third 16–23, and the fourth 24–31. (If the retentiveness were byte-addressable, this would be a little endian byte order.)
In order to clearly elucidate the code necessary for sub-word accesses without tying the example too closely to whatsoever item word-addressed architecture, the following examples use MIPS associates. In reality, MIPS is a byte-addressed compages with direct support for loading and storing 8-fleck and sixteen-bit values, but the example will pretend that information technology merely provides 32-bit loads and stores and that offsets within a 32-fleck word must be stored separately from an address. MIPS has been chosen because it is a unproblematic assembly language with no specialized facilities that would make these operations more than convenient.
Suppose that a programme wishes to read the 3rd code signal into annals r1 from the word at an accost in annals r2. In the absence of any other support from the instruction set up, the program must load the total word, right-shift by 16 to drib the starting time ii code points, and then mask off the fourth code point:
ldw $r1, 0($r2) # Load the total word srl $r1, $r1, xvi # Shift right by 16 andi $r1, $r1, 0xFF # Mask off other code points
If the beginning is not known statically, but instead a flake-showtime is stored in the register r3, a slightly more circuitous approach is required:
ldw $r1, 0($r2) # Load the full word srlv $r1, $r1, $r3 # Shift correct by the bit outset andi $r1, $r1, 0xFF # Mask off other lawmaking points
Suppose instead that the program wishes to assign the code point in register r1 to the third lawmaking point in the give-and-take at the address in r2. In the absence of whatsoever other back up from the instruction gear up, the program must load the full word, mask off the old value of that code point, shift the new value into place, merge the values, and shop the full word back:
sll $r1, $r1, 16 # Shift the new value left by 16 lhi $r5, 0x00FF # Construct a constant mask to select the third byte nor $r5, $r5, $nada # Flip the mask and then that information technology clears the third byte ldw $r4, 0($r2) # Load the full word and $r4, $r5, $r4 # Clear the third byte from the word or $r4, $r4, $r1 # Merge the new value into the word stw $r4, 0($r2) # Store the consequence equally the full give-and-take
Again, if the offset is instead stored in r3, a more circuitous approach is required:
sllv $r1, $r1, $r3 # Shift the new value left by the fleck start llo $r5, 0x00FF # Construct a abiding mask to select a byte sllv $r5, $r5, $r3 # Shift the mask left by the bit offset nor $r5, $r5, $zero # Flip the mask so that it clears the selected byte ldw $r4, 0($r2) # Load the full word and $r4, $r5, $r4 # Articulate the selected byte from the word or $r4, $r4, $r1 # Merge the new value into the word stw $r4, 0($r2) # Store the issue equally the full word
This lawmaking sequence assumes that another thread cannot modify other bytes in the word concurrently. If concurrent modification is possible, then 1 of the modifications might exist lost. To solve this problem, the final few instructions must be turned into an atomic compare-exchange loop and then that a concurrent modification will simply crusade it to echo the functioning with the new value. No memory barriers are required in this case.
A pair of a word accost and an offset inside the word is called a wide accost (also known every bit a fat accost or fat pointer). (This should not exist confused with other uses of wide addresses for storing other kinds of supplemental data, such as the bounds of an assortment.) The stored first may exist either a scrap offset or a byte start. The code sequences to a higher place do good from the commencement beingness denominated in bits because they use it as a shift count; an compages with direct back up for selecting bytes might prefer to just store a byte kickoff.
In these code sequences, the additional offset would have to be stored aslope the base of operations address, effectively doubling the overall storage requirements of an accost. This is non always true on discussion machines, primarily because addresses themselves are ofttimes non packed with other data to brand accesses more efficient. For instance, the Cray X1 uses 64-bit words, only addresses are simply 32 $.25; when an accost is stored in retention, it is stored in its own word, then the byte get-go can be placed in the upper 32 bits of the word. The inefficiency of using wide addresses on that system is just all the extra logic to manipulate this get-go and excerpt and insert bytes within words; it has no retentivity-use touch.
[edit]
The minimum addressable unit of a computer isn't necessarily the same as the minimum memory access size of the reckoner's didactics set. For case, a reckoner might use byte addressing without providing any instructions to directly read or write a single byte. Programs would exist expected to emulate those operations in software with bit-manipulations, simply similar the example code sequences above do. This is relatively common in 64-bit computer architectures designed every bit successors to 32-bit supercomputers or minicomputers, such the December Alpha and the Cray X1.
The C standard states that a pointer is expected to have the usual representation of an address. C also allows a pointer to be formed to any object except a fleck-field; this includes each individual chemical element of an assortment of bytes. C compilers for computers that use word addressing often use different representations for pointers to different types depending on their size. A pointer to a type that'south large enough to fill up a word will be a uncomplicated accost, while a pointer such as char* or void* will be a wide arrow: a pair of the address of a word and the offset of a byte within that word. Converting between pointer types is therefore not necessarily a trivial performance and can lose information if done incorrectly.
Because the size of a C struct is not ever known when deciding the representation of a arrow to that struct, it is non possible to reliably utilize the rule above. Compilers may demand to align the kickoff of a struct so that it can utilise a more efficient arrow representation.
Examples [edit]
- The ERA 1103 uses give-and-take addressing with 36-bit words. Merely addresses 0-1023 refer to random-access memory; others are either unmapped or refer to drum memory.
- The PDP-x uses give-and-take addressing with 36-bit words and 18-bit addresses.
- Most Cray supercomputers from the 1980s and 1990s use word addressing with 64-flake words. The Cray-i and Cray 10-MP use 24-bit addresses, while well-nigh others use 32-bit addresses.
- The Cray X1 uses byte addressing with 64-bit addresses. It does not straight support memory accesses smaller than 64 $.25, and such accesses must be emulated in software. The C compiler for the X1 was the first Cray compiler to back up emulating xvi-bit accesses.[1]
- The DEC Alpha uses byte addressing with 64-scrap addresses. Early Alpha processors do non provide whatever direct back up for 8-chip and 16-scrap retentivity accesses, and programs are required to east.thousand. load a byte by loading the containing 64-bit discussion and so separately extracting the byte. Considering the Alpha uses byte addressing, this offset is yet represented in the least meaning bits of the address (rather than separately as a wide address), and the Alpha conveniently provides load and store unaligned instructions (
ldq_uandstq_u) which ignore those bits and only load and store the containing aligned word.[ii] The later on byte-word extensions to the architecture (BWX) added eight-bit and xvi-fleck loads and stores, starting with the Alpha 21164a.[3] Once more, this extension was possible without serious software incompatibilities because the Alpha had e'er used byte addressing.
See also [edit]
- Byte addressing
References [edit]
- ^ Terry Greyzck, Cray Inc. Cray X1 Compiler Challenges (And How We Solved Them)
- ^ "The Alpha AXP, function 8: Memory access, storing bytes and words and unaligned information". 16 Baronial 2017.
- ^ "Blastoff: The History in Facts and Comments - Blastoff 21164 (EV5, EV56) and 21164PC (PCA56, PCA57)".
Source: https://en.wikipedia.org/wiki/Word_addressing
0 Response to "what does it mean to be byte addressable"
Publicar un comentario