How are segment registers involved in memory address translation?

问题

In what I've learned so far about segmentation:

A virtual address contains a segment selector and an offset
The segment selector is used in conjunction with the GDTR to find the linear address of the segment descriptor
The segment descriptor houses information regarding the chosen segment, including its linear address

So, my questions are:

Based on what I've read, the virtual address is loaded into the segment register, and then somehow the translation is continued from there. What happens to the segment register after the virtual address has been loaded into it to obtain the descriptor?
As I understand it, the segment register also holds a cached value of the descriptor. How does this come into play during the translation process?
How does the system determine which segment register to load in, given that a segment selector can have up to 2^13 different values and there are only six primary registers?

回答1:

The usual translation goes as follow:

 Logical address   -->   GDT -->  Linear address          --> Page tables --> Physical Address
(segment:offset)                 (segment base + offset)         

\______________________________________________________/ 
                  Virtual address                                     
             (can be either logical or linear)

If running in VMX non-root mode (i.e. in a VM) and EPT is enabled then:

 Logical address   -->   GDT -->  Linear address          --> Page tables --> Guest Physical Address --> EPT --> (System) Physical Address
(segment:offset)                 (segment base + offset)         

\______________________________________________________/                      \__________________________________________________________/
                  Virtual address                                                        Physical address
             (can be either logical or linear)

If an IOMMU is present (like the umbrella technology VT-d):

Logical address   -->   GDT -->  Linear address          --> Page tables --> Guest Physical Address --> EPT --> (System) Physical Address  -->  1 or 2 level translation --> (IO) Physical address
(segment:offset)                 (segment base + offset)         

\______________________________________________________/                     \___________________________________________________________________________________________________________________/
                  Virtual address                                                        Physical address
             (can be either logical or linear)

The MMIO can even perform the translation of the Guest Virtual Address or the Guest Physical Address (one of it's purposes is to reify the Virtual address of an application to the hardware and simplify the management of the plethora of address spaces encountered during the translation).

Note As Hadi Brais pointed out, the term "Virtual address" only designates a Linear address in the Intel and AMD manuals.
I find it more useful to label both the logical and the linear addresses as virtual because they are before the page translation step.

The segment register holds a segment selector that index a segment descriptor that is used to performs the security checks and get the segment base that is summed with the offset part of the logical address.
After that, it's done.

Every address specified at the instruction level is a logical address - requiring the lookup of the segment descriptor.
To avoid reading it from memory each time the memory is accessed by an instruction, the CPU caches it - otherwise that would be a performance killer.

The OS setup the segment registers based on what it need to do but it rarely need more that four segments anyway.

The primary intent for segmentation (in PM) was to fulfil process isolation by defining non overlapping segments for each program.
A program usually need only a stack segment, a data segment and a code segment - the other three are there to avoid saving/restoring the data segment back then when a segment max size was 64KiB (read: Real mode. fs and gs were added later though).

Today OSes use a flat model where there are only two segments (code and data/stack - this is a simplification, other segments are required) encompassing the whole address space, plus OS specifics segments for things like TLS or PEB/TEB.
So six segment registers are even more than it's needed, the 8192 entries of the GDT are there in case they are (if even) needed.

来源：https://stackoverflow.com/questions/52222133/how-are-segment-registers-involved-in-memory-address-translation

标签

x86

x86-64

hardware

intel

cpu-registers