You will find here an explanation of common terms and concepts common to R300
hardware and drivers in general.
There several ways to access the hardware:
Register writes - a part of physical memory is mapped to the memory (called registers) inside the card. These registers described the internal state of the card and thus writing (or reading) values for them will do something.
This is often abbreviated MMIO - memory mapped input-output.
Take note of the fact that as the registers describe the internal state of the card their value can change without explicits writes. Also, some registers
will trigger an internal function when a certain bit is set, so it is perfectly normal to see a sequence of writes like OUTREG(0x4f18, 1) which appear to write the same value into the register.
Memory writes: the video memory is mapped to a certain portion of physical memory just like the registers. The framebuffer is usually located at offset 0x0 so writing there will cause something displayed.
Memory writes are slow ! Reads are even slower ! Remember, the processor has to stretch all the way from its inner core past the cache, through the north bridge, across the PCI (or AGP) bus to the device. And in case of reads it has to wait for the answer.
DMA upload of commands to the command processor (abbreviated CP).
Instead of writing data to register or memory directly one can instruct a microcontroller onboard the card to do it for you.
My understading is that this is somewhat slower than direct writes in terms of response time (so a DMA command, once scheduled for upload, will take longer to take its effect). However, there is a big plus - while the command processor pulls packets with register values your main (and fast) cpu can do something else.
AGP memory. This is a portion of system RAM dedicated to hold data like textures or vertices, etc. It was intended to augment the video memory local to the graphics card.
AGP memory is good for one thing - fast transfers of data to the video card.
My understanding is that CPU writes to AGP memory are slightly slower than writes to system RAM as they need to bypass the cache. So, this is another way to partition work between the video card and main processor
PCIGART - this is similar to AGP memory, but for PCI devices.
The transfers to video card go with regular speed of PCI bus. It is mainly used
as convenience (see GART below)
GART stands for global address remapping table. What it does is allow to represent a bunch of pages as a single contiguous region. This way both your program and video card can view AGP memory as an array and agree on offsets - even though the actual memory used can be scattered around.
GART is so convienient that a video card can make use of multiple GART tables - one for PCIGART, one for ring buffer, one for upload/download of video, etc.
GART can be as simple as a few (physically contiguous) pages of system RAM filled with addresses of pages to download and as complex as AGPGART - which is part of the Northbridge chip.
Synchronization and timing
Unlike microprocessors, which execute complicated programs that contain loops and branches,
the graphics accelerators have a pipeline structure - where the data is fetched, processed and the results are written to the framebuffer
Even newer cards that can execute vertex and pixel shader programs contain them within a single pipeline stage.
As the data flow is less complicated conflicts (for example, data not being ready for fetching) are dealt with by explicit scheduling commands rather by complicated hardware within the chip.
Note that a conflict can often result in a lockup and sometimes bring down the entire system by making PCI bus inoperative.
Thus the two common lockup states:
- card lockup - X frozen (maybe mouse moves)system is pingable and can be shutdown cleanly.
- bus lockup - card locked up and took down PCI bus with it. System is not responsive at all and power-off is necessary.
Note: it is possible to cause a lockup by writing to video memory while a 3d engine is active and wants to write to it at the same time.
There are several ways to tell the graphics card to stall or wait for something:
- A status register will tell the status of the hardware and, in particular, whether a particular subunit is busy. There are several status registers.
- A WAIT_UNTIL register will cause the PCI writes to card to stall until a condition is met.
- A command processor also has a similar WAIT_UNTIL command.
- A radeon_reg.h file in a recent Xserver driver, Mesa driver or DRM driver.
Note: sometimes they only have a subset relevant to the driver, so be sure to look at all of them.
- r300_reg.h file in r300_driver/docs contains the guesses as to what R300 registers do. As such, the information within it is best tested before use.