diff --git a/Docs/HAL/ARMop_API b/Docs/HAL/ARMop_API new file mode 100644 index 0000000000000000000000000000000000000000..dc1a7d897fd4ed1ab0280236795523b6a45160b7 --- /dev/null +++ b/Docs/HAL/ARMop_API @@ -0,0 +1,440 @@ +12345678901234567890123456789012345678901234567890123456789012345678901234567890 + +mjs 12 Jan 2001 Early Draft + + +RISC OS Kernel ARM core support +=============================== + +This document is concerned with the design of open ended support for +multiple ARM cores within the RISC OS kernel, as part of the work loosely +termed hardware abstraction. Note that the ARM core support is part of the +OS kernel, and so is not part of the hardware abstraction layer (HAL) +itself. + +Background +---------- + +ARM core support (including caches and MMU) has historically been coded in a +tailored way for one or two specific variants. Since version 3.7 this has +meant just two variants; ARM 6/7 and StrongARM SA110. A more generic +approach is required for the next generation. This aims both to support +several cores in a more structured way, and to cover minor variants (eg. +cache size) with the same support code. The natural approach is to set up +run-time vectors to a set of ARM support routines. + +Note that it is currently assumed that the ARM MMU architecture will not +change radically in future ARM cores. Hence, the kernel memory management +algorithms remain largely unchanged. This is believed to be a reasonable +assumption, since the last major memory management change was with Risc PC +and ARM 610 (when the on-chip MMU was introduced). + +Note that all ARM support code must be 32-bit clean, as part of the 32-bit +clean kernel. + +Survey of ARM core requirements +------------------------------- + +At present, five broad ARM core types can be considered to be of interest; +ARM7 (and ARM6), ARM9, ARM10, StrongARM (SA1) and XScale. These divide +primarily in terms of cache types, and cache and TLB maintenance +requirements. They also span a range of defined ARM architecture variants, +which introduced variants for system operations (primarily coprocessor 15 +instructions). + +The current ARM architecture is version 5. This (and version 4) has some +open ended definitions to allow code to determine cache size and types from +CP15 registers. Hence, the design of the support code can hope to be at +least tolerant of near future variations that are introduced. + +ARM7 +---- + +ARM7 cores may be architecture 3 or 4. They differ in required coprocessor +15 operations for the same cache and TLB control. ARM6 cores are much the +same as architecture 3 ARM7. The general character of all these cores is of +unified write-through caches that can only be invalidated on a global basis. +The TLBs are also unified, and can be invalidated per entry or globally. + +ARM9 +---- + +ARM9 cores are architecture 4. We ignore ARM9 variants without an MMU. The +kernel can read cache size and features. The ARM 920 or 922 have harvard +caches, with writeback and writethrough capable data caches (on a page or +section granularity). Data and instruction caches can be invalidated by +individual lines or globally. The data cache can be cleaned by virtual +address or cache segment/index, allowing for efficient cache maintenance. +Data and instruction TLBs can be invalidated by entry or globally. + +ARM10 +----- + +ARM 10 is architecture 5. Few details available at present. Likely to be +similar to ARM9 in terms of cache features and available operations. + +StrongARM +--------- + +StrongARM is architecture 4. StrongARMs have harvard caches, the data cache +being writeback only (no writethrough option). The data cache can only be +globally cleaned in an indirect manner, by reading from otherwise unused +address space. This is inefficient because it requires external (to the +core) reads on the bus. In particular, the minimum cost of a clean, for a +nearly clean cache, is high. The data cache supports clean and invalidate by +individual virtual lines, so this is reasonably efficient for small ranges +of address. The data TLB can be invalidated by entry or globally. + +The instruction cache can only be invalidated globally. This is inefficient +for cases such as IMBs over a small range (dynamic code). The instruction +TLB can only be invalidated globally. + +Some StrongARM variants have a mini data cache. This is selected over the +main cache on a section or page by using the cachable/bufferable bits set to +C=1,B=0 in the MMU (this is not standard ARM architecture). The mini data +cache is writeback and must be cleaned in the same manner as the main data +cache. + +XScale +------ + +XScale is architecture 5. It implements harvard caches, the data cache being +writeback or writethrough (on a page or section granularity). Data and +instruction caches can be invalidated by individual lines or globally. The +data cache can be fully cleaned by allocating lines from otherwise unused +address space. Unlike StrongARM, no external reads are needed for the clean +operation, so that cache maintenance is efficient. + +XScale has a mini data cache. This is only available by using extension bits +in the MMU. This extension is not documented in the current manual for +architecture 5, but will presumably be properly recognised by ARM. It should +be a reasonably straightforward extension for RISC OS. The mini data cache +can only be cleaned by inefficient indirect reads as on StrongARM. However, +for XScale, the whole mini data cache can be configured as writethrough to +obviate this problem. The most likely use for RISC OS is to map screen +memory as mini cacheable, when writethrough caching will also be highly +desirable to prevent delayed screen update. + +The instruction and data TLBs can each be invalidated by entry or globally. + + +Kernel ARM operations +--------------------- + +This section lists the definitions and API of the set of ARM operations +required by the kernel for each major ARM type that is to be supported. Some +operations may be very simple on some ARMs. Others may need support from the +kernel environment - for example, readable parameters that have been +determined at boot, or address space available for cache clean operations. + +The general rules for register usage and preservation in calling these +operations is: + + - any parameters are passed in r0,r1 etc. as required + - r0 may be used as a scratch register + - the routines see a valid stack via sp, at least 16 words are available + - lr is the return link as required + - on exit, all registers except r0 and lr must be preserved + +Note that where register values are given as logical addresses, these are +RISC OS logical addresses. The equivalent ARM terminology is virtual address +(VA), or modified virtual address (MVA) for architectures with the fast +context switch extension. + +Note also that where cache invalidation is required, it is implicit that any +associated operations for a particular ARM should be performed also. The +most obvious example is for an ARM with branch prediction, where it may be +necessary to invalidate a branch cache anywhere where instruction cache +invalidation is to be performed. + +Any operation that is a null operation on the given ARM should be +implemented as a single return instruction: + + MOV pc, lr + + +-- Cache_CleanInvalidateAll + +The cache or caches are to be globally invalidated, with cleaning of any +writeback data being properly performed. + + entry: - + exit: - + + IRQs are enabled + call is not reentrant + +Note that any write buffer draining should also be performed by this +operation, so that memory is fully updated with respect to any writeaback +data. + +The OS only expects the invalidation to be with respect to instructions/data +that are not involved in any currently active interrupts. In other words, it +is expected and desirable that interrupts remain enabled during any extended +clean operation, in order to avoid impact on interrupt latency. + +-- Cache_CleanAll + +The unified cache or data cache are to be globally cleaned (any writeback data +updated to memory). Invalidation is not required. + + entry: - + exit: - + + IRQs are enabled + call is not reentrant + +Note that any write buffer draining should also be performed by this +operation, so that memory is fully updated with respect to any writeaback +data. + +The OS only expects the cleaning to be with respect to data that are not +involved in any currently active interrupts. In other words, it is expected +and desirable that interrupts remain enabled during any extended clean +operation, in order to avoid impact on interrupt latency. + +-- Cache_InvalidateAll + +The cache or caches are to be globally invalidated. Cleaning of any writeback +data is not to be performed. + + entry: - + exit: - + + IRQs are enabled + call is not reentrant + +This call is only required for special restart use, since it implies that +any writeback data are either irrelevant or not valid. It should be a very +simple operation on all ARMs. + +-- Cache_RangeThreshold + +Return a threshold value for an address range, above which it is advisable +to globally clean and/or invalidate caches, for performance reasons. For a +range less than or equal to the threshold, a ranged cache operation is +recommended. + + entry: - + exit: r0 = threshold value (bytes) + + IRQs are enabled + call is not reentrant + +This call returns a value that the kernel may use to select between strategies +in some cache operations. This threshold may also be of use to some of the +ARM operations themselves (although they should typically be able to read +the parameter more directly). + +The exact value is unlikely to be critical, but a sensible value may depend +on both the ARM and external factors such as memory bus speed. + + +-- TLB_InvalidateAll + +The TLB or TLBs are to be globally invalidated. + + entry: - + exit: - + + IRQs are enabled + call is not reentrant + +-- TLB_InvalidateEntry + +The TLB or TLBs are to be invalidated for the entry at the given logical +address. + + entry: r0 = logical address of entry to invalidate (page aligned) + exit: - + + IRQs are enabled + call is not reentrant + +The address will always be page aligned (4k). + +-- WriteBuffer_Drain + +Any writebuffers are to be drained so that any pending writes are guaranteed +completed to memory. + + entry: - + exit: - + + IRQs are enabled + call is not reentrant + +-- IMB_Full + +A global instruction memory barrier (IMB) is to be performed. + + entry: - + exit: - + + IRQs are enabled + call is not reentrant + +An IMB is an operation that should be performed after new instructions have +been stored and before they are executed. It guarantees correct operation +for code modification (eg. something as simple as loading code to be +executed). + +On some ARMs, this operation may be null. On ARMs with harvard architecture +this typically consists of: + + 1) clean data cache + 2) drain write buffer + 3) invalidate instruction cache + +There may be other considerations such as invalidating branch prediction +caches. + +-- IMB_Range + +An instruction memory barrier (IMB) is to be performed over a logical +address range. + + entry: r0 = logical address of start of range + r1 = logical address of end of range (exclusive) + Note that r0 and r1 are aligned on cache line boundaries + exit: - + + IRQs are enabled + call is not reentrant + +An IMB is an operation that should be performed after new instructions have +been stored and before they are executed. It guarantees correct operation +for code modification (eg. something as simple as loading code to be +executed). + +On some ARMs, this operation may be null. On ARMs with harvard architecture +this typically consists of: + + 1) clean data cache over the range + 2) drain write buffer + 3) invalidate instruction cache over the range + +There may be other considerations such as invalidating branch prediction +caches. + +Note that the range may be very large. The implementation of this call is +typically expected to use a threshold (related to Cache_RangeThreshold) to +decide when to perform IMB_Full instead, being faster for large ranges. + +-- MMU_Changing + +The global MMU mapping is about to be changed. + + entry: - + exit: - + + IRQs are enabled + call is not reentrant + +The operation must typically perform the following: + + 1) globally clean and invalidate all caches + 2) drain write buffer + 3) globally invalidate TLB or TLBs + +Note that it should not be necessary to disable IRQs. The OS ensures that +remappings do not affect currently active interrupts. + +-- MMU_ChangingEntry + +The MMU mapping is about to be changed for a single page entry (4k). + + entry: r0 = logical address of entry (page aligned) + exit: - + + IRQs are enabled + call is not reentrant + +The operation must typically perform the following: + + 1) clean and invalidate all caches over the 4k range of the page + 2) drain write buffer + 3) invalidate TLB or TLBs for the entry + +Note that it should not be necessary to disable IRQs. The OS ensures that +remappings do not affect currently active interrupts. + +-- MMU_ChangingUncached + +The MMU mapping is about to be changed in a way that globally affects +uncacheable space. + + entry: - + exit: - + + IRQs are enabled + call is not reentrant + +The operation must typically globally invalidate the TLB or TLBs. The OS +guarantees that cacheable space is not affected, so cache operations are not +required. However, there may still be considerations such as fill buffers +that operate in uncacheable space on some ARMs. + +-- MMU_ChangingUncachedEntry + +The MMU mapping is about to be changed for a single uncacheable page entry +(4k). + + entry: r0 = logical address of entry (page aligned) + exit: - + + IRQs are enabled + call is not reentrant + +The operation must typically invalidate the TLB or TLBs for the entry. The +OS guarantees that cacheable space is not affected, so cache operations are +not required. However, there may still be considerations such as fill +buffers that operate in uncacheable space on some ARMs. + + +-- MMU_ChangingEntries + +The MMU mapping is about to be changed for a contiguous range of page +entries (multiple of 4k). + + entry: r0 = logical address of first page entry (page aligned) + r1 = number of page entries ( >= 1) + exit: - + + IRQs are enabled + call is not reentrant + +The operation must typically perform the following: + + 1) clean and invalidate all caches over the range of the pages + 2) drain write buffer + 3) invalidate TLB or TLBs over the range of the entries + +Note that it should not be necessary to disable IRQs. The OS ensures that +remappings do not affect currently active interrupts. + +Note that the number of entries may be large. The operation is typically +expected to use a reasonable threshold, above which it performs a global +operation instead for speed reasons. + +-- MMU_ChangingUncachedEntries + +The MMU mapping is about to be changed for a contiguous range of uncacheable +page entries (multiple of 4k). + + entry: r0 = logical address of first page entry (page aligned) + r1 = number of page entries ( >= 1) + exit: - + + IRQs are enabled + call is not reentrant + +The operation must typically invalidate the TLB or TLBs over the range of +the entries. The OS guarantees that cacheable space is not affected, so +cache operations are not required. However, there may still be +considerations such as fill buffers that operate in uncacheable space on +some ARMs. + +Note that the number of entries may be large. The operation is typically +expected to use a reasonable threshold, above which it performs a global +operation instead for speed reasons. diff --git a/Docs/HAL/HAL_API b/Docs/HAL/HAL_API new file mode 100644 index 0000000000000000000000000000000000000000..45526eb194e85756a7daf3e035bf91dc58e0c57f --- /dev/null +++ b/Docs/HAL/HAL_API @@ -0,0 +1,1054 @@ +12345678901234567890123456789012345678901234567890123456789012345678901234567890 + +2001 - a HAL API +---------------- + +mjs 12 Jan 2001 Early Draft (mjs,kjb) + + +RISC OS Hardware Abstraction +============================ + +Background +---------- + +This document is concerned with low level developments of RISC OS in order +to support future ARM based platforms. Loosely, this has been considered as +creating a hardware abstraction layer, or HAL. This term is a useful +shorthand, but with the following caveats. Firstly, the HAL work is only +envisaged to provide a modest level of low level abstraction (at least for +the next OS generation). Secondly, significant non-HAL work, at all levels, +is required to make a useful next generation RISC OS. + +Note that most of the hardware dependence of the OS is already confined to +low level code (essentially, the kernel and device drivers). Here we assume +that the OS is only expected to run on an ARM processor, and with somewhat +restricted choices of I/O hardware (eg. friendly video pixel formats). + +Up to now (version 4), RISC OS has evolved while closely coupled to an ARM +processor core and to an Acorn proprietary chip set (video, memory, I/O). It +has remained highly hardware specific. For the purposes of further +investment in RISC OS, three key areas of hardware dependence must be +addressed; 32-bit clean operation, support for new ARM cores, support for +various video, memory and I/O configurations. Without all of these, the OS +is essentially useless on forseeable future hardware. + + +32-bit clean code +----------------- + +All RISC OS code must run 32-bit clean on future releases. This is because +all ARM cores from ARM9 onwards (and also some ARM7 variants) have entirely +removed support for RISC OS's native 26-bit modes. Note that 32-bit clean +code is not precluded from working on the older ARM cores (back to ARM 610). + +With more care, 32/26-bit agnostic code can be written to work back to ARM +2. This may be of interest to module and application code, but note that the +OS kernel itself is only expected to work back to ARM 610, since an MMU is +required. + +Much of the work required is routine and has been done for the OS itself +(though long term weeding of consequent bugs is required). A 32-bit +compatible shared C library has been released in order to encourage +conversion of application code by third parties. This work is not part of +hardware abstraction and is not considered further in this document. + +Support for newer ARM cores +--------------------------- + +ARM core support (including caches and MMU) has historically been coded in a +tailored way for one or two specific variants. Since version 3.7 this has +meant just two variants; ARM 6/7 and StrongARM SA110. A more generic +approach is required for the next generation. This aims both to support +several cores in a more structured way, and to cover minor variants (eg. +cache size) with the same support code. The natural approach is to set up +run-time vectors to a set of ARM support routines. + +Note that it is currently assumed that the ARM MMU architecture will not +change radically in future ARM cores. Hence, the kernel memory management +algorithms remain largely unchanged. This is believed to be a reasonable +assumption, since the last major memory management change was with Risc PC +and ARM 610 (when the on-chip MMU was introduced). + +ARM core support is confined almost entirely to the kernel, and is therefore +not strictly part of the HAL. The HAL will only be concerned with any +external factors such as clock selection. Only HAL aspects are considered +further in this document. + +Hardware abstraction layer +-------------------------- + +A simple HAL is to be inserted underneath RISC OS. This will provide two +functions. Firstly, it will be responsible for initial system bootstrap, +much like a PC BIOS, and secondly it will provide simple APIs to allow +hardware access. + +The HAL APIs are a thin veneer on top of the hardware. They are designed to +act as replacements for all the hardware knowledge and manipulation +performed by the RISC OS Kernel, together with some APIs that will allow +RISC OS driver modules to become more hardware independent. No attempt will +be made (at this stage) to perform such tasks as separating the video +drivers from the Kernel, for example. + +One tricky design decision is the amount of abstraction to aim for. Too +little, and the system is not flexible enough; too much and HAL design is +needlessly complicated for simple hardware. The present design tries to err +on the side of too little abstraction. Extra, more abstract APIs can always +be added later. So, initially, for example, the serial device API will just +provide discovery, some capability flags and the base address of the UART +register set. This will be sufficient for the vast majority of devices. If +new hardware comes along later that isn't UART compatible, a new API can be +defined. Simple hardware can continue to just report UART base addresses. + +The bulk of device driver implementation remains in RISC OS modules - the +difference is that the HAL will allow many device drivers to avoid direct +access to hardware. For example, PS2Driver can now use HAL calls to send and +receive bytes through the PS/2 ports, and thus is no longer tied to IOMD's +PS/2 hardware. Similarly, interrupt masking and unmasking, as performed by +any device vector claimant, is now a HAL call. Note that HAL calls are +normally performed via a Kernel SWI - alternatively the Kernel can return +the address of specific HAL routines. There is nothing to stop specific +drivers talking to hardware directly, as long as they accept that this will +tie them to specific devices. + +This dividing line between the HAL and RISC OS driver modules is crucial. If +the HAL does everything, then we have achieved nothing - we have just as +much hardware dependent code - it's just in a different place. It is +important to place the dividing line as close to the hardware as possible, +to make it easy to design a HAL and to prevent large amounts of code +duplication between HALs for different platforms. + +The Kernel remains responsible for the ARM's MMU and all other aspects of +the CPU core. The HAL requires no knowledge of details of ARM +implementations, and thus any HAL implementation should work on any +processor from the ARM610 onwards. + + +HAL/OS layout and headers +------------------------- + +The OS is linked to run at a particular base address. Pre-HAL OS's were +linked to run at <n>MB, that is on a MB alignment to allow efficient MMU +section mapping. For simplicity, the HAL/OS layout can allow a fixed maximum +size for the HAL, currently set at 64k. Then the OS base address will be +<n>MB+64k. This allows a HAL of up to 64K to be placed at the bottom of a +ROM below the OS, and the HAL/OS combination to still be section-mapped. A +ROM should be portable to hardware variants merely by replacing the 64k HAL +block. + +A more flexible system would only sacrifice MMU mapping efficiency. The HAL +and OS could be placed in any desired way, provided that each is contiguous +in physical memory. + +The OS starts with a header including a magic word - this aids probing and +location of images. The OS header format is defined as: + +Word 0: Magic word ("OSIm" - &6D49534F) +Word 1: Flags (currently should be 0) +Word 2: Image size (bytes) +Word 3: Offset (bytes) from OS base to table of OS routine entry points +Word 4: Number of entries in table + +The HAL itself should have whatever header is required to start the system. +For example on ARM7500 16->32 bit switch code is required, and on the +9500 parts a special ROM header and checksum must be present. A HAL +descriptor block, instead of a header, can be placed somewhere in the HAL. A +pointer to this block is passed by the HAL to the OS in the OS_Start call: + +Word 0: Flags (currently should be 0) +Word 1: Offset (bytes) from descriptor to start of HAL (will be <= 0) +Word 2: HAL size (bytes) +Word 3: Offset (bytes) from descriptor to table of HAL routine entry points +Word 4: Number of entries in table +Word 5: Size of HAL static workspace required (bytes) + +Calling standards +----------------- + +RISC OS and the HAL are two separate entities, potentially linked +separately. The OS and the HAL are each defined with a set of callable +routines for the OS/HAL interface. Each HAL entry or each OS entry is given +a unique (arbitrary) number, starting at 0. The offset to each entry is +given in an entry table. Calls can be made manually through this table, or +stubs could be created at run-time to allow high-level language calls. + +Every entry (up to the declared maximum) must exist. If not implemented, a +failure response must be returned, or the call ignored, as appropriate. Note +that the OS interface for the HAL should not be confused with standard OS +calls (SWIs) already defined for use in the OS itself. + +To permit high-level language use in the future, the procedure call standard +in both directions is the ARM-Thumb Procedure Call Standard (ATPCS) as +defined by ARM, with no use of floating point, no stack limit checking, no +frame pointers, and no Thumb interworking. HAL code is expected to be ROPI +and RWPI (ie. all its read-only segments and read-write segments are +position-independent). Hence the HAL is called with its static workspace +base (sb) in r9. The OS kernel is neither ROPI nor RWPI (except for the +pre-MMU calls, which are ROPI). OS calls from the HAL do not use r9 as a +static base. + +The HAL will always be called in a privileged mode - if called in an +interrupt mode, the corresponding interrupts will be disabled. The HAL +should not change mode. HAL code should work in both 26-bit and 32-bit modes +(but should assume 32-bit configuration). + +Routines can be conveniently specified in C language syntax. Typically they +will be written in assembler. In detail, the ATPCS register usage for HAL +calls is as follows: + + ATPCS ARM use at exit + a1 r0 argument 1/return value undefined or return value + a2 r1 argument 2/return value undefined or return value + a3 r2 argument 3/return value undefined or return value + a4 r3 argument 4/return value undefined or return value + v1 r4 var 1 preserved + v2 r5 var 2 preserved + v3 r6 var 3 preserved + v4 r7 var 4 preserved + v5 r8 var 5 preserved + sb r9 static workspace base preserved + v7 r10 var 7 preserved + v8 r11 var 8 preserved + ip r12 scratch undefined + sp r13 stack pointer preserved + lr r14 return link undefined + +The static workspace base points to the HAL workspace. + +Note that HAL calls must be assumed to corrupt all of r0-r3,r12,r14. A +function return value may be in r0, or (less commonly) multiple return +words in two or more of r0-r3. + +If there are more than 4 arguments to a HAL call, arguments 5 onwards must +be pushed onto the stack before the call, and discarded after return. (The +order of arguments is with argument 5 at top of stack, ie. first to be +pulled.) + +The register usage for the OS entry points is the same, except that r9 is +not used as a static base (it is preserved). + +When using assembler, the register usage may seem somewhat restricted, and +cumbersome for more than 4 arguments. However, it is typically a reasonable +balance for function calls (as a PCS would aim to be), and does not preclude +implementation in C for example. Old kernel code may require register +preserving overhead to insert HAL calls easily, but for most calls this is +insignificant, compared to hardware access costs. + +Initialisation sequence +----------------------- + +After system reset, bootstrap code in the HAL will do minimal hardware +set-up ... blah blah + +HAL entry points +---------------- + +These routines are expected to be called from the OS (Kernel). See the +'Calling standards' section for general information on register usage and so +forth. + +Interrupts +---------- + +The HAL must provide the ability to identify, prioritise and mask IRQs, and the ability +to mask FIQs. RISC OS supplies the ARM's processor vectors, and on an IRQ calls the HAL +to request the identity of the highest priority interrupt. + +IRQ and FIQ device numbers are arbitrary, varying from system to system. They should be +arranged to allow quick mappings to and from hardware registers, and should ideally +be packed, starting at 0. + +Timers +------ + +The HAL must supply at least one timer capable of generating periodic +interrupts. Each timer should generate a separate logical interrupt, and the +interrupt must be latched. The timers must either be variable rate (period is +a multiple of a basic granularity), or be fixed rate (period = 1*granularity). +Optionally, the timer should be capable of reporting the time until the +next interrupt, in units of the granularity. + +Counter +------- + +The HAL must supply a counter that varies rapidly, appropriate for use for +sub-millisecond timing. On many systems, this counter will form part of +timer 0 - as such it is not required to operate when timer 0 is not running. +On other systems, the periodic timers may have no readable latch, and a +separate unit will be required. + +The counter should count down from (period-1) to 0 continuously. + +Non-volatile memory +------------------- + +The HAL should provide at least 240 bytes of non-volatile memory. If no +non-volatile memory is available, the HAL may provide fake NVRAM contents +suitable for RISC OS - however, it is preferable that the HAL just state +that NVRAM is not available, and RISC OS will act as though a CMOS reset has +been performed every reset. + +NVRAM is typically implemented as an IIC device, so the calls are permitted +to be slow, and to enable interrupts. The HAL is not expected to cache +contents. + +If the HAL has no particular knowledge of NVMemory, then it may just say +that "NVMemory is on IIC", and the OS will probe for CMOS/EEPROM devices on +the IIC bus. + +IIC bus +------- + +Many hardware designs have an IIC bus. Often, it is used only to support +non-volatile memory, but in other systems TV tuners, TV modulators, +microcontrollers, and arbitrary expansion cards may be fitted. + +Low-level and high level APIs are defined. An arbitrary number of buses is +supported, and each can be controlled by either the low or high level API. +The OS should normally only use one fixed API on each bus - mixing APIs is +unpredictable. + +The low-level API requires the OS to control the two lines of the bus +directly. The high-level API currently covers version 2.1 of the IIC +protocol, and allows high-level transactions to be performed. + +It is expected that a HAL will always provide the low-level API on each bus, +where possible in hardware. Using this, the OS can provide Fast mode single +or multi-master operation. The HAL may wish to provide the high-level API +where a dedicated IIC port with hardware assistance is available; this will +further permit High-speed and slave operation. + +As it is possible that some HAL APIs (eg NVMemory), although abstracted at +this API layer, are still actually an IIC device, a matching set of +high-level IIC calls are provided in the OS. These give the HAL access to +the OS IIC engine, which will make low-level HAL calls. This saves the HAL +from implementing the full IIC protocol. To illustrate this diagramatically: + + +----------+ NVMem_Read +------------+ NVMemoryRead +------------+ + | | ---------> | | ------------> | | + | App | | OS | IICTransmit | HAL | + | | | | <------------ | | + | | | | IICSetLines | | + | | | | ------------> | | + +----------+ +------------+ +------------+ + +The low-level calls should be fast. Interrupt status may not be altered. + +The following structure is used: + + typedef struct { int SDA, SCL } IICLines; + +High level API to be defined ... + +Video +----- + +The HAL only attempts to abstract the hardware controller aspects of the OS +video. It does not (yet) consider pixel formats, framestore layout, hardware +graphics acceleration. All these would affect a great deal of RISC OS +graphics code that forms much of the value of the OS. This means that the +envisaged HAL/RISC OS combination makes some specific assumptions about +graphics framestore layout as follows: + + - memory mapped framestore + - expected to be contiguous physical memory, can be specific memory (eg. VRAM) + - mapped as contiguous logical memory + - progressive raster scan in logical memory from top left pixel to bottom right + - start of each raster row must be word aligned + - number of pixels in a row should be such that row is a whole number of words + - spacing between start of each row is a constant number of words, possibly + greater than row length (via mode variable, LineLength) + - 1,2,4,8,16 or 32 bits per pixel (bpp) + - little endian pixel packing for 1,2,4 bpp (least significant bits are + leftmost pixels) + - presence of palette assumed for 1,2,4,8 bpp (8-bits per r,g,b component in + each entry) + - 16 bpp format: + bits 0-4 Red + 5-9 Green + 10-14 Blue + 15 Supremacy (0=solid, 1=transparent) + - 32 bpp format: + bits 0-7 Red + 8-15 Green + 16-23 Blue + 24-31 Supremacy (0=solid, 255=transparent) + - palette words are 32 bits: + bits 0-7 Reserved (0), or Supremacy (0=solid, 255=transparent) + 8-15 Red + 16-23 Green + 24-31 Blue + - pointer/cursor is assumed supported in hardware, 32x32 pixels, + each pixel either transparent or one of 3 paletted colours + - support for physically interlaced, logically progressive framestore via + MMU tricks and use of LineLength mode variable, currently not fully + integrated into kernel + +Note that it is possible to support hardware where only some pixel depths +are available, or only some fit the RISC OS assumptions. Also some hardware +has some configurability for 'arbitrary' choices like RGB versus BGR +ordering. Hence, the restrictions are typically much less severe than might +first be thought. + +Supporting a software only pointer/cursor is feasible (much less work than +new pixel formats) but not yet considered. + +Aside: RISC OS video interlace trick +------------------------------------ + +Has been used in NC/STB variants. Makes a physically interlaced framestore +(two distinct field stores) appear as logically progressive framestore, +using MMU to map many logical copies, and using freedom to choose a constant +logical increment between rows in RO mode definition. For 576 rows say, uses +576M of logical space. Each 1M (section mapped) supports a row and allows +logical address to increment monotonically, as physical address alternates +between (increasing rows of) physical field stores. Currently not integrated +into kernel, so fudges address space allocation and poking of video +variables. Also has drawback of thrashing data TLBs (one entry per row). + +The trick requires the physical field stores to be separated by 1M plus half +a row. The logical spacing between rows is also set to 1M plus half a row. +The 1M logical sections are set to map alternately to the even and odd +physical fields (the second field being offset by half a row relative to 1M +alignment). Then the logical incrementing of rows maps alternately between +fields, incrementing physically by 1 row between visits to the same field. +Note that the multiple logical mapping implies uncached screen to avoid +coherency worries, but RO uses uncached screen anyway (with exception of +Ursula/Phoebe, now defunct). + + +Routines in detail +------------------ + +[Note, plonking all routines here possibly only temporarily. May want +routines listed in relevant sections with overview. eg. video routines +with video section, etc.] + +-- HAL_Init(unsigned int *riscos_header) + +The OS will call HAL_Init after enabling the MMU, and initialising the HAL +workspace (filled with 0). At this point any initialisation for the main HAL +routines (rather than the early bootstrap code in the HAL) can be done. + +-- HAL_IRQEnable + +???? + +-- HAL_IRQDisable + +???? + +-- HAL_IRQClear + +???? + +-- HAL_IRQSource + +???? + +-- int HAL_Timers(void) + +Returns number of timers. Timers are numbered from 0 upwards. Timer 0 must +exist. + +-- int HAL_TimerDevice(int timer) + +Returns device number of timer n. A device number refers to the IRQ device +number for interrupt calls. + +-- unsigned int HAL_TimerGranularity(int timer) + +Returns basic granularity of timer n in ticks per second. + +-- unsigned int HAL_TimerMaxPeriod(int timer) + +Returns maximum period of the timer, in units of Granularity. Will be 1 for +a fixed rate timer. + +-- void HAL_TimerSetPeriod(int timer, unsigned int period) + +Sets period of timer n. If period > 0, the timer will generate interrupts +every (period / granularity) seconds. If period = 0, the timer may be +stopped. This may not be possible on some hardware, so the corresponding +interrupt should be masked in addition to calling this function with period +0. If period > maxperiod, behaviour is undefined. + +-- unsigned int HAL_TimerPeriod(int timer) + +Reads period of timer n. This should be the actual period in use by the +hardware, so if for example period 0 was requested and impossible, the +actual current period should be reported. + +-- unsigned int HAL_TimerReadCountdown(int timer) + +Returns the time until the next interrupt in units of granularity, rounded +down. If not available, 0 is returned. + +-- unsigned int HAL_CounterRate(void) + +Returns the rate of the counter in ticks per second. Typically will equal +HAL_TimerGranularity(0). + +-- unsigned int HAL_CounterPeriod(void) + +Returns the period of the counter, in ticks. Typically will equal +HAL_TimerPeriod(0). + +-- unsigned int HAL_CounterRead(void) + +Reads the current counter value. Typically will equal +HAL_TimerReadCountdown(0). + +-- unsigned void HAL_CounterDelay(unsigned int microseconds) + +Delay for at least the specified number of microseconds. + +-- unsigned int HAL_NVMemoryType(void) + +Returns a flags word describing the NVMemory + bits 0-7: 0 => no NVMemory available + 1 => NVMemory may be available on the IIC bus + 2 => NVMemory is available on the IIC bus, and the + device characteristics are known + 3 => the HAL provides NVMemory access calls. + bit 8: NVMemory has a protected region at the end + bit 9: Protected region is software deprotectable + bit 10: Memory locations 0-15 are readable + bit 11: Memory locations 0-15 are writeable + +If bits 0-7 are 0 or 1 no other NVMemory calls need be available, and bits +8-31 should be zero. + +If bits 0-7 are 2, Size, ProtectedSize, Protection and IICAddress calls must +be available. + +If bits 0-7 are 3, all calls except IICAddress must be available. + +-- unsigned int HAL_NVMemorySize(void) + +Returns the number of bytes of non-volatile memory available. Bytes 0-15 +should be included in the count, so for example a Philips PCF8583 CMOS/RTC +device (as used in the Archimedes and Risc PC) would be described as a +256-byte device, with locations 0-15 not readable. More complex arrangements +would have to be abstracted out by the HAL providing its own NVMemory access +calls. + +This is to suit the current RISC OS Kernel, which does not use bytes 0-15. + +-- unsigned int HAL_NVMemoryProtectedSize(void) + +Returns the number of bytes of NVMemory that are protected. These should be +at the top of the address space. The OS will not attempt to write to those +locations without first requesting deprotection (if available). Returns 0 if +bit 8 of the flags is clear. + +-- void HAL_NVMemoryProtection(bool) + +Enables (if true) or disables if (false) the protection of the software +protectable region. Does nothing if bits 8 and 9 not both set. + +-- unsigned int HAL_NVMemoryIICAddress(void) + +Returns a word describing the addressing scheme of the NVRAM. + bits 0-7: IIC address + +This will always be on bus zero. + +-- int HAL_NVMemoryRead(unsigned int addr, void *buffer, unsigned int n) + +Reads n bytes of memory from address addr onwards into the buffer supplied. +Returns the number of bytes successfully read. Under all normal +circumstances the return value will be n - if it is not, a hardware failure +is implied. Behaviour is undefined if the address range specified is outside +the NVMemory, or inside bytes 0-15, if declared unavailable. + +-- int HAL_NVMemoryWrite(unsigned int addr, void *buffer, unsigned int n) + +Write n bytes of memory into address addr onwards from the buffer supplied. +Returns the number of bytes successfully written. Under all normal +circumstances the return value will be n - if it is not, a hardware failure +is implied. Behaviour is undefined if the address range specified is outside +the NVMemory. Writes inside the a protected region should be ignored. + +-- int HAL_IICBuses(void) + +Returns the number of IIC buses on the system. + +-- unsigned int HAL_IICType(int bus) + +Returns a flag word describing the specified IIC bus. + bit 0: Bus supplies the low-level API + bit 1: Bus supplies the high-level API + bit 2: High-level API supports multi-master operation + bit 3: High-level API supports slave operation + bit 16: Bus supports Fast (400kbps) operation + bit 17: Bus supports High-speed (3.4Mbps) operation + bits 20-31: Version number of IIC supported by high-level API, * 100. + + +-- __value_in_regs IICLines HAL_IICSetLines(int bus, IICLines lines) + +Sets the SDA and SCL lines on the specified bus. A 0 value represents logic +LOW, 1 logic HIGH. The function then reads back and returns the values +present on the bus, to permit arbitration. + +Note the "__value_in_regs" keyword, which signifies that the binary ABI +expects SDA and SCL to be returned in registers a1 and a2. + +-- __value_in_regs IICLines HAL_IICReadLines(int bus) + +Reads the state of the IIC lines on the specified bus, without changing +their state. + +Note the "__value_in_regs" keyword, which signifies that the binary ABI +expects SDA and SCL to be returned in registers a1 and a2. + +-- int HAL_VideoFlybackDevice(void) + +Returns the device number of the video flyback interrupt. [Note: HAL +interrupt API possibly subject to change, may affect this call.] + +-- void HAL_Video_SetMode(const void *VIDCList3) + +Programs the video controller to initialise a display mode. RISC OS passes a +standard VIDC List Type 3 as specified in PRM 5a-125. Note that this is a +generic video controller list, and so VIDC in this context does not refer to +any specific devices such as Acorn VIDC20. + +The HAL is expected to set the video controller timings on this call. Any +palette, pixel DMA and hardware cursor settings are controlled via other +calls. + +-- void HAL_Video_WritePaletteEntry(uint type, uint pcolour, uint index) + +Writes a single palette entry to the video controller. + + type = 0 for normal palette entry + 1 for border colour + 2 for pointer colour + >= 3 reserved + + pcolour = palette entry colour in BBGGRRSS format (Blue,Green,Red,Supremacy) + + index = index of entry + +Indices are in the range 0..255 for normal, 0 for border, 0..3 for pointer +colours. Note that RISC OS only makes calls using 1..3 for the pointer, and +pointer colour 0 is assumed to be transparent. + +-- void HAL_Video_WritePaletteEntries(uint type, const uint *pcolours, + uint index, uint Nentries) + +Writes a block of palette entries to the video controller. + + type = 0 for normal palette entry + 1 for border colour + 2 for pointer colour + >= 3 reserved + + pcolours = pointer to block of palette entry colours in BBGGRRSS format + (Blue,Green,Red,Supremacy) + + index = start index in palette (for first entry in block) + + Nentries = number of entries in block (must be >= 1) + +Indices are in the range 0..255 for normal, 0 for border, 0..3 for pointer +colours. Note that RISC OS only makes calls using 1..3 for the pointer, and +pointer colour 0 is assumed to be transparent. + +-- uint HAL_Video_ReadPaletteEntry(uint type, uint pcolour, uint index) + +Returns the effective palette entry after taking into account any hardware +restrictions in the video controller, assuming it was originally programmed +with the value pcolour. + + type = 0 for normal palette entry + 1 for border colour + 2 for pointer colour + >= 3 reserved + + pcolour = palette entry colour in BBGGRRSS format (Blue,Green,Red,Supremacy) + + index = index of entry + + returns : effective BBGGRRSS + +Indices are in the range 0..255 for normal, 0 for border, 0..3 for pointer +colours. Note that RISC OS only makes calls using 1..3 for the pointer, and +pointer colour 0 is assumed to be transparent. + +Depending on harwdware capabilities, HALs may have to remember current +settings (eg. bits per pixel) or keep soft copies of entries. Because this +call supplies the original pcolour, this need is minimised (some HALs can +just return pcolour or a directly modified pcolour). + +-- void HAL_Video_SetInterlace(uint interlace) + +Sets the video interlaced sync. + + interlace = 0 or 1 for interlace off or on + (all other values reserved) + +-- void HAL_Video_SetBlank(uint blank, uint DPMS) + + blank = 0 or 1 for unblank or blank + (all other values reserved) + + DMPS = 0..3 as specified by monitor DPMSState (from mode file) + 0 for no DPMS power saving + +The HAL is expected to attempt to turn syncs off according to DPMS, and to +turn video DMA off for blank (and therefore on for unblank) if possible. The +HAL is not expected to do anything else, eg. blank all palette entries. Such +things are the responsibility of the OS, and also this call is expected to +be fast. May be called with interrupts off. + +-- void HAL_Video_SetPowerSave(uint powersave) + + powersave = 0 or 1 for power save off or on + (all other values reserved) + +The HAL is expected to perform any reasonable measures on the video +controller to save power (eg. turn off DACs), when the display is assumed +not to be required. Blanking is handled by a separate call. + +[What does this really mean. What is acceptable and safe for displays? ] + +-- void HAL_Video_UpdatePointer(uint flags, int x, int y, const shape_t *shape) + +Update the displayed position of the current pointer shape (or turn shape +off). This call is made by the OS at a time to allow smoothly displayed +changes (on a VSync). + + flags: + bit 0 = pointer display enable (0=off, 1=on) + bit 1 = pointer shape update (0=no change, 1=updated) + bits 2..31 reserved (0) + + xpos = x position of top left of pointer (xpos = 0 for left of display) + + ypos = y position of top left of pointer (ypos = 0 for top of display) + + shape points to shape_t descriptor block: + typedef struct shape_t + { + uint8 width; /* unpadded width in bytes (see notes) */ + uint8 height; /* in pixels */ + uint8 padding[2]; /* 2 bytes of padding for field alignment */ + void *buffLA; /* logical address of buffer holding pixel data */ + void *buffPA; /* corresponding physical address of buffer */ + } + +Notes: +1) if flags bit 0 is 0 (pointer off), x, y, shape are undefined +2) the shape data from RISC OS is always padded with transparent pixels + on the rhs, to a width of 32 pixels (8 bytes) +3) pointer clipping is the responsibility of the HAL (eg. may be able to + allow display of pointer in border region on some h/w) +4) buffer for pixel data is aligned to a multiple of 256 bytes or better + +The HAL may need to take note of the shape updated flag, and make its own +new copies if true. This is to handle cases like dual scan LCD pointer, +which typically needs two or more shape buffers for the hardware, or +possibly to handle clipping properly. This work should only be done when the +updated flag is true. + +A simple HAL, where hardware permits, can use the shape data in the buffer +directly, ignoring the updated flag. The OS guarantees that the buffer data +is valid for the whole time it is to be displayed. + +-- void HAL_Video_SetDAG(uint DAG, uint paddr) + +Set the video DMA address generator value to the given physical address. + + DAG = 0 set start address of current video display + 1 set start address of total video buffer + 2 set end address (exclusive) of total video buffer + all other values reserved + + paddr = physical address for given DAG + +The OS has a video buffer which is >= total display size, and may be using +bank switching (several display buffers) or hardware scroll within the total +video buffer. + + DAG=1 will be start address of current total video buffer + DAG=2 will be end address (exclusive) of current total video buffer + DAG=0 will be start address in buffer for current display + +HALs should respond differently depending on whether hardware scroll is +supported or not. (The OS will already know this from HAL_Video_Features). + +No hardware scroll: +Only DAG=0 is significant, and the end address of the current display is +implied by the size of the current mode. Calls with DAG=1,2 should be +ignored. + +Hardware scroll: +DAG=0 again defines display start. DAG=2 defines the last address +(exclusive) that should be displayed before wrapping back (if reached within +display size), and DAG=1 defines the address to which accesses should wrap +back. + +-- int HAL_Video_VetMode(const void *VIDClist, const void *workspace) + +Allows HAL to vet a proposed mode. + +[What does this really do, and what can HAL do. Are we going to allow +changes to VIDCList by HAL, ie. not const. Is mode workspace really ok to +pass to HAL ???] + + VIDClist -> generic video controller list (VIDC list type 3) + + workspace -> mode workspace (if mode number), or 0 + + returns 0 if OK (may be minor adjusts to VIDClist and/or workspace values) + non-zero if not OK + + +-- uint HAL_Video_Features(void) + +Determine key features supported by the video hardware. + + returns a flags word: + bit 0 hardware scroll is supported + bit 1 hardware pointer/cursor is supported + bit 2 interlace is supported with progressive framestore + other bits reserved (returned as 0) + +Bits are set for true. If bit 2 is true, then the OS assumes that a simple +progressive framestore layout is sufficient for an interlaced display (ie. +that the hardware implements the interlaced scan). + +-- uint HAL_Video_PixelFormats(void) + +Determine the pixel formats that are supported by the hardware. + + returns flags word: + bit 0 1 bpp is supported + bit 1 2 bpp is supported + bit 2 4 bpp is supported + bit 3 8 bpp is supported + bit 4 16 bpp is supported + bit 5 32 bpp is supported + other bits reserved (returned as 0) + +Bits are set for true. Bits 0-5 refer to support with standard RISC OS pixel +layout. (such as little endian packing for 1,2,4 bpp, 5-5-5 RGB for 16 bpp, +etc). See the section discussing Video for more information. Other formats +may be introduced when/if RO supports them. + +-- uint HAL_Video_BufferAlignment(void) + +Determine the framestore buffer alignment required by the hardware. + + returns an unsigned integer: + the required alignment for the framestore buffer, in bytes + (expected to be a power of 2) + + +-- HAL_MatrixColumns + +??? + +-- HAL_MatrixScan + +??? + +-- HAL_TouchscreenType + +??? + +-- HAL_TouchscreenRead + +??? + +-- unsigned int64 HAL_MachineID(void) + +Returns a 64-bit unique machine identifier. What does it mean? ... + +-- void *HAL_ControllerAddress(unsigned flags, unsigned controller) + +Maps to RISC OS' OS_Memory 9 call - provides a way for people who must poke +the hardware to find it. Bits 0-7 of controller are the sequence number +(starting at zero), and bits 8-31 are the controller type. Currently +allocated types are: + + 0 = EASI card access speed control register (sequence no = card) + 1 = EASI space (sequence no = card) + 2 = VIDC1 + 3 = VIDC20 + 4 = IOMD + + HALEntry HAL_HardwareInfo + HALEntry HAL_SuperIOInfo + + +RISC OS entry points from HAL init +---------------------------------- + +These are entry points into the OS, called from the HAL. + +-- void RISCOS_InitARM(unsigned int flags) + + flags: reserved - sbz + +On entry: + SVC mode + MMU and caches off + IRQs and FIQs disabled + No RAM or stack used + +On exit: + Instruction cache may be on + +This routine must be called once very early on in the HAL start-up, to +accelerate the CPU for the rest of HAL initialisation. Typically, it will +just enable the instruction cache (if possible on the ARM in use), and +ensure that the processor is in 32-bit configuration and mode. + +Some architecture 4 (and later) ARMs have bits in the control register that +affect the hardware layer - eg the iA and nF bits in the ARM920T. These are +the HAL's responsibility - the OS will not touch them. Conversely, the HAL +should not touch the cache, MMU and core configuration bits (currently bits +0-14). + +On architecture 3, the control register is write only - the OS will set bits +11-31 to zero. + +Likewise, such things as the StrongARM 110's register 15 (Test, Clock and +Idle Control) are the HAL's responsibility. The OS does not know about the +configuration of the system, so cannot program such registers. + +This entry must not be called after RISCOS_Start. + +-- void *RISCOS_AddRAM(unsigned int flags, void *start, void *end, + uintptr_t sigbits, void *ref) + flags + bit 0: video memory (only first contiguous range will be used) + bits 8-11: speed indicator (arbitrary, higher => faster) + other bits reserved (SBZ) + start + start address of RAM (inclusive) (no alignment requirements) + end + end address of RAM (exclusive) (no alignment requirements, but must be >= start) + sigbits + significant address bit mask (1 => this bit of addr decoded, 0 => this bit ignored) + ref + reference handle (NULL for first call) + +Returns ref for next call + +On entry: + SVC32 mode + MMU and data cache off + IRQs and FIQs disabled + +This entry point must be the first call from the HAL to RISC OS following a hardware +reset. It may be called as many times as necessary to give all enumerate RAM that +is available for general purpose use. It should only be called to declare video +memory if the video memory may be used as normal RAM when in small video modes. + +To permit software resets: + The HAL must be non-destructive of any declared RAM outside the first 4K of the first + block. + The stack pointer should be initialised 4K into the first block, or in some non- + declared RAM. + Must present memory in a fixed order on any given system. + +The first block must be at least 256K and 16K aligned. +Block coalescing only works well if RAM banks are added in ascending address order. + +RISC OS will use RAM at the start of the first block as initial workspace. +Max usage is 16 bytes per block + 32 (currently 8 per block + 4). This +limits the number of discontiguous blocks (although RISC OS will concatanate +contiguous blocks where possible). + +This call must not be made after RISCOS_Start. + + +-- void RISCOS_Start(unsigned int flags, int *riscos_header, + int *hal_entry_table, void *ref) + + flags + bit 0: power on reset + bit 1: CMOS reset inhibited (eg protection link on Risc PC) + bit 2: perform a CMOS reset (if bit 1 clear and bit 0 set - eg front panel + button held down on an NC) + +On entry: + SVC32 mode + MMU and data cache off + IRQs and FIQs disabled + +This routine must be called after all calls to RISCOS_AddRAM have been +completed. It does not return. Future calls back to the HAL are via the HAL +entry table, after the MMU has been enabled. + + +-- void *RISCOS_MapInIO(unsigned int flags, void *phys, unsigned int size) + + flags: bit 2 => make memory bufferable + phys: physical address to map in + size: number of bytes of memory to map in + +This routine is used to map in IO memory for the HAL's usage. Normally it +would only be called during HAL_Init(). Once mapped in the IO space cannot +be released. + +It returns the resultant virtual address corresponding to phys, or 0 for +failure. Failure can only occur if no RAM is available for page tables, or +if the virtual address space is exhausted. + +-- void *RISCOS_AccessPhysicalAddress(unsigned int flags, void *phys, void **oldp) + + flags: bit 2 => make memory bufferable + other bits must be zero + phys: physical address to access + oldp: pointer to location to store old state (or NULL) + +On entry: + Privileged mode + MMU on + FIQs on + Re-entrant + +On exit: + Returns logical address corresponding to phys + +Arranges for the physical address phys to be mapped in to logical memory. In +fact, the whole megabyte containing "phys" is mapped in (ie if phys = +&12345678, then &12300000 to &123FFFFF become available). The memory is +supervisor access only, non-cacheable, non-bufferable by default, and will +remain available until the next call to RISCOS_Release/AccessPhysicalAddress +(although interrupt routines or subroutines may temporarily map in something +else). + +When finished, the user should call RISCOS_ReleasePhysicalAddress. + +-- void RISCOS_ReleasePhysicalAddress(void *old) + + old: state returned from a previous call to RISCOS_AccessPhysicalAddress + +On entry: + MMU on + FIQs on + Re-entrant + +Usage: + Call with the a value output from a previous RISCOS_ReleasePhysicalAddress. + +Example: + + void *old; + unsigned int *addr = (unsigned int *) 0x80005000; + unsigned int *addr2 = (unsigned int *) 0x90005000; + + addr = (unsigned int *) RISCOS_AccessPhysicalAddress(addr, &old); + addr[0] = 3; addr[1] = 5; + + addr2 = (unsigned int *) RISCOS_AccessPhysicalAddress(addr2, NULL); + *addr2 = 7; + + RISCOS_ReleasePhysicalAddress(old); diff --git a/VersionASM b/VersionASM index 1e37868398fd9379e4150d4584b0c3b6a0391423..095b5cdf02a123d842cf3364706b7c3098f4dc6a 100644 --- a/VersionASM +++ b/VersionASM @@ -13,12 +13,12 @@ GBLS Module_ComponentPath Module_MajorVersion SETS "5.35" Module_Version SETA 535 -Module_MinorVersion SETS "4.79.2.14" -Module_Date SETS "09 Jan 2001" -Module_ApplicationDate2 SETS "09-Jan-01" -Module_ApplicationDate4 SETS "09-Jan-2001" +Module_MinorVersion SETS "4.79.2.15" +Module_Date SETS "12 Jan 2001" +Module_ApplicationDate2 SETS "12-Jan-01" +Module_ApplicationDate4 SETS "12-Jan-2001" Module_ComponentName SETS "Kernel" Module_ComponentPath SETS "RiscOS/Sources/Kernel" -Module_FullVersion SETS "5.35 (4.79.2.14)" -Module_HelpVersion SETS "5.35 (09 Jan 2001) 4.79.2.14" +Module_FullVersion SETS "5.35 (4.79.2.15)" +Module_HelpVersion SETS "5.35 (12 Jan 2001) 4.79.2.15" END diff --git a/VersionNum b/VersionNum index ffc1a7eadb0301611a8380f6c5f039ddd1a29ab3..686bfb8b6ff1db59f969b92524231593da1dc8d9 100644 --- a/VersionNum +++ b/VersionNum @@ -4,19 +4,19 @@ * */ #define Module_MajorVersion_CMHG 5.35 -#define Module_MinorVersion_CMHG 4.79.2.14 -#define Module_Date_CMHG 09 Jan 2001 +#define Module_MinorVersion_CMHG 4.79.2.15 +#define Module_Date_CMHG 12 Jan 2001 #define Module_MajorVersion "5.35" #define Module_Version 535 -#define Module_MinorVersion "4.79.2.14" -#define Module_Date "09 Jan 2001" +#define Module_MinorVersion "4.79.2.15" +#define Module_Date "12 Jan 2001" -#define Module_ApplicationDate2 "09-Jan-01" -#define Module_ApplicationDate4 "09-Jan-2001" +#define Module_ApplicationDate2 "12-Jan-01" +#define Module_ApplicationDate4 "12-Jan-2001" #define Module_ComponentName "Kernel" #define Module_ComponentPath "RiscOS/Sources/Kernel" -#define Module_FullVersion "5.35 (4.79.2.14)" -#define Module_HelpVersion "5.35 (09 Jan 2001) (4.79.2.14)" +#define Module_FullVersion "5.35 (4.79.2.15)" +#define Module_HelpVersion "5.35 (12 Jan 2001) (4.79.2.15)" diff --git a/s/ARM600 b/s/ARM600 index fc0c0ddd67fb48ed855f6803e7e8338dd8aa4db2..b4e1e1750cc8f6e92ec65489a4fd89e98cd40070 100644 --- a/s/ARM600 +++ b/s/ARM600 @@ -2392,6 +2392,7 @@ MMUControl_Flush TST r0,#&80000000 BEQ MMUC_flush_flushT ARMop Cache_CleanInvalidateAll,,,r1 + LDR r0, [sp] MMUC_flush_flushT TST r0,#&40000000 BEQ MMUC_flush_done diff --git a/s/vdu/vdudriver b/s/vdu/vdudriver index 4300064a96ad45c3e0655a9742f83a27b8a3b54a..30639cf7f8850dc3ec412fbd42321ccb5468099a 100644 --- a/s/vdu/vdudriver +++ b/s/vdu/vdudriver @@ -168,15 +168,12 @@ VduInit ROUT STR r0, [r4, #HWPixelFormats] mjsCallHAL HAL_Video_Features STR r0, [r4, #HWVideoFeatures] - mjsCallHAL HAL_Video_Features - STR r0, [r4, #HWPixelFormats] mjsCallHAL HAL_Video_BufferAlignment STR r0, [r4, #HWBufferAlign] Pull "r4, r9, r12" ;;; sort this out! - ! 0, "mjsHAL not doing anything useful with HAL_Video_PixelFormats" - ! 0, "mjsHAL not doing anything useful with HAL_Video_bufferAlign" + ! 0, "mjsHAL not doing anything useful with HAL_Video_BufferAlignment" ! 0, "mjsHAL not dealing with lack of h/w pointer" LDR R0, =RangeC+SpriteReason_SwitchOutputToSprite @@ -607,6 +604,75 @@ CursorNbitTab & Cursor16bit-CursorNbitTab & Cursor32bit-CursorNbitTab +; table of susbstitute mode numbers to cater for hardware that might +; not support all of 1,2,4,8 bpp (bits per pixel) modes +; +; indexed by mode number (0..49), pairs of byte values: +; bpp = bits per pixel of this mode number +; promo = promoted mode number (0..49), or &FF if none +; +; promoted number is: +; 1) same resolution at next higher bpp (up to 8), if available, or +; 2) similar resolution at 8 bpp (8 bpp should be available on most h/w) +; +ModePromoTable +; +; bpp promo mode no. +; + DCB 1, 8 ; 0 + DCB 2, 9 ; 1 + DCB 4, 10 ; 2 + DCB 1, 15 ; 3 + DCB 1, 1 ; 4 + DCB 2, 2 ; 5 + DCB 1, 13 ; 6 + DCB 4, 13 ; 7 + DCB 2, 12 ; 8 + DCB 4, 13 ; 9 + DCB 8, &FF ; 10 + DCB 2, 14 ; 11 + DCB 4, 15 ; 12 + DCB 8, &FF ; 13 + DCB 4, 15 ; 14 + DCB 8, &FF ; 15 + DCB 4, 24 ; 16 + DCB 4, 24 ; 17 + DCB 1, 19 ; 18 + DCB 2, 20 ; 19 + DCB 4, 21 ; 20 + DCB 8, &FF ; 21 + DCB 4, 36 ; 22 + DCB 1, 28 ; 23 + DCB 8, &FF ; 24 + DCB 1, 26 ; 25 + DCB 2, 27 ; 26 + DCB 4, 28 ; 27 + DCB 8, &FF ; 28 + DCB 1, 30 ; 29 + DCB 2, 31 ; 30 + DCB 4, 32 ; 31 + DCB 8, &FF ; 32 + DCB 1, 34 ; 33 + DCB 2, 35 ; 34 + DCB 4, 36 ; 35 + DCB 8, &FF ; 36 + DCB 1, 38 ; 37 + DCB 2, 39 ; 38 + DCB 4, 40 ; 39 + DCB 8, &FF ; 40 + DCB 1, 42 ; 41 + DCB 2, 43 ; 42 + DCB 4, 28 ; 43 + DCB 1, 45 ; 44 + DCB 2, 46 ; 45 + DCB 4, 15 ; 46 + DCB 8, &FF ; 47 + DCB 4, 49 ; 48 + DCB 8, &FF ; 49 +; + ALIGN + + ; ***************************************************************************** ; ; SYN - Perform MODE change @@ -634,6 +700,39 @@ VduBadExit ; jumped to if an error in VDU code ModeChangeSub ROUT Push lr + ;If its a common mode number (0..49) consider a possible mode number + ;substitution, if hardware does not support given bits per pixel. + ;We are vaguely assuming h/w supports at least 8 bpp, otherwise we may + ;not be able to find a usable mode number, and later code may not handle + ;that well. This is probably ok, 8 bpp is almost universal. + ; + CMP r2, #256 + BHS mchsub_3 + AND r1, r2, #&7F + CMP r1, #50 ; mode number + BHS mchsub_3 + Push "r3, r4" + ADR lr, ModePromoTable ; table of mode promotions + LDR r4, [WsPtr, #HWPixelFormats] ; bits 0 to 3 set for 1,2,4,8 bpp supported +mchsub_1 + MOV r1, r1, LSL #1 + LDRB r3, [lr, r1] ; bpp for this mode number (1,2,4,8) + TST r3, r4 ; supported in h/w? + ANDNE r2, r2, #&80 ; if yes, take mode number that passed + ORRNE r2, r2, r1, LSR #1 + BNE mchsub_2 + ADD r1, r1, #1 ; else look for promotion + LDRB r1, [lr, r1] ; new mode number + CMP r1, #&FF ; &FF if none + BNE mchsub_1 + ;alright, dont panic, just try to get a VGA-like mode of any bpp, if not tried already + CMP r1, #28 ; VGA 8 bpp + MOVNE r1, #25 ; VGA 1 bpp + BNE mchsub_1 +mchsub_2 + Pull "r3, r4" +; +mchsub_3 MOV R1, #Service_PreModeChange IssueService TEQ R1, #0 ; was service claimed ? diff --git a/s/vdu/vduswis b/s/vdu/vduswis index 9362f04552588e3b6ed349f8e9855a051c23a1bb..cfe1f8411e63d2a52933cc8cf4496c6286f38655 100644 --- a/s/vdu/vduswis +++ b/s/vdu/vduswis @@ -783,23 +783,25 @@ FindOKMode ROUT BNE %FT05 ; service claimed -; mjs Kernel/HAL split -; call HAL vetting routine to possibly adjust parameters (or if desperate, to disallow mode) - -;;;mjsHAL - is the mode workspace suitably generic to be passed to HAL? - ; int HAL_VetMode(void *VIDClist, void *workspace) - ; - ; VIDClist -> generic video controller list (VIDC list type 3) - ; workspace -> mode workspace (if mode number), or 0 - ; returns 0 if OK (may be minor adjusts to VIDClist and/or workspace values) - ; non-zero if not OK - ; +; mjs Kernel/HAL split +; call HAL vetting routine to possibly disallow mode +; Push "r0-r3, r9, r12" MOV r0,r3 MOV r1,r4 + ;we'll do the vet on whether h/w supports the pixel depth ourselves + LDR r2,[r0,#VIDCList3_PixelDepth] + MOV r3,#1 + MOV r3,r3,LSL r2 ; bits per pixel + LDR r2,[WsPtr,#HWPixelFormats] + TST r3,r2 + MOVEQ r0,#1 + BEQ %FT04 ; not supported + ;now any vet the HAL might want to do mjsAddressHAL mjsCallHAL HAL_Video_VetMode +04 CMP r0,#0 Pull "r0-r3,r9,r12" BNE %FT05 ; HAL says "Oi, Kernel, No!" @@ -921,6 +923,13 @@ FindSubstitute Entry ADD r13, r13, #PushedInfoSize CMP r11, #4 MOVCS r11, #0 + Push "r2, r3" + LDR r2, [WsPtr, #HWPixelFormats] ; see if h/w supports this BPP + MOV r3, #1 + MOV r3, r3, LSL r11 + TST r2, r3 + MOVEQ r11, #3 ; if not, use 8 BPP (assumed best chance for a mode number) + Pull "r2, r3" LDRB r1, [r1, r11] CLRV EXIT