Commit 2736fc5f authored by Jeffrey Lee's avatar Jeffrey Lee

Merge SMP branch to trunk

Detail:
  Since the current SMP changes are fairly minor, and the trunk is seeing most development, from a maintenance perspective it makes sense to merge the changes to trunk. This will also make sure they get some wider testing ready for when the next round of SMP development takes place.
  Changes:
  - Docs/SMP - New docs folder describing SMP-related changes to the HAL and interrupt handling. Some of the IRQ changes can also be taken advantage of by single-core devices, since it introduces a way to describe which interrupt sources can be routed to IRQ & FIQ
  - Makefile, hdr/DBellDevice, hdr/HALDevice - New HAL device for an inter-processor software-generated interrupt source ("doorbell")
  - hdr/HALEntries - Reuse the unused matrix keyboard & touchscreen HAL entry points for the new IRQ handling & SMP-related HAL calls
  - hdr/KernelWS - Bump up MaxInterrupts
  - hdr/OSMem, s/MemInfo - Introduce OS_Memory 19, to allow for DMA to/from cacheable memory without actually altering the cacheability of the pages (which can be even more tricky in SMP systems than it is in uniprocessor systems)
  - hdr/Options - Introduce SMP build switch. Currently this controls whether the ARMops will operate in "SMP-friendly" mode or not (when running on MP processors)
  - s/ARMops, s/MemMap2 - Introduce the ARMv7MP ARMop implementation. Simplify DCache_LineLen / ICache_LineLen handling for WB_CR7_Lx so that it's the plain value rather than log2(n)-2
  - s/ExtraSWIs - If ARMops are in SMP-friendly mode, global OS_SynchroniseCodeAreas now only syncs application space and the RMA. This is because there is no trivial MP-safe global IMB operation available. This will also make global OS_SynchroniseCodeAreas significantly slower, but the documentation has always warned against performing a global IMB for just that reason, so code that suffers performance penalties should really try and switch to a ranged IMB.
  - s/NewIRQs - Update some comments regarding IRQ handler entry/exit conditions
Admin:
  Untested


Version 6.09. Tagged as 'Kernel-6_09'
parents 526764e1 fa39779f
HAL amendments for SMP
======================
What this document covers
-------------------------
This document describes any additions or revisions to the HAL specification, in particular any new or changed HAL entry points. Other documents may cover specific areas of the HAL/OS (e.g. IRQ handling) which are not covered here.
Overview
--------
Startup of an SMP system will be along the following lines:
1. The HAL performs the standard OS boot process (RISCOS_InitARM,
RISCOS_AddRAM, RISCOS_Start), for a single core. The HAL should ensure the
other cores are held in some form of reset state (e.g. in a software
spin/sleep loop, if necessary)
2. Once the kernel is sufficiently initialised, it will make use of the new
HAL_CPUCount and HAL_CPUNumber entry points to determine that it is running
on a multi-core system and the ID of the boot core.
3. For each additional core that the OS wants to make use of, the OS will call
HAL_SMPStartup to instruct the HAL to boot the core. The HAL is required to
initialise the core (e.g. basic CP15 register settings) and then jump to the
physical address that was provided in the HAL_SMPStartup call.
SMP safety of HAL and OS entry points is described in terms of three levels:
* PRIMARY: The entry point or group of entry points must only be called from the primary core (or whichever core issues RISCOS_Start)
* UNSAFE: The entry point or group of entry points cannot be called concurrently from multiple cores.
* PER-RESOURCE: The entry point or group of entry points can be called concurrently from multiple cores, as long as no two concurrent calls access the same resource (e.g. making concurrent calls to two different timers)
* SAFE: Concurrent calls which target the same resource are expected to be handled in a fully SMP-safe manner.
For UNSAFE and PER-RESOURCE, if calls are to be made from different cores, it is the caller's responsibility to use the necessary memory barriers to make sure that each core which makes a call sees a consistent view of memory, as if all the calls had been made from a single core. For example, the DualSerial module will want to use a per-UART spinlock to ensure that HAL calls for a specific UART can only occur on one core at a time. The spinlock implementation will internally perform barrier operations when locking and unlocking, automatically fulfilling this requirement.
For SAFE calls, the HAL must contains its own barriers and/or spinlocks as necessary, to allow it to cope with any kind of concurrent behaviour.
SMP safety of OS entry points
-----------------------------
Currently, all OS entry points are PRIMARY.
SMP safety of HAL entry points
------------------------------
PRIMARY:
* HAL_Init
* HAL_InitDevices
* HAL_KbdScan
* HAL_Reset
* HAL_SMPStartup
UNSAFE:
* HAL_NVMemory
* HAL_PCI
* HAL_USB
* HAL_Video
* HAL_DebugRX, HAL_DebugTX
* HAL_Watchdog
PER-RESOURCE:
* HAL_Timer & HAL_Counter (Note that HAL_Counter counts as timer zero)
* HAL_IIC
* HAL_UART
SAFE:
* HAL_CleanerSpace
* HAL_ExtMachineID
* HAL_HardwareInfo
* HAL_MachineID
* HAL_PhysInfo
* HAL_PlatformInfo
* HAL_PlatformName
* HAL_SuperIOInfo
* HAL_CPUCount
* HAL_CPUNumber
Safety of HAL_IRQ & HAL_FIQ entry points are described in a separate document.
The HAL_Matrix and HAL_Touchscreen entry points have been retired and replaced with new SMP/IRQ-related entry points, as described elsewhere.
SMP safety of HAL devices
-------------------------
In general, individual HAL devices are UNSAFE. Concurrent calls to different HAL devices should be safe, unless there are device-specific restrictions in place (e.g. DMA and audio typically make use of linked devices, and so count as a single unsafe resource). For more details, consult revised device documentation where available.
New entry points
----------------
For SMP systems, all of these entry points must be implemented. For non-SMP systems they can be left unimplemented (standard MOV pc,lr stub).
#56: int HAL_CPUCount(void)
Returns the number of CPU cores which are present in the system and can be controlled by the HAL. E.g. 4 for a quad-core system.
#57: int HAL_CPUNumber(void)
Returns a number in the range [0, HAL_CPUCount) which identifies the current core. Typically this will just involve extracting the lower bits from the CP15 MPIDR register, however since core numbering is platform-specific this action must be performed by the HAL.
#58: void HAL_SMPStartup(int core, unsigned int addr)
Starts the indicated core (core number in the range [0, HAL_CPUCount)). Behaviour is undefined if the call is made for a core which has already started.
'addr' provides the physical address of the code that the core should jump to once it starts. Depending on the platform and bootloader, the HAL may be required to perform basic initialisation beforehand (e.g. enabling the SMP bit in CP15, enabling the snoop control unit, GIC CPU interface initialisation, etc.). Essentially the core should be in the same state as the primary core was when OS_Start was called.
The HAL can assume that all required code/data located at 'addr' has been fully flushed to main memory prior to the call.
The HAL need not wait for the core to fully initialise; the OS will contain its own wait loop that will be exited once the desired core executes the startup code located at 'addr'.
The OS will not attempt to initialise multiple cores in parallel. E.g. it is safe for the HAL to reuse the same portion of NCNB workspace to store any dynamic bootstrap code/data that is required.
SMP IRQ handling
================
What this document covers
-------------------------
This document specifies a new revision of the HAL interrupt API which allows the HAL/OS to implement useful multi-core IRQ handling on typical multi-core systems such as the above.
The new specification is backwards-compatible with the current specification, so multi-core device drivers running on older, single-core OS versions should function correctly. However old single-core drivers running on a multi-core OS may require updates to ensure correct behaviour.
Typical multi-core IRQ hardware
-------------------------------
Consider the following example of a typical multi-core IRQ setup:
+---------------+
+--------+ +--+ /-| Private timer |
| | |#0|-/ +---------------+
| | IRQ |#1|-------\
| Core 0 |-----|#2|------\ \+--------+
| | |#3|-----\ \/ |-------< peripheral 1
| | |#4|----\ \/ |
+--------+ +--+ \/ |-------< peripheral 2
+ IRQ router |
+--------+ +--+ /\ |-------< peripheral 3
| | |#0|-\ / /\ |
| | IRQ |#1|--{- / /\ |-------< peripheral 4
| Core 1 |-----|#2|--{-- / /+--------+
| | |#3|--{--- /
| | |#4|--{----
+--------+ +--+ | +---------------+
\-| Private timer |
+---------------+
* Core 0 has a private timer IRQ, and four periperhal IRQs (#1-#4, corresponding to peripherals 1-4)
* Core 1 has the same setup, but IRQ#0 corresponds to a different private timer device
* An interrupt router is present which controls which core(s) receive each of the four peripheral interrupts.
* On some systems, the interrupt router will allow multiple cores to receive the same interrupt (N-N in ARM GIC terminology). A software or hardware interlock will be used to ensure only one CPU services each interrupt at a time.
* On other systems, the interrupt router will only allow each peripheral interrupt to be routed to a single core at a time.
* For simplicity we'll only consider the case where the routing can be controlled on a per-interrupt basis; if routing can only be performed on groups of interrupts then we assume the HAL will use a fixed mapping scheme and not allow the routing to be modified at runtime.
* It's not shown on the diagram, but typically some of the interrupt controller registers will be banked, with each CPU seeing its own view of the register.
* For example, with the ARM GIC, private interrupts such as the private timer IRQ can only have their enable/disable state modified by the core that owns the interrupt
Overview of the revised API
---------------------------
The new specification aims to produce a system where device drivers shouldn't need to know or care what core their interrupt handlers are running on. Ideally, the OS should be able to route interrupts to cores as it pleases, in order to balance the overall IRQ latency of the system.
To support this, the following is necessary:
* The HAL will ensure that IRQ device numbers (as used by the HAL IRQ/FIQ APIs) are assigned to interrupts such that each unique source device/peripheral is assigned a unique device number.
* For example, the above example would require six device numbers: Two for the two private timers and four for the four peripherals.
* This may require the HAL to internally remap device numbers to/from hardware IRQ numbers
* A new HAL entry point, HAL_IRQProperties, will be introduced to allow the HAL to describe the properties of a given device number
* e.g. which core(s) it can be routed to
* Two new HAL entry points, HAL_IRQSetCores and HAL_IRQGetCores, will be introduced to allow the IRQ routing to be modified
* Currently there are no new calls defined to allow the routing of FIQs, however these can easily be added at a later date if required
* In general, operations which mask/unmask interrupts (HAL_IRQEnable / HAL_IRQDisable & the FIQ equivalents) are expected to operate correctly no matter what core they are called from.
* However, to simplify HAL implementation, the HAL is not required to support the use of those calls for interrupts where the state is only accessible from the owning core (such as the private timer interrupts). Usually this restriction only applies to kernel-level interrupts like timers and doorbells, so the restriction is not expected to impact general device drivers.
* If this becomes a problem in the future, it's expected that the HAL or kernel could be extended to use a message passing system to allow the state of private interrupts to be transparently controlled by other cores.
The dangers of HAL_IRQEnable / HAL_IRQDisable
---------------------------------------------
HAL_IRQEnable/HAL_IRQDisable do not perform reference counting, which makes them dangerous to use in threaded or multi-core environments. To ensure the calls are used in a safe manner, drivers must obey the following rules:
* After calling OS_ClaimDeviceVector, a driver must call HAL_IRQEnable to enable receipt of the interrupt (ideally the kernel would do this automatically, but to ensure drivers are compatible with prior releases of RISC OS 5 there is little to be gained by having the kernel do this)
* If the IRQ line is shared, the driver must make no other calls to HAL_IRQEnable / HAL_IRQDisable (otherwise there is the danger that it could conflict with calls made by other drivers)
* This means that nested interrupts (as described below) must be handled by masking the interrupts in the peripheral. If the peripheral does not have a dedicated interrupt mask register then nested interrupts must not be used.
* For non-shared IRQ lines, the driver is free to call HAL_IRQEnable / HAL_IRQDisable at its leisure. However it must be aware of the dangers of this (e.g. a foreground thread may conflict with the IRQ handler if they are on different cores)
* The kernel will only call HAL_IRQDisable for unhandled interrupts, so there should be no danger of it conflicting with usage by device drivers
* There is the danger of a race condition with OS_ClaimDeviceVector which could result in an interrupt being erroneously disabled by the kernel. To avoid this the kernel must treat OS_ClaimDeviceVector as a barrier operation, which ensures there is no running interrupt handler for the device (including the unhandled interrupt handler) while the handler list is being updated.
HAL_FIQEnable/HAL_FIQDisable are also dangerous in threaded or multi-core environments, however since RISC OS only allows one driver to claim the FIQ vector at a time, the driver which owns the FIQ vector should not have to worry about any calls from other drivers which cause problems.
IRQ handling state machine
--------------------------
At the HAL layer, IRQ handling is expected to match the behaviour described by the following state machine. Each CPU core is expected to implement its own instance of the state machine.
Start (1): An unmasked IRQ is received by the interrupt controller,
| and it has decided to route it to the current CPU. The
V CPU's IRQ line will be asserted and at some point the CPU
/------\ will start executing the IRQ vector.
/-->| idle |<-\
| \------/ | (2): The CPU calls HAL_IRQSource to determine the source of
| | | the pending interrupt. While in the 'active' state:
|(4) | (1) | * The state of the CPU's IRQ line is indeterminate
| V | * The result of further calls to HAL_IRQSource is
| /---------\ | unpredictable
| | pending | |
| \---------/ | (3): The CPU calls HAL_IRQClear OR HAL_IRQDisable, specifying
| | | | the device number that was returned by HAL_IRQSource.
\---/ | (2) | * After the call, the CPU's IRQ line will be de-asserted
V | and the interrupt controller will resume looking for
/--------\ | new interrupts
| active | | * Calling HAL_IRQClear for a different device number to
\--------/ | that returned by HAL_IRQSource is unpredictable
| (3) | * Calling HAL_IRQDisable for a different device number
\------/ will not cause a state transition
(4): The CPU calls HAL_IRQSource, and -1 is returned,
indicating a spurious IRQ.
* No further action by the CPU is necessary, the IRQ
line will be de-asserted and the interrupt controller
will resume looking for new interrupts.
State Allowed operations
----- ------------------
idle All, except HAL_IRQSource and HAL_IRQClear.
pending All, except HAL_IRQClear. HAL_IRQSource will either trigger
transition (2) or (4), depending on the return value.
active All, except HAL_IRQSource. HAL_IRQClear is only valid if it
specifies the device number that was returned by HAL_IRQSource, in
which case transition (3) will be triggered. (3) will also be
triggered if HAL_IRQDisable is called with the device number that was
returned by HAL_IRQSource. Note that in all cases, (3) will only be
triggered if it is the core that received the interrupt that makes
the call.
This state machine is identical to that used by the single-core IRQ handling. However it is only now that it has been formally specified.
Nested interrupts
-----------------
Most modern interrupt controllers have builtin support for handling of nested interrupts, typically using a prioritisation-based scheme. The HAL/OS currently has no way of taking advantage of this functionality, and this document does not aim to tackle that.
Therefore, nested interrupts are expected to be handled using the following scheme:
1. The kernel calls HAL_IRQSource to determine the interrupt source, and then calls the appropriate IRQ handler
2. The interrupt handler silences the interrupt, either by using HAL_IRQDisable (non-shared IRQ line) or by masking the interrupt in the peripheral and calling HAL_IRQClear (shared IRQ line). This will also allow the IRQ controller to start generating new IRQ requests to the processor.
3. The interrupt handler clears the PSR.I bit to allow the CPU to handle any incoming IRQs.
4. At end of execution, the interrupt handler sets PSR.I (to prevent any stack overflows if the disabled device is already interrupting again), and reverses the action that was performed in step 2 (i.e. call HAL_IRQEnable, or un-mask the interrupts in the peripheral)
5. The interrupt handler returns to the kernel
Again, this is the scheme that is currently used for single-core IRQ handling, and provides (unprioritised) nested IRQ support for any type of interrupt controller.
API details
-----------
Note that all calls which accept a device number expect the number to be in the range [0, HAL_IRQMax). E.g. the 'shared' flag which some APIs place in bit 31 must be clear.
#107: int HAL_IRQMax(void)
Returns the highest (+1) interrupt device number in the system. In addition to the uniqueness constraint described in the overview above, the HAL should also aim to keep the HAL_IRQMax number as low as reasonably possible, so that the OS can use a simple lookup table to map device numbers to handlers.
#1: int HAL_IRQEnable(int device)
#6: int HAL_FIQEnable(int device)
Unmasks the specified interrupt source within the interrupt controller.
Returns non-zero if the interrupt was previously enabled (for IRQ/FIQ as appropriate), zero if it was previously disabled.
Behaviour is unpredictable if a private interrupt is specified, but the calling core does not own the interrupt.
#2: int HAL_IRQDisable(int device)
#7: int HAL_FIQDisable(int device)
Masks the specified interrupt source within the interrupt controller, so that it will no longer trigger IRQ/FIQ generation.
Returns non-zero if the interrupt was previously enabled (for IRQ/FIQ as appropriate), zero if it was previously disabled.
Behaviour is unpredictable if a private interrupt is specified, but the calling core does not own the interrupt.
If the interrupt is currently firing (i.e. HAL_IRQSource has returned with that number) then this call will also act as a call to HAL_IRQClear / HAL_FIQClear.
#4: int HAL_IRQSource(void)
#10: int HAL_FIQSource(void)
The kernel must call this on entry to the IRQ/FIQ vector in order to determine the source of the current interrupt.
The return value will be a IRQ/FIQ device number, or -1 if the interrupt was spurious.
If a valid IRQ/FIQ device number is returned, it's expected that the OS will handle the interrupt; calling HAL_IRQSource and then doing nothing with the result is forbidden (e.g. if you had a routine which polls interrupt state with IRQs disabled).
#3: void HAL_IRQClear(int device)
#9: void HAL_FIQClear(int device)
This should be called at the end of each interrupt handler, to signal to the interrupt controller that the given IRQ/FIQ interrupt has been dealt with.
Generally when the HAL receives this call it will signal to the interrupt controller that the interrupt has been dealt with, e.g. by writing to the EOIR register in a GIC. Failing to mark in interrupt as complete may mean the interrupt gets (spuriously) triggered again, or it may prevent lower priority interrupts from being received.
For HAL timer IRQs and VSync IRQs (when using the HAL video API) the HAL may also use this to update the IRQ state within the timer or video controller. Implementing this behaviour within a new HAL is discouraged (use HAL_TimerIRQClear and GraphicsV video devices)
For spurious interrupts, no call to HAL_IRQClear/HAL_FIQClear should be made.
For SMP, these calls are only expected to work correctly if the call is being made on the core on which the interrupt was received.
#5: int HAL_IRQStatus(int device)
#11: int HAL_FIQStatus(int device)
Returns non-zero if the indicated device is currently requesting an interrupt, ignoring its current enable/disable state.
#8: void HAL_FIQDisableAll(void)
Disable all FIQ sources for the current core.
#53: __value_in_regs struct { int irq, fiq; } HAL_IRQProperties(int device)
Returns information about the behaviour of the given interrupt in SMP systems.
The low 16 bits of irq and fiq are bit masks, indicating which core(s) the interrupt can be routed to (and whether it can be routed as an IRQ or an FIQ).
The high 16 bits of irq and fiq provide additional information:
Bit 31: =1 if the interrupt can be assigned to multiple cores at once
=0 if it can only be assigned to (a maximum of) one core at a time
Bit 30: =1 if HAL_IRQEnable/HAL_IRQDisable (or FIQ equivalent) will only operate correctly if they are called from a core which the interrupt is currently routed to
=0 if HAL_IRQEnable/HAL_IRQDisable (or FIQ equivalent) will work from any core, and will affect all cores to which the interrupt is routed (i.e. there is a global enable flag for each interrupt)
Bits 29-16: Reserved
#54: int HAL_IRQSetCores(int device, int mask)
Set the IRQ routing for the given interrupt; bit N of mask should be set if the interrupt is to be routed to core N.
Returns the new mask, which may be different to what was requested.
Currently there is no equivalent call allocated for FIQ routing (it's expected FIQs will have fixed routing)
To avoid race conditions with active interrupt handlers, this call is for kernel use only. Other components which need to manually manage IRQ routing must do so via the SWI interface (TBD)
#55: int HAL_IRQGetCores(int device)
Returns the current IRQ routing for the given device
Currently there is no equivalent call allocated for FIQ routing (it's expected FIQs will have fixed routing)
Kernel changes
--------------
To ensure safe operation of multi-core IRQs, the kernel will require at least the following changes:
* A barrier placing in OS_ClaimDeviceVector, as described above
* A barrier placing in OS_ReleaseDeviceVector, to ensure the handler being removed has finished executing by the time the SWI returns
General advice
--------------
Traditionally device drivers have just disabled interrupts as a means of making sure their interrupt handler isn't running (e.g. in order to perform an atomic update of some state). In a multi-core environment this will not work; a spinlock must be used instead. This may also require your interrupt handler to be capable of dealing with spurious interrupts from its device - e.g. if the foreground masks an interrupt within the peripheral at the same time as that interrupt fires, the IRQ handler (looking at the new state) may not be able to recognise the source of the interrupt within the device.
......@@ -32,7 +32,8 @@ ASFLAGS += -PD "FreezeDevRel SETL {${FREEZE_DEV_REL}}"
CUSTOMROM = custom
CUSTOMEXP = custom
CUSTOMSA = custom
EXPORTS = ${EXP_HDR}.EnvNumbers \
EXPORTS = ${EXP_HDR}.DBellDevice \
${EXP_HDR}.EnvNumbers \
${EXP_HDR}.HALDevice \
${EXP_HDR}.HALEntries \
${EXP_HDR}.ModHand \
......@@ -100,6 +101,9 @@ export: ${EXPORTS}
${EXP_HDR}.EnvNumbers: hdr.EnvNumbers
${CP} hdr.EnvNumbers $@ ${CPFLAGS}
${EXP_HDR}.DBellDevice: hdr.DBellDevice
${CP} hdr.DBellDevice $@ ${CPFLAGS}
${EXP_HDR}.HALDevice: hdr.HALDevice
${CP} hdr.HALDevice $@ ${CPFLAGS}
......
......@@ -11,13 +11,13 @@
GBLS Module_HelpVersion
GBLS Module_ComponentName
GBLS Module_ComponentPath
Module_MajorVersion SETS "6.08"
Module_Version SETA 608
Module_MajorVersion SETS "6.09"
Module_Version SETA 609
Module_MinorVersion SETS ""
Module_Date SETS "30 Jun 2018"
Module_ApplicationDate SETS "30-Jun-18"
Module_Date SETS "07 Jul 2018"
Module_ApplicationDate SETS "07-Jul-18"
Module_ComponentName SETS "Kernel"
Module_ComponentPath SETS "castle/RiscOS/Sources/Kernel"
Module_FullVersion SETS "6.08"
Module_HelpVersion SETS "6.08 (30 Jun 2018)"
Module_FullVersion SETS "6.09"
Module_HelpVersion SETS "6.09 (07 Jul 2018)"
END
/* (6.08)
/* (6.09)
*
* This file is automatically maintained by srccommit, do not edit manually.
* Last processed by srccommit version: 1.1.
*
*/
#define Module_MajorVersion_CMHG 6.08
#define Module_MajorVersion_CMHG 6.09
#define Module_MinorVersion_CMHG
#define Module_Date_CMHG 30 Jun 2018
#define Module_Date_CMHG 07 Jul 2018
#define Module_MajorVersion "6.08"
#define Module_Version 608
#define Module_MajorVersion "6.09"
#define Module_Version 609
#define Module_MinorVersion ""
#define Module_Date "30 Jun 2018"
#define Module_Date "07 Jul 2018"
#define Module_ApplicationDate "30-Jun-18"
#define Module_ApplicationDate "07-Jul-18"
#define Module_ComponentName "Kernel"
#define Module_ComponentPath "castle/RiscOS/Sources/Kernel"
#define Module_FullVersion "6.08"
#define Module_HelpVersion "6.08 (30 Jun 2018)"
#define Module_LibraryVersionInfo "6:8"
#define Module_FullVersion "6.09"
#define Module_HelpVersion "6.09 (07 Jul 2018)"
#define Module_LibraryVersionInfo "6:9"
;
; Copyright (c) 2016, RISC OS Open Ltd
; All rights reserved.
;
; Redistribution and use in source and binary forms, with or without
; modification, are permitted provided that the following conditions are met:
; * Redistributions of source code must retain the above copyright
; notice, this list of conditions and the following disclaimer.
; * Redistributions in binary form must reproduce the above copyright
; notice, this list of conditions and the following disclaimer in the
; documentation and/or other materials provided with the distribution.
; * Neither the name of RISC OS Open Ltd nor the names of its contributors
; may be used to endorse or promote products derived from this software
; without specific prior written permission.
;
; THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
; AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
; IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
; ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
; LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
; CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
; SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
; INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
; CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
; ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
; POSSIBILITY OF SUCH DAMAGE.
;
; Public interface (ie interface to the kernel) of doorbell HAL devices
GET hdr:HALDevice
OldOpt SETA {OPT}
OPT OptNoList+OptNoP1List
[ :LNOT: :DEF: Included_Hdr_DBellDevice
GBLL Included_Hdr_DBellDevice
Included_Hdr_DBellDevice SETL {TRUE}
; Device for each doorbell interrupt
^ 0
# HALDeviceSize
HALDevice_DBellGetIRQ # 4
HALDevice_DBellRing # 4
HALDevice_DBell_Size * :INDEX: @
]
OPT OldOpt
END
......@@ -70,6 +70,7 @@ HALDeviceComms_EtherNIC # 1 ; Ethernet NIC
HALDeviceComms_GPIO # 1 ; GPIO interface
HALDeviceComms_InterProc # 1 ; Inter-processor mailboxes, etc.
HALDeviceComms_SPI # 1 ; SPI
HALDeviceComms_ARMDBell # 1 ; Doorbell for signaling other ARM cores
HALDeviceType_ExpCtl * 5 :SHL: 8
^ 1
......@@ -220,6 +221,10 @@ HALDeviceID_SPI_BCM2835_0 # 1
HALDeviceID_SPI_BCM2835_12 # 1
HALDeviceID_SPI_IMX6 # 1
^ 0
HALDeviceID_ARMDBell_BCM2836 # 1
HALDeviceID_ARMDBell_GIC # 1
^ 0
HALDeviceID_SDIO_SDHCI # 1
......
......@@ -90,13 +90,12 @@ EntryNo_HAL_VideoFeatures # 1 ; 50
EntryNo_HAL_VideoBufferAlignment # 1 ; 51
EntryNo_HAL_VideoOutputFormat # 1 ; 52
EntryNo_HAL_MatrixColumns # 1 ; 53
EntryNo_HAL_MatrixScan # 1 ; 54
EntryNo_HAL_TouchscreenType # 1 ; 55
EntryNo_HAL_TouchscreenRead # 1 ; 56
EntryNo_HAL_TouchscreenMode # 1 ; 57
EntryNo_HAL_TouchscreenMeasure # 1 ; 58
EntryNo_HAL_IRQProperties # 1 ; 53 (was HAL_MatrixColumns)
EntryNo_HAL_IRQSetCores # 1 ; 54 (was HAL_MatrixScan)
EntryNo_HAL_IRQGetCores # 1 ; 55 (was HAL_TouchscreenType)
EntryNo_HAL_CPUCount # 1 ; 56 (was HAL_TouchscreenRead)
EntryNo_HAL_CPUNumber # 1 ; 57 (was HAL_TouchscreenMode)
EntryNo_HAL_SMPStartup # 1 ; 58 (was HAL_TouchscreenMeasure)
EntryNo_HAL_MachineID # 1 ; 59, ReadSysInfo 2
EntryNo_HAL_ControllerAddress # 1 ; 60, Memory 9
......@@ -252,4 +251,12 @@ HALUSBControllerFlag_HAL_Over_Current * 2 ; Use HAL_USBPortDevice/IRQStatus/IRQC
HALUSBControllerFlag_32bit_Regs * 8 ; Must use 32bit access for all registers
HALUSBControllerFlag_EHCI_ETTF * &80000000 ; EHCI controller has embedded TT
; IRQ
HALIRQ_Shared * &80000000 ; Used with some APIs to indicate device/IRQ number is shared by multiple devices
; HAL_IRQProperties flags:
HALIRQProperty_Multicore * &80000000 ; Interrupt can be routed to multiple cores at once
HALIRQProperty_Private * &40000000 ; Interrupt enable/disable will only have an effect if it's called on a core which the interrupt is routed to, and if it's routed to multiple cores it may not affect them all
END
......@@ -1762,7 +1762,7 @@ MouseBuff |#| MouseBuffSize
; IRQ despatch
MaxInterrupts * 192 ; 192 needed by OMAP5. Increase in future if necessary.
MaxInterrupts * 256 ; 256 needed by SMP i.MX6 quad. Increase in future if necessary.
DefIRQ1Vspace * 12*MaxInterrupts+128
DefaultIRQ1V |#| DefIRQ1Vspace
......
......@@ -109,6 +109,7 @@ OSMemReason_ReleasePhysAddr * 15 ; Release the temp mapping
OSMemReason_MemoryAreaInfo * 16 ; Return size & location of various non-DA areas
OSMemReason_MemoryAccessPrivileges * 17 ; Decode AP numbers into permission flags
OSMemReason_FindAccessPrivilege * 18 ; Find best AP number from given permission flags
OSMemReason_DMAPrep * 19 ; Convert PA <-> LA, perform cache maintenance required for DMA
OSMemReason_Compatibility * 20 ; Get/set compatibility settings
OSMemReason_CheckMemoryAccess * 24 ; Return attributes/permissions for a logical address range
......@@ -120,6 +121,12 @@ MemPermission_PrivX * 1<<3 ; Executable in privileged modes
MemPermission_PrivW * 1<<4 ; Writable in privileged modes
MemPermission_PrivR * 1<<5 ; Readable in privileged modes
; OS_Memory 19 (DMAPrep) flags
DMAPrep_PhysProvided * 1<<8 ; Input function provides physical addresses, not logical
DMAPrep_Write * 1<<9 ; DMA is writing to RAM
DMAPrep_End * 1<<10 ; DMA is complete, perform any post-op cache maintenance
DMAPrep_UseBounceBuffer * 1 ; Input/output function flag: Must use bounce buffer for this block
; OS_Memory 24 (CheckMemoryAccess) flags
CMA_Completely_UserR * 1<<0 ; completely readable in user mode
CMA_Completely_UserW * 1<<1 ; completely writable in user mode
......
......@@ -186,6 +186,11 @@ CacheablePageTables SETL {TRUE} ; Use cacheable page tables wher
GBLL SyncPageTables
SyncPageTables SETL (MEMM_Type = "VMSAv6") :LOR: CacheablePageTables ; Any page table modification (specifically, overwriting faulting entries) requires synchronisation
[ :LNOT: :DEF: SMP
GBLL SMP
SMP SETL (MEMM_Type = "VMSAv6") :LAND: {TRUE} ; Enable SMP-related changes
]
GBLL UseNewFX0Error
UseNewFX0Error SETL ((Version :AND: 1) = 1) ; Whether *FX 0 should show the ROM link date instead of the UtilityModule date
......
This diff is collapsed.
......@@ -231,12 +231,7 @@ SyncCodeAreasRange
MOV r0, r1
ADD r1, r2, #4 ;exclusive end address
LDR r2, =ZeroPage
LDRB lr, [r2, #Cache_Type]
CMP lr, #CT_ctype_WB_CR7_Lx ; DCache_LineLen lin or log?
LDRB lr, [r2, #DCache_LineLen]
MOVEQ r2, #4
MOVEQ lr, r2, LSL lr
LDREQ r2, =ZeroPage
SUB lr, lr, #1
ADD r1, r1, lr ;rounding up end address
MVN lr, lr
......@@ -246,10 +241,31 @@ SyncCodeAreasRange
Pull "r0-r2, pc"
SyncCodeAreasFull
Push "r0, lr"
[ SMP
Entry "r0-r2"
LDR r2, =ZeroPage
ARMop Cache_RangeThreshold,,,r2
CMP r0, #-1
BNE %FT90
; ARMops are in SMP-friendly mode, which means we have no (SMP-friendly) global IMB available
; Just clean application space and the RMA?
LDR r1, [r2, #AplWorkSize]
MOV r0, #32*1024
ARMop IMB_Range,,,r2
MOV r0, #ChangeDyn_RMA
SWI XOS_ReadDynamicArea
ADD r1, r1, r0
ARMop IMB_Range,,,r2
EXIT
90
ARMop IMB_Full,,,r2
EXIT
|
Entry "r0"
LDR r0, =ZeroPage
ARMop IMB_Full,,,r0
Pull "r0, pc"
EXIT
]
LTORG
......
......@@ -69,7 +69,7 @@ MemReturn
B MemoryAreaInfo ; 16
B MemoryAccessPrivileges ; 17
B FindAccessPrivilege ; 18
B %BT20 ; Reason code 19 reserved (for DMAPrep, on SMP branch)
B DMAPrep ; 19
B ChangeCompatibility ; 20
B %BT20 ; 21 |
B %BT20 ; 22 | Reserved for us
......@@ -1009,6 +1009,8 @@ ReleasePhysAddr
BL RISCOS_ReleasePhysicalAddress
Pull "r0-r3,r12,pc"
LTORG
;----------------------------------------------------------------------------------------
;
; In: r0 = flags
......@@ -1337,6 +1339,353 @@ FindAccessPrivilege ROUT
MakeErrorBlock AccessPrivilegeNotFound
;----------------------------------------------------------------------------------------
;
; In: r0 = flags
; bit meaning
; 0-7 19 (reason code)
; 8 Input function provides physical addresses
; 9 DMA is writing to RAM
; 10 DMA is complete, perform any post-op cache maintenance
; 11-31 reserved (set to 0)
; r1 = R12 value to provide to called functions
; r2 = Initial R9 value to provide to input function
; r3 -> Input function:
; in: r9 = r2 from SWI / value from previous call
; r12 = r1 from SWI
; out: r0 = start address of region
; r1 = length of region (0 if end of transfer)
; r2 = flags:
; bit 0: Bounce buffer will be used
; r9 = new r9 for next input call
; r12 corrupt
; r4 = Initial R9 value to provide to output function
; r5 -> Output function (if bit 10 of R0 clear):
; in: r0 = logical address of start of region
; r1 = physical address of start of region
; r2 = length of region
; r3 = flags:
; bit 0: Bounce buffer must be used
; r9 = r4 from SWI / value from previous call
; r12 = r1 from SWI
; out: r9 = new r9 value for next output call
; r0-r3, r12 corrupt
;
; Out: r2, r4 updated to match values returned by input/output calls
; All other regs preserved
;
; Performs address translation and cache maintenance necessary to allow for DMA
; to be performed to/from cacheable memory.
;
; To allow Service_PagesUnsafe to be dealt with in a straightforward manner, we
; have to be careful not to cache the results of any address translations over
; calls to the input/output functions. E.g. if the output function tries to
; allocate from PCI RAM, that may trigger claiming of a specific page by the
; PCI DA, potentially invalidating any existing logical -> physical translation.
; This restriction hampers the routines ability to merge together input and
; output blocks, and to perform minimal cache maintenance. However for typical
; scatter lists of low to medium complexity it should still produce acceptable
; output.
;
; Note that if the input function provides physical addresses, the caller must
; take care to abort the entire operation if one of the physical pages involved
; in the request becomes claimed by someone else while the OS_Memory call is in
; progress. This is because we have no sensible way of dealing with this case
; ourselves (even if we didn't attempt to call the input function multiple times
; and merge together the blocks, we'd still have to buffer things internally to
; deal with when blocks need splitting for cache alignment)
;
; Internally, blocks are stored in the following format:
;
; Word 0 = Start logical address (incl.)
; Word 1 = Logical -> physical address offset (low bits) + flags (high bits)
; Word 2 = End logical address (excl.)
;
; This minimises the number of registers needed to hold a block, and simplifies
; the merge calculation (blocks can be merged if words 2 + 1 of first block
; match words 0 + 1 of second block)
; Workspace struct that's stored on the stack
^ 0
DMAPrepW_InHold # 12
DMAPrepW_InChunk # 12
DMAPrepW_PhyChunk # 12