Commit afc3b390 authored by Jeffrey Lee's avatar Jeffrey Lee
Browse files

Replace WriteBuffer_Drain ARMop with a suite of memory barrier ARMops

Detail:
  - Docs/HAL/ARMop_API - Updated with documentation for the new ARMops.
  - s/ARMops - Set up pointers for the new memory barrier ARMops. Add full implementations for ARMv6 & ARMv7; older architectures should be able to get by with a mix of null ops & write buffer drain ops. Update ARMopPtrTable to validate structure against the list in hdr/OSMisc
  - hdr/KernelWS - Reserve workspace for new ARMops. Free up a bit of space by limiting ourselves to 2 cache levels with ARMv7. Remove some unused definitions.
  - hdr/OSMisc - New header defining OS_PlatformFeatures & OS_MMUControl reason codes, OS_PlatformFeatures 0 flags, and OS_MMUControl 2 ARMop indices
  - Makefile - Add export rules for OSMisc header
  - hdr/ARMops, s/ARM600, s/VMSAv6 - Remove CPUFlag_* and MMUCReason_* definitions. Update OS_MMUControl write buffer drain to use DSB_ReadWrite ARMop (which is what most existing write buffer drain implementations have been renamed to).
  - s/GetAll - Get Hdr:OSMisc
  - s/Kernel - Use OS_PlatformFeatures reason code symbols
  - s/vdu/vdudecl - Remove unused definition
Admin:
  Tested on ARM11, Cortex-A8, Cortex-A9


Version 5.35, 4.79.2.279. Tagged as 'Kernel-5_35-4_79_2_279'
parent cbfc4ff1
......@@ -273,15 +273,85 @@ For unified caches, r1-r2 will match r3-r4. This call mainly exists for the
benefit of OS_PlatformFeatures 33.
-- WriteBuffer_Drain
Memory barrier ARMops
=====================
Any writebuffers are to be drained so that any pending writes are guaranteed
completed to memory.
-- DSB_ReadWrite (previously, WriteBuffer_Drain)
This call is roughly equivalent to the ARMv7 "DSB SY" instruction:
* Writebuffers are drained
* Full read/write barrier - no data load/store will cross the instruction
* Instructions following the barrier will only begin execution once the barrier is passed - but any prefetched instructions are not flushed
entry: -
exit: -
-- DSB_Write
This call is roughly equivalent to the ARMv7 "DSB ST" instruction:
* Writebuffers are drained
* Write barrier - reads may cross the instruction
* Instructions following the barrier will only begin execution once the barrier is passed - but any prefetched instructions are not flushed
entry: -
exit: -
-- DSB_Read
There is no direct equivalent to this in ARMv7 (barriers are either W or RW). However it's useful to define a read barrier, as (e.g.) on Cortex-A9 a RW barrier would require draining the write buffer of the external PL310 cache, while a R barrier can simply be an ordinary DSB instruction.
* Read barrier - writes may cross the instruction
* Instructions following the barrier will only begin execution once the barrier is passed - but any prefetched instructions are not flushed
entry: -
exit: -
-- DMB_ReadWrite
This call is roughly equivalent to the ARMv7 "DMB SY" instruction:
* Ensures in-order operation of data load/store instructions
* Does not stall instruction execution
* Does not guarantee that any preceeding memory operations complete in a timely manner (or at all)
entry: -
exit: -
Although this call doesn't guarantee that any memory operation completes, it's usually all that's required when interacting with hardware devices which use memory-mapped IO. E.g. fill a buffer with data, issue a DMB, then write to a hardware register to start some external DMA. The writes to the buffer will have been guaranteed to complete by the time the write to the hardware register completes.
-- DMB_Write
This call is roughly equivalent to the ARMv7 "DMB ST" instruction:
* Ensures in-order operation of data store instructions
* Does not stall instruction execution
* Does not guarantee that any preceeding memory operations complete in a timely manner (or at all)
entry: -
exit: -
Although this call doesn't guarantee that any memory operation completes, it's usually all that's required when interacting with hardware devices which use memory-mapped IO. E.g. fill a buffer with data, issue a DMB, then write to a hardware register to start some external DMA. The writes to the buffer will have been guaranteed to complete by the time the write to the hardware register completes.
-- DMB_Read
There is no direct equivalent to this in ARMv7 (barriers are either W or RW). However it's useful to define a read barrier, as (e.g.) on Cortex-A9 a RW barrier would require draining the write buffer of the external PL310 cache, while a R barrier can simply be an ordinary DMB instruction.
* Ensures in-order operation of data load instructions
* Does not stall instruction execution
* Does not guarantee that any preceeding memory operations complete in a timely manner (or at all)
entry: -
exit: -
Although this call doesn't guarantee that any memory operation completes, it's usually all that's required when interacting with hardware devices which use memory-mapped IO. E.g. after reading a hardware register to detect that a DMA write to RAM has completed, issue a read barrier to ensure that any reads from the data buffer see the final data.
TLB ARMops
----------
......
......@@ -37,6 +37,7 @@ EXPORTS = ${EXP_HDR}.EnvNumbers \
${EXP_HDR}.HALEntries \
${EXP_HDR}.ModHand \
${EXP_HDR}.OSEntries \
${EXP_HDR}.OSMisc \
${EXP_HDR}.OSRSI6 \
${EXP_HDR}.PL310 \
${EXP_HDR}.PublicWS \
......@@ -49,6 +50,7 @@ EXPORTS = ${EXP_HDR}.EnvNumbers \
${C_EXP_HDR}.HALEntries \
${C_EXP_HDR}.ModHand \
${C_EXP_HDR}.OSEntries \
${C_EXP_HDR}.OSMisc \
${C_EXP_HDR}.OSRSI6 \
${C_EXP_HDR}.RISCOS \
${C_EXP_HDR}.Variables \
......@@ -113,6 +115,9 @@ ${EXP_HDR}.ModHand: hdr.ModHand
${EXP_HDR}.OSEntries: hdr.OSEntries
${CP} hdr.OSEntries $@ ${CPFLAGS}
${EXP_HDR}.OSMisc: hdr.OSMisc
${CP} hdr.OSMisc $@ ${CPFLAGS}
${EXP_HDR}.OSRSI6: hdr.OSRSI6
${CP} hdr.OSRSI6 $@ ${CPFLAGS}
......@@ -151,6 +156,10 @@ ${C_EXP_HDR}.ModHand: hdr.ModHand
${C_EXP_HDR}.OSEntries: Global.h.OSEntries h.OSEntries
${FAPPEND} $@ h.OSEntries Global.h.OSEntries
${C_EXP_HDR}.OSMisc: hdr.OSMisc
${MKDIR} ${C_EXP_HDR}
${HDR2H} hdr.OSMisc $@
${C_EXP_HDR}.OSRSI6: hdr.OSRSI6
${MKDIR} ${C_EXP_HDR}
${HDR2H} hdr.OSRSI6 $@
......
......@@ -13,11 +13,11 @@
GBLS Module_ComponentPath
Module_MajorVersion SETS "5.35"
Module_Version SETA 535
Module_MinorVersion SETS "4.79.2.278"
Module_Date SETS "11 Aug 2015"
Module_ApplicationDate SETS "11-Aug-15"
Module_MinorVersion SETS "4.79.2.279"
Module_Date SETS "14 Aug 2015"
Module_ApplicationDate SETS "14-Aug-15"
Module_ComponentName SETS "Kernel"
Module_ComponentPath SETS "castle/RiscOS/Sources/Kernel"
Module_FullVersion SETS "5.35 (4.79.2.278)"
Module_HelpVersion SETS "5.35 (11 Aug 2015) 4.79.2.278"
Module_FullVersion SETS "5.35 (4.79.2.279)"
Module_HelpVersion SETS "5.35 (14 Aug 2015) 4.79.2.279"
END
......@@ -5,19 +5,19 @@
*
*/
#define Module_MajorVersion_CMHG 5.35
#define Module_MinorVersion_CMHG 4.79.2.278
#define Module_Date_CMHG 11 Aug 2015
#define Module_MinorVersion_CMHG 4.79.2.279
#define Module_Date_CMHG 14 Aug 2015
#define Module_MajorVersion "5.35"
#define Module_Version 535
#define Module_MinorVersion "4.79.2.278"
#define Module_Date "11 Aug 2015"
#define Module_MinorVersion "4.79.2.279"
#define Module_Date "14 Aug 2015"
#define Module_ApplicationDate "11-Aug-15"
#define Module_ApplicationDate "14-Aug-15"
#define Module_ComponentName "Kernel"
#define Module_ComponentPath "castle/RiscOS/Sources/Kernel"
#define Module_FullVersion "5.35 (4.79.2.278)"
#define Module_HelpVersion "5.35 (11 Aug 2015) 4.79.2.278"
#define Module_FullVersion "5.35 (4.79.2.279)"
#define Module_HelpVersion "5.35 (14 Aug 2015) 4.79.2.279"
#define Module_LibraryVersionInfo "5:35"
......@@ -52,26 +52,6 @@ Cortex_A15 # 1
Cortex_A17 # 1
ARMunk * 255
; These flags are stored in ProcessorFlags and returned by OS_PlatformFeatures 0 (Read code features)
CPUFlag_SynchroniseCodeAreas * 1:SHL:0 ; Calls to OS_SynchroniseCodeAreas required
CPUFlag_InterruptDelay * 1:SHL:1 ; Clearing then setting I bit immediately doesn't trigger IRQs
CPUFlag_VectorReadException * 1:SHL:2 ; 26-bit reads of hardware vectors abort
CPUFlag_StorePCplus8 * 1:SHL:3 ; Stores of R15 store PC+8 rather than PC+12
CPUFlag_BaseRestored * 1:SHL:4 ; Base Restored abort model rather than Base Updated
CPUFlag_SplitCache * 1:SHL:5 ; CPU has separate I and D caches
CPUFlag_32bitOS * 1:SHL:6 ; OS is 32-bit
CPUFlag_No26bitMode * 1:SHL:7 ; CPU does not support 26-bit modes
CPUFlag_LongMul * 1:SHL:8 ; Has M extensions (UMULL etc)
CPUFlag_Thumb * 1:SHL:9 ; Supports Thumb
CPUFlag_DSP * 1:SHL:10 ; Has E extensions (QADD etc)
CPUFlag_ExtendedPages * 1:SHL:15 ; Supports extended small page L2 descriptors
CPUFlag_NoWBDrain * 1:SHL:16 ; CPU does not support Drain Write Buffer instruction
CPUFlag_AbortRestartBroken * 1:SHL:17 ; Aborts do not correctly follow documented abort model
CPUFlag_XScale * 1:SHL:18 ; it's an XScale, so weird debug etc
CPUFlag_XScaleJTAGconnected * 1:SHL:19 ; JTAG has been connected
CPUFlag_HiProcVecs * 1:SHL:20 ; High processor vectors are in use
; The macro to do an ARM operation. All ARM operations are expected
; to corrupt a1 only
; This macro corrupts ip unless $zeropage reg is supplied
......
......@@ -1238,7 +1238,12 @@ Proc_Cache_RangeThreshold # 4
Proc_Cache_Examine # 4
Proc_TLB_InvalidateAll # 4
Proc_TLB_InvalidateEntry # 4
Proc_WriteBuffer_Drain # 4
Proc_DSB_ReadWrite # 4
Proc_DSB_Write # 4
Proc_DSB_Read # 4
Proc_DMB_ReadWrite # 4
Proc_DMB_Write # 4
Proc_DMB_Read # 4
Proc_IMB_Full # 4
Proc_IMB_Range # 4
Proc_IMB_List # 4
......@@ -1250,8 +1255,9 @@ Proc_MMU_ChangingEntries # 4
Proc_MMU_ChangingUncachedEntries # 4
Cache_Lx_Info # 4 ; Cache level ID register
Cache_Lx_DTable # 4*7 ; Data/unified cache layout for all 7 levels
Cache_Lx_ITable # 4*7 ; Instruction cache layout for all 7 levels
Cache_Lx_MaxLevel * 2 ; Current machines have max of 2 cache levels
Cache_Lx_DTable # 4*Cache_Lx_MaxLevel ; Data/unified cache layout for all supported levels
Cache_Lx_ITable # 4*Cache_Lx_MaxLevel ; Instruction cache layout for all supported levels
Cache_HALDevice # 4 ; Pointer to any HAL cache device we're using
]
......@@ -1767,20 +1773,6 @@ ExprStackStart * ScratchSpace + ScratchSpaceSize
]
; Tutu needs some for argument substitution + expansion for run/load types
; Only OS call during xform is XOS_SubstituteArgs and XOS_Heap(Claim,SysHeap)
^ 0 ; Offset from ScratchSpace
rav_substituted # 256
rav_arglist # 256
TopOfPageZero # 0
^ &8000 ; The actual top of Page Zero
EconetDebugSpace |#| &20 * 4 ; Thirty two words (&7F80)
ASSERT @ > TopOfPageZero ; Make sure we don't clash
; *****************************************************************************
; *** Cursor, Sound DMA, SWI, and OSCLI workspace. ***
[ :LNOT: HAL32
......@@ -1789,9 +1781,7 @@ EconetDebugSpace |#| &20 * 4 ; Thirty two words (&7F80)
]
; *****************************************************************************
TopOfDMAPhysRAM * &80000 ; OFFSET in physram
TopOfDMAWorkSpace * CursorChunkAddress + 32*1024
OffsetLogicalToPhysical * TopOfDMAPhysRAM - TopOfDMAWorkSpace
^ TopOfDMAWorkSpace ; Note we will be going down
......@@ -1825,7 +1815,6 @@ Export_SoundWorkSpace |#| SoundWorkSpaceSize + SoundEvtSize
CursorDataSize * &600 ; four defined shapes, plus 2 holding shapes
CursorData |#| CursorDataSize
CursorSoundRAM * CursorData
CursorSoundPhysRAM * CursorSoundRAM + OffsetLogicalToPhysical
SPARE_oldCursorSpace |#| &200 ; padding to avoid changing exported addresses for now
......
; Copyright 2015 Castle Technology Ltd
;
; Licensed under the Apache License, Version 2.0 (the "License");
; you may not use this file except in compliance with the License.
; You may obtain a copy of the License at
;
; http://www.apache.org/licenses/LICENSE-2.0
;
; Unless required by applicable law or agreed to in writing, software
; distributed under the License is distributed on an "AS IS" BASIS,
; WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
; See the License for the specific language governing permissions and
; limitations under the License.
;
; Miscellaneous public definitions that aren't important enough to pollute
; Hdr:RISCOS or to have their own header
; OS_PlatformFeatures reason codes
OSPlatformFeatures_ReadCodeFeatures * 0
OSPlatformFeatures_ReadProcessorVectors * 32
OSPlatformFeatures_ReadCacheInfo * 33
; These flags are returned by OS_PlatformFeatures 0 (Read code features)
CPUFlag_SynchroniseCodeAreas * 1:SHL:0 ; Calls to OS_SynchroniseCodeAreas required
CPUFlag_InterruptDelay * 1:SHL:1 ; Clearing then setting I bit immediately doesn't trigger IRQs
CPUFlag_VectorReadException * 1:SHL:2 ; 26-bit reads of hardware vectors abort
CPUFlag_StorePCplus8 * 1:SHL:3 ; Stores of R15 store PC+8 rather than PC+12
CPUFlag_BaseRestored * 1:SHL:4 ; Base Restored abort model rather than Base Updated
CPUFlag_SplitCache * 1:SHL:5 ; CPU has separate I and D caches
CPUFlag_32bitOS * 1:SHL:6 ; OS is 32-bit
CPUFlag_No26bitMode * 1:SHL:7 ; CPU does not support 26-bit modes
CPUFlag_LongMul * 1:SHL:8 ; Has M extensions (UMULL etc)
CPUFlag_Thumb * 1:SHL:9 ; Supports Thumb
CPUFlag_DSP * 1:SHL:10 ; Has E extensions (QADD etc)
CPUFlag_ExtendedPages * 1:SHL:15 ; Supports extended small page L2 descriptors
CPUFlag_NoWBDrain * 1:SHL:16 ; CPU does not support Drain Write Buffer instruction
CPUFlag_AbortRestartBroken * 1:SHL:17 ; Aborts do not correctly follow documented abort model
CPUFlag_XScale * 1:SHL:18 ; it's an XScale, so weird debug etc
CPUFlag_XScaleJTAGconnected * 1:SHL:19 ; JTAG has been connected
CPUFlag_HiProcVecs * 1:SHL:20 ; High processor vectors are in use
; OS_MMUControl reason codes
^ 0
MMUCReason_ModifyControl # 1
MMUCReason_Flush # 1
MMUCReason_GetARMop # 1
MMUCReason_Unknown # 0
; These are the ARMops exposed by OS_MMUControl 2
^ 0
ARMop_Cache_CleanInvalidateAll # 1 ; 0
ARMop_Cache_CleanAll # 1 ; 1
ARMop_Cache_InvalidateAll # 1 ; 2
ARMop_Cache_RangeThreshold # 1 ; 3
ARMop_TLB_InvalidateAll # 1 ; 4
ARMop_TLB_InvalidateEntry # 1 ; 5
ARMop_DSB_ReadWrite # 1 ; 6
ARMop_IMB_Full # 1 ; 7
ARMop_IMB_Range # 1 ; 8
ARMop_IMB_List # 1 ; 9
ARMop_MMU_Changing # 1 ; 10
ARMop_MMU_ChangingEntry # 1 ; 11
ARMop_MMU_ChangingUncached # 1 ; 12
ARMop_MMU_ChangingUncachedEntry # 1 ; 13
ARMop_MMU_ChangingEntries # 1 ; 14
ARMop_MMU_ChangingUncachedEntries # 1 ; 15
ARMop_DSB_Write # 1 ; 16
ARMop_DSB_Read # 1 ; 17
ARMop_DMB_ReadWrite # 1 ; 18
ARMop_DMB_Write # 1 ; 19
ARMop_DMB_Read # 1 ; 20
ARMop_Max # 0
END
......@@ -422,12 +422,6 @@ SSETMEMC ROUT
; out: r0 = ARMop function ptr
;
^ 0
MMUCReason_ModifyControl # 1 ; reason code 0
MMUCReason_Flush # 1 ; reason code 1
MMUCReason_GetARMop # 1
MMUCReason_Unknown # 0
MMUControlSWI Entry
BL MMUControlSub
PullEnv
......@@ -529,7 +523,7 @@ MMUControl_Flush
TST r10,#&40000000
ARMop TLB_InvalidateAll,NE,,r12
TST r10,#&10000000
ARMop WriteBuffer_Drain,NE,,r12
ARMop DSB_ReadWrite,NE,,r12
ADDS r0,r10,#0
Pull "pc"
......
......@@ -230,14 +230,19 @@ WeirdARMPanic
Analyse_ARMv3
ADRL a1, NullOp
ADRL a2, Cache_Invalidate_ARMv3
ADRL a3, WriteBuffer_Drain_ARMv3
ADRL a3, DSB_ReadWrite_ARMv3
ADRL a4, TLB_Invalidate_ARMv3
ADRL ip, TLB_InvalidateEntry_ARMv3
STR a1, [v6, #Proc_Cache_CleanAll]
STR a2, [v6, #Proc_Cache_CleanInvalidateAll]
STR a2, [v6, #Proc_Cache_InvalidateAll]
STR a3, [v6, #Proc_WriteBuffer_Drain]
STR a3, [v6, #Proc_DSB_ReadWrite]
STR a3, [v6, #Proc_DSB_Write]
STR a1, [v6, #Proc_DSB_Read]
STR a3, [v6, #Proc_DMB_ReadWrite]
STR a3, [v6, #Proc_DMB_Write]
STR a1, [v6, #Proc_DMB_Read]
STR a4, [v6, #Proc_TLB_InvalidateAll]
STR ip, [v6, #Proc_TLB_InvalidateEntry]
STR a1, [v6, #Proc_IMB_Full]
......@@ -270,15 +275,20 @@ Analyse_WriteThroughUnified
ADRL a1, NullOp
ADRL a2, Cache_InvalidateUnified
TST v5, #CPUFlag_NoWBDrain
ADRNEL a3, WriteBuffer_Drain_OffOn
ADREQL a3, WriteBuffer_Drain
ADRNEL a3, DSB_ReadWrite_OffOn
ADREQL a3, DSB_ReadWrite
ADRL a4, TLB_Invalidate_Unified
ADRL ip, TLB_InvalidateEntry_Unified
STR a1, [v6, #Proc_Cache_CleanAll]
STR a2, [v6, #Proc_Cache_CleanInvalidateAll]
STR a2, [v6, #Proc_Cache_InvalidateAll]
STR a3, [v6, #Proc_WriteBuffer_Drain]
STR a3, [v6, #Proc_DSB_ReadWrite]
STR a3, [v6, #Proc_DSB_Write]
STR a1, [v6, #Proc_DSB_Read]
STR a3, [v6, #Proc_DMB_ReadWrite]
STR a3, [v6, #Proc_DMB_Write]
STR a1, [v6, #Proc_DMB_Read]
STR a4, [v6, #Proc_TLB_InvalidateAll]
STR ip, [v6, #Proc_TLB_InvalidateEntry]
STR a1, [v6, #Proc_IMB_Full]
......@@ -333,8 +343,27 @@ Analyse_WB_CR7_LDa
ADRL a1, TLB_InvalidateEntry_WB_CR7_LDa
STR a1, [v6, #Proc_TLB_InvalidateEntry]
ADRL a1, WriteBuffer_Drain_WB_CR7_LDa
STR a1, [v6, #Proc_WriteBuffer_Drain]
[ MEMM_Type = "ARM600"
; <= ARMv5, just use the drain write buffer MCR
ADRL a1, DSB_ReadWrite_WB_CR7_LDa
ADRL a2, NullOp
STR a1, [v6, #Proc_DSB_ReadWrite]
STR a1, [v6, #Proc_DSB_Write]
STR a2, [v6, #Proc_DSB_Read]
STR a1, [v6, #Proc_DMB_ReadWrite]
STR a1, [v6, #Proc_DMB_Write]
STR a2, [v6, #Proc_DMB_Read]
|
; ARMv6(+), use the ARMv6 barrier MCRs
ADRL a1, DSB_ReadWrite_ARMv6
STR a1, [v6, #Proc_DSB_ReadWrite]
STR a1, [v6, #Proc_DSB_Write]
STR a1, [v6, #Proc_DSB_Read]
ADRL a1, DMB_ReadWrite_ARMv6
STR a1, [v6, #Proc_DMB_ReadWrite]
STR a1, [v6, #Proc_DMB_Write]
STR a1, [v6, #Proc_DMB_Read]
]
ADRL a1, IMB_Full_WB_CR7_LDa
STR a1, [v6, #Proc_IMB_Full]
......@@ -422,8 +451,14 @@ Analyse_WB_Crd
ADRL a1, TLB_InvalidateEntry_WB_Crd
STR a1, [v6, #Proc_TLB_InvalidateEntry]
ADRL a1, WriteBuffer_Drain_WB_Crd
STR a1, [v6, #Proc_WriteBuffer_Drain]
ADRL a1, DSB_ReadWrite_WB_Crd
ADRL a2, NullOp
STR a1, [v6, #Proc_DSB_ReadWrite]
STR a1, [v6, #Proc_DSB_Write]
STR a2, [v6, #Proc_DSB_Read]
STR a1, [v6, #Proc_DMB_ReadWrite]
STR a1, [v6, #Proc_DMB_Write]
STR a2, [v6, #Proc_DMB_Read]
ADRL a1, IMB_Full_WB_Crd
STR a1, [v6, #Proc_IMB_Full]
......@@ -495,8 +530,14 @@ Analyse_WB_Cal_LD
ADRL a1, TLB_InvalidateEntry_WB_Cal_LD
STR a1, [v6, #Proc_TLB_InvalidateEntry]
ADRL a1, WriteBuffer_Drain_WB_Cal_LD
STR a1, [v6, #Proc_WriteBuffer_Drain]
ADRL a1, DSB_ReadWrite_WB_Cal_LD
ADRL a2, NullOp ; Assuming barriers are only used for non-cacheable memory, a read barrier routine isn't necessary on XScale because all non-cacheable reads complete in-order with read/write accesses to other NC locations
STR a1, [v6, #Proc_DSB_ReadWrite]
STR a1, [v6, #Proc_DSB_Write]
STR a2, [v6, #Proc_DSB_Read]
STR a1, [v6, #Proc_DMB_ReadWrite]
STR a1, [v6, #Proc_DMB_Write]
STR a2, [v6, #Proc_DMB_Read]
ADRL a1, IMB_Full_WB_Cal_LD
STR a1, [v6, #Proc_IMB_Full]
......@@ -601,7 +642,7 @@ Analyse_WB_CR7_Lx
TST a1, #7
ADD a3, a3, #1
MOVNE a1, a1, LSR #3
CMP a3, #14 ; Stop after level 7 (even though an 8th level might exist on some CPUs?)
CMP a3, #Cache_Lx_MaxLevel ; Stop at the last level we support
ADD a2, a2, #4
BLT %BT10
......@@ -630,8 +671,17 @@ Analyse_WB_CR7_Lx
ADRL a1, TLB_InvalidateEntry_WB_CR7_Lx
STR a1, [v6, #Proc_TLB_InvalidateEntry]
ADRL a1, WriteBuffer_Drain_WB_CR7_Lx
STR a1, [v6, #Proc_WriteBuffer_Drain]
ADRL a1, DSB_ReadWrite_ARMv7
ADRL a2, DSB_Write_ARMv7
STR a1, [v6, #Proc_DSB_ReadWrite]
STR a2, [v6, #Proc_DSB_Write]
STR a1, [v6, #Proc_DSB_Read]
ADRL a1, DMB_ReadWrite_ARMv7
ADRL a2, DMB_Write_ARMv7
STR a1, [v6, #Proc_DMB_ReadWrite]
STR a2, [v6, #Proc_DMB_Write]
STR a1, [v6, #Proc_DMB_Read]
ADRL a1, IMB_Full_WB_CR7_Lx
STR a1, [v6, #Proc_IMB_Full]
......@@ -1009,7 +1059,7 @@ Cache_Examine_Simple
LDR r2, [r4, #DCache_Size]
LDRB r3, [r4, #ICache_LineLen]
LDR r4, [r4, #ICache_Size]
MOV pc, lr
NullOp MOV pc, lr
[ MEMM_Type = "ARM600"
......@@ -1022,9 +1072,9 @@ Cache_Examine_Simple
Cache_Invalidate_ARMv3
MCR p15, 0, a1, c7, c0
NullOp MOV pc, lr
MOV pc, lr
WriteBuffer_Drain_ARMv3
DSB_ReadWrite_ARMv3
;swap always forces unbuffered write, stalling till WB empty
SUB sp, sp, #4
SWP a1, a1, [sp]
......@@ -1113,7 +1163,7 @@ Cache_InvalidateUnified
MCR p15, 0, a1, c7, c7
MOV pc, lr
WriteBuffer_Drain_OffOn
DSB_ReadWrite_OffOn
; used if ARM has no drain WBuffer MCR op
Push "a2"
ARM_read_control a1
......@@ -1123,7 +1173,7 @@ WriteBuffer_Drain_OffOn
Pull "a2"
MOV pc, lr
WriteBuffer_Drain
DSB_ReadWrite
; used if ARM has proper drain WBuffer MCR op
MOV a1, #0
MCR p15, 0, a1, c7, c10, 4
......@@ -1301,7 +1351,7 @@ MMU_ChangingUncachedEntry_WB_CR7_LDa
MOV pc, lr
WriteBuffer_Drain_WB_CR7_LDa ROUT
DSB_ReadWrite_WB_CR7_LDa ROUT
MOV a1, #0
MCR p15, 0, a1, c7, c10, 4 ; drain WBuffer
MOV pc, lr
......@@ -1553,7 +1603,7 @@ MMU_ChangingUncachedEntry_WB_Crd
MCR p15, 0, a1, c8, c5, 0 ;flush ITLB
MOV pc, lr
WriteBuffer_Drain_WB_Crd
DSB_ReadWrite_WB_Crd
MCR p15, 0, a1, c7, c10, 4 ;drain WBuffer
MOV pc, lr
......@@ -1853,7 +1903,7 @@ MMU_ChangingUncachedEntry_WB_Cal_LD
MOV pc, lr
WriteBuffer_Drain_WB_Cal_LD ROUT
DSB_ReadWrite_WB_Cal_LD ROUT
MCR p15, 0, a1, c7, c10, 4 ; drain WBuffer (waits, so no need for CPWAIT)
MOV pc, lr
......@@ -2267,12 +2317,6 @@ TLB_InvalidateEntry_WB_CR7_Lx ROUT
MOV pc, lr
WriteBuffer_Drain_WB_CR7_Lx ROUT
myDSB ,a1 ; DSB is the new name for write buffer draining
myISB ,a1,,y ; Also do ISB for extra paranoia
MOV pc, lr
IMB_Full_WB_CR7_Lx ROUT
;
; do: clean DCache; drain WBuffer, invalidate ICache/branch predictor
......@@ -2614,16 +2658,54 @@ Cache_Examine_PL310 ROUT
MOV r4, r2
MOV pc, lr
WriteBuffer_Drain_PL310 ROUT
DSB_ReadWrite_PL310 ROUT
Entry
LDR lr, =ZeroPage
LDR lr, [lr, #Cache_HALDevice]
LDR lr, [lr, #HALDevice_Address]
; Drain ARM write buffer
myDSB ,a1,"SY"
; Drain PL310 write buffer
PL310Sync lr, a1
; Additional barrier necessary here, to prevent reads from occuring before the PL310Sync has completed. DMB should be sufficient, rather than DSB.
myDMB ,a1,"SY"
EXIT
DSB_Write_PL310 ROUT
Entry
LDR lr, =ZeroPage
LDR lr, [lr, #Cache_HALDevice]
LDR lr, [lr, #HALDevice_Address]
; Drain ARM write buffer
myDSB ,a1,"ST"
; Drain PL310 write buffer
PL310Sync lr, a1
; Assume that no barrier needed here (we don't care about reads, and there's surely no such thing as a speculative write)
EXIT
DMB_ReadWrite_PL310 ROUT
Entry
LDR lr, =ZeroPage
LDR lr, [lr, #Cache_HALDevice]
LDR lr, [lr, #HALDevice_Address]
; Drain ARM write buffer
myDMB ,a1,"SY"
; Drain PL310 write buffer
PL310Sync lr, a1
; Additional barrier necessary here, to prevent reads from occuring before the PL310Sync has completed
myDMB ,a1,"SY"
EXIT
DMB_Write_PL310 ROUT
Entry
LDR lr, =ZeroPage
LDR lr, [lr, #Cache_HALDevice]
LDR lr, [lr, #HALDevice_Address]
; Drain ARM write buffer
myDSB ,a1 ; DSB is the new name for write buffer draining
myISB ,a1,,y ; Also do ISB for extra paranoia
myDMB ,a1,"ST"
; Drain PL310 write buffer
PL310Sync lr, a1
; Assume that no barrier needed here (we don't care about reads, and there's surely no such thing as a speculative write)
EXIT
MMU_Changing_PL310 ROUT
......@@ -2730,6 +2812,40 @@ MMU_ChangingEntries_PL310 ROUT
myISB ,a1,,y ; Ensure that the effects are visible