Commit b0682acb authored by Jeffrey Lee's avatar Jeffrey Lee
Browse files

Cache maintenance fixes

Detail:
  This set of changes tackles two main issues:
  * Before mapping out a cacheable page or making it uncacheable, the OS performs a cache clean+invalidate op. However this leaves a small window where data may be fetched back into the cache, either accidentally (dodgy interrupt handler) or via agressive prefetch (as allowed for by the architecture). This rogue data can then result in coherency issues once the pages are mapped out or made uncacheable a short time later.
    The fix for this is to make the page uncacheable before performing the cache maintenance (although this isn't ideal, as prior to ARMv7 it's implementation defined whether address-based cache maintenance ops affect uncacheable pages or not - and on ARM11 it seems that they don't, so for that CPU we currently force a full cache clean instead)
  * Modern ARMs generally ignore unexpected cache hits, so there's an interrupt hole in the current OS_Memory 0 "make temporarily uncacheable" implementation where the cache is being flushed after the page has been made uncacheable (consider the case of a page that's being used by an interrupt handler, but the page is being made uncacheable so it can also be used by DMA). As well as affecting ARMv7+ devices this was found to affect XScale (and ARM11, although untested for this issue, would have presumably suffered from the "can't clean uncacheable pages" limitation)
    The fix for this is to disable IRQs around the uncache sequence - however FIQs are currently not being dealt with, so there's still a potential issue there.
  File changes:
  - Docs/HAL/ARMop_API, hdr/KernelWS, hdr/OSMisc - Add new Cache_CleanInvalidateRange ARMop
  - s/ARM600, s/VMSAv6 - BangCam updated to make the page uncacheable prior to flushing the cache. Add GetTempUncache macro to help with calculating the page flags required for making pages uncacheable. Fix abort in OS_MMUControl on Raspberry Pi - MCR-based ISB was resetting ZeroPage pointer to 0
  - s/ARMops - Cache_CleanInvalidateRange implementations. PL310 MMU_ChangingEntry/MMU_ChangingEntries refactored to rely on Cache_CleanInvalidateRange_PL310, which should be a more optimal implementation of the cache cleaning code that was previously in MMU_ChangingEntry_PL310.
  - s/ChangeDyn - Rename FastCDA_UpFront to FastCDA_Bulk, since the cache maintenance is no longer performed upfront. CheckCacheabilityR0ByMinusR2 now becomes RemoveCacheabilityR0ByMinusR2. PMP LogOp implementation refactored quite a bit to perform cache/TLB maintenance after making page table changes instead of before. One flaw with this new implementation is that mapping out large areas of cacheable pages will result in multiple full cache cleans while the old implementation would have (generally) only performed one - a two-pass approach over the page list would be needed to solve this.
  - s/GetAll - Change file ordering so GetTempUncache macro is available earlier
  - s/HAL - ROM decompression changed to do full MMU_Changing instead of MMU_ChangingEntries, to make sure earlier cached data is truly gone from the cache. ClearPhysRAM changed to make page uncacheable before flushing cache.
  - s/MemInfo - OS_Memory 0 interrupt hole fix
  - s/AMBControl/memmap - AMB_movepagesout_L2PT now split into cacheable+non-cacheable variants. Sparse map out operation now does two passes through the page list so that they can all be made uncacheable prior to the cache flush + map out.
Admin:
  Tested on StrongARM, XScale, ARM11, Cortex-A7, Cortex-A9, Cortex-A15, Cortex-A53
  Appears to fix the major issues plaguing SATA on IGEPv5


Version 5.35, 4.79.2.306. Tagged as 'Kernel-5_35-4_79_2_306'
parent eb908a1e
......@@ -202,6 +202,25 @@ that are not involved in any currently active interrupts. In other words, it
is expected and desirable that interrupts remain enabled during any extended
clean operation, in order to avoid impact on interrupt latency.
-- Cache_CleanInvalidateRange
The cache or caches are to be invalidated for (at least) the given range, with
cleaning of any writeback data being properly performed.
entry: r0 = logical address of start of range
r1 = logical address of end of range (exclusive)
Note that r0 and r1 are aligned on cache line boundaries
exit: -
Note that any write buffer draining should also be performed by this
operation, so that memory is fully updated with respect to any writeaback
data.
The OS only expects the invalidation to be with respect to instructions/data
that are not involved in any currently active interrupts. In other words, it
is expected and desirable that interrupts remain enabled during any extended
clean operation, in order to avoid impact on interrupt latency.
-- Cache_CleanAll
The unified cache or data cache are to be globally cleaned (any writeback data
......
......@@ -13,11 +13,11 @@
GBLS Module_ComponentPath
Module_MajorVersion SETS "5.35"
Module_Version SETA 535
Module_MinorVersion SETS "4.79.2.305"
Module_Date SETS "29 Feb 2016"
Module_ApplicationDate SETS "29-Feb-16"
Module_MinorVersion SETS "4.79.2.306"
Module_Date SETS "10 Mar 2016"
Module_ApplicationDate SETS "10-Mar-16"
Module_ComponentName SETS "Kernel"
Module_ComponentPath SETS "castle/RiscOS/Sources/Kernel"
Module_FullVersion SETS "5.35 (4.79.2.305)"
Module_HelpVersion SETS "5.35 (29 Feb 2016) 4.79.2.305"
Module_FullVersion SETS "5.35 (4.79.2.306)"
Module_HelpVersion SETS "5.35 (10 Mar 2016) 4.79.2.306"
END
......@@ -5,19 +5,19 @@
*
*/
#define Module_MajorVersion_CMHG 5.35
#define Module_MinorVersion_CMHG 4.79.2.305
#define Module_Date_CMHG 29 Feb 2016
#define Module_MinorVersion_CMHG 4.79.2.306
#define Module_Date_CMHG 10 Mar 2016
#define Module_MajorVersion "5.35"
#define Module_Version 535
#define Module_MinorVersion "4.79.2.305"
#define Module_Date "29 Feb 2016"
#define Module_MinorVersion "4.79.2.306"
#define Module_Date "10 Mar 2016"
#define Module_ApplicationDate "29-Feb-16"
#define Module_ApplicationDate "10-Mar-16"
#define Module_ComponentName "Kernel"
#define Module_ComponentPath "castle/RiscOS/Sources/Kernel"
#define Module_FullVersion "5.35 (4.79.2.305)"
#define Module_HelpVersion "5.35 (29 Feb 2016) 4.79.2.305"
#define Module_FullVersion "5.35 (4.79.2.306)"
#define Module_HelpVersion "5.35 (10 Mar 2016) 4.79.2.306"
#define Module_LibraryVersionInfo "5:35"
......@@ -1234,6 +1234,7 @@ ProcessorFlags # 4 ; Processor flags (IMB, Arch4 etc)
MMU_PCBTrans # 4
Proc_Cache_CleanInvalidateAll # 4
Proc_Cache_CleanInvalidateRange # 4
Proc_Cache_CleanAll # 4
Proc_Cache_InvalidateAll # 4
Proc_Cache_RangeThreshold # 4
......
......@@ -77,6 +77,7 @@ ARMop_DSB_Read # 1 ; 17
ARMop_DMB_ReadWrite # 1 ; 18
ARMop_DMB_Write # 1 ; 19
ARMop_DMB_Read # 1 ; 20
ARMop_Cache_CleanInvalidateRange # 1 ; 21
ARMop_Max # 0
END
......@@ -406,17 +406,80 @@ AMB_movepagesout_CAM ROUT
; ----------------------------------------------------------------------------------
;
;AMB_movepagesout_L2PT
;AMB_movecacheablepagesout_L2PT
;
;updates L2PT for old logical page positions, does not update CAM
;
; entry:
; r3 = old page flags
; r4 = old logical address of 1st page
; r8 = number of pages
;
AMB_movepagesout_L2PT ROUT
Push "r0-r8,lr"
AMB_movecacheablepagesout_L2PT
Entry "r0-r8"
; Calculate L2PT flags needed to make the pages uncacheable
; Assume all pages will have identical flags (or at least close enough)
LDR lr,=ZeroPage
LDR lr,[lr, #MMU_PCBTrans]
GetTempUncache r0, r3, lr, r1
LDR r1, =TempUncache_L2PTMask
LDR lr,=L2PT
ADD lr,lr,r4,LSR #(Log2PageSize-2) ;lr -> L2PT 1st entry
CMP r8,#4
BLT %FT20
10
LDMIA lr,{r2-r5}
BIC r2,r2,r1
BIC r3,r3,r1
BIC r4,r4,r1
BIC r5,r5,r1
ORR r2,r2,r0
ORR r3,r3,r0
ORR r4,r4,r0
ORR r5,r5,r0
STMIA lr!,{r2-r5}
SUB r8,r8,#4
CMP r8,#4
BGE %BT10
20
CMP r8,#0
BEQ %FT35
30
LDR r2,[lr]
BIC r2,r2,r1
ORR r2,r2,r0
STR r2,[lr],#4
SUBS r8,r8,#1
BNE %BT30
35
FRAMLDR r0,,r4 ;address of 1st page
FRAMLDR r1,,r8 ;number of pages
LDR r3,=ZeroPage
ARMop MMU_ChangingUncachedEntries,,,r3 ;flush TLB
FRAMLDR r0,,r4
FRAMLDR r1,,r8
ADD r1,r0,r1,LSL #Log2PageSize
ARMop Cache_CleanInvalidateRange,,,r3 ;flush from cache
FRAMLDR r4
FRAMLDR r8
B %FT55 ; -> moveuncacheablepagesout_L2PT (avoid pop+push of large stack frame)
; ----------------------------------------------------------------------------------
;
;AMB_moveuncacheablepagesout_L2PT
;
;updates L2PT for old logical page positions, does not update CAM
;
; entry:
; r4 = old logical address of 1st page
; r8 = number of pages
;
AMB_moveuncacheablepagesout_L2PT
ALTENTRY
55 ; Enter here from moveuncacheablepagesout
LDR lr,=L2PT
ADD lr,lr,r4,LSR #(Log2PageSize-2) ;lr -> L2PT 1st entry
......@@ -430,32 +493,25 @@ AMB_movepagesout_L2PT ROUT
MOV r7,#0
CMP r8,#8
BLT %FT20
10
BLT %FT70
60
STMIA lr!,{r0-r7} ;blam! (8 entries)
SUB r8,r8,#8
CMP r8,#8
BGE %BT10
20
BGE %BT60
70
CMP r8,#0
BEQ %FT35
30
BEQ %FT85
80
STR r0,[lr],#4
SUBS r8,r8,#1
BNE %BT30
35
[ MEMM_Type = "VMSAv6"
; In order to guarantee that the result of a page table write is
; visible, the ARMv6+ memory order model requires us to perform TLB
; maintenance (equivalent to the MMU_ChangingUncached ARMop) after we've
; performed the write. Performing the maintenance beforehand (as we've
; done traditionally) will work most of the time, but not always.
LDR r0, [sp, #4*4]
LDR r1, [sp, #8*4]
LDR r2, =ZeroPage
ARMop MMU_ChangingUncachedEntries,,,r2
]
Pull "r0-r8,pc"
BNE %BT80
85
FRAMLDR r0,,r4 ;address of 1st page
FRAMLDR r1,,r8 ;number of pages
LDR r3,=ZeroPage
ARMop MMU_ChangingUncachedEntries,,,r3 ;no cache worries, hoorah
EXIT
; ----------------------------------------------------------------------------------
;
......@@ -510,12 +566,8 @@ AMB_SetMemMapEntries ROUT
LDR r2,[r10] ;page number of 1st page
LDR r7,[r7,#CamEntriesPointer] ;r7 -> CAM
ADD r1,r7,r2,LSL #CAM_EntrySizeLog2 ;r1 -> CAM entry for 1st page
[ AMB_LimpidFreePool
LDR r4,[r1,#CAM_LogAddr] ;fetch old logical addr. of 1st page from CAM
LDR r3,[r1,#CAM_PageFlags] ;fetch old PPL of 1st page from CAM
|
LDR r4,[r1,#CAM_LogAddr] ;fetch old logical addr. of 1st page from CAM
]
CMP r5,#-1
BEQ AMB_smme_mapout
......@@ -534,24 +586,16 @@ AMB_SetMemMapEntries ROUT
;
;this should be map FreePool -> App Space then
;
MOV r0,r4 ;address of 1st page
MOV r1,r8 ;number of pages
LDR r3,=ZeroPage
ARMop MMU_ChangingUncachedEntries,,,r3 ;no cache worries, hoorah
MOV r3,r5
BL AMB_movepagesout_L2PT ;unmap 'em from where they are
BL AMB_moveuncacheablepagesout_L2PT ;unmap 'em from where they are
BL AMB_movepagesin_L2PT ;map 'em to where they now be
BL AMB_movepagesin_CAM ;keep the bloomin' soft CAM up to date
Pull "r0-r4,r7-r11, pc"
AMB_smme_mapnotlimpid
]
;
MOV r0,r4 ;address of 1st page
MOV r1,r8 ;number of pages
LDR r3,=ZeroPage
ARMop MMU_ChangingEntries,,,r3 ;
BL AMB_movecacheablepagesout_L2PT
MOV r3,r5
BL AMB_movepagesout_L2PT
BL AMB_movepagesin_L2PT
BL AMB_movepagesin_CAM
Pull "r0-r4,r7-r11, pc"
......@@ -567,14 +611,10 @@ AMB_smme_mapin
;all pages destined for same new logical page Nowhere, ie. mapping them out
;
AMB_smme_mapout
LDR r3,=DuffEntry
CMP r4,r3
LDR lr,=DuffEntry
CMP r4,lr
BEQ %FT50 ;pages already mapped out - just update CAM for new ownership
MOV r0,r4 ;address of 1st page
MOV r1,r8 ;number of pages
LDR r3,=ZeroPage
ARMop MMU_ChangingEntries,,,r3 ;
BL AMB_movepagesout_L2PT
BL AMB_movecacheablepagesout_L2PT
50
BL AMB_movepagesout_CAM
......@@ -618,24 +658,28 @@ AMB_SetMemMapEntries_SparseMapOut ROUT
CMP r3,#0
MOVEQ pc,lr
Push "r0-r11,lr"
Entry "r0-r11"
;to do this safely we need to do it in two passes
;first pass makes pages uncacheable
;second pass unmaps them
;n.b. like most of the AMB code, this assumes nothing will trigger an abort-based lazy map in operation while we're in the middle of this processing!
;
;pass one: make pages uncacheable (preserve bitmap, CAM)
;
MOV r10,r4 ;ptr to page list
LDR r2,=ZeroPage
MOV r9,#AP_Duff ;permissions for DuffEntry
LDR r7,[r2,#CamEntriesPointer] ;r7 -> CAM
MOV r4,#ApplicationStart ;log. address of first page
LDR r1,=DuffEntry ;means Nowhere, in CAM
;if the number of pages mapped in is small enough, we'll do cache/TLB coherency on
;just those pages, else global (performance decision, threshold probably not critical)
MOV r9,#-1 ;initialised to correct page flags once we find a mapped in page
;decide if we want to do TLB coherency as we go
ARMop Cache_RangeThreshold,,,r2 ;returns threshold (bytes) in r0
CMP r3,r0,LSR #Log2PageSize
MOVLO r6,#0 ;r6 := 0 if we are to do coherency as we go
BLO %FT10 ;let's do it
ARMop MMU_Changing,,,r2 ;global coherency
B %FT10
;skip next 32 pages then continue
......@@ -643,7 +687,7 @@ AMB_SetMemMapEntries_SparseMapOut ROUT
ADD r10,r10,#32*4
ADD r4,r4,#32*PageSize
;find the sparsely mapped pages, map them out, doing coherency as we go if enabled
;find the sparsely mapped pages, make them uncacheable, doing coherency as we go if enabled
10
MOV r8,#1 ;initial bitmap mask for new bitmap word
LDR r11,[r5],#4 ;next word of bitmap
......@@ -652,12 +696,82 @@ AMB_SetMemMapEntries_SparseMapOut ROUT
12
TST r11,r8 ;page is currently mapped in if bit set
BEQ %FT16
TEQ r6, #0
BNE %FT14 ;check for coherency as we go
CMP r9,#-1 ;have page flags yet?
BNE %FT14
LDR r0,[r10] ;page no.
ADD r0,r7,r0,LSL #CAM_EntrySizeLog2 ;r0 -> CAM entry for page
LDR r0,[r0,#CAM_PageFlags]
LDR r2,=ZeroPage
MOV r0,r4 ;address of page
ARMop MMU_ChangingEntry,,,r2
LDR r2,[r2,#MMU_PCBTrans]
GetTempUncache r9,r0,r2,lr
14
LDR lr,=L2PT ;lr -> L2PT
LDR r2,[lr,r4,LSR #(Log2PageSize-2)] ;L2PT entry for page
LDR r0,=TempUncache_L2PTMask
BIC r2,r2,r0
ORR r2,r2,r9
STR r2,[lr,r4,LSR #(Log2PageSize-2)] ;make uncacheable
TEQ r6, #0
BNE %FT15
LDR r2,=ZeroPage
MOV r0,r4
ARMop MMU_ChangingUncachedEntry,,,r2 ;flush TLB
MOV r0,r4
ADD r1,r0,#PageSize
ARMop Cache_CleanInvalidateRange,,,r2 ;flush from cache
15
SUBS r3,r3,#1
BEQ %FT40 ;done
16
ADD r10,r10,#4 ;next page no.
ADD r4,r4,#PageSize ;next logical address
MOVS r8,r8,LSL #1 ;if 32 bits processed...
BNE %BT12
B %BT10
40
TEQ r6, #0
BEQ %FT45
LDR r2,=ZeroPage
ARMop MMU_ChangingUncached,,,r2
ARMop Cache_CleanInvalidateAll,,,r2
45
;
;pass two: unmap pages (+ clear bitmap + update CAM)
;
FRAMLDR r3
FRAMLDR r5
FRAMLDR r10,,r4 ;ptr to page list
LDR r2,=ZeroPage
MOV r9,#AP_Duff ;permissions for DuffEntry
LDR r7,[r2,#CamEntriesPointer] ;r7 -> CAM
MOV r4,#ApplicationStart ;log. address of first page
LDR r1,=DuffEntry ;means Nowhere, in CAM
;decide if we want to do TLB coherency as we go
MOV r6, r3, LSR #5 ; r6!=0 if doing global coherency (32 entry TLB)
B %FT60
;skip next 32 pages then continue
56
ADD r10,r10,#32*4
ADD r4,r4,#32*PageSize
;find the sparsely mapped pages, map them out, doing coherency as we go if enabled
60
MOV r8,#1 ;initial bitmap mask for new bitmap word
LDR r11,[r5],#4 ;next word of bitmap
CMP r11,#0 ;if next 32 bits of bitmap clear, skip
BEQ %BT56 ;skip loop must terminate if r3 > 0
62
TST r11,r8 ;page is currently mapped in if bit set
BEQ %FT66
LDR r0,[r10] ;page no.
ADD r0,r7,r0,LSL #CAM_EntrySizeLog2 ;r0 -> CAM entry for page
ASSERT CAM_LogAddr=0
......@@ -666,14 +780,8 @@ AMB_SetMemMapEntries_SparseMapOut ROUT
LDR lr,=L2PT ;lr -> L2PT
MOV r2, #0
STR r2,[lr,r4,LSR #(Log2PageSize-2)] ;L2PT entry for page set to 0 (means translation fault)
[ MEMM_Type = "VMSAv6"
; In order to guarantee that the result of a page table write is
; visible, the ARMv6+ memory order model requires us to perform TLB
; maintenance (equivalent to the MMU_ChangingUncached ARMop) after we've
; performed the write. Performing the maintenance beforehand (as we've
; done traditionally) will work most of the time, but not always.
TEQ r6, #0
BNE %FT15
BNE %FT65
[ ZeroPage != 0
LDR r2,=ZeroPage
]
......@@ -682,36 +790,26 @@ AMB_SetMemMapEntries_SparseMapOut ROUT
[ ZeroPage != 0
MOV r2,#0
]
15
]
65
SUBS r3,r3,#1
STREQ r2,[r5,#-4] ;make sure we clear last word of bitmap, and...
BEQ %FT20 ;done
16
BEQ %FT90 ;done
66
ADD r10,r10,#4 ;next page no.
ADD r4,r4,#PageSize ;next logical address
MOVS r8,r8,LSL #1 ;if 32 bits processed...
BNE %BT12
BNE %BT62
MOV r2, #0
STR r2,[r5,#-4] ;zero word of bitmap we've just traversed
LDR r2,=ZeroPage
B %BT10
B %BT60
20
[ MEMM_Type = "VMSAv6"
; In order to guarantee that the result of a page table write is
; visible, the ARMv6+ memory order model requires us to perform TLB
; maintenance (equivalent to the MMU_ChangingUncached ARMop) after we've
; performed the write. Performing the maintenance beforehand (as we've
; done traditionally) will work most of the time, but not always.
90
TEQ r6, #0
BEQ %FT25
BEQ %FT95
LDR r2,=ZeroPage
ARMop MMU_ChangingUncached,,,r2
25
]
Pull "r0-r11,pc"
95
EXIT
; ----------------------------------------------------------------------------------
......
......@@ -26,6 +26,19 @@ DebugAborts SETL {FALSE}
GBLL UseProcessTransfer
UseProcessTransfer SETL {FALSE}
; Convert given page flags to the equivalent temp uncacheable L2PT flags
; n.b. temp not used here but included for VMSAv6 compatibility
MACRO
GetTempUncache $out, $pageflags, $pcbtrans, $temp
ASSERT DynAreaFlags_CPBits = 7*XCB_P :SHL: 10
ASSERT DynAreaFlags_NotCacheable = XCB_NC :SHL: 4
ASSERT DynAreaFlags_NotBufferable = XCB_NB :SHL: 4
AND $out, $pageflags, #DynAreaFlags_NotCacheable + DynAreaFlags_NotBufferable
ORR $out, $out, #DynAreaFlags_NotCacheable ; treat as temp uncache
LDRB $out, [$pcbtrans, $out, LSR #4] ; convert to X, C and B bits for this CPU
MEND
TempUncache_L2PTMask * L2_X+L2_C+L2_B
; MMU interface file - ARM600 version
......@@ -282,21 +295,36 @@ BangL2PT ; internal entry point used only
TST r11, #DynAreaFlags_DoublyMapped
BNE BangL2PT_sledgehammer ;if doubly mapped, don't try to be clever
;we sort out cache coherency _before_ remapping, because some ARMs might insist on
;that order (write back cache doing write backs to logical addresses)
;we need to worry about cache only if mapping out a cacheable page
;In order to safely map out a cacheable page and remove it from the
;cache, we need to perform the following process:
;* Make the page uncacheable
;* Flush TLB
;* Clean+invalidate cache
;* Write new mapping (r6)
;* Flush TLB
;For uncacheable pages we can just do the last two steps
;
TEQ r6, #0 ;EQ if mapping out
TSTEQ r11, #DynAreaFlags_NotCacheable ;EQ if also cacheable (overcautious for temp uncache+illegal PCB combos)
MOV r0, r3 ;MMU page entry address
ADR lr, %FT20
LDR r4, =ZeroPage
ARMop MMU_ChangingEntry, EQ, tailcall, r4
ARMop MMU_ChangingUncachedEntry, NE, tailcall, r4
BNE %FT20
LDR lr, [r4, #MMU_PCBTrans]
GetTempUncache r0, r11, lr
LDR lr, [r1, r3, LSR #10] ;get current L2PT entry
BIC lr, lr, #TempUncache_L2PTMask ;remove current attributes
ORR lr, lr, r0
STR lr, [r1, r3, LSR #10] ;Make uncacheable
MOV r0, r3
ARMop MMU_ChangingUncachedEntry,,, r4 ; TLB flush
MOV r0, r3
ADD r1, r3, #4096
ARMop Cache_CleanInvalidateRange,,, r4 ; Cache flush
LDR r1, =L2PT
20 STR r6, [r1, r3, LSR #10] ;update L2PT entry
Pull "pc"
Pull "lr"
MOV r0, r3
ARMop MMU_ChangingUncachedEntry,,tailcall,r4
BangL2PT_sledgehammer
......
......@@ -230,6 +230,7 @@ Analyse_ARMv3
STR a1, [v6, #Proc_Cache_CleanAll]
STR a2, [v6, #Proc_Cache_CleanInvalidateAll]
STR a2, [v6, #Proc_Cache_CleanInvalidateRange]
STR a2, [v6, #Proc_Cache_InvalidateAll]
STR a3, [v6, #Proc_DSB_ReadWrite]
STR a3, [v6, #Proc_DSB_Write]
......@@ -276,6 +277,7 @@ Analyse_WriteThroughUnified
STR a1, [v6, #Proc_Cache_CleanAll]
STR a2, [v6, #Proc_Cache_CleanInvalidateAll]
STR a2, [v6, #Proc_Cache_CleanInvalidateRange]
STR a2, [v6, #Proc_Cache_InvalidateAll]
STR a3, [v6, #Proc_DSB_ReadWrite]
STR a3, [v6, #Proc_DSB_Write]
......@@ -319,6 +321,9 @@ Analyse_WB_CR7_LDa
ADRL a1, Cache_CleanInvalidateAll_WB_CR7_LDa
STR a1, [v6, #Proc_Cache_CleanInvalidateAll]
ADRL a1, Cache_CleanInvalidateRange_WB_CR7_LDa
STR a1, [v6, #Proc_Cache_CleanInvalidateRange]
ADRL a1, Cache_CleanAll_WB_CR7_LDa
STR a1, [v6, #Proc_Cache_CleanAll]
......@@ -427,6 +432,9 @@ Analyse_WB_Crd
ADRL a1, Cache_CleanInvalidateAll_WB_Crd
STR a1, [v6, #Proc_Cache_CleanInvalidateAll]
ADRL a1, Cache_CleanInvalidateRange_WB_Crd
STR a1, [v6, #Proc_Cache_CleanInvalidateRange]
ADRL a1, Cache_CleanAll_WB_Crd
STR a1, [v6, #Proc_Cache_CleanAll]
......@@ -506,6 +514,9 @@ Analyse_WB_Cal_LD
ADRL a1, Cache_CleanInvalidateAll_WB_Cal_LD
STR a1, [v6, #Proc_Cache_CleanInvalidateAll]
ADRL a1, Cache_CleanInvalidateRange_WB_Cal_LD
STR a1, [v6, #Proc_Cache_CleanInvalidateRange]
ADRL a1, Cache_CleanAll_WB_Cal_LD
STR a1, [v6, #Proc_Cache_CleanAll]
......@@ -649,6 +660,9 @@ Analyse_WB_CR7_Lx
ADRL a1, Cache_CleanInvalidateAll_WB_CR7_Lx
STR a1, [v6, #Proc_Cache_CleanInvalidateAll]
ADRL a1, Cache_CleanInvalidateRange_WB_CR7_Lx
STR a1, [v6, #Proc_Cache_CleanInvalidateRange]
ADRL a1, Cache_CleanAll_WB_CR7_Lx
STR a1, [v6, #Proc_Cache_CleanAll]
......@@ -1337,6 +1351,45 @@ Cache_CleanInvalidateAll_WB_CR7_LDa ROUT
Pull "a2, ip"
MOV pc, lr
; a1 = start address (inclusive, cache line aligned)
; a2 = end address (exclusive, cache line aligned)
;
[ MEMM_Type = "ARM600"
Cache_CleanInvalidateRange_WB_CR7_LDa ROUT
Push "a2, a3, lr"
LDR lr, =ZeroPage
SUB a2, a2, a1
LDR a3, [lr, #DCache_RangeThreshold] ;check whether cheaper to do global clean
CMP a2, a3
BHS %FT30
ADD a2, a2, a1 ;clean end address (exclusive)
LDRB a3, [lr, #DCache_LineLen]
10
MCR p15, 0, a1, c7, c14, 1 ; clean&invalidate DCache entry
MCR p15, 0, a1, c7, c5, 1 ; invalidate ICache entry
ADD a1, a1, a3
CMP a1, a2
BLO %BT10
MOV a1, #0
MCR p15, 0, a1, c7, c10, 4 ; drain WBuffer
MCR p15, 0, a1, c7, c5, 6 ; flush branch predictors
Pull "a2, a3, pc"
;
30
Pull "a2, a3, lr"
B Cache_CleanInvalidateAll_WB_CR7_LDa
|
; Bodge for ARM11
; The OS assumes that address-based cache maintenance operations will operate
; on pages which are currently marked non-cacheable (so that we can make a page
; non-cacheable and then clean/invalidate the cache, to ensure prefetch or
; anything else doesn't pull any data for the page back into the cache once
; we've cleaned it). For ARMv7+ this is guaranteed behaviour, but prior to that
; it's implementation defined, and the ARM11 in particular seems to ignore