Commit b0682acb authored by Jeffrey Lee's avatar Jeffrey Lee
Browse files

Cache maintenance fixes

Detail:
  This set of changes tackles two main issues:
  * Before mapping out a cacheable page or making it uncacheable, the OS performs a cache clean+invalidate op. However this leaves a small window where data may be fetched back into the cache, either accidentally (dodgy interrupt handler) or via agressive prefetch (as allowed for by the architecture). This rogue data can then result in coherency issues once the pages are mapped out or made uncacheable a short time later.
    The fix for this is to make the page uncacheable before performing the cache maintenance (although this isn't ideal, as prior to ARMv7 it's implementation defined whether address-based cache maintenance ops affect uncacheable pages or not - and on ARM11 it seems that they don't, so for that CPU we currently force a full cache clean instead)
  * Modern ARMs generally ignore unexpected cache hits, so there's an interrupt hole in the current OS_Memory 0 "make temporarily uncacheable" implementation where the cache is being flushed after the page has been made uncacheable (consider the case of a page that's being used by an interrupt handler, but the page is being made uncacheable so it can also be used by DMA). As well as affecting ARMv7+ devices this was found to affect XScale (and ARM11, although untested for this issue, would have presumably suffered from the "can't clean uncacheable pages" limitation)
    The fix for this is to disable IRQs around the uncache sequence - however FIQs are currently not being dealt with, so there's still a potential issue there.
  File changes:
  - Docs/HAL/ARMop_API, hdr/KernelWS, hdr/OSMisc - Add new Cache_CleanInvalidateRange ARMop
  - s/ARM600, s/VMSAv6 - BangCam updated to make the page uncacheable prior to flushing the cache. Add GetTempUncache macro to help with calculating the page flags required for making pages uncacheable. Fix abort in OS_MMUControl on Raspberry Pi - MCR-based ISB was resetting ZeroPage pointer to 0
  - s/ARMops - Cache_CleanInvalidateRange implementations. PL310 MMU_ChangingEntry/MMU_ChangingEntries refactored to rely on Cache_CleanInvalidateRange_PL310, which should be a more optimal implementation of the cache cleaning code that was previously in MMU_ChangingEntry_PL310.
  - s/ChangeDyn - Rename FastCDA_UpFront to FastCDA_Bulk, since the cache maintenance is no longer performed upfront. CheckCacheabilityR0ByMinusR2 now becomes RemoveCacheabilityR0ByMinusR2. PMP LogOp implementation refactored quite a bit to perform cache/TLB maintenance after making page table changes instead of before. One flaw with this new implementation is that mapping out large areas of cacheable pages will result in multiple full cache cleans while the old implementation would have (generally) only performed one - a two-pass approach over the page list would be needed to solve this.
  - s/GetAll - Change file ordering so GetTempUncache macro is available earlier
  - s/HAL - ROM decompression changed to do full MMU_Changing instead of MMU_ChangingEntries, to make sure earlier cached data is truly gone from the cache. ClearPhysRAM changed to make page uncacheable before flushing cache.
  - s/MemInfo - OS_Memory 0 interrupt hole fix
  - s/AMBControl/memmap - AMB_movepagesout_L2PT now split into cacheable+non-cacheable variants. Sparse map out operation now does two passes through the page list so that they can all be made uncacheable prior to the cache flush + map out.
Admin:
  Tested on StrongARM, XScale, ARM11, Cortex-A7, Cortex-A9, Cortex-A15, Cortex-A53
  Appears to fix the major issues plaguing SATA on IGEPv5


Version 5.35, 4.79.2.306. Tagged as 'Kernel-5_35-4_79_2_306'
parent eb908a1e
......@@ -202,6 +202,25 @@ that are not involved in any currently active interrupts. In other words, it
is expected and desirable that interrupts remain enabled during any extended
clean operation, in order to avoid impact on interrupt latency.
-- Cache_CleanInvalidateRange
The cache or caches are to be invalidated for (at least) the given range, with
cleaning of any writeback data being properly performed.
entry: r0 = logical address of start of range
r1 = logical address of end of range (exclusive)
Note that r0 and r1 are aligned on cache line boundaries
exit: -
Note that any write buffer draining should also be performed by this
operation, so that memory is fully updated with respect to any writeaback
data.
The OS only expects the invalidation to be with respect to instructions/data
that are not involved in any currently active interrupts. In other words, it
is expected and desirable that interrupts remain enabled during any extended
clean operation, in order to avoid impact on interrupt latency.
-- Cache_CleanAll
The unified cache or data cache are to be globally cleaned (any writeback data
......
......@@ -13,11 +13,11 @@
GBLS Module_ComponentPath
Module_MajorVersion SETS "5.35"
Module_Version SETA 535
Module_MinorVersion SETS "4.79.2.305"
Module_Date SETS "29 Feb 2016"
Module_ApplicationDate SETS "29-Feb-16"
Module_MinorVersion SETS "4.79.2.306"
Module_Date SETS "10 Mar 2016"
Module_ApplicationDate SETS "10-Mar-16"
Module_ComponentName SETS "Kernel"
Module_ComponentPath SETS "castle/RiscOS/Sources/Kernel"
Module_FullVersion SETS "5.35 (4.79.2.305)"
Module_HelpVersion SETS "5.35 (29 Feb 2016) 4.79.2.305"
Module_FullVersion SETS "5.35 (4.79.2.306)"
Module_HelpVersion SETS "5.35 (10 Mar 2016) 4.79.2.306"
END
......@@ -5,19 +5,19 @@
*
*/
#define Module_MajorVersion_CMHG 5.35
#define Module_MinorVersion_CMHG 4.79.2.305
#define Module_Date_CMHG 29 Feb 2016
#define Module_MinorVersion_CMHG 4.79.2.306
#define Module_Date_CMHG 10 Mar 2016
#define Module_MajorVersion "5.35"
#define Module_Version 535
#define Module_MinorVersion "4.79.2.305"
#define Module_Date "29 Feb 2016"
#define Module_MinorVersion "4.79.2.306"
#define Module_Date "10 Mar 2016"
#define Module_ApplicationDate "29-Feb-16"
#define Module_ApplicationDate "10-Mar-16"
#define Module_ComponentName "Kernel"
#define Module_ComponentPath "castle/RiscOS/Sources/Kernel"
#define Module_FullVersion "5.35 (4.79.2.305)"
#define Module_HelpVersion "5.35 (29 Feb 2016) 4.79.2.305"
#define Module_FullVersion "5.35 (4.79.2.306)"
#define Module_HelpVersion "5.35 (10 Mar 2016) 4.79.2.306"
#define Module_LibraryVersionInfo "5:35"
......@@ -1234,6 +1234,7 @@ ProcessorFlags # 4 ; Processor flags (IMB, Arch4 etc)
MMU_PCBTrans # 4
Proc_Cache_CleanInvalidateAll # 4
Proc_Cache_CleanInvalidateRange # 4
Proc_Cache_CleanAll # 4
Proc_Cache_InvalidateAll # 4
Proc_Cache_RangeThreshold # 4
......
......@@ -77,6 +77,7 @@ ARMop_DSB_Read # 1 ; 17
ARMop_DMB_ReadWrite # 1 ; 18
ARMop_DMB_Write # 1 ; 19
ARMop_DMB_Read # 1 ; 20
ARMop_Cache_CleanInvalidateRange # 1 ; 21
ARMop_Max # 0
END
......@@ -406,17 +406,80 @@ AMB_movepagesout_CAM ROUT
; ----------------------------------------------------------------------------------
;
;AMB_movepagesout_L2PT
;AMB_movecacheablepagesout_L2PT
;
;updates L2PT for old logical page positions, does not update CAM
;
; entry:
; r3 = old page flags
; r4 = old logical address of 1st page
; r8 = number of pages
;
AMB_movepagesout_L2PT ROUT
Push "r0-r8,lr"
AMB_movecacheablepagesout_L2PT
Entry "r0-r8"
; Calculate L2PT flags needed to make the pages uncacheable
; Assume all pages will have identical flags (or at least close enough)
LDR lr,=ZeroPage
LDR lr,[lr, #MMU_PCBTrans]
GetTempUncache r0, r3, lr, r1
LDR r1, =TempUncache_L2PTMask
LDR lr,=L2PT
ADD lr,lr,r4,LSR #(Log2PageSize-2) ;lr -> L2PT 1st entry
CMP r8,#4
BLT %FT20
10
LDMIA lr,{r2-r5}
BIC r2,r2,r1
BIC r3,r3,r1
BIC r4,r4,r1
BIC r5,r5,r1
ORR r2,r2,r0
ORR r3,r3,r0
ORR r4,r4,r0
ORR r5,r5,r0
STMIA lr!,{r2-r5}
SUB r8,r8,#4
CMP r8,#4
BGE %BT10
20
CMP r8,#0
BEQ %FT35
30
LDR r2,[lr]
BIC r2,r2,r1
ORR r2,r2,r0
STR r2,[lr],#4
SUBS r8,r8,#1
BNE %BT30
35
FRAMLDR r0,,r4 ;address of 1st page
FRAMLDR r1,,r8 ;number of pages
LDR r3,=ZeroPage
ARMop MMU_ChangingUncachedEntries,,,r3 ;flush TLB
FRAMLDR r0,,r4
FRAMLDR r1,,r8
ADD r1,r0,r1,LSL #Log2PageSize
ARMop Cache_CleanInvalidateRange,,,r3 ;flush from cache
FRAMLDR r4
FRAMLDR r8
B %FT55 ; -> moveuncacheablepagesout_L2PT (avoid pop+push of large stack frame)
; ----------------------------------------------------------------------------------
;
;AMB_moveuncacheablepagesout_L2PT
;
;updates L2PT for old logical page positions, does not update CAM
;
; entry:
; r4 = old logical address of 1st page
; r8 = number of pages
;
AMB_moveuncacheablepagesout_L2PT
ALTENTRY
55 ; Enter here from moveuncacheablepagesout
LDR lr,=L2PT
ADD lr,lr,r4,LSR #(Log2PageSize-2) ;lr -> L2PT 1st entry
......@@ -430,32 +493,25 @@ AMB_movepagesout_L2PT ROUT
MOV r7,#0
CMP r8,#8
BLT %FT20
10
BLT %FT70
60
STMIA lr!,{r0-r7} ;blam! (8 entries)
SUB r8,r8,#8
CMP r8,#8
BGE %BT10
20
BGE %BT60
70
CMP r8,#0
BEQ %FT35
30
BEQ %FT85
80
STR r0,[lr],#4
SUBS r8,r8,#1
BNE %BT30
35
[ MEMM_Type = "VMSAv6"
; In order to guarantee that the result of a page table write is
; visible, the ARMv6+ memory order model requires us to perform TLB
; maintenance (equivalent to the MMU_ChangingUncached ARMop) after we've
; performed the write. Performing the maintenance beforehand (as we've
; done traditionally) will work most of the time, but not always.
LDR r0, [sp, #4*4]
LDR r1, [sp, #8*4]
LDR r2, =ZeroPage
ARMop MMU_ChangingUncachedEntries,,,r2
]
Pull "r0-r8,pc"
BNE %BT80
85
FRAMLDR r0,,r4 ;address of 1st page
FRAMLDR r1,,r8 ;number of pages
LDR r3,=ZeroPage
ARMop MMU_ChangingUncachedEntries,,,r3 ;no cache worries, hoorah
EXIT
; ----------------------------------------------------------------------------------
;
......@@ -510,12 +566,8 @@ AMB_SetMemMapEntries ROUT
LDR r2,[r10] ;page number of 1st page
LDR r7,[r7,#CamEntriesPointer] ;r7 -> CAM
ADD r1,r7,r2,LSL #CAM_EntrySizeLog2 ;r1 -> CAM entry for 1st page
[ AMB_LimpidFreePool
LDR r4,[r1,#CAM_LogAddr] ;fetch old logical addr. of 1st page from CAM
LDR r3,[r1,#CAM_PageFlags] ;fetch old PPL of 1st page from CAM
|
LDR r4,[r1,#CAM_LogAddr] ;fetch old logical addr. of 1st page from CAM
]
CMP r5,#-1
BEQ AMB_smme_mapout
......@@ -534,24 +586,16 @@ AMB_SetMemMapEntries ROUT
;
;this should be map FreePool -> App Space then
;
MOV r0,r4 ;address of 1st page
MOV r1,r8 ;number of pages
LDR r3,=ZeroPage
ARMop MMU_ChangingUncachedEntries,,,r3 ;no cache worries, hoorah
MOV r3,r5
BL AMB_movepagesout_L2PT ;unmap 'em from where they are
BL AMB_moveuncacheablepagesout_L2PT ;unmap 'em from where they are
BL AMB_movepagesin_L2PT ;map 'em to where they now be
BL AMB_movepagesin_CAM ;keep the bloomin' soft CAM up to date
Pull "r0-r4,r7-r11, pc"
AMB_smme_mapnotlimpid
]
;
MOV r0,r4 ;address of 1st page
MOV r1,r8 ;number of pages
LDR r3,=ZeroPage
ARMop MMU_ChangingEntries,,,r3 ;
BL AMB_movecacheablepagesout_L2PT
MOV r3,r5
BL AMB_movepagesout_L2PT
BL AMB_movepagesin_L2PT
BL AMB_movepagesin_CAM
Pull "r0-r4,r7-r11, pc"
......@@ -567,14 +611,10 @@ AMB_smme_mapin
;all pages destined for same new logical page Nowhere, ie. mapping them out
;
AMB_smme_mapout
LDR r3,=DuffEntry
CMP r4,r3
LDR lr,=DuffEntry
CMP r4,lr
BEQ %FT50 ;pages already mapped out - just update CAM for new ownership
MOV r0,r4 ;address of 1st page
MOV r1,r8 ;number of pages
LDR r3,=ZeroPage
ARMop MMU_ChangingEntries,,,r3 ;
BL AMB_movepagesout_L2PT
BL AMB_movecacheablepagesout_L2PT
50
BL AMB_movepagesout_CAM
......@@ -618,24 +658,28 @@ AMB_SetMemMapEntries_SparseMapOut ROUT
CMP r3,#0
MOVEQ pc,lr
Push "r0-r11,lr"
Entry "r0-r11"
;to do this safely we need to do it in two passes
;first pass makes pages uncacheable
;second pass unmaps them
;n.b. like most of the AMB code, this assumes nothing will trigger an abort-based lazy map in operation while we're in the middle of this processing!
;
;pass one: make pages uncacheable (preserve bitmap, CAM)
;
MOV r10,r4 ;ptr to page list
LDR r2,=ZeroPage
MOV r9,#AP_Duff ;permissions for DuffEntry
LDR r7,[r2,#CamEntriesPointer] ;r7 -> CAM
MOV r4,#ApplicationStart ;log. address of first page
LDR r1,=DuffEntry ;means Nowhere, in CAM
;if the number of pages mapped in is small enough, we'll do cache/TLB coherency on
;just those pages, else global (performance decision, threshold probably not critical)
MOV r9,#-1 ;initialised to correct page flags once we find a mapped in page
;decide if we want to do TLB coherency as we go
ARMop Cache_RangeThreshold,,,r2 ;returns threshold (bytes) in r0
CMP r3,r0,LSR #Log2PageSize
MOVLO r6,#0 ;r6 := 0 if we are to do coherency as we go
BLO %FT10 ;let's do it
ARMop MMU_Changing,,,r2 ;global coherency
B %FT10
;skip next 32 pages then continue
......@@ -643,7 +687,7 @@ AMB_SetMemMapEntries_SparseMapOut ROUT
ADD r10,r10,#32*4
ADD r4,r4,#32*PageSize
;find the sparsely mapped pages, map them out, doing coherency as we go if enabled
;find the sparsely mapped pages, make them uncacheable, doing coherency as we go if enabled
10
MOV r8,#1 ;initial bitmap mask for new bitmap word
LDR r11,[r5],#4 ;next word of bitmap
......@@ -652,12 +696,82 @@ AMB_SetMemMapEntries_SparseMapOut ROUT
12
TST r11,r8 ;page is currently mapped in if bit set
BEQ %FT16
TEQ r6, #0
BNE %FT14 ;check for coherency as we go
CMP r9,#-1 ;have page flags yet?
BNE %FT14
LDR r0,[r10] ;page no.
ADD r0,r7,r0,LSL #CAM_EntrySizeLog2 ;r0 -> CAM entry for page
LDR r0,[r0,#CAM_PageFlags]
LDR r2,=ZeroPage
MOV r0,r4 ;address of page
ARMop MMU_ChangingEntry,,,r2
LDR r2,[r2,#MMU_PCBTrans]
GetTempUncache r9,r0,r2,lr
14
LDR lr,=L2PT ;lr -> L2PT
LDR r2,[lr,r4,LSR #(Log2PageSize-2)] ;L2PT entry for page
LDR r0,=TempUncache_L2PTMask
BIC r2,r2,r0
ORR r2,r2,r9
STR r2,[lr,r4,LSR #(Log2PageSize-2)] ;make uncacheable
TEQ r6, #0
BNE %FT15
LDR r2,=ZeroPage
MOV r0,r4
ARMop MMU_ChangingUncachedEntry,,,r2 ;flush TLB
MOV r0,r4
ADD r1,r0,#PageSize
ARMop Cache_CleanInvalidateRange,,,r2 ;flush from cache
15
SUBS r3,r3,#1
BEQ %FT40 ;done
16
ADD r10,r10,#4 ;next page no.
ADD r4,r4,#PageSize ;next logical address
MOVS r8,r8,LSL #1 ;if 32 bits processed...
BNE %BT12
B %BT10
40
TEQ r6, #0
BEQ %FT45
LDR r2,=ZeroPage
ARMop MMU_ChangingUncached,,,r2
ARMop Cache_CleanInvalidateAll,,,r2
45
;
;pass two: unmap pages (+ clear bitmap + update CAM)
;
FRAMLDR r3
FRAMLDR r5
FRAMLDR r10,,r4 ;ptr to page list
LDR r2,=ZeroPage
MOV r9,#AP_Duff ;permissions for DuffEntry
LDR r7,[r2,#CamEntriesPointer] ;r7 -> CAM
MOV r4,#ApplicationStart ;log. address of first page
LDR r1,=DuffEntry ;means Nowhere, in CAM
;decide if we want to do TLB coherency as we go
MOV r6, r3, LSR #5 ; r6!=0 if doing global coherency (32 entry TLB)
B %FT60
;skip next 32 pages then continue
56
ADD r10,r10,#32*4
ADD r4,r4,#32*PageSize
;find the sparsely mapped pages, map them out, doing coherency as we go if enabled
60
MOV r8,#1 ;initial bitmap mask for new bitmap word
LDR r11,[r5],#4 ;next word of bitmap
CMP r11,#0 ;if next 32 bits of bitmap clear, skip
BEQ %BT56 ;skip loop must terminate if r3 > 0
62
TST r11,r8 ;page is currently mapped in if bit set
BEQ %FT66
LDR r0,[r10] ;page no.
ADD r0,r7,r0,LSL #CAM_EntrySizeLog2 ;r0 -> CAM entry for page
ASSERT CAM_LogAddr=0
......@@ -666,14 +780,8 @@ AMB_SetMemMapEntries_SparseMapOut ROUT
LDR lr,=L2PT ;lr -> L2PT
MOV r2, #0
STR r2,[lr,r4,LSR #(Log2PageSize-2)] ;L2PT entry for page set to 0 (means translation fault)
[ MEMM_Type = "VMSAv6"
; In order to guarantee that the result of a page table write is
; visible, the ARMv6+ memory order model requires us to perform TLB
; maintenance (equivalent to the MMU_ChangingUncached ARMop) after we've
; performed the write. Performing the maintenance beforehand (as we've
; done traditionally) will work most of the time, but not always.
TEQ r6, #0
BNE %FT15
BNE %FT65
[ ZeroPage != 0
LDR r2,=ZeroPage
]
......@@ -682,36 +790,26 @@ AMB_SetMemMapEntries_SparseMapOut ROUT
[ ZeroPage != 0
MOV r2,#0
]
15
]
65
SUBS r3,r3,#1
STREQ r2,[r5,#-4] ;make sure we clear last word of bitmap, and...
BEQ %FT20 ;done
16
BEQ %FT90 ;done
66
ADD r10,r10,#4 ;next page no.
ADD r4,r4,#PageSize ;next logical address
MOVS r8,r8,LSL #1 ;if 32 bits processed...
BNE %BT12
BNE %BT62
MOV r2, #0
STR r2,[r5,#-4] ;zero word of bitmap we've just traversed
LDR r2,=ZeroPage
B %BT10
B %BT60
20
[ MEMM_Type = "VMSAv6"
; In order to guarantee that the result of a page table write is
; visible, the ARMv6+ memory order model requires us to perform TLB
; maintenance (equivalent to the MMU_ChangingUncached ARMop) after we've
; performed the write. Performing the maintenance beforehand (as we've
; done traditionally) will work most of the time, but not always.
90
TEQ r6, #0
BEQ %FT25
BEQ %FT95
LDR r2,=ZeroPage
ARMop MMU_ChangingUncached,,,r2
25
]
Pull "r0-r11,pc"
95
EXIT
; ----------------------------------------------------------------------------------
......
......@@ -26,6 +26,19 @@ DebugAborts SETL {FALSE}
GBLL UseProcessTransfer
UseProcessTransfer SETL {FALSE}
; Convert given page flags to the equivalent temp uncacheable L2PT flags
; n.b. temp not used here but included for VMSAv6 compatibility
MACRO
GetTempUncache $out, $pageflags, $pcbtrans, $temp
ASSERT DynAreaFlags_CPBits = 7*XCB_P :SHL: 10
ASSERT DynAreaFlags_NotCacheable = XCB_NC :SHL: 4
ASSERT DynAreaFlags_NotBufferable = XCB_NB :SHL: 4
AND $out, $pageflags, #DynAreaFlags_NotCacheable + DynAreaFlags_NotBufferable
ORR $out, $out, #DynAreaFlags_NotCacheable ; treat as temp uncache
LDRB $out, [$pcbtrans, $out, LSR #4] ; convert to X, C and B bits for this CPU
MEND
TempUncache_L2PTMask * L2_X+L2_C+L2_B
; MMU interface file - ARM600 version
......@@ -282,21 +295,36 @@ BangL2PT ; internal entry point used only
TST r11, #DynAreaFlags_DoublyMapped
BNE BangL2PT_sledgehammer ;if doubly mapped, don't try to be clever
;we sort out cache coherency _before_ remapping, because some ARMs might insist on
;that order (write back cache doing write backs to logical addresses)
;we need to worry about cache only if mapping out a cacheable page
;In order to safely map out a cacheable page and remove it from the
;cache, we need to perform the following process:
;* Make the page uncacheable
;* Flush TLB
;* Clean+invalidate cache
;* Write new mapping (r6)
;* Flush TLB
;For uncacheable pages we can just do the last two steps
;
TEQ r6, #0 ;EQ if mapping out
TSTEQ r11, #DynAreaFlags_NotCacheable ;EQ if also cacheable (overcautious for temp uncache+illegal PCB combos)
MOV r0, r3 ;MMU page entry address
ADR lr, %FT20
LDR r4, =ZeroPage
ARMop MMU_ChangingEntry, EQ, tailcall, r4
ARMop MMU_ChangingUncachedEntry, NE, tailcall, r4
BNE %FT20
LDR lr, [r4, #MMU_PCBTrans]
GetTempUncache r0, r11, lr
LDR lr, [r1, r3, LSR #10] ;get current L2PT entry
BIC lr, lr, #TempUncache_L2PTMask ;remove current attributes
ORR lr, lr, r0
STR lr, [r1, r3, LSR #10] ;Make uncacheable
MOV r0, r3
ARMop MMU_ChangingUncachedEntry,,, r4 ; TLB flush
MOV r0, r3
ADD r1, r3, #4096
ARMop Cache_CleanInvalidateRange,,, r4 ; Cache flush
LDR r1, =L2PT
20 STR r6, [r1, r3, LSR #10] ;update L2PT entry
Pull "pc"
Pull "lr"
MOV r0, r3
ARMop MMU_ChangingUncachedEntry,,tailcall,r4
BangL2PT_sledgehammer
......
......@@ -230,6 +230,7 @@ Analyse_ARMv3
STR a1, [v6, #Proc_Cache_CleanAll]
STR a2, [v6, #Proc_Cache_CleanInvalidateAll]
STR a2, [v6, #Proc_Cache_CleanInvalidateRange]
STR a2, [v6, #Proc_Cache_InvalidateAll]
STR a3, [v6, #Proc_DSB_ReadWrite]
STR a3, [v6, #Proc_DSB_Write]
......@@ -276,6 +277,7 @@ Analyse_WriteThroughUnified
STR a1, [v6, #Proc_Cache_CleanAll]
STR a2, [v6, #Proc_Cache_CleanInvalidateAll]
STR a2, [v6, #Proc_Cache_CleanInvalidateRange]
STR a2, [v6, #Proc_Cache_InvalidateAll]
STR a3, [v6, #Proc_DSB_ReadWrite]
STR a3, [v6, #Proc_DSB_Write]
......@@ -319,6 +321,9 @@ Analyse_WB_CR7_LDa
ADRL a1, Cache_CleanInvalidateAll_WB_CR7_LDa
STR a1, [v6, #Proc_Cache_CleanInvalidateAll]
ADRL a1, Cache_CleanInvalidateRange_WB_CR7_LDa
STR a1, [v6, #Proc_Cache_CleanInvalidateRange]
ADRL a1, Cache_CleanAll_WB_CR7_LDa
STR a1, [v6, #Proc_Cache_CleanAll]
......@@ -427,6 +432,9 @@ Analyse_WB_Crd
ADRL a1, Cache_CleanInvalidateAll_WB_Crd
STR a1, [v6, #Proc_Cache_CleanInvalidateAll]
ADRL a1, Cache_CleanInvalidateRange_WB_Crd
STR a1, [v6, #Proc_Cache_CleanInvalidateRange]
ADRL a1, Cache_CleanAll_WB_Crd
STR a1, [v6, #Proc_Cache_CleanAll]
......@@ -506,6 +514,9 @@ Analyse_WB_Cal_LD
ADRL a1, Cache_CleanInvalidateAll_WB_Cal_LD
STR a1, [v6, #Proc_Cache_CleanInvalidateAll]
ADRL a1, Cache_CleanInvalidateRange_WB_Cal_LD
STR a1, [v6, #Proc_Cache_CleanInvalidateRange]
ADRL a1, Cache_CleanAll_WB_Cal_LD
STR a1, [v6, #Proc_Cache_CleanAll]
......@@ -649,6 +660,9 @@ Analyse_WB_CR7_Lx
ADRL a1, Cache_CleanInvalidateAll_WB_CR7_Lx
STR a1, [v6, #Proc_Cache_CleanInvalidateAll]
ADRL a1, Cache_CleanInvalidateRange_WB_CR7_Lx
STR a1, [v6, #Proc_Cache_CleanInvalidateRange]
ADRL a1, Cache_CleanAll_WB_CR7_Lx
STR a1, [v6, #Proc_Cache_CleanAll]
......@@ -1337,6 +1351,45 @@ Cache_CleanInvalidateAll_WB_CR7_LDa ROUT
Pull "a2, ip"
MOV pc, lr
; a1 = start address (inclusive, cache line aligned)
; a2 = end address (exclusive, cache line aligned)
;
[ MEMM_Type = "ARM600"
Cache_CleanInvalidateRange_WB_CR7_LDa ROUT
Push "a2, a3, lr"
LDR lr, =ZeroPage
SUB a2, a2, a1
LDR a3, [lr, #DCache_RangeThreshold] ;check whether cheaper to do global clean
CMP a2, a3
BHS %FT30
ADD a2, a2, a1 ;clean end address (exclusive)
LDRB a3, [lr, #DCache_LineLen]
10
MCR p15, 0, a1, c7, c14, 1 ; clean&invalidate DCache entry
MCR p15, 0, a1, c7, c5, 1 ; invalidate ICache entry
ADD a1, a1, a3
CMP a1, a2
BLO %BT10
MOV a1, #0
MCR p15, 0, a1, c7, c10, 4 ; drain WBuffer
MCR p15, 0, a1, c7, c5, 6 ; flush branch predictors
Pull "a2, a3, pc"
;
30
Pull "a2, a3, lr"
B Cache_CleanInvalidateAll_WB_CR7_LDa
|
; Bodge for ARM11
; The OS assumes that address-based cache maintenance operations will operate
; on pages which are currently marked non-cacheable (so that we can make a page
; non-cacheable and then clean/invalidate the cache, to ensure prefetch or
; anything else doesn't pull any data for the page back into the cache once
; we've cleaned it). For ARMv7+ this is guaranteed behaviour, but prior to that
; it's implementation defined, and the ARM11 in particular seems to ignore
; address-based maintenance which target non-cacheable addresses.
; As a workaround, perform a full clean & invalidate instead
Cache_CleanInvalidateRange_WB_CR7_LDa * Cache_CleanInvalidateAll_WB_CR7_LDa
]
Cache_InvalidateAll_WB_CR7_LDa ROUT
;
......@@ -1730,6 +1783,33 @@ MMU_ChangingEntries_WB_Crd ROUT
MCR p15, 0, a1, c8, c7, 0 ;flush ITLB and DTLB
Pull "a2, a3, pc"
Cache_CleanInvalidateRange_WB_Crd ROUT
;
;same comments as MMU_ChangingEntry_WB_Crd
;
Push "a2, a3, lr"
LDR lr, =ZeroPage
SUB a2, a2, a1
LDR a3, [lr, #DCache_RangeThreshold] ;check whether cheaper to do global clean
CMP a2, a3
BHS %FT30
ADD a2, a2, a1 ;clean end address (exclusive)
LDRB a3, [lr, #DCache_LineLen]
10
MCR p15, 0, a1, c7, c10, 1 ;clean DCache entry
MCR p15, 0, a1, c7, c6, 1 ;flush DCache entry
ADD a1, a1, a3
CMP a1, a2
BLO %BT10
MCR p15, 0, a1, c7, c10, 4 ;drain WBuffer
MCR p15, 0, a1, c7, c5, 0 ;flush ICache
Pull "a2, a3, pc"
;
30
BL Cache_CleanAll_WB_Crd ;clean DCache (wrt to non-interrupt stuff)
MCR p15, 0, a1, c7, c5, 0 ;flush ICache
Pull "a2, a3, pc"
MMU_ChangingUncachedEntries_WB_Crd ROUT
CMP a2, #32 ;arbitrary-ish threshold
BHS %FT20
......@@ -2071,6 +2151,41 @@ MMU_ChangingEntries_WB_Cal_LD ROUT
CPWAIT
Pull "a2, a3, pc"
Cache_CleanInvalidateRange_WB_Cal_LD ROUT
;
;same comments as MMU_ChangingEntry_WB_Cal_LD
;
Push "a2, a3, lr"
LDR lr, =ZeroPage
SUB a2, a2, a1
LDR a3, [lr, #DCache_RangeThreshold] ;check whether cheaper to do global clean
CMP a2, a3
BHS %FT30
ADD a2, a2, a1 ;clean end address (exclusive)
LDRB a3, [lr, #DCache_LineLen]
10
MCR p15, 0, a1, c7, c10, 1 ; clean DCache entry
MCR p15, 0, a1, c7, c6, 1 ; invalidate DCache entry
[ :LNOT:XScaleJTAGDebug
MCR p15, 0, a1, c7, c5, 1 ; invalidate ICache entry
]
ADD a1, a1, a3
CMP a1, a2
BLO %BT10
MCR p15, 0, a1, c7, c10, 4 ; drain WBuffer
[ XScaleJTAGDebug
MCR p15, 0, a1, c7, c5, 0 ; invalidate ICache and BTB
|
MCR p15, 0, a1, c7, c5, 6 ; invalidate BTB
]
CPWAIT
Pull "a2, a3, pc"
;
30
Pull "a2, a3, lr"
B Cache_CleanInvalidateAll_WB_Cal_LD
MMU_ChangingUncachedEntries_WB_Cal_LD ROUT
CMP a2, #32 ; arbitrary-ish threshold
BHS %FT20
......@@ -2517,6 +2632,46 @@ MMU_ChangingEntries_WB_CR7_Lx ROUT
myISB ,a1,,y ; Ensure that the effects are visible
Pull "a2, a3, pc"
; a1 = start address (inclusive, cache line aligned)
; a2 = end address (exclusive, cache line aligned)
;
Cache_CleanInvalidateRange_WB_CR7_Lx ROUT
Push "a2, a3, lr"
LDR lr, =ZeroPage
SUB a2, a2, a1
LDR a3, [lr, #DCache_RangeThreshold] ;check whether cheaper to do global clean
CMP a2, a3
BHS %FT30
ADD a2, a2, a1 ;clean end address (exclusive)
LDRB a3, [lr, #DCache_LineLen] ; log2(line len)-2
MOV lr, #4
MOV a3, lr, LSL a3
MOV lr, a1
10
MCR p15, 0, a1, c7, c14, 1 ; clean&invalidate DCache entry to PoC
ADD a1, a1, a3
CMP a1, a2
BNE %BT10
myDSB ,a3 ; Wait for clean to complete
LDR a3, =ZeroPage
LDRB a3, [a3, #ICache_LineLen] ; Use ICache line length, just in case D&I length differ
MOV a1, #4
MOV a3, a1, LSL a3
MOV a1, lr ; Get start address back
10
MCR p15, 0, a1, c7, c5, 1 ; invalidate ICache entry to PoC
ADD a1, a1, a3
CMP a1, a2
BNE %BT10
MCR p15, 0, a1, c7, c5, 6 ; invalidate branch predictors
myDSB ,a1
myISB ,a1,,y
Pull "a2, a3, pc"
;
30
Pull "a2, a3, lr"
B Cache_CleanInvalidateAll_WB_CR7_Lx
; a1 = first page affected (page aligned address)
; a2 = number of pages
;
......@@ -2564,6 +2719,8 @@ MMU_ChangingUncachedEntries_WB_CR7_Lx ROUT
BNE %BT10
MEND
PL310Threshold * 1024*1024 ; Arbitrary threshold for full clean
Cache_CleanInvalidateAll_PL310 ROUT
; Errata 727915 workaround - use CLEAN_INV_INDEX instead of CLEAN_INV_WAY
Entry "a2-a4"
......@@ -2647,7 +2804,7 @@ Cache_InvalidateAll_PL310 ROUT
B Cache_InvalidateAll_WB_CR7_Lx
Cache_RangeThreshold_PL310 ROUT
MOV a1, #1024*1024
MOV a1, #PL310Threshold
MOV pc, lr
Cache_Examine_PL310 ROUT
......@@ -2738,98 +2895,124 @@ MMU_Changing_PL310 ROUT
myISB ,a1,,y ; Ensure that the effects are visible
EXIT
; a1 = page affected (page aligned address)
; a1 = virtual address of page affected (page aligned address)
;
MMU_ChangingEntry_PL310 ROUT
Entry "a1-a4"
; MMU_ChangingEntry_WB_CR7_Lx performs a clean & invalidate before invalidating the TLBs.
; This means we must behave in a similar way to the PL310 clean & invalidate:
Push "a1-a3,lr"
; Keep this one simple by just calling through to MMU_ChangingEntries
MOV a3, #1
B %FT10
; a1 = virtual address of first page affected (page aligned address)
; a2 = number of pages
;
MMU_ChangingEntries_PL310
Push "a1-a3,lr"
MOV a3, a2
10 ; Arrive here from MMU_ChangingEntry_PL310
myDSB ,lr ; Ensure the page table write has actually completed
myISB ,lr,,y ; Also required
; Do PL310 clean & invalidate
ADD a2, a1, a3, LSL #Log2PageSize
BL Cache_CleanInvalidateRange_PL310
; Do the MMU op
; Now that the cache is clean this is equivalent to an uncached op
Pull "a1"
MOV a2, a3
BL MMU_ChangingUncachedEntries_WB_CR7_Lx
Pull "a2-a3,pc"
; a1 = start address (inclusive, cache line aligned)
; a2 = end address (exclusive, cache line aligned)
;
Cache_CleanInvalidateRange_PL310 ROUT
Entry "a2-a4,v1"
; For simplicity, align to page boundaries
LDR a4, =PageSize-1
ADD a2, a2, a4
BIC a1, a1, a4
BIC a3, a2, a4
SUB v1, a3, a1
CMP v1, #PL310Threshold
BHS %FT90
MOV a4, a1
; Behave in a similar way to the PL310 full clean & invalidate:
; * Clean ARM
; * Clean & invalidate PL310
; * Clean & invalidate ARM (i.e. do the MMU changing op)
; * Clean & invalidate ARM
; a4 = base virtual address
; a3 = end virtual address
; v1 = length
; Convert logical addr to physical.
; Use the ARMv7 CP15 registers for convenience.
PHPSEI
MCR p15, 0, a1, c7, c8, 0 ; ATS1CPR
myISB ,a4
MRC p15, 0, a4, c7, c4, 0 ; Get result
PLP
TST a4, #1
BNE %FT50 ; Lookup failed - assume this means that the page doesn't need cleaning from the PL310
; Mask out the memory attributes that were returned by the lookup
[ NoARMT2
BIC a4, a4, #&FF
BIC a4, a4, #&F00
|
BFC a4, #0, #12
]
; Clean ARM
myDSB ,lr
myISB ,lr,,y
LDR lr, =ZeroPage
ADD a2, a1, #PageSize ;clean end address (exclusive)
LDRB a3, [lr, #DCache_LineLen] ; log2(line len)-2
LDR a1, =ZeroPage
LDR lr, [a1, #DCache_RangeThreshold] ;check whether cheaper to do global clean
CMP lr, v1
ADRLE lr, %FT30
BLE Cache_CleanAll_WB_CR7_Lx
; Clean each page in turn
LDRB a2, [a1, #DCache_LineLen] ; log2(line len)-2
MOV lr, #4
MOV a3, lr, LSL a3
10
MCR p15, 0, a1, c7, c10, 1 ; clean DCache entry to PoC
ADD a1, a1, a3
CMP a1, a2
BNE %BT10
myDSB ,a3 ; Wait for clean to complete
; Clean & invalidate PL310
LDR a2, =ZeroPage
LDR a2, [a2, #Cache_HALDevice]
LDR a2, [a2, #HALDevice_Address]
; Ensure we haven't re-entered an in-progress op
MOV a2, lr, LSL a2
20
LDR lr, [a2, #PL310_REG7_CLEAN_INV_PA]
TST lr, #1
MCR p15, 0, a4, c7, c10, 1 ; clean DCache entry to PoC
ADD a4, a4, a2
CMP a4, a3
BNE %BT20
; Clean & invalidate each line/index of the page
ADD a1, a4, #&FE0 ; last line within the page
myDSB ,a2 ; Wait for clean to complete
SUB a4, a3, v1
30
STR a4, [a2, #PL310_REG7_CLEAN_INV_PA]
; Clean & invalidate PL310
LDR a1, =ZeroPage
LDR a2, [a1, #Cache_HALDevice]
LDR a2, [a2, #HALDevice_Address]
; Ensure we haven't re-entered an in-progress op
40
LDR lr, [a2, #PL310_REG7_CLEAN_INV_PA]
TST lr, #1
BNE %BT40
TEQ a4, a1
ADD a4, a4, #1<<5 ; next index
BNE %BT30
; Clean & invalidate each line/index of the pages
50
; Convert logical addr to physical.
; Use the ARMv7 CP15 registers for convenience.
PHPSEI
MCR p15, 0, a4, c7, c8, 0 ; ATS1CPR
myISB ,a1
MRC p15, 0, a1, c7, c4, 0 ; Get result
PLP
TST a1, #1
ADD a4, a4, #PageSize
BNE %FT75 ; Lookup failed - assume this means that the page doesn't need cleaning from the PL310
; Point to last line in page, and mask out attributes returned by the
; lookup
ORR a1, a1, #&FE0
BIC a1, a1, #&01F
60
STR a1, [a2, #PL310_REG7_CLEAN_INV_PA]
70
LDR lr, [a2, #PL310_REG7_CLEAN_INV_PA]
TST lr, #1
BNE %BT70
TST a1, #&FE0
SUB a1, a1, #1<<5 ; next index
BNE %BT60
75
CMP a4, a3
BNE %BT50
; Sync
PL310Sync a2, a1
50
; Clean & invalidate ARM (+ do MMU op)
PullEnv
B MMU_ChangingEntry_WB_CR7_Lx
; a1 = first page affected (page aligned address)
; a2 = number of pages
;
MMU_ChangingEntries_PL310 ROUT
Entry "a2-a3"
; Keep this one simple and just split it into a series of per-page operations
; This will result in some unnecessary TLB invalidate & PL310 sync thrashing, so in the future a more advanced implementation might be nice.
CMP a2, #1024*1024/PageSize ; Arbitrary threshold for full clean
BHS %FT20
MOV a3, a1
10
MOV a1, a3
BL MMU_ChangingEntry_PL310
SUBS a2, a2, #1
ADD a3, a3, #PageSize
BNE %BT10
; Clean & invalidate ARM
SUB a1, a3, v1
MOV a2, a3
BL Cache_CleanInvalidateRange_WB_CR7_Lx
EXIT
20
90
; Full clean required
BL Cache_CleanInvalidateAll_PL310
MOV a1, #0
MCR p15, 0, a1, c8, c7, 0 ; invalidate ITLB and DTLB
myDSB ,a1,,y ; Wait TLB invalidation to complete
myISB ,a1,,y ; Ensure that the effects are visible
EXIT
PullEnv
B Cache_CleanInvalidateAll_PL310
; --------------------------------------------------------------------------
; ----- Generic ARMv6 and ARMv7 barrier operations -------------------------
......@@ -2867,6 +3050,8 @@ DMB_Write_ARMv7 ROUT
] ; MEMM_Type = "VMSAv6"
LTORG
; --------------------------------------------------------------------------
LookForHALCacheController ROUT
......@@ -2929,6 +3114,7 @@ KnownHALCaches ROUT
DCD HALDeviceID_CacheC_PL310
01
DCD Cache_CleanInvalidateAll_PL310
DCD Cache_CleanInvalidateRange_PL310
DCD Cache_CleanAll_PL310
DCD Cache_InvalidateAll_PL310
DCD Cache_RangeThreshold_PL310
......@@ -2985,6 +3171,7 @@ ARMopPtrTable
ARMopPtr DMB_ReadWrite
ARMopPtr DMB_Write
ARMopPtr DMB_Read
ARMopPtr Cache_CleanInvalidateRange
ARMopPtrTable_End
ASSERT ARMopPtrTable_End - ARMopPtrTable = ARMop_Max*4
......
This diff is collapsed.
......@@ -96,6 +96,7 @@
GET s.ModHand
$GetUnsqueeze
GET s.ArthurSWIs
$GetKernelMEMC
GET s.ChangeDyn
$GetHAL
GET s.Arthur2
......@@ -114,7 +115,6 @@
$GetMessages
GET s.Middle
GET s.Super1
$GetKernelMEMC
$GetMemInfo
! 0, "Main kernel size = &" :CC: :STR: (.-KernelBase)
StartOfVduDriver
......
......@@ -1184,6 +1184,8 @@ MMUon_nol1ptoverlap
MOV a3, v1
BL memset
; Flush the workspace from the cache & TLB so we can unmap it
; Really we should make the pages uncacheable first, but for simplicity we just
; do a full cache clean+invalidate later on when changing the ROM permissions
MOV a1, #4<<20
MOV a2, v1, LSR #12
ARMop MMU_ChangingEntries
......@@ -1220,15 +1222,12 @@ MMUon_nol1ptoverlap
MOV a1, a1, LSR #12
MOV a1, a1, LSL #12
29
Push "a2,a4"
MOV a3, #(AP_ROM * L2X_APMult) + L2_C + L2_B
BL Init_MapIn
; Flush & invalidate cache/TLB to ensure everything respects the new page access
; Putting a flush here also means the decompression code doesn't have to worry
; about IMB'ing the decompressed ROM
Pull "a1,a2"
MOV a2, a2, LSR #12
ARMop MMU_ChangingEntries
ARMop MMU_Changing ; Perform full clean+invalidate to ensure any lingering cache lines for the decompression workspace are gone
DebugTX "ROM access changed to read-only"
30
; Allocate the CAM
......@@ -2270,6 +2269,14 @@ ClearPhysRAM ROUT
MSR CPSR_c, #F32_bit+SVC32_mode
; Make page uncacheable so the following is safe
Push "r0-r3"
MOV r0, #L1_B
MOV r1, r10
MOV r2, #0
BL RISCOS_AccessPhysicalAddress
Pull "r0-r3"
; Clean & invalidate the cache before the 1MB window closes
[ StrongARM
; StrongARM requires special clean code, because we haven't mapped in
......
......@@ -239,22 +239,45 @@ MemoryConvert ROUT
LDR r5, [r4] ; Get L2 entry (safe as we know address is valid).
BIC r5, r5, #(L2_C+L2_B+L2_TEX) :AND: 255 ; Knock out existing attributes (n.b. assumed to not be large page!)
ORR r5, r5, lr ; Set new attributes
STR r5, [r4] ; Write back new L2 entry.
|
LDR r5, [r4] ; Get L2 entry (safe as we know address is valid).
TST r0, #cacheable_bit
BICEQ r5, r5, #L2_C ; Disable/enable cacheability.
ORRNE r5, r5, #L2_C
STR r5, [r4] ; Write back new L2 entry.
]
BNE %FT63
; Making page non-cacheable
; There's a potential interrupt hole here - many ARMs ignore cache hits
; for pages which are marked as non-cacheable (seen on XScale,
; Cortex-A53, Cortex-A15 to name but a few, and documented in many TRMs)
; We can't be certain that this page isn't being used by an interrupt
; handler, so if we're making it non-cacheable we have to take the safe
; route of disabling interrupts around the operation.
; Note - currently no consideration is given to FIQ handlers.
; Note - we clean the cache as the last step (as opposed to doing it at
; the start) to make sure prefetching doesn't pull data back into the
; cache.
PHPSEI r11 ; IRQs off
STR r5, [r4] ; Write back new L2 entry.
MOV r5, r0
ASSERT (L2PT :SHL: 10) = 0 ; Ensure we can convert r4 back to the page log addr
MOV r0, r4, LSL #10
ARMop MMU_ChangingUncachedEntry,,,r3 ; Clean TLB
MOV r0, r4, LSL #10
MOV r10, r1
ADD r1, r0, #4096
ARMop Cache_CleanInvalidateRange,,,r3 ; Clean page from cache
PLP r11 ; IRQs back on again
MOV r1, r10
B %FT65
63
; Making page cacheable again
; Shouldn't be any cache maintenance worries
STR r5, [r4] ; Write back new L2 entry.
MOV r5, r0
ASSERT (L2PT :SHL: 10) = 0 ; Ensure we can convert r4 back to the page log addr
MOV r0, r4, LSL #10
; *** KJB - this assumes that uncacheable pages still allow cache hits (true on all
; ARMs so far).
ADR lr, %FT65
ARMop MMU_ChangingEntry,EQ,tailcall,r3 ; Clean cache & TLB
ARMop MMU_ChangingUncachedEntry,NE,tailcall,r3 ; Clean TLB
ARMop MMU_ChangingUncachedEntry,,,r3 ; Clean TLB
65
MOV r0, r5
B %BT10
......
......@@ -45,6 +45,21 @@ UseProcessTransfer SETL {FALSE}
KEEP
; Convert given page flags to the equivalent temp uncacheable L2PT flags
MACRO
GetTempUncache $out, $pageflags, $pcbtrans, $temp
ASSERT DynAreaFlags_CPBits = 7*XCB_P :SHL: 10
ASSERT DynAreaFlags_NotCacheable = XCB_NC :SHL: 4
ASSERT DynAreaFlags_NotBufferable = XCB_NB :SHL: 4
AND $out, $pageflags, #DynAreaFlags_NotCacheable + DynAreaFlags_NotBufferable
AND $temp, $pageflags, #DynAreaFlags_CPBits
ORR $out, $out, #XCB_TU<<4 ; treat as temp uncacheable
ORR $out, $out, $temp, LSR #10-4
LDRB $out, [$pcbtrans, $out, LSR #4] ; convert to X, C and B bits for this CPU
MEND
TempUncache_L2PTMask * L2_B+L2_C+L2_TEX
; **************** CAM manipulation utility routines ***********************************
; **************************************************************************************
......@@ -202,24 +217,37 @@ BangL2PT ; internal entry point used only
TST r11, #DynAreaFlags_DoublyMapped
BNE BangL2PT_sledgehammer ;if doubly mapped, don't try to be clever
;we sort out cache coherency _before_ remapping, because some ARMs might insist on
;that order (write back cache doing write backs to logical addresses)
;we need to worry about cache only if mapping out a cacheable page
;In order to safely map out a cacheable page and remove it from the
;cache, we need to perform the following process:
;* Make the page uncacheable
;* Flush TLB
;* Clean+invalidate cache
;* Write new mapping (r6)
;* Flush TLB
;For uncacheable pages we can just do the last two steps
;
TEQ r6, #0 ;EQ if mapping out
TSTEQ r11, #DynAreaFlags_NotCacheable ;EQ if also cacheable (overcautious for temp uncache+illegal PCB combos)
MOV r0, r3 ;MMU page entry address
ADR lr, %FT20
LDR r4, =ZeroPage
ARMop MMU_ChangingEntry, EQ, tailcall, r4
ARMop MMU_ChangingUncachedEntry, NE, tailcall, r4
BNE %FT20
; Potentially we could just map as strongly-ordered + XN here
; But for safety just go for temp uncacheable (will retain memory type + shareability)
LDR lr, [r4, #MMU_PCBTrans]
GetTempUncache r0, r11, lr, r4
LDR lr, [r1, r3, LSR #10] ;get current L2PT entry
LDR r4, =TempUncache_L2PTMask
BIC lr, lr, r4 ;remove current attributes
ORR lr, lr, r0
STR lr, [r1, r3, LSR #10] ;Make uncacheable
LDR r4, =ZeroPage
MOV r0, r3
ARMop MMU_ChangingUncachedEntry,,, r4 ; TLB flush
MOV r0, r3
ADD r1, r3, #4096
ARMop Cache_CleanInvalidateRange,,, r4 ; Cache flush
LDR r1, =L2PT
20 STR r6, [r1, r3, LSR #10] ;update L2PT entry
; In order to guarantee that the result of a page table write is
; visible, the ARMv6+ memory order model requires us to perform TLB
; maintenance (equivalent to the MMU_ChangingUncached ARMop) after we've
; performed the write. Performing the maintenance beforehand (as we've
; done traditionally) will work most of the time, but not always.
Pull "lr"
MOV r0, r3
ARMop MMU_ChangingUncachedEntry,,tailcall,r4
......@@ -432,7 +460,7 @@ MMUControl_ModifyControl ROUT
ARMop Cache_CleanAll,,,r3
15
ARM_write_control r2
myISB ,r3 ; Must be running on >=ARMv6, so perform ISB to ensure CP15 write is complete
myISB ,lr ; Must be running on >=ARMv6, so perform ISB to ensure CP15 write is complete
BIC lr, r1, r2 ; lr = bits going from 1->0
TST lr, #MMUC_C ; if cache turning off then flush cache afterwards
TSTNE lr, #MMUC_I
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment