1. 28 Jul, 2021 3 commits
    • Jeffrey Lee's avatar
      Add AP 1 emulation for long descriptor page tables · f93d930d
      Jeffrey Lee authored
      The long descriptor page table format doesn't support RISC OS access
      privilege 1 (user RX, privileged RWX). Previously we were downgrading
      this to AP 0 (user RWX, privielged RWX), which obviously weakens the
      security of the memory. However now that we have an AbortTrap
      implementation, we can map the memory as "user none, privileged RWX" and
      provide user read support via AbortTrap's instruction decode & execute
      logic.
      
      There's no support for executing usermode code from the memory, but the
      compatibility issues caused by that are likely to be minimal.
      f93d930d
    • Jeffrey Lee's avatar
      Add abortable DA support · fccd5e2f
      Jeffrey Lee authored
      This implementation should be compatible with RISCOS Ltd's
      implementation.
      fccd5e2f
    • Jeffrey Lee's avatar
      Allow RW/ZI sections to be used · 2b665896
      Jeffrey Lee authored
      * Instruct the linker to place any RW/ZI data sections in the last ~16MB
      of the memory map, starting from &ff000000 (with the current toolchain,
      giving it a fixed base address is much easier than giving it a variable
      base address)
      * The RW/ZI section is mapped as completely inaccessible to user mode
      * The initial content of the RW section is copied over shortly after MMU
      startup (in Continue_after_HALInit)
      * Since link's -bin option produces a file containing a copy of the
      (zero-initialised) ZI section, the kernel binary is now produced from a
      "binary with AIF header" AIF with the help of the new 'kstrip' tool.
      kstrip extracts just the RO and RW sections, ensuring the ROM doesn't
      contain a redundant block of zeros for the ZI section.
      
      This should make it easier to use C code & arbitrary libraries within
      the kernel, providing they're compiled with suitable settings (e.g.
      non-module, no FP, no stack checking, like HALs typically use)
      2b665896
  2. 28 Apr, 2021 5 commits
    • Jeffrey Lee's avatar
      Support runtime selection of pagetable format · ba993cb5
      Jeffrey Lee authored
      Runtime selection between long descriptor and short descriptor page
      table format is now possible (with the decision based on whether the HAL
      registers any high RAM or not). The main source changes are as follows:
      
      * LongDesc and ShortDesc switches are in hdr.Options to control what
      kernel variant is built
      * PTOp and PTWhich macros introduced in hdr.ARMops to allow for
      invocation of functions / code blocks which are specific to the page
      table format. If the kernel is being built with only one page table
      format enabled, PTOp is just a BL instruction, ensuring there's no
      performance loss compared to the old code.
      * _LongDesc and _ShortDesc suffixes added to various function names, to
      allow both versions of the function to be included at once if runtime
      selection is enabled
      * Most of the kernel / MMU initialisation code in s.HAL is now encased
      in a big WHILE loop, allowing it to be duplicated if runtime switching
      is enabled (easier than adding dynamic branches all over the place, and
      only costs a few KB of ROM/RAM)
      * Some more functions (notably AccessPhysicalAddress,
      ReleasePhysicalAddress, and MapInIO) have been moved to s.ShortDesc /
      s.LongDesc since they were already 90% specific to page table format
      ba993cb5
    • Jeffrey Lee's avatar
      Add Service_PagesUnsafe64 & PagesSafe64 · 15a7d5ee
      Jeffrey Lee authored
      These use a page block with a 64bit address fields (matching OS_Memory
      64). The page list(s) contain the full list of pages involved in the
      operation, unlike the 32bit PagesUnsafe / PagesSafe calls, which only
      list pages which have 32bit addresses. The kernel issues the service
      calls in the following order:
      
      1. Service_PagesUnsafe64
      2. Service_PagesUnsafe
      3. Service_PagesSafe
      4. Service_PagesSafe64
      
      Since only one PagesUnsafe operation can occur at a time, a program
      which supports both service calls can safely ignore the PagesUnsafe /
      PagesSafe calls if a PagesUnsafe64 operation is in progress (the
      PagesUnsafe call will only list a subset of the pages from the
      PagesUnsafe64 call). The 32bit PagesUnsafe / PagesSafe calls will be
      skipped if no 32bit pages are being replaced.
      
      The addition of these calls means that NeedsSpecificPages DAs (and PMPs)
      can now request pages which have large physical addresses.
      
      Note that the page replacement logic now has the restriction that pages
      which have 32bit physical addresses can only be replaced by other pages
      which have 32bit physical addresses. This is necessary to ensure that
      users of the old 32bit APIs see the page replacement take place. However
      it does mean that programs will be unable to claim pages of low RAM
      which are in use if there are not enough free low RAM pages in the free
      pool.
      
      A future optimisation would be to update the service calls so that they
      don't list required pages which are in the free pool; if all the
      required pages are in the free pool this would allow the service calls
      (and FIQ claiming) to be skipped completely.
      15a7d5ee
    • Jeffrey Lee's avatar
      Add OS_Memory 64, to supersede OS_Memory 0 · d5e91a02
      Jeffrey Lee authored
      OS_Memory 64 is an extended form of OS_Memory 0 which uses 64bit
      addresses instead of 32bit. Using 64bit physical addresses allows
      conversions to/from physical addresses to be performed on pages with
      large physical addresses. Using 64bit logical addresses provides us some
      future-proofing for an AArch64 version of RISC OS, with a 64bit logical
      memory map.
      d5e91a02
    • Jeffrey Lee's avatar
      Define OS_Memory 0 page block format · 7ddbbeed
      Jeffrey Lee authored
      Add to s.ChangeDyn a definition of the OS_Memory 0 page block format,
      and update all relevant code to use those definitions instead of
      hardcoded offsets.
      7ddbbeed
    • Jeffrey Lee's avatar
      Support RAM banks with high physical addresses · df4efb68
      Jeffrey Lee authored
      This changes PhysRamTable to store the address of each RAM bank in terms
      of (4KB) pages instead of bytes, effectively allowing it to support a 44
      bit physical address space. This means that (when the long descriptor
      page table format is used) the OS can now make use of memory located
      outside the lower 4GB of the physical address space. However some
      public APIs still need extending to allow for all operations to be
      supported on high RAM (e.g. OS_Memory logical to physical address
      lookups)
      
      OS_Memory 12 (RecommendPage) has been extended to allow R4-R7 to be used
      to specify a (64bit) physical address range which the recommended pages
      must lie within. For backwards compatibility this defaults to 0-4GB.
      df4efb68
  3. 17 Mar, 2021 2 commits
    • Jeffrey Lee's avatar
      Remove CAM size limit · 79bc3343
      Jeffrey Lee authored
      Previously the CAM sat inside a fixed 16MB window, restricting it to
      storing the details of 1 million pages, i.e. 4GB of RAM. Shuffle things
      around a bit to allow this restriction to be removed: the CAM is now
      located just above the IO region, and the CAM start address /
      IO top will calculated appropriately during kernel init. This change
      paves the way for us to support machines with over 4GB of RAM.
      
      FixedAreasTable has also been removed, since it's no longer really
      necessary (DAs can only be created between the top of application space
      and the bottom of the used IO space, and it's been a long time since
      we've had any fixed bits in the middle of there)
      79bc3343
    • Jeffrey Lee's avatar
      Initial long descriptor support · b51b5540
      Jeffrey Lee authored
      This adds initial support for the "long descriptor" MMU page table
      format, which allows the CPU to (flexibly) use a 40-bit physical address
      space.
      
      There are still some features that need fixing (e.g. RISCOS_MapInIO
      flags), and the OS doesn't yet support RAM above the 32bit limit, but
      this set of changes is enough to allow for working ROMs to be produced.
      
      Also, move MMUControlSoftCopy initialisation out of ClearWkspRAM, since
      it's unrelated to whether the HAL has cleared the RAM or not.
      b51b5540
  4. 13 Feb, 2021 4 commits
    • Jeffrey Lee's avatar
      [RISCOS_]AccessPhysicalAddress uses page flags · 7924aae2
      Jeffrey Lee authored
      Currently RISCOS_AccessPhysicalAddress allows the caller to specify the
      permissions/properties of the mapped memory by directly specifying some
      of the L1 page table entry flags. This will complicate things when
      adding support for more page table formats, so change it so that
      standard RISC OS page flags are used instead (like the alternate entry
      point, RISCOS_AccessPhysicalAddressUnchecked, already uses).
      
      Also, drop the "RISCOS_" prefix from RISCOS_AccessPhysicalAddress and
      RISCOS_ReleasePhysicalAddress, and remove the references to these
      routines from the HAL docs. These routines have never been exposed to
      the HAL, so renaming them and removing them from the docs should make
      their status clearer.
      
      Version 6.52. Tagged as 'Kernel-6_52'
      7924aae2
    • Jeffrey Lee's avatar
      OS_FindMemMapEntries now uses logical_to_physical · b60d3a70
      Jeffrey Lee authored
      Reduce the number of routines which directly examine the page tables, by
      changing OS_FindMemMapEntries to use logical_to_physical.
      b60d3a70
    • Jeffrey Lee's avatar
      Start moving page table code into s.ShortDesc · ca69793c
      Jeffrey Lee authored
      In preparation for the addition of long descriptor page table support,
      start moving low-level page table routines into their own file
      (s.ShortDesc) so that we can add a corresponding long descriptor
      implementation in the future.
      
      * logical_to_physical, MakePageTablesCacheable,
      MakePageTablesNonCacheable, AllocateBackingLevel2, AMB_movepagesin_L2PT,
      AMB_movecacheablepagesout_L2PT, AMB_moveuncacheablepagesout_L2PT
      routines, and PageNumToL2PT macros, all moved to s.ShortDesc with no
      changes.
      * Add new UpdateL1PTForPageReplacement routine (by splitting some code
      out of s.ChangeDyn)
      ca69793c
    • Jeffrey Lee's avatar
      Prepare logical_to_physical for 64bit phys addrs · 4fd2dd01
      Jeffrey Lee authored
      ppn_to_physical, logical_to_physical, physical_to_ppn & ppn_to_physical
      have now all been changed to accept/receive 64bit physical addresses in
      R8,R9 instead of a 32bit address in R5. However, where a phys addr is
      being provided as an input, they may currently only pay attention to the
      bottom 32 bits of the address.
      4fd2dd01
  5. 16 Jan, 2021 1 commit
    • Jeffrey Lee's avatar
      Make supervisor stack inaccessible to user mode · bbc7ad20
      Jeffrey Lee authored
      Previously the supervisor stack was read-only in user mode, but since
      the supervisor stack is typically empty when the CPU is in user mode,
      it's questionable whether any software actually makes use of this
      facility.
      
      To simplify support for the long descriptor page table format (which
      doesn't support the user-RO + privileged-RW access mode), let's
      try and remove usermode SVC stack access completely.
      
      Tested on Raspberry Pi 4
      
      Version 6.48. Tagged as 'Kernel-6_48'
      bbc7ad20
  6. 23 Nov, 2020 1 commit
    • Julie Stamp's avatar
      Increase RamFS limit to 2GB · a81fa868
      Julie Stamp authored
      Detail:
      RamFS now supports disc sizes up to 2GB-4KB, so raise the dynamic area limit from 508MB.
      
      Admin:
      Tested with a disc size up to 928MB
      
      Version 6.46. Tagged as 'Kernel-6_46'
      a81fa868
  7. 19 Sep, 2020 1 commit
    • Jeffrey Lee's avatar
      OS_DynamicArea 22 fixes · 88219988
      Jeffrey Lee authored
      Multiple fixes, mostly related to error handling.
      
      1. Ensure R1 is initialised correctly when generating BadPageNumber
      errors (labels 94 & 95). Generally this involves setting it to zero to
      indicate that no call to LogOp_MapOut is required. Failing to do this
      would typically result in a crash.
      2. When branching back to the start of the loop after calling
      GetNonReservedPage, ensure R0 is reset to zero. Failing to do this would
      have a performance impact on LogOp_MapOut, but shouldn't be fatal.
      3. In the main routine, postpone writing back DANode_Size until after
      the call to physical_to_ppn (because we may decide to abort the op
      and return an error without moving a page).
      4. Fix stack offset when accessing PMPLogOp_GlobalTBLFlushNeeded.
      Getting this wrong could potentially result in some TLB maintenance
      being skipped when moving uncacheable pages.
      5. Fix stack imbalance at label 94
      
      Version 6.43. Tagged as 'Kernel-6_43'
      88219988
  8. 01 Jul, 2020 6 commits
    • Jeffrey Lee's avatar
      Add missing AMBControl appspace shrink check · 0634b535
      Jeffrey Lee authored
      Fix GrowFreePoolFromAppSpace (i.e. appspace shrink operation) to issue
      UpCall_MemoryMoving / Service_Memory when attempting to shrink PMP-based
      appspace (i.e. AMBControl nodes). This fixes (e.g.) BASIC getting stuck
      in an abort loop if you try and use OS_ChangeDynamicArea to grow the
      free pool.
      
      Version 6.40. Tagged as 'Kernel-6_40'
      0634b535
    • Jeffrey Lee's avatar
      Fix PMP appspace size check · 5014117d
      Jeffrey Lee authored
      Fix AreaGrow to read appspace size correctly when appspace is a PMP
      (i.e. an AMBControl node). Reading DANode_Size will only report the
      amount of memory currently paged in (e.g. by lazy task swapping),
      causing AreaGrow to underestimate how much it can potentially take from
      the area.
      5014117d
    • Jeffrey Lee's avatar
      Fix GrowFreePool · 1a3c927f
      Jeffrey Lee authored
      Buggy since its introduction in Kernel-5_35-4_79_2_284, GrowFreePool was
      attempting to grow the free pool by shrinking application space, an
      operation which OS_ChangeDynamicArea doesn't support. Change it to grow
      the free pool instead, and fix a couple of other issues that would have
      caused it to work incorrectly (register corruption causing it to request
      a size change of zero, and incorrect assumption that
      OS_ChangeDynamicArea returns the amount unmoved, when really it returns
      the amount moved)
      1a3c927f
    • Jeffrey Lee's avatar
      Fix combined freepool + appspace shrink · 76d04b25
      Jeffrey Lee authored
      When a DA tries to grow by more than the free pool size, the kernel
      should try to take the necessary remaining amount from application
      space. Historically this was handled as a combined "take from freepool
      and appspace" operation, but with Kernel-5_35-4_79_2_284 this was
      changed to use a nested call to OS_ChangeDynamicArea, so first appspace
      is shrunk into the free pool and then the target DA is grown using just
      the free pool.
      
      However the code was foolishly trying to use ChangeDyn_AplSpace as the
      argument to OS_ChangeDynamicArea, which that call doesn't recognise as a
      valid DA number. Change it to use ChangeDyn_FreePool ("grow free pool
      from appspace"), and also fix up a stack imbalance that would have
      caused it to misbehave regardless of the outcome.
      76d04b25
    • Jeffrey Lee's avatar
      Fix shrinkables check in AreaGrow · 86fe0712
      Jeffrey Lee authored
      TryToShrinkShrinkables_Bytes expected both R1 and R2 to be byte counts,
      but AreaGrow was calling with R1 as a byte count and R2 as a page count.
      This would have caused it to request the first-found shrinkable to
      shrink more than necessary, and also confuse the rest of AreaGrow when
      the page-based R2 result of TryToShrinkShrinkables gets converted to a
      byte count (when AreaGrow wants it as a page count)
      86fe0712
    • Jeffrey Lee's avatar
      Fix Service_Memory when shrinking appspace · 323e88c6
      Jeffrey Lee authored
      Update AreaShrink so that (when shrinking appspace) CheckAppSpace is
      passed the change amount as a negative number, so that Service_Memory is
      issued with the correct sign.
      
      Fixes issue reported on the forums, where BASIC was getting confused
      because appspace shrinks were being reported as if they were a grow
      operation:
      
      https://www.riscosopen.org/forum/forums/4/topics/15067
      
      It looks like this bug was introduced in Kernel-5_35-4_79_2_284
      (introduction of PMPs), where the logic for appspace shrinks (which must
      be performed via a grow of the free pool or some other DA) were moved
      out of AreaGrow and into AreaShrink (because appspace shrinks are now
      internally treated as "shrink appspace into free pool")
      323e88c6
  9. 27 Feb, 2020 1 commit
  10. 12 Feb, 2020 3 commits
    • Jeffrey Lee's avatar
      Fixes for zero-size PMPs · 0830af41
      Jeffrey Lee authored
      OS_DynamicArea 21, 22 & 25 were using the value of the PMP page list
      pointer (DANode_PMP) to determine whether the dynamic area is a PMP or
      not. However, PMPs which have had their max physical size set to zero
      will don't have the page list allocated, which will cause the test to
      fail. Normally this won't matter (those calls can't do anything useful
      when used on PMPs with zero max size), except for the edge case of where
      the SWI has been given a zero-length page list as input. By checking the
      value of DANode_PMP, this would result in the calls incorrectly
      returning an error.
      
      Fix this by having the code check the DA flags instead. Also, add a
      check to OS_DynamicArea 23 (PMP resize), otherwise non-PMP DAs could end
      up having page lists allocated for them.
      0830af41
    • Jeffrey Lee's avatar
      Fix stack imbalance in DA release · 3a26f20e
      Jeffrey Lee authored
      In OS_DynamicArea 2, a stack imbalance would occur if an error is
      encountered while releasing the physical pages of a PMP (R1-R8 pushed,
      but only R1-R7 pulled). Fix it, but also don't bother storing R1, since
      it's never modified.
      3a26f20e
    • Jeffrey Lee's avatar
      PMP LogOp_MapOut fixes · a4ab6171
      Jeffrey Lee authored
      * Fix caching of page table entry flags (was never updating R9, so the
      flags would be recalculated for every page)
      * Fix use of flag in bottom bit of R6; if the flag was set, the
      early-exit case for having made all the cacheable pages uncacheable will
      never be hit, forcing it to loop through the full page list instead
      a4ab6171
  11. 18 Jan, 2020 1 commit
    • Jeffrey Lee's avatar
      Fix OS_DynamicArea 21 handling of MaxCamEntry · 5f7b9b37
      Jeffrey Lee authored
      OS_DynamicArea 21 was treating MaxCamEntry as if it was the exclusive
      upper bound, when really it's the inclusive bound. The consequence of
      this was that PMPs were unable to explicitly claim the highest-numbered
      RAM page in the system.
      
      Version 6.31. Tagged as 'Kernel-6_31'
      5f7b9b37
  12. 24 Nov, 2019 1 commit
    • Jeffrey Lee's avatar
      Add OS_DynamicArea 27+28, for supporting lots of RAM · 9224a6ca
      Jeffrey Lee authored
      OS_DynamicArea 27 is the same as OS_DynamicArea 5 ("return free
      memory"), except the result is measured in pages instead of bytes,
      allowing it to behave sensibly on machines with many gigabytes of RAM.
      
      Similarly, OS_DynamicArea 28 is the same as OS_DynamicArea 7 (internal
      DA enumeration call used by TaskManager), except the returned size
      values are measured in pages instead of bytes. A flags word has also
      been added to allow for more expansion in the future.
      
      Hdr:OSMem now also contains some more definitions which external code
      will find useful.
      
      Version 6.29. Tagged as 'Kernel-6_29'
      9224a6ca
  13. 19 Nov, 2019 1 commit
    • Jeffrey Lee's avatar
      Allow reservation of memory pages · 1f84ad9f
      Jeffrey Lee authored
      This change adds a new OS_Memory reason code, 23, for reserving memory
      without actually assigning it to a dynamic area. Other dynamic areas can
      still use the memory, but only the code that reserved it will be allowed
      to claim exclusive use over it (i.e. PageFlags_Unavailable).
      
      This is useful for systems such as the PCI heap, where physically
      contiguous memory is required, but the memory isn't needed all of the
      time. By reserving the pages, it allows other regular DAs to make use of
      the memory when the PCI heap is small. But when the PCI heap needs to
      grow, it guarantees that (if there's enough free memory in the system)
      the previously reserved pages can be allocated to the PCI heap.
      
      Notes:
      
      * Reservations are handled on an honour system; there's no checking that
      the program that reserved the memory is the one attempting to map it in.
      
      * For regular NeedsSpecificPages DAs, reserved pages can only be used if
      the special "RESV" R0 return value is used.
      
      * For PMP DAs, reserved pages can only be made Unavailable if the entry
      in the page block also specifies the Reserved page flag. The actual
      state of the Reserved flag can't be modified via PMP DA ops, the flag is
      only used to indicate the caller's permission/intent to make the page
      Unavailable.
      
      * If a PMP DA tries to make a Reserved page Unavailable without
      specifying the Reserved flag, the kernel will try to swap it out for a
      replacement page taken from the free pool (preserving the contents and
      generating Service_PagesUnsafe / Service_PagesSafe, as if another DA
      had claimed the page)
      
      Version 6.28. Tagged as 'Kernel-6_28'
      1f84ad9f
  14. 30 Sep, 2019 1 commit
    • Jeffrey Lee's avatar
      Allow runtime adjustment of AplWorkMaxSize · 0aeea07f
      Jeffrey Lee authored
      Detail:
      This adds a new OS_DynamicArea reason code, 26, for adjusting
      AplWorkMaxSize at runtime. This allows compatibility tools such as
      Aemulor to adjust the limit without resorting to patching the kernel.
      Any adjustment made to the value will affect the upper limit of
      application space, and the lower limit of dynamic area placement.
      Attempting to adjust beyond the compile-time upper/default limit, or
      such that it will interfere with existing dynamic areas / wimpslots,
      will result in an error.
      
      Relevant forum thread:
      https://www.riscosopen.org/forum/forums/11/topics/14734
      
      Admin:
      Tested on BB-xM, desktop active & inactive
      
      Version 6.24. Tagged as 'Kernel-6_24'
      0aeea07f
  15. 16 Aug, 2019 2 commits
    • Ben Avison's avatar
      Support supersection-mapped memory in OS_Memory 24 · bd294cf9
      Ben Avison authored
      To achieve this:
      * DecodeL1Entry and DecodeL2Entry return 64-bit physical addresses in
        r0 and r1, with additional return values shuffled up to r2 and r3
      * DecodeL1Entry now returns the section size, so callers can distinguish
        section- from supersection-mapped memory
      * PhysAddrToPageNo now accepts a 64-bit address (though since the physical
        RAM table is currently still all 32-bit, it will report any top-word-set
        addresses as being not in RAM)
      
      Version 6.22. Tagged as 'Kernel-6_22'
      bd294cf9
    • Ben Avison's avatar
      Support temporary mapping of IO above 4GB using supersections · 96913c1f
      Ben Avison authored
      Add a new reason code, OS_Memory 22, equivalent to OS_Memory 14, but
      accepting a 64-bit physical address in r1/r2. Current ARM architectures can
      only express 40-bit or 32-bit physical addresses in their page tables
      (depending on whether they feature the LPAE extension or not) so unlike
      OS_Memory 14, OS_Memory 22 can return an error if an invalid physical
      address has been supplied. OS_Memory 15 should still be used to release a
      temporary mapping, whether you claimed it using OS_Memory 14 or OS_Memory 22.
      
      The logical memory map has had to change to accommodate supersection mapping
      of the physical access window, which needs to be 16MB wide and aligned to a
      16MB boundary. This results in there being 16MB less logical address space
      available for dynamic areas on all platforms (sorry) and there is now a 1MB
      hole spare in the system address range (above IO).
      
      The internal function RISCOS_AccessPhysicalAddress has been changed to
      accept a 64-bit physical address. This function has been a candidate for
      adding to the kernel entry points from the HAL for a long time - enough that
      it features in the original HAL documentation - but has not been so added
      (at least not yet) so there are no API compatibility issues there.
      
      Requires RiscOS/Sources/Programmer/HdrSrc!2
      96913c1f
  16. 30 Jun, 2018 1 commit
    • ROOL's avatar
      Simplify initial AplSpace claim · 526764e1
      ROOL authored
      Detail:
        As the application slot is now a normal dynamic area, there's no need to manipulate the CAM directly. Convert FudgeSomeAppSpace into a OS_ChangeDynamicArea SWI followed by memset().
        ChangeDyn.s: Offset by 32k to account for the -32k that dynamic area -1 has.
        NewReset.s: Delete FudgeSomeAppSpace and replace as above.
      Admin:
        Submission from Timothy Baldwin.
      
      Version 6.08. Tagged as 'Kernel-6_08'
      526764e1
  17. 14 Apr, 2018 1 commit
    • Jeffrey Lee's avatar
      Fix ability for PMPs to claim specific pages · 5e3e9d38
      Jeffrey Lee authored
      Detail:
        s/ChangeDyn - Due to the way that some page flags map to the same bits as (different) DA flags, the Batcall that PMP_PreGrow makes in order to claim the requested page was getting confused and thinking that the special DMA PreGrow handler should be used instead of the DA-specific one (which in this case is a custom one responsible for claiming the right page). Modify PMP_PreGrow so that it only supplies DA flags to the Batcall, and patches in any custom page flags afterwards.
        Also swap magic number for appropriate symbol in PMPGrowHandler.
      Admin:
        Tested on BB-xM
        Fixes CAM corruption when a PMP claims a specific page, due to the PMP code and DA code disagreeing about which page should be used
      
      
      Version 6.00. Tagged as 'Kernel-6_00'
      5e3e9d38
  18. 11 Jan, 2017 1 commit
  19. 13 Dec, 2016 4 commits
    • Jeffrey Lee's avatar
      Implement support for cacheable pagetables · 65fa6a28
      Jeffrey Lee authored
      Detail:
        Modern ARMs (ARMv6+) introduce the possibility for the page table walk hardware to make use of the data cache(s) when performing memory accesses. This can significantly reduce the cost of a TLB miss on the system, and since the accesses are cache-coherent with the CPU it allows us to make the page tables cacheable for CPU (program) accesses also, improving the performance of page table manipulation by the OS.
        Even on ARMs where the page table walk can't use the data cache, it's been measured that page table manipulation operations can still benefit from placing the page tables in write-through or bufferable memory.
        So with that in mind, this set of changes updates the OS to allow cacheable/bufferable page tables to be used by the OS + MMU, using a system-appropriate cache policy.
        File changes:
        - hdr/KernelWS - Allocate workspace for storing the page flags that are to be used by the page tables
        - hdr/OSMem - Re-specify CP_CB_AlternativeDCache as having a different behaviour on ARMv6+ (inner write-through, outer write-back)
        - hdr/Options - Add CacheablePageTables option to allow switching back to non-cacheable page tables if necessary. Add SyncPageTables var which will be set {TRUE} if either the OS or the architecture requires a DSB after writing to a faulting page table entry.
        - s/ARM600, s/VMSAv6 - Add new SetTTBR & GetPageFlagsForCacheablePageTables functions. Update VMSAv6 for wider XCBTable (now 2 bytes per element)
        - s/ARMops - Update pre-ARMv7 MMU_Changing ARMops to drain the write buffer on entry if cacheable pagetables are in use (ARMv7+ already has this behaviour due to architectural requirements). For VMSAv6 Normal memory, change the way that the OS encodes the cache policy in the page table entries so that it's more compatible with the encoding used in the TTBR.
        - s/ChangeDyn - Update page table page flag handling to use PageTable_PageFlags. Make use of new PageTableSync macro.
        - s/Exceptions, s/AMBControl/memmap - Make use of new PageTableSync macro.
        - s/HAL - Update MMU initialisation sequence to make use of PageTable_PageFlags + SetTTBR
        - s/Kernel - Add PageTableSync macro, to be used after any write to a faulting page table entry
        - s/MemInfo - Update OS_Memory 0 page flag conversion. Update OS_Memory 24 to use new symbol for page table access permissions.
        - s/MemMap2 - Use PageTableSync. Add routines to enable/disable cacheable pagetables
        - s/NewReset - Enable cacheable pagetables once we're fully clear of the MMU initialision sequence (doing earlier would be trickier due to potential double-mapping)
      Admin:
        Tested on pretty much everything currently supported
        Delivers moderate performance benefits to page table ops on old systems (e.g. 10% faster), astronomical benefits on some new systems (up to 8x faster)
        Stats: https://www.riscosopen.org/forum/forums/3/topics/2728?page=2#posts-58015
      
      
      Version 5.71. Tagged as 'Kernel-5_71'
      65fa6a28
    • Jeffrey Lee's avatar
      Make MMU_Changing ARMops perform the sub-operations in a sensible order · 9a96263a
      Jeffrey Lee authored
      Detail:
        For a while we've known that the correct way of doing cache maintenance on ARMv6+ (e.g. when converting a page from cacheable to non-cacheable) is as follows:
        1. Write new page table entry
        2. Flush old entry from TLB
        3. Clean cache + drain write buffer
        The MMU_Changing ARMops (e.g. MMU_ChangingEntry) implement the last two items, but in the wrong order. This has caused the operations to fall out of favour and cease to be used, even in pre-ARMv6 code paths where the effects of improper cache/TLB management perhaps weren't as readily visible.
        This change re-specifies the relevant ARMops so that they perform their sub-operations in the correct order to make them useful on modern ARMs, updates the implementations, and updates the kernel to make use of the ops whereever relevant.
        File changes:
        - Docs/HAL/ARMop_API - Re-specify all the MMU_Changing ARMops to state that they are for use just after a page table entry has been changed (as opposed to before - e.g. 5.00 kernel behaviour). Re-specify the cacheable ones to state that the TLB invalidatation comes first.
        - s/ARM600, s/ChangeDyn, s/HAL, s/MemInfo, s/VMSAv6, s/AMBControl/memmap - Replace MMU_ChangingUncached + Cache_CleanInvalidate pairs with equivalent MMU_Changing op
        - s/ARMops - Update ARMop implementations to do everything in the correct order
        - s/MemMap2 - Update ARMop usage, and get rid of some lingering sledgehammer logic from ShuffleDoublyMappedRegionForGrow
      Admin:
        Tested on pretty much everything currently supported
      
      
      Version 5.70. Tagged as 'Kernel-5_70'
      9a96263a
    • Jeffrey Lee's avatar
      Place restrictions on the use of cacheable doubly-mapped DAs · 2704c756
      Jeffrey Lee authored
      Detail:
        The kernel has always allowed software to create cacheable doubly-mapped DAs, despite the fact that the VIVT caches used on ARMv5 and below would have no way of keeping both of the mappings coherent
        This change places restrictions the following restrictions on doubly-mapped areas, to ensure that cache settings which can't be supported by the cache architecture of the CPU can't be selected:
        * On ARMv6 and below, cacheable doubly-mapped areas aren't supported.
          * Although ARMv6 has VIPT data caches, it's also subject to page colouring constraints which would require us to force the DA size to be a multiple of 16k. So for now keep things simple and disallow cacheable doubly-mapped areas on ARMv6.
        * On ARMv7 and above, cacheable doubly-mapped areas are allowed, but only if they are marked non-executable
          * The blocker to allowing executable cacheable doubly-mapped areas are the VIPT instruction caches; OS_SynchroniseCodeAreas (or callers of it) would need to know that a doubly-mapped area is in use so that they can flush both mappings from the I-cache. Although some chips do have PIPT instruction caches, again it isn't really worth supporting executable cacheable doubly-mapped areas at the moment.
        These changes also allow us to get rid of the expensive 'sledgehammer' logic when dealing with doubly-mapped areas
        File changes:
        - s/ARM600, s/VMSAv6 - Remove the sledgehammer logic, only perform cache/TLB maintenance for the required areas
        - s/ChangeDyn - Implement the required checks
        - s/MemMap2 - Move some cache maintenance logic into RemoveCacheabilityR0ByMinusR2, which previously would have had to be performed by the caller due to the sledgehammer paranoia
      Admin:
        Cacheable doubly-mapped DAs tested on iMx6 (tried making screen memory write-through cacheable; decent performance gain seen)
        Note OS_Memory 0 "make temporarily uncacheable" doesn't work on doubly-mapped areas, so cacheable doubly-mapped areas are not yet safe for general DMA
      
      
      Version 5.69. Tagged as 'Kernel-5_69'
      2704c756
    • Jeffrey Lee's avatar
      Make s/ChangeDyn slightly more readable by splitting some routines out into a separate file · 4a6150dc
      Jeffrey Lee authored
      Detail:
        s/MemMap2 - New file containing assorted low-level memory mapping routines taken from s/ChangeDyn. N.B. There's no special significance to this being named "MemMap2", it's just a name that stuck due to some earlier (abandoned) changes which added a file named "MemMap".
        s/ChangeDyn - Remove the routines/chunks of code that were moved to s/MemMap2. Also some duplicate code removal (Regular DA grow code and DoTheGrowNotSpecified are now rely on the new DoTheGrowCommon routine for doing the actual grow)
        s/GetAll - GET s/MemMap2 at an appropriate time
      Admin:
        Tested on pretty much everything currently supported
      
      
      Version 5.67. Tagged as 'Kernel-5_67'
      4a6150dc