1. 24 Nov, 2019 1 commit
    • Jeffrey Lee's avatar
      Add OS_DynamicArea 27+28, for supporting lots of RAM · 9224a6ca
      Jeffrey Lee authored
      OS_DynamicArea 27 is the same as OS_DynamicArea 5 ("return free
      memory"), except the result is measured in pages instead of bytes,
      allowing it to behave sensibly on machines with many gigabytes of RAM.
      Similarly, OS_DynamicArea 28 is the same as OS_DynamicArea 7 (internal
      DA enumeration call used by TaskManager), except the returned size
      values are measured in pages instead of bytes. A flags word has also
      been added to allow for more expansion in the future.
      Hdr:OSMem now also contains some more definitions which external code
      will find useful.
      Version 6.29. Tagged as 'Kernel-6_29'
  2. 19 Nov, 2019 1 commit
    • Jeffrey Lee's avatar
      Allow reservation of memory pages · 1f84ad9f
      Jeffrey Lee authored
      This change adds a new OS_Memory reason code, 23, for reserving memory
      without actually assigning it to a dynamic area. Other dynamic areas can
      still use the memory, but only the code that reserved it will be allowed
      to claim exclusive use over it (i.e. PageFlags_Unavailable).
      This is useful for systems such as the PCI heap, where physically
      contiguous memory is required, but the memory isn't needed all of the
      time. By reserving the pages, it allows other regular DAs to make use of
      the memory when the PCI heap is small. But when the PCI heap needs to
      grow, it guarantees that (if there's enough free memory in the system)
      the previously reserved pages can be allocated to the PCI heap.
      * Reservations are handled on an honour system; there's no checking that
      the program that reserved the memory is the one attempting to map it in.
      * For regular NeedsSpecificPages DAs, reserved pages can only be used if
      the special "RESV" R0 return value is used.
      * For PMP DAs, reserved pages can only be made Unavailable if the entry
      in the page block also specifies the Reserved page flag. The actual
      state of the Reserved flag can't be modified via PMP DA ops, the flag is
      only used to indicate the caller's permission/intent to make the page
      * If a PMP DA tries to make a Reserved page Unavailable without
      specifying the Reserved flag, the kernel will try to swap it out for a
      replacement page taken from the free pool (preserving the contents and
      generating Service_PagesUnsafe / Service_PagesSafe, as if another DA
      had claimed the page)
      Version 6.28. Tagged as 'Kernel-6_28'
  3. 30 Sep, 2019 1 commit
    • Jeffrey Lee's avatar
      Allow runtime adjustment of AplWorkMaxSize · 0aeea07f
      Jeffrey Lee authored
      This adds a new OS_DynamicArea reason code, 26, for adjusting
      AplWorkMaxSize at runtime. This allows compatibility tools such as
      Aemulor to adjust the limit without resorting to patching the kernel.
      Any adjustment made to the value will affect the upper limit of
      application space, and the lower limit of dynamic area placement.
      Attempting to adjust beyond the compile-time upper/default limit, or
      such that it will interfere with existing dynamic areas / wimpslots,
      will result in an error.
      Relevant forum thread:
      Tested on BB-xM, desktop active & inactive
      Version 6.24. Tagged as 'Kernel-6_24'
  4. 16 Aug, 2019 2 commits
    • Ben Avison's avatar
      Support temporary mapping of IO above 4GB using supersections · 96913c1f
      Ben Avison authored
      Add a new reason code, OS_Memory 22, equivalent to OS_Memory 14, but
      accepting a 64-bit physical address in r1/r2. Current ARM architectures can
      only express 40-bit or 32-bit physical addresses in their page tables
      (depending on whether they feature the LPAE extension or not) so unlike
      OS_Memory 14, OS_Memory 22 can return an error if an invalid physical
      address has been supplied. OS_Memory 15 should still be used to release a
      temporary mapping, whether you claimed it using OS_Memory 14 or OS_Memory 22.
      The logical memory map has had to change to accommodate supersection mapping
      of the physical access window, which needs to be 16MB wide and aligned to a
      16MB boundary. This results in there being 16MB less logical address space
      available for dynamic areas on all platforms (sorry) and there is now a 1MB
      hole spare in the system address range (above IO).
      The internal function RISCOS_AccessPhysicalAddress has been changed to
      accept a 64-bit physical address. This function has been a candidate for
      adding to the kernel entry points from the HAL for a long time - enough that
      it features in the original HAL documentation - but has not been so added
      (at least not yet) so there are no API compatibility issues there.
      Requires RiscOS/Sources/Programmer/HdrSrc!2
    • Ben Avison's avatar
      Support permanent mapping of IO above 4GB using supersections · 9024d1f6
      Ben Avison authored
      This is facilitated by two extended calls. From the HAL:
      * RISCOS_MapInIO64 allows the physical address to be specified as 64-bit
      From the OS:
      * OS_Memory 21 acts like OS_Memory 13, but takes a 64-bit physical address
      There is no need to extend RISCOS_LogToPhys, instead we change its return
      type to uint64_t. Any existing HALs will only read the a1 register, thereby
      narrowing the result to 32 bits, which is fine because all existing HALs
      only expected a 32-bit physical address space anyway.
      Internally, RISCOS_MapInIO has been rewritten to detect and use supersections
      for IO regions that end above 4GB. Areas that straddle the 4GB boundary should
      also work, although if you then search for a sub-area that doesn't, it won't
      find a match and will instead map it in again using vanilla sections - this is
      enough of an edge case that I don't think we need to worry about it too much.
      The rewrite also conveniently fixes a bug in the old code: if the area being
      mapped in went all the way up to physical address 0xFFFFFFFF (inclusive) then
      only the first megabyte of the area was actually mapped in due to a loop
      termination issue.
      Requires RiscOS/Sources/Programmer/HdrSrc!2
  5. 19 Aug, 2017 1 commit
    • Jeffrey Lee's avatar
      Add a compatibility page zero for high processor vectors / zero page relocation builds · ffac5791
      Jeffrey Lee authored
        When HiProcVecs is enabled, there will now be a read-only page located at &0 in order to ease compatibility with buggy software which reads from null pointers
        Although most of the page is zero-filled, the start of the page contains a few words which are invalid pointers, discouraging dereferencing them, and a warning message if the memory is interpreted as a string.
        On ARMv6+ the page is also made non-executable, to deal with branch-through-zero type situations
        OS_Memory 20 has been introduced as a way of determining whether the compatibility page is present, and also to enable/disable it
        File changes:
        - hdr/Options - Add CompatibilityPage option
        - hdr/OSMem - Declare OS_Memory reason code 20
        - hdr/KernelWS - When CompatibilityPage is enabled, make sure nothing else is located at &0
        - s/NewReset - Enable compatibility page just before Service_PostInit (try and keep zero-tolerance policy for null pointer dereferencing during ROM init)
        - s/MemInfo - OS_Memory 20 implementation. Add knowledge of the compatibility page to OS_Memory 16 and 24.
        Tested on BB-xM
      Version 5.87. Tagged as 'Kernel-5_87'
  6. 12 Aug, 2017 1 commit
    • Jeffrey Lee's avatar
      Add OS_Memory 19, which is intended to replace the OS_Memory 0 "make... · b47fdbb1
      Jeffrey Lee authored
      Add OS_Memory 19, which is intended to replace the OS_Memory 0 "make uncacheable" feature, when used for DMA
        Making pages uncacheable to allow them to be used with DMA can be troublesome for a number of reasons:
        * Many processors ignore cache hits for non-cacheable pages, so to avoid breaking any IRQ handlers the page table manipulation + cache maintenance must be performed with IRQs disabled, impacting the IRQ latency of the system
        * Some processors don't support LDREX/STREX to non-cacheable pages
        * In SMP setups it may be necessary to temporarily park the other cores somewhere safe, or perform some other explicit synchronisation to make sure they all have consistent views of the cache/TLB
        The above issues are most likely to cause problems when the page is shared by multiple programs; a DMA operation which targets one part of a page could impact the programs which are using the other parts.
        To combat these problems, OS_Memory 19 is being introduced, which allows DMA cache coherency/address translation to be performed without altering the attributes of the pages.
        Files changed:
        - hdr/OSMem - Add definitions for OS_Memory 19
        - s/MemInfo - Add OS_Memory 19 implementation
        Tested on Raspberry Pi 3, iMx6
      Version 5.86, Tagged as 'Kernel-5_86-4_129_2_3'
  7. 13 Dec, 2016 1 commit
    • Jeffrey Lee's avatar
      Implement support for cacheable pagetables · 65fa6a28
      Jeffrey Lee authored
        Modern ARMs (ARMv6+) introduce the possibility for the page table walk hardware to make use of the data cache(s) when performing memory accesses. This can significantly reduce the cost of a TLB miss on the system, and since the accesses are cache-coherent with the CPU it allows us to make the page tables cacheable for CPU (program) accesses also, improving the performance of page table manipulation by the OS.
        Even on ARMs where the page table walk can't use the data cache, it's been measured that page table manipulation operations can still benefit from placing the page tables in write-through or bufferable memory.
        So with that in mind, this set of changes updates the OS to allow cacheable/bufferable page tables to be used by the OS + MMU, using a system-appropriate cache policy.
        File changes:
        - hdr/KernelWS - Allocate workspace for storing the page flags that are to be used by the page tables
        - hdr/OSMem - Re-specify CP_CB_AlternativeDCache as having a different behaviour on ARMv6+ (inner write-through, outer write-back)
        - hdr/Options - Add CacheablePageTables option to allow switching back to non-cacheable page tables if necessary. Add SyncPageTables var which will be set {TRUE} if either the OS or the architecture requires a DSB after writing to a faulting page table entry.
        - s/ARM600, s/VMSAv6 - Add new SetTTBR & GetPageFlagsForCacheablePageTables functions. Update VMSAv6 for wider XCBTable (now 2 bytes per element)
        - s/ARMops - Update pre-ARMv7 MMU_Changing ARMops to drain the write buffer on entry if cacheable pagetables are in use (ARMv7+ already has this behaviour due to architectural requirements). For VMSAv6 Normal memory, change the way that the OS encodes the cache policy in the page table entries so that it's more compatible with the encoding used in the TTBR.
        - s/ChangeDyn - Update page table page flag handling to use PageTable_PageFlags. Make use of new PageTableSync macro.
        - s/Exceptions, s/AMBControl/memmap - Make use of new PageTableSync macro.
        - s/HAL - Update MMU initialisation sequence to make use of PageTable_PageFlags + SetTTBR
        - s/Kernel - Add PageTableSync macro, to be used after any write to a faulting page table entry
        - s/MemInfo - Update OS_Memory 0 page flag conversion. Update OS_Memory 24 to use new symbol for page table access permissions.
        - s/MemMap2 - Use PageTableSync. Add routines to enable/disable cacheable pagetables
        - s/NewReset - Enable cacheable pagetables once we're fully clear of the MMU initialision sequence (doing earlier would be trickier due to potential double-mapping)
        Tested on pretty much everything currently supported
        Delivers moderate performance benefits to page table ops on old systems (e.g. 10% faster), astronomical benefits on some new systems (up to 8x faster)
        Stats: https://www.riscosopen.org/forum/forums/3/topics/2728?page=2#posts-58015
      Version 5.71. Tagged as 'Kernel-5_71'
  8. 02 Aug, 2016 1 commit
    • Jeffrey Lee's avatar
      Add support for shareable pages and additional access privileges · 9cd4cbe4
      Jeffrey Lee authored
        This set of changes:
        * Refactors page table entry encoding/decoding so that it's (mostly) performed via functions in the MMU files (s.ARM600, s.VMSAv6) rather than on an ad-hoc basis as was the case previously
        * Page table entry encoding/decoding performed during ROM init is also handled via the MMU functions, which resolves some cases where the wrong cache policy was in use on ARMv6+
        * Adds basic support for shareable pages - on non-uniprocessor systems all pages will be marked as shareable (however, we are currently lacking ARMops which broadcast cache maintenance operations to other cores, so safe sharing of cacheable regions isn't possible yet)
        * Adds support for the VMSA XN flag and the "privileged ROM" access permission. These are exposed via RISC OS access privileges 4 and above, taking advantage of the fact that 4 bits have always been reserved for AP values but only 4 values were defined
        * Adds OS_Memory 17 and 18 to convert RWX-style access flags to and from RISC OS access privelege numbers; this allows us to make arbitrary changes to the mappings of AP values 4+ between different OS/hardware versions, and allows software to more easily cope with cases where the most precise AP isn't available (e.g. no XN on <=ARMv5)
        * Extends OS_Memory 24 (CheckMemoryAccess) to return executability information
        * Adds exported OSMem header containing definitions for OS_Memory and OS_DynamicArea
        File changes:
        - Makefile - export C and assembler versions of hdr/OSMem
        - Resources/UK/Messages - Add more text for OS_Memory errors
        - hdr/KernelWS - Correct comment regarding DCacheCleanAddress. Allocate workspace for MMU_PPLTrans and MMU_PPLAccess.
        - hdr/OSMem - New file containing exported OS_Memory and OS_DynamicArea constants, and public page flags
        - hdr/Options - Reduce scope of ARM6support to only cover builds which require ARMv3 support
        - s/AMBControl/Workspace - Clarify AMBNode_PPL usage
        - s/AMBControl/growp, mapslot, mapsome, memmap - Use AreaFlags_ instead of AP_
        - s/AMBControl/main, memmap - Use GetPTE instead of generating page table entry manually
        - s/ARM600 - Remove old coments relating to lack of stack. Update BangCam to use GetPTE. Update PPL tables, removing PPLTransL1 (L1 entries are now derived from L2 table on demand) and adding a separate table for ARM6. Implement the ARM600 versions of the Get*PTE ('get page table entry') and Decode*Entry functions
        - s/ARMops - Add Init_PCBTrans function to allow relevant MMU_PPLTrans/MMU_PCBTrans pointers to be set up during the pre-MMU stage of ROM init. Update ARM_Analyse to set up the pointers that are used post MMU init.
        - s/ChangeDyn - Move a bunch of flags to hdr/OSMem. Rename the AP_ dynamic area flags to AreaFlags_ to avoid name clashes and confusion with the page table AP_ values exported by Hdr:MEMM.ARM600/Hdr:MEMM.VMSAv6. Also generate the relevant flags for OS_Memory 24 so that it can refer to the fixed areas by their name instead of hardcoding the permissions.
        - s/GetAll - GET Hdr:OSMem
        - s/HAL - Change initial page table setup to use DA/page flags and GetPTE instead of building page table entries manually. Simplify AllocateL2PT by removing the requirement for the user to supply the access perimssions that will be used for the area; instead for ARM6 we just assume that cacheable memory is the norm and set L1_U for any L1 entry we create here.
        - s/Kernel - Add GetPTE macro (for easier integration of Get*PTE functions) and GenPPLAccess macro (for easy generation of OS_Memory 24 flags)
        - s/MemInfo - Fixup OS_Memory 0 to not fail on seeing non-executable pages. Implement OS_Memory 17 & 18. Tidy up some error generation. Make OS_Memory 13 use GetPTE. Extend OS_Memory 24 to return (non-) executability information, to use the named CMA_ constants generated by s/ChangeDyn, and to use the Decode*Entry functions when it's necessary to decode page table entries.
        - s/NewReset - Use AreaFlags_ instead of AP_
        - s/VMSAv6 - Remove old comments relating to lack of stack. Update BangCam to use GetPTE. Update PPL tables, removing PPLTransL1 (L1 entries are now derived from L2 table on demand) and adding a separate table for shareable pages. Implement the VMSAv6 versions of the Get*PTE and Decode*Entry functions.
        Tested on Raspberry Pi 1, Raspberry Pi 3, Iyonix, RPCEmu (ARM6 & ARM7), comparing before and after CAM and page table dumps to check for any unexpected differences
      Version 5.55. Tagged as 'Kernel-5_55'