1. 14 Jan, 2023 1 commit
  2. 28 Jul, 2021 3 commits
    • Jeffrey Lee's avatar
      Add OS_AbortTrap implementation · c199c178
      Jeffrey Lee authored
      This supports all the load/store instructions, including FPA & VFP/NEON.
      Most instructions are handled directly via the base version of the
      AbortTrap API that was first implemented in RISC OS Select. However, to
      properly cope with LDREX/STREX, and future support for prefetch aborts,
      the API has been extended to allow the kernel to request that a block of
      memory is mapped in with certain permissions. For LDREX/STREX the kernel
      will then rewind the PC so that the instruction can be retried directly.
      
      Test code in Dev/AbortTrap exists in order to allow checking of all
      major functionality, along with code for building the code in a
      softloadable module for easier/quicker testing.
      c199c178
    • Jeffrey Lee's avatar
      Add extra LTORG to s.HAL · e7152ebd
      Jeffrey Lee authored
      Needed to resolve some literal pool range issues when long descriptor
      page table support is enabled
      e7152ebd
    • Jeffrey Lee's avatar
      Allow RW/ZI sections to be used · 2b665896
      Jeffrey Lee authored
      * Instruct the linker to place any RW/ZI data sections in the last ~16MB
      of the memory map, starting from &ff000000 (with the current toolchain,
      giving it a fixed base address is much easier than giving it a variable
      base address)
      * The RW/ZI section is mapped as completely inaccessible to user mode
      * The initial content of the RW section is copied over shortly after MMU
      startup (in Continue_after_HALInit)
      * Since link's -bin option produces a file containing a copy of the
      (zero-initialised) ZI section, the kernel binary is now produced from a
      "binary with AIF header" AIF with the help of the new 'kstrip' tool.
      kstrip extracts just the RO and RW sections, ensuring the ROM doesn't
      contain a redundant block of zeros for the ZI section.
      
      This should make it easier to use C code & arbitrary libraries within
      the kernel, providing they're compiled with suitable settings (e.g.
      non-module, no FP, no stack checking, like HALs typically use)
      2b665896
  3. 30 Apr, 2021 1 commit
    • Jeffrey Lee's avatar
      Fix compressed ROM support · 2a3ad40a
      Jeffrey Lee authored
      When PhysRamTable was updated to store addresses in page units instead
      of byte units (commit df4efb68), the code which allocates the ROM
      decompression workspace didn't get updated, causing it to break. Add a
      few extra shifts to the code in order to account for the changes.
      
      Fixes issue reported on forums with (compressed) OMAP3 ROM failing to
      boot: https://www.riscosopen.org/forum/forums/5/topics/16446
      
      Version 6.57. Tagged as 'Kernel-6_57'
      2a3ad40a
  4. 28 Apr, 2021 6 commits
    • Jeffrey Lee's avatar
      Log -> phys conversion improvements · 46081bca
      Jeffrey Lee authored
      * RISCOS_LogToPhys upgraded to allow it to cope with all page types
      (added support for 64KB "large" pages and lazily-mapped pages)
      
      * Added OS_Memory 65, which calls through to RISCOS_LogToPhys, to allow
      regular software to do logical-to-physical conversions for all page
      types (other calls, like OS_Memory 0/64, typically only work with 4KB
      pages)
      
      * LoadAndDecodeL2Entry updated to always return a page/entry size, like
      LoadAndDecodeL1Entry
      
      Version 6.56. Tagged as 'Kernel-6_56'
      46081bca
    • Jeffrey Lee's avatar
      Support runtime selection of pagetable format · ba993cb5
      Jeffrey Lee authored
      Runtime selection between long descriptor and short descriptor page
      table format is now possible (with the decision based on whether the HAL
      registers any high RAM or not). The main source changes are as follows:
      
      * LongDesc and ShortDesc switches are in hdr.Options to control what
      kernel variant is built
      * PTOp and PTWhich macros introduced in hdr.ARMops to allow for
      invocation of functions / code blocks which are specific to the page
      table format. If the kernel is being built with only one page table
      format enabled, PTOp is just a BL instruction, ensuring there's no
      performance loss compared to the old code.
      * _LongDesc and _ShortDesc suffixes added to various function names, to
      allow both versions of the function to be included at once if runtime
      selection is enabled
      * Most of the kernel / MMU initialisation code in s.HAL is now encased
      in a big WHILE loop, allowing it to be duplicated if runtime switching
      is enabled (easier than adding dynamic branches all over the place, and
      only costs a few KB of ROM/RAM)
      * Some more functions (notably AccessPhysicalAddress,
      ReleasePhysicalAddress, and MapInIO) have been moved to s.ShortDesc /
      s.LongDesc since they were already 90% specific to page table format
      ba993cb5
    • Jeffrey Lee's avatar
      Remove 1MB bodge from LongDesc LoadAndDecodeL1Entry · ce95d42e
      Jeffrey Lee authored
      LoadAndDecodeL1Entry will now always return the size/alignment of the
      entry. This allows ConstructCAMfromPageTables to walk over a 2MB long
      descriptor page table pointer in one go, instead of splitting it into
      two 1MB chunks (as if short descriptor page tables were in use) and
      calling LoadAndDecodeL1Entry twice. This has allowed the 1MB result
      alignment bodge to be removed from the LongDesc version of
      LoadAndDecodeL1Entry.
      ce95d42e
    • Jeffrey Lee's avatar
      Add MaxCamEntry32 & CPUFlag_HighRAM · 5bd42912
      Jeffrey Lee authored
      MaxCamEntry32 is an internal variable which the kernel can use to
      quickly determine whether a RAM page has a 32bit physical address or
      something larger, by comparing with the physical page number (currently
      entries in PhysRamTable are sorted such that all 32bit pages come first)
      
      CPUFlag_HighRAM (aka OS_PlatformFeatures 0 bit 21) is a flag that
      external code can use to detect whether any high RAM is present, and
      thus whether 64bit physical address APIs should be preferred over 32bit
      ones (once the new APIs are implemented!). Using APIs which only support
      32bit physical addresses will result in functionality being limited.
      5bd42912
    • Jeffrey Lee's avatar
      Support RAM banks with high physical addresses · df4efb68
      Jeffrey Lee authored
      This changes PhysRamTable to store the address of each RAM bank in terms
      of (4KB) pages instead of bytes, effectively allowing it to support a 44
      bit physical address space. This means that (when the long descriptor
      page table format is used) the OS can now make use of memory located
      outside the lower 4GB of the physical address space. However some
      public APIs still need extending to allow for all operations to be
      supported on high RAM (e.g. OS_Memory logical to physical address
      lookups)
      
      OS_Memory 12 (RecommendPage) has been extended to allow R4-R7 to be used
      to specify a (64bit) physical address range which the recommended pages
      must lie within. For backwards compatibility this defaults to 0-4GB.
      df4efb68
    • Jeffrey Lee's avatar
      Fix RISCOS_AddRAM memory table description · 402f32c2
      Jeffrey Lee authored
      There are 20 length bits per entry, not 22
      402f32c2
  5. 17 Mar, 2021 4 commits
    • Jeffrey Lee's avatar
      Initial large phys addr support for RISCOS_AddRAM · 21a340f4
      Jeffrey Lee authored
      Define that bit 12 of the RISCOS_AddRAM flags indicates that the
      supplied start, end, and sigbits values are in 4KB units instead of byte
      units. This allows a 44 bit address space to be used, higher than the 40
      bit LPAE limit.
      
      The page list that RISCOS_AddRAM constructs will now store everything in
      4KB page units, however any RAM above 4GB will currently be thrown away
      when the list is later transferred to the PhysRamTable which the OS uses
      at runtime.
      
      Version 6.54. Tagged as 'Kernel-6_54'
      21a340f4
    • Jeffrey Lee's avatar
      Remove CAM size limit · 79bc3343
      Jeffrey Lee authored
      Previously the CAM sat inside a fixed 16MB window, restricting it to
      storing the details of 1 million pages, i.e. 4GB of RAM. Shuffle things
      around a bit to allow this restriction to be removed: the CAM is now
      located just above the IO region, and the CAM start address /
      IO top will calculated appropriately during kernel init. This change
      paves the way for us to support machines with over 4GB of RAM.
      
      FixedAreasTable has also been removed, since it's no longer really
      necessary (DAs can only be created between the top of application space
      and the bottom of the used IO space, and it's been a long time since
      we've had any fixed bits in the middle of there)
      79bc3343
    • Jeffrey Lee's avatar
      Initial long descriptor support · b51b5540
      Jeffrey Lee authored
      This adds initial support for the "long descriptor" MMU page table
      format, which allows the CPU to (flexibly) use a 40-bit physical address
      space.
      
      There are still some features that need fixing (e.g. RISCOS_MapInIO
      flags), and the OS doesn't yet support RAM above the 32bit limit, but
      this set of changes is enough to allow for working ROMs to be produced.
      
      Also, move MMUControlSoftCopy initialisation out of ClearWkspRAM, since
      it's unrelated to whether the HAL has cleared the RAM or not.
      b51b5540
    • Jeffrey Lee's avatar
      Ensure PhysIllegalMask is initialised correctly · a7240617
      Jeffrey Lee authored
      Remove the lazy initialisation of PhysIllegalMask and instead manually
      initialise it during MMU init. This fixes some situations where the lazy
      initialisation doesn't work (PhysIllegalMask isn't in a zero-initialised
      area of workspace, so if the HAL isn't doing a RAM clear then it could
      be garbage)
      a7240617
  6. 13 Feb, 2021 4 commits
    • Jeffrey Lee's avatar
      [RISCOS_]AccessPhysicalAddress uses page flags · 7924aae2
      Jeffrey Lee authored
      Currently RISCOS_AccessPhysicalAddress allows the caller to specify the
      permissions/properties of the mapped memory by directly specifying some
      of the L1 page table entry flags. This will complicate things when
      adding support for more page table formats, so change it so that
      standard RISC OS page flags are used instead (like the alternate entry
      point, RISCOS_AccessPhysicalAddressUnchecked, already uses).
      
      Also, drop the "RISCOS_" prefix from RISCOS_AccessPhysicalAddress and
      RISCOS_ReleasePhysicalAddress, and remove the references to these
      routines from the HAL docs. These routines have never been exposed to
      the HAL, so renaming them and removing them from the docs should make
      their status clearer.
      
      Version 6.52. Tagged as 'Kernel-6_52'
      7924aae2
    • Jeffrey Lee's avatar
      Remove more direct page table access · 858949b6
      Jeffrey Lee authored
      RISCOS_LogToPhys and OS_Memory 20 (compatibility page) changed to use
      suitable subroutines for reading the page tables instead of accessing
      them directly.
      858949b6
    • Jeffrey Lee's avatar
      DecodeL1/L2Entry -> LoadAndDecodeL1/L2Entry · 846eee02
      Jeffrey Lee authored
      Change the DecodeL1/L2Entry routines so that instead of accepting a page
      table entry as input, they accept a (suitable-aligned) logical address
      and fetch the page table entry themselves. This helps insulate the
      calling code from the finer details of the page table format.
      846eee02
    • Jeffrey Lee's avatar
      Prepare logical_to_physical for 64bit phys addrs · 4fd2dd01
      Jeffrey Lee authored
      ppn_to_physical, logical_to_physical, physical_to_ppn & ppn_to_physical
      have now all been changed to accept/receive 64bit physical addresses in
      R8,R9 instead of a 32bit address in R5. However, where a phys addr is
      being provided as an input, they may currently only pay attention to the
      bottom 32 bits of the address.
      4fd2dd01
  7. 23 Jan, 2021 1 commit
    • Timothy E Baldwin's avatar
      Preserve LR and CPSR in DebugTX · 59d9f948
      Timothy E Baldwin authored
      This ensures that the debug macros may be freeley placed without
      concern for if LR and CPSR contain important values, and avoid the
      of code only working because the debug code changes the flags with
      the effect of a bug appearing when debugging is enabled.
      
      Version 6.49. Tagged as 'Kernel-6_49'
      59d9f948
  8. 28 Oct, 2020 1 commit
    • John Ballance's avatar
      Build fix to DebugTerminal · 070ae73d
      John Ballance authored
      Changes in Kernel-6_44 left an undefined symbol in s.HAL in DebugTerminal_Rdch. These mods resolve that, and also add a note in s.Kernel to reflect this usage.
      
      Version 6.45. Tagged as 'Kernel-6_45'
      070ae73d
  9. 12 Feb, 2020 1 commit
  10. 21 Sep, 2019 1 commit
    • Jeffrey Lee's avatar
      Fix a couple of RISCOS_MapInIO bugs · 0ee90f05
      Jeffrey Lee authored
      Detail:
      
      - s/HAL - Fix ADD v. SUB muddle that could prevent addresses from being rounded down correctly. Fix incorrect logical address being returned to caller on pre-ARMv6 machines due to PageTableSync corrupting a1.
      - s/NewReset - Initialising the CMOS RAM cache while in the middle of setting up the processor vectors feels a bit silly. Move the code to just afterwards so that it feels a bit safer, and so that early crashes are easier to debug (processor vectors in stable state)
      
      Admin:
      
      Tested on Iyonix.
      Fixes ROM softload failure reported on forums:
      https://www.riscosopen.org/forum/forums/11/topics/14749
      
      Version 6.23. Tagged as 'Kernel-6_23'
      0ee90f05
  11. 16 Aug, 2019 3 commits
    • Ben Avison's avatar
      Support supersection-mapped memory in OS_Memory 24 · bd294cf9
      Ben Avison authored
      To achieve this:
      * DecodeL1Entry and DecodeL2Entry return 64-bit physical addresses in
        r0 and r1, with additional return values shuffled up to r2 and r3
      * DecodeL1Entry now returns the section size, so callers can distinguish
        section- from supersection-mapped memory
      * PhysAddrToPageNo now accepts a 64-bit address (though since the physical
        RAM table is currently still all 32-bit, it will report any top-word-set
        addresses as being not in RAM)
      
      Version 6.22. Tagged as 'Kernel-6_22'
      bd294cf9
    • Ben Avison's avatar
      Support temporary mapping of IO above 4GB using supersections · 96913c1f
      Ben Avison authored
      Add a new reason code, OS_Memory 22, equivalent to OS_Memory 14, but
      accepting a 64-bit physical address in r1/r2. Current ARM architectures can
      only express 40-bit or 32-bit physical addresses in their page tables
      (depending on whether they feature the LPAE extension or not) so unlike
      OS_Memory 14, OS_Memory 22 can return an error if an invalid physical
      address has been supplied. OS_Memory 15 should still be used to release a
      temporary mapping, whether you claimed it using OS_Memory 14 or OS_Memory 22.
      
      The logical memory map has had to change to accommodate supersection mapping
      of the physical access window, which needs to be 16MB wide and aligned to a
      16MB boundary. This results in there being 16MB less logical address space
      available for dynamic areas on all platforms (sorry) and there is now a 1MB
      hole spare in the system address range (above IO).
      
      The internal function RISCOS_AccessPhysicalAddress has been changed to
      accept a 64-bit physical address. This function has been a candidate for
      adding to the kernel entry points from the HAL for a long time - enough that
      it features in the original HAL documentation - but has not been so added
      (at least not yet) so there are no API compatibility issues there.
      
      Requires RiscOS/Sources/Programmer/HdrSrc!2
      96913c1f
    • Ben Avison's avatar
      Support permanent mapping of IO above 4GB using supersections · 9024d1f6
      Ben Avison authored
      This is facilitated by two extended calls. From the HAL:
      * RISCOS_MapInIO64 allows the physical address to be specified as 64-bit
      
      From the OS:
      * OS_Memory 21 acts like OS_Memory 13, but takes a 64-bit physical address
      
      There is no need to extend RISCOS_LogToPhys, instead we change its return
      type to uint64_t. Any existing HALs will only read the a1 register, thereby
      narrowing the result to 32 bits, which is fine because all existing HALs
      only expected a 32-bit physical address space anyway.
      
      Internally, RISCOS_MapInIO has been rewritten to detect and use supersections
      for IO regions that end above 4GB. Areas that straddle the 4GB boundary should
      also work, although if you then search for a sub-area that doesn't, it won't
      find a match and will instead map it in again using vanilla sections - this is
      enough of an edge case that I don't think we need to worry about it too much.
      
      The rewrite also conveniently fixes a bug in the old code: if the area being
      mapped in went all the way up to physical address 0xFFFFFFFF (inclusive) then
      only the first megabyte of the area was actually mapped in due to a loop
      termination issue.
      
      Requires RiscOS/Sources/Programmer/HdrSrc!2
      9024d1f6
  12. 23 Jun, 2019 1 commit
  13. 04 Aug, 2018 1 commit
    • Jeffrey Lee's avatar
      Fix OS_Hardware 3 to be re-entrant · 6f9f922b
      Jeffrey Lee authored
      Detail:
        s/HAL - OS_Hardware 3 (remove HAL device) will now re-scan the device list for the device following the Service_Hardware call, so that the device list won't become corrupt if the service call triggers addition/removal of devices.
      Admin:
        Tested on iMX6
        *HDMIOff now correctly removes the HDMI audio device and SoundDMA's software mixer device (SoundDMA removes mixer in response to the HDMI audio device vanishing, but re-entrancy bug meant that the HDMI device was left on the list)
        Note that this only covers re-entrancy via Service_Hardware. OS_Hardware 2/3/4/5 are not re-entrant from other locations (e.g. IRQ handlers or memory allocation service calls).
      
      
      Version 6.12. Tagged as 'Kernel-6_12'
      6f9f922b
  14. 09 Sep, 2017 1 commit
    • ROOL's avatar
      Change module initialisation to be a two pass scheme · ac1ea0f5
      ROOL authored
      Detail:
        To make it easier to support arbitrary complexity keyboard controllers (eg. USB via DWCDriver on the Pi) have the kernel do the early keyboard recovery key press detection instead of the HAL.
        During the first pass those modules used for reading the keyboard are started, ignoring the CMOS frugal bits.
        The keyboard is then scanned for 3s, during which time the RAM is cleared (unless the HAL indicated it has already been done).
        During the second pass the remaining modules are started respecting the CMOS frugal bits. Any which were already started in the first pass are inserted into the new chain, so the keyboard is reset once and only once.
      
        Boot times, with a 300cs key scan time in NewReset.
        Risc PC with 160MB RAM (128+32+0).
        Times from turning on power to initial "beep", using a stopwatch.
                      RISC OS 3.70 RISC OS 5.22 This OS
        ARM610        12.5         10.4         10.3
        ARM710        11.8         10.2         9.7
        StrongARM 233 11.1         9.5          8.4
      
        In NewReset.s:
        Remove old KbdScan code (leave Reset_IRQ_Handler for IIC only)
        If HAL_KbdScanDependencies returns a null string then present KbdDone flag and skip to full init.
        A few vestiges of soft resets removed.
        Do RAM clear when waiting for INKEY (being careful not to trash the running modules...).
        Clearing just the freepool on a 2GB Titanium cleared 7EFD6 pages (99.2%).
      
        In ModHand.s:
        2nd pass need to sneaky renumber the nodes (so *ROMModules is in the right order, frugal bits line up) without resetting the chain
      
        In HAL.s:
        Change ClearPhysRAM to ClearWkspRAM, such that it only clears the kernel workspace rather than all RAM. The bulk of the RAM is cleared during the keyboard scan by new function ClearFreePoolSection.
        Add a variant of Init_MapInRAM which clears the mapped in RAM too (as these very early claims will not be in the free pool when the RAM is cleared later).
        Remove HAL keyboard scan setup & IRQ handler.
        Fix bug in HALDebugHexTX2, the input value needs pre-shifting by 16b before continuing.
      
        In GetAll.s, PMF/osbyte.s:
        Use Hdr:Countries and Hdr:OsBytes for constants.
      
        In PMF/key.s, PMF/osinit.s:
        Relocate the key post init from PostInit to KeyPostInit.
        Changed PostInit to not tail call KeyPostInit so they can be called independently.
      
        In hdr/KernelWs:
        Improve comments, add InitWsStart label to refer to.
      
        In hdr/HALEntries:
        Add HAL_KbdScanDependencies.
        Delete KbdFlag exports.
        Took the opportunity to reorder some of the higher numbered HAL entries and re-grouping, specifically (112,120) (84,106,108,117).
      Admin:
        Tested on an ARM6/ARM7/SA Risc PC, BeagleBoard xM, Iyonix, Pandaboard ES, Wandboard Quad, IPEGv5, Titanium, Pi 2 and 3.
        Requires corresponding HAL change.
        Submission for USB bounty.
      
      Version 5.89. Tagged as 'Kernel-5_89'
      ac1ea0f5
  15. 13 Dec, 2016 4 commits
    • Jeffrey Lee's avatar
      Implement some ARM11 errata workarounds · bfd65b5c
      Jeffrey Lee authored
      Detail:
        s/ARMops, s/HAL - Add workarounds for some of the scary errata that were previously weren't dealing with (720013, 716151, 714068)
      Admin:
        Tested on Raspberry Pi 1
      
      
      Version 5.72. Tagged as 'Kernel-5_72'
      bfd65b5c
    • Jeffrey Lee's avatar
      Implement support for cacheable pagetables · 65fa6a28
      Jeffrey Lee authored
      Detail:
        Modern ARMs (ARMv6+) introduce the possibility for the page table walk hardware to make use of the data cache(s) when performing memory accesses. This can significantly reduce the cost of a TLB miss on the system, and since the accesses are cache-coherent with the CPU it allows us to make the page tables cacheable for CPU (program) accesses also, improving the performance of page table manipulation by the OS.
        Even on ARMs where the page table walk can't use the data cache, it's been measured that page table manipulation operations can still benefit from placing the page tables in write-through or bufferable memory.
        So with that in mind, this set of changes updates the OS to allow cacheable/bufferable page tables to be used by the OS + MMU, using a system-appropriate cache policy.
        File changes:
        - hdr/KernelWS - Allocate workspace for storing the page flags that are to be used by the page tables
        - hdr/OSMem - Re-specify CP_CB_AlternativeDCache as having a different behaviour on ARMv6+ (inner write-through, outer write-back)
        - hdr/Options - Add CacheablePageTables option to allow switching back to non-cacheable page tables if necessary. Add SyncPageTables var which will be set {TRUE} if either the OS or the architecture requires a DSB after writing to a faulting page table entry.
        - s/ARM600, s/VMSAv6 - Add new SetTTBR & GetPageFlagsForCacheablePageTables functions. Update VMSAv6 for wider XCBTable (now 2 bytes per element)
        - s/ARMops - Update pre-ARMv7 MMU_Changing ARMops to drain the write buffer on entry if cacheable pagetables are in use (ARMv7+ already has this behaviour due to architectural requirements). For VMSAv6 Normal memory, change the way that the OS encodes the cache policy in the page table entries so that it's more compatible with the encoding used in the TTBR.
        - s/ChangeDyn - Update page table page flag handling to use PageTable_PageFlags. Make use of new PageTableSync macro.
        - s/Exceptions, s/AMBControl/memmap - Make use of new PageTableSync macro.
        - s/HAL - Update MMU initialisation sequence to make use of PageTable_PageFlags + SetTTBR
        - s/Kernel - Add PageTableSync macro, to be used after any write to a faulting page table entry
        - s/MemInfo - Update OS_Memory 0 page flag conversion. Update OS_Memory 24 to use new symbol for page table access permissions.
        - s/MemMap2 - Use PageTableSync. Add routines to enable/disable cacheable pagetables
        - s/NewReset - Enable cacheable pagetables once we're fully clear of the MMU initialision sequence (doing earlier would be trickier due to potential double-mapping)
      Admin:
        Tested on pretty much everything currently supported
        Delivers moderate performance benefits to page table ops on old systems (e.g. 10% faster), astronomical benefits on some new systems (up to 8x faster)
        Stats: https://www.riscosopen.org/forum/forums/3/topics/2728?page=2#posts-58015
      
      
      Version 5.71. Tagged as 'Kernel-5_71'
      65fa6a28
    • Jeffrey Lee's avatar
      Make MMU_Changing ARMops perform the sub-operations in a sensible order · 9a96263a
      Jeffrey Lee authored
      Detail:
        For a while we've known that the correct way of doing cache maintenance on ARMv6+ (e.g. when converting a page from cacheable to non-cacheable) is as follows:
        1. Write new page table entry
        2. Flush old entry from TLB
        3. Clean cache + drain write buffer
        The MMU_Changing ARMops (e.g. MMU_ChangingEntry) implement the last two items, but in the wrong order. This has caused the operations to fall out of favour and cease to be used, even in pre-ARMv6 code paths where the effects of improper cache/TLB management perhaps weren't as readily visible.
        This change re-specifies the relevant ARMops so that they perform their sub-operations in the correct order to make them useful on modern ARMs, updates the implementations, and updates the kernel to make use of the ops whereever relevant.
        File changes:
        - Docs/HAL/ARMop_API - Re-specify all the MMU_Changing ARMops to state that they are for use just after a page table entry has been changed (as opposed to before - e.g. 5.00 kernel behaviour). Re-specify the cacheable ones to state that the TLB invalidatation comes first.
        - s/ARM600, s/ChangeDyn, s/HAL, s/MemInfo, s/VMSAv6, s/AMBControl/memmap - Replace MMU_ChangingUncached + Cache_CleanInvalidate pairs with equivalent MMU_Changing op
        - s/ARMops - Update ARMop implementations to do everything in the correct order
        - s/MemMap2 - Update ARMop usage, and get rid of some lingering sledgehammer logic from ShuffleDoublyMappedRegionForGrow
      Admin:
        Tested on pretty much everything currently supported
      
      
      Version 5.70. Tagged as 'Kernel-5_70'
      9a96263a
    • Jeffrey Lee's avatar
      Add new ARMops. Add macros which map the ARMv7/v8 cache/TLB maintenance... · 48485eee
      Jeffrey Lee authored
      Add new ARMops. Add macros which map the ARMv7/v8 cache/TLB maintenance mnemonics (as featured in recent ARM ARMs) to MCR ops.
      
      Detail:
        - Docs/HAL/ARMop_API - Document the new ARMops. These ops are intended to help with future work (DMA without OS_Memory 0 "make temp uncacheable", and minimising cache maintenance when unmapping pages) and aren't in use just yet.
        - hdr/Copro15ops - Add new macros for ARMv7+ which map the mnemonics seen in recent ARM ARMs to the corresponding MCR ops. This should make things easier when cross-referencing docs and reduce the risk of typos.
        - hdr/KernelWS - Shuffle kernel workspace a bit to make room for the new ARMops
        - hdr/OSMisc - Expose new ARMops via OS_MMUControl 2
        - s/ARMops - Implement the new ARMops. Change the ARMv7+ ARMops to use the new mnemonic macros. Also get rid of myDSB / myISB usage from ARMv7+ code paths; use DSB/ISB/etc. directly to ensure correct behaviour
        - s/HAL - Mnemonic + ISB/DSB updates. Change software RAM clear to do 16 bytes at a time for kernel workspace instead of 32 to allow the kernel workspace tweaks to work.
      Admin:
        Binary diff shows that mnemonics map to the original MCR ops correctly
        Note: Raspberry Pi builds will now emit lots of warnings due to increased DSB/ISB instruction use. However it should be safe to ignore these as they should only be present in v7+ code paths.
        Note: New ARMops haven't been tested yet, will be disabled (or at least hidden from user code) in a future checkin
      
      
      Version 5.68. Tagged as 'Kernel-5_68'
      48485eee
  16. 02 Aug, 2016 1 commit
    • Jeffrey Lee's avatar
      Add support for shareable pages and additional access privileges · 9cd4cbe4
      Jeffrey Lee authored
      Detail:
        This set of changes:
        * Refactors page table entry encoding/decoding so that it's (mostly) performed via functions in the MMU files (s.ARM600, s.VMSAv6) rather than on an ad-hoc basis as was the case previously
        * Page table entry encoding/decoding performed during ROM init is also handled via the MMU functions, which resolves some cases where the wrong cache policy was in use on ARMv6+
        * Adds basic support for shareable pages - on non-uniprocessor systems all pages will be marked as shareable (however, we are currently lacking ARMops which broadcast cache maintenance operations to other cores, so safe sharing of cacheable regions isn't possible yet)
        * Adds support for the VMSA XN flag and the "privileged ROM" access permission. These are exposed via RISC OS access privileges 4 and above, taking advantage of the fact that 4 bits have always been reserved for AP values but only 4 values were defined
        * Adds OS_Memory 17 and 18 to convert RWX-style access flags to and from RISC OS access privelege numbers; this allows us to make arbitrary changes to the mappings of AP values 4+ between different OS/hardware versions, and allows software to more easily cope with cases where the most precise AP isn't available (e.g. no XN on <=ARMv5)
        * Extends OS_Memory 24 (CheckMemoryAccess) to return executability information
        * Adds exported OSMem header containing definitions for OS_Memory and OS_DynamicArea
        File changes:
        - Makefile - export C and assembler versions of hdr/OSMem
        - Resources/UK/Messages - Add more text for OS_Memory errors
        - hdr/KernelWS - Correct comment regarding DCacheCleanAddress. Allocate workspace for MMU_PPLTrans and MMU_PPLAccess.
        - hdr/OSMem - New file containing exported OS_Memory and OS_DynamicArea constants, and public page flags
        - hdr/Options - Reduce scope of ARM6support to only cover builds which require ARMv3 support
        - s/AMBControl/Workspace - Clarify AMBNode_PPL usage
        - s/AMBControl/growp, mapslot, mapsome, memmap - Use AreaFlags_ instead of AP_
        - s/AMBControl/main, memmap - Use GetPTE instead of generating page table entry manually
        - s/ARM600 - Remove old coments relating to lack of stack. Update BangCam to use GetPTE. Update PPL tables, removing PPLTransL1 (L1 entries are now derived from L2 table on demand) and adding a separate table for ARM6. Implement the ARM600 versions of the Get*PTE ('get page table entry') and Decode*Entry functions
        - s/ARMops - Add Init_PCBTrans function to allow relevant MMU_PPLTrans/MMU_PCBTrans pointers to be set up during the pre-MMU stage of ROM init. Update ARM_Analyse to set up the pointers that are used post MMU init.
        - s/ChangeDyn - Move a bunch of flags to hdr/OSMem. Rename the AP_ dynamic area flags to AreaFlags_ to avoid name clashes and confusion with the page table AP_ values exported by Hdr:MEMM.ARM600/Hdr:MEMM.VMSAv6. Also generate the relevant flags for OS_Memory 24 so that it can refer to the fixed areas by their name instead of hardcoding the permissions.
        - s/GetAll - GET Hdr:OSMem
        - s/HAL - Change initial page table setup to use DA/page flags and GetPTE instead of building page table entries manually. Simplify AllocateL2PT by removing the requirement for the user to supply the access perimssions that will be used for the area; instead for ARM6 we just assume that cacheable memory is the norm and set L1_U for any L1 entry we create here.
        - s/Kernel - Add GetPTE macro (for easier integration of Get*PTE functions) and GenPPLAccess macro (for easy generation of OS_Memory 24 flags)
        - s/MemInfo - Fixup OS_Memory 0 to not fail on seeing non-executable pages. Implement OS_Memory 17 & 18. Tidy up some error generation. Make OS_Memory 13 use GetPTE. Extend OS_Memory 24 to return (non-) executability information, to use the named CMA_ constants generated by s/ChangeDyn, and to use the Decode*Entry functions when it's necessary to decode page table entries.
        - s/NewReset - Use AreaFlags_ instead of AP_
        - s/VMSAv6 - Remove old comments relating to lack of stack. Update BangCam to use GetPTE. Update PPL tables, removing PPLTransL1 (L1 entries are now derived from L2 table on demand) and adding a separate table for shareable pages. Implement the VMSAv6 versions of the Get*PTE and Decode*Entry functions.
      Admin:
        Tested on Raspberry Pi 1, Raspberry Pi 3, Iyonix, RPCEmu (ARM6 & ARM7), comparing before and after CAM and page table dumps to check for any unexpected differences
      
      
      Version 5.55. Tagged as 'Kernel-5_55'
      9cd4cbe4
  17. 30 Jun, 2016 2 commits
    • Jeffrey Lee's avatar
      Delete lots of old switches · f655fcf6
      Jeffrey Lee authored
      Detail:
        This change gets rid of the following switches from the source (picking appropriate code paths for a 32bit HAL build):
        * FixCallBacks
        * UseProcessTransfer
        * CanLiveOnROMCard
        * BleedinDaveBell
        * NewStyleEcfs
        * DoVdu23_0_12
        * LCDPowerCtrl
        * HostVdu
        * Print
        * EmulatorSupport
        * TubeInfo
        * AddTubeBashers
        * TubeChar, TubeString, TubeDumpNoStack, TubeNewlNoStack macros
        * FIQDebug
        * VCOstartfix
        * AssemblingArthur (n.b. still defined for safety with anything in Hdr: which uses it, but not used explicitly by the kernel)
        * MouseBufferFix
        * LCDInvert
        * LCDSupport
        * DoInitialiseMode
        * Interruptible32bitModes
        * MouseBufferManager
        * StrongARM (new CacheCleanerHack and InterruptDelay switches added to hdr/Options to cover some functionality that StrongARM previously covered)
        * SAcleanflushbroken
        * StrongARM_POST
        * IrqsInClaimRelease
        * CheckProtectionLink
        * GSWorkspaceInKernelBuffers
        * EarlierReentrancyInDAShrink
        * LongCommandLines
        * ECC
        * NoSPSRcorruption
        * RMTidyDoesNowt
        * RogerEXEY
        * StorkPowerSave
        * DebugForcedReset
        * AssembleKEYV
        * AssemblePointerV
        * ProcessorVectors
        * Keyboard_Type
        Assorted old files have also been deleted.
      Admin:
        Identical binary to previous revision for IOMD & Raspberry Pi builds
      
      
      Version 5.51. Tagged as 'Kernel-5_51'
      f655fcf6
    • Jeffrey Lee's avatar
      Delete pre-HAL and 26bit code · 7d5bfc66
      Jeffrey Lee authored
      Detail:
        This change gets rid of the following switches from the source (picking appropriate code paths for a 32bit HAL build):
        * HAL
        * HAL26
        * HAL32
        * No26bitCode
        * No32bitCode
        * IncludeTestSrc
        * FixR9CorruptionInExtensionSWI
        Various old files have also been removed (POST code, Arc/STB keyboard drivers, etc.)
      Admin:
        Identical binary to previous revision for IOMD & Raspberry Pi builds
      
      
      Version 5.49. Tagged as 'Kernel-5_49'
      7d5bfc66
  18. 15 Jun, 2016 1 commit
    • Jeffrey Lee's avatar
      Clear the exclusive monitor when returning to pre-empted code · d10d2336
      Jeffrey Lee authored
      Detail:
        s/Kernel - Add macro for CLREX, which uses a dummy STREX on basic ARMv6 machines. Clear the exclusive monitor after issuing transient callbacks, to cope with callbacks being triggered on exit from IRQ
        s/ArthurSWIs, s/HAL, s/NewIRQs - Clear the exclusive monitor on exit from IRQ handlers & default FIQ handler
        s/VMSAv6 - Clear the exclusive monitor on entry to the data abort pre-veneer
      Admin:
        Tested on Raspberry Pi
        Non-transient callback handlers, custom abort handlers, FIQ handlers, and anything else which returns directly to interrupted user code is responsible for issuing its own CLREX if the code has done something that could have left the local monitor in the exclusive state (e.g. calling a SWI counts towards this, as there's no guarantee the monitor will be open on exit from the SWI)
      
      
      Version 5.35, 4.79.2.327. Tagged as 'Kernel-5_35-4_79_2_327'
      d10d2336
  19. 20 May, 2016 1 commit
    • Jeffrey Lee's avatar
      Fix CPU features being clobbered by software RAM clear · 36bf6e21
      Jeffrey Lee authored
      Detail:
        s/ARMops, s/HAL - Move CPU feature init to after the RAM clear, to prevent the cached values being clobbered on platforms where the HAL doesn't perform the RAM clear
        s/CPUFeatures - Update/clarify comment
      Admin:
        Tested on Raspberry Pi
        Fixes issue spotted by Sprow
      
      
      Version 5.35, 4.79.2.320. Tagged as 'Kernel-5_35-4_79_2_320'
      36bf6e21
  20. 10 Mar, 2016 1 commit
    • Jeffrey Lee's avatar
      Cache maintenance fixes · b0682acb
      Jeffrey Lee authored
      Detail:
        This set of changes tackles two main issues:
        * Before mapping out a cacheable page or making it uncacheable, the OS performs a cache clean+invalidate op. However this leaves a small window where data may be fetched back into the cache, either accidentally (dodgy interrupt handler) or via agressive prefetch (as allowed for by the architecture). This rogue data can then result in coherency issues once the pages are mapped out or made uncacheable a short time later.
          The fix for this is to make the page uncacheable before performing the cache maintenance (although this isn't ideal, as prior to ARMv7 it's implementation defined whether address-based cache maintenance ops affect uncacheable pages or not - and on ARM11 it seems that they don't, so for that CPU we currently force a full cache clean instead)
        * Modern ARMs generally ignore unexpected cache hits, so there's an interrupt hole in the current OS_Memory 0 "make temporarily uncacheable" implementation where the cache is being flushed after the page has been made uncacheable (consider the case of a page that's being used by an interrupt handler, but the page is being made uncacheable so it can also be used by DMA). As well as affecting ARMv7+ devices this was found to affect XScale (and ARM11, although untested for this issue, would have presumably suffered from the "can't clean uncacheable pages" limitation)
          The fix for this is to disable IRQs around the uncache sequence - however FIQs are currently not being dealt with, so there's still a potential issue there.
        File changes:
        - Docs/HAL/ARMop_API, hdr/KernelWS, hdr/OSMisc - Add new Cache_CleanInvalidateRange ARMop
        - s/ARM600, s/VMSAv6 - BangCam updated to make the page uncacheable prior to flushing the cache. Add GetTempUncache macro to help with calculating the page flags required for making pages uncacheable. Fix abort in OS_MMUControl on Raspberry Pi - MCR-based ISB was resetting ZeroPage pointer to 0
        - s/ARMops - Cache_CleanInvalidateRange implementations. PL310 MMU_ChangingEntry/MMU_ChangingEntries refactored to rely on Cache_CleanInvalidateRange_PL310, which should be a more optimal implementation of the cache cleaning code that was previously in MMU_ChangingEntry_PL310.
        - s/ChangeDyn - Rename FastCDA_UpFront to FastCDA_Bulk, since the cache maintenance is no longer performed upfront. CheckCacheabilityR0ByMinusR2 now becomes RemoveCacheabilityR0ByMinusR2. PMP LogOp implementation refactored quite a bit to perform cache/TLB maintenance after making page table changes instead of before. One flaw with this new implementation is that mapping out large areas of cacheable pages will result in multiple full cache cleans while the old implementation would have (generally) only performed one - a two-pass approach over the page list would be needed to solve this.
        - s/GetAll - Change file ordering so GetTempUncache macro is available earlier
        - s/HAL - ROM decompression changed to do full MMU_Changing instead of MMU_ChangingEntries, to make sure earlier cached data is truly gone from the cache. ClearPhysRAM changed to make page uncacheable before flushing cache.
        - s/MemInfo - OS_Memory 0 interrupt hole fix
        - s/AMBControl/memmap - AMB_movepagesout_L2PT now split into cacheable+non-cacheable variants. Sparse map out operation now does two passes through the page list so that they can all be made uncacheable prior to the cache flush + map out.
      Admin:
        Tested on StrongARM, XScale, ARM11, Cortex-A7, Cortex-A9, Cortex-A15, Cortex-A53
        Appears to fix the major issues plaguing SATA on IGEPv5
      
      
      Version 5.35, 4.79.2.306. Tagged as 'Kernel-5_35-4_79_2_306'
      b0682acb
  21. 29 Feb, 2016 1 commit
    • Jeffrey Lee's avatar
      OS_Memory 13/14/15 fixes · eb908a1e
      Jeffrey Lee authored
      Detail:
        s/HAL - Change RISCOS_AccessPhysicalAddress & RISCOS_ReleasePhysicalAddress (aka OS_Memory 14 & 15) to use the MMU_ChangingUncached ARMop instead of TLB_InvalidateEntry, as on ARMv6+ the MMU version ensures the write has been flushed to be visible by the TLB, while the TLB invalidate call doesn't.
        Fix RISCOS_MapInIO (aka OS_Memory 13) not detecting regions which have already been mapped in due to L1_XN flag masking issue. Also issue DSB+ISB after the page table write(s) to ensure it's visible by the TLB hardware.
      Admin:
        Tested on IGEPv5
      
      
      Version 5.35, 4.79.2.305. Tagged as 'Kernel-5_35-4_79_2_305'
      eb908a1e