RISC OS and the "HAL"
=====================

RISC OS currently is tied to the IOMD20 and VIDC20 peripheral set,
descendents of the original IOC, MEMC and VIDC devices designed in parallel
with the original ARM. These devices provide a close fit with RISC OS, and
their functionality is well suited to general purpose and embedded systems,
but the continuing drive to reduce cost requires us to support other
peripheral sets on off-the-shelf ARM system on chips.

First targets for support are L7205/L7210 for Customer L and CL92xx (the new
ARM920T based devices) for Customer A. Enclosed are a summary of their
advantages and disadvantages over the ARM7500FE for our Information
Appliance designs.

              L7205                       CL92xx
            + Faster (50% or so)        + Faster (400%+)
            + SDRAM support             + SDRAM support
            + USB                       + USB
                                        + EIDE interface
                                        + Lots of GPIO
            - No hardware cursor
            - No floating point         - Incompatible floating point
            - No video DACs             - No video DACs
            - No PS/2                   - No PS/2
                                        - Bizarre MS-Windows video system


To support these devices, and others in the future, a simple HAL is to be
inserted underneath RISC OS. This will provide two functions. Firstly, it
will be responsible for initial system bootstrap, much like a PC BIOS, and
secondly it will provide simple APIs to allow hardware access.

The HAL APIs are a thin veneer on top of the hardware. They are designed to
act as replacements for all the hardware knowledge and manipulation performed
by the RISC OS Kernel, together with some APIs that will allow RISC OS driver
modules to become more hardware independent. No attempt will be made (at this
stage) to perform such tasks as separating the video drivers from the Kernel,
for example.

One tricky design decision is the amount of abstraction to aim for. Too
little, and the system is not flexible enough; too much and HAL design is
needlessly complicated for simple hardware. The present design tries to
err on the side of too little abstraction. Extra, more abstract APIs can
always be added later. So, initially, for example, the serial device API
will just provide discovery, some capability flags and the base address
of the UART register set. This will be sufficient for the vast majority
of devices. If new hardware comes along later that isn't UART compatible,
a new API can be defined. Simple hardware can continue to just report
UART base addresses.

The bulk of device driver implementation remains in RISC OS modules - the
difference is that the HAL will allow many device drivers to no longer
directly access hardware. For example, PS2Driver can now use HAL calls to
send and receive bytes through the PS/2 ports, and thus is no longer tied to
IOMD's PS/2 hardware. Similarly, interrupt masking and unmasking, as
performed by any device vector claimant, is now a HAL call. Note that HAL
calls are normally performed via a Kernel SWI - alternatively the Kernel
can return the address of specific Kernel routines. There is nothing to stop
specific drivers talking to hardware directly, as long as they accept that
this will tie them to specific devices.

This dividing line between the HAL and RISC OS driver modules is crucial. If
the HAL does everything, then we have achieved nothing - we have just as much
hardware dependent code - it's just in a different place. It is important to
place the dividing line as close to the hardware as possible, to make it easy
to design a HAL and to prevent large amounts of code  duplication between
HALs for different platforms.

The Kernel remains responsible for the ARM's MMU and all other aspects of the
CPU core. The HAL requires no knowledge of details of ARM implementations,
and thus any HAL implementation should work on any processor from the ARM610
to the ARM940T or XScale.


OS independence
===============

Notionally, the HAL implementation is OS independent. It makes no assumptions
about the virtual memory map of the OS, and only uses the defined HAL->OS
entries. The HAL should not call RISC OS SWIs.

In practice, however, the HALs are unlikely to be used on anything other
than RISC OS, and many HALs are likely to be written. This makes it sensible
to place as much intelligence as possible within RISC OS itself, to prevent
duplicated effort.


Calling standards
=================

RISC OS and the HAL are two separate entities, potentially linked separately.
Thus some simple dynamic linking is required. This occurs via a hybrid of the
RISC OS module header and Shared C Library stubs. Each RISC OS/HAL entry is
given a unique (arbitrary) number, starting at 0. The offset to each entry is
given in an entry table. Calls can be made manually through this table, or
stubs could be created at run-time to allow high-level language calls.

Every entry (up to the declared maximum) must exist. If not implemented, a
failure response must be returned, or the call ignored, as appropriate.

To permit high-level language use in the future, the procedure call standard
in both directions is ATPCS, with no use of floating point, no stack limit
checking, no frame pointers, and no Thumb interworking. HAL code is expected
to be ROPI and RWPI (hence it is called with its static workspace base in
sb). The OS kernel is neither ROPI nor RWPI (except for the pre-MMU calls,
which are ROPI).

The HAL will always be called in a privileged mode - if called in an
interrupt mode, the corresponding interrupts will be disabled. The HAL should
not change mode. HAL code should work in both 26-bit and 32-bit modes (but
should assume 32-bit configuration).


Header formats
==============

The OS is linked to run at a particular base address. At present, the address
will be at <n>MB + 64KB. This allows a HAL of up to 64K to be placed at the
bottom of a ROM below the OS, and the whole thing to be section-mapped.
However, if a different arrangement is used, the system will still work
(albeit slightly less efficiently).

The OS starts with a magic word - this aids probing and location of images.
Following that is a defined header format:

Word 0: Magic word ("OSIm" - &6D49534F)
Word 1: Flags (0)
Word 2: Image size
Word 3: Offset from base to entry table
Word 4: Number of entries available

The HAL itself may have whatever header is required to start the system. For
example on ARM7500 16->32 bit switch code is required, and on the 9500 parts
a special ROM header and checksum must be present. Instead of a header,
a pointer to the HAL descriptor is passed to the OS in the OS_Start call:

Word 0: Flags
            bit 0 => uncachable workspace (32K) required
            bits 1-31 reserved
Word 1: Offset from descriptor to start of HAL (will be <= 0)
Word 2: HAL size
Word 3: Offset from descriptor to entry table
Word 4: Number of entries available
Word 5: Static workspace required

Each of the HAL and the OS must be contiguous within physical memory.


RISC OS entry points from HAL init
==================================


Entry 0:
void RISCOS_InitARM(unsigned int flags)

    flags: reserved - sbz

On entry:
  SVC mode
  MMU and caches off
  IRQs and FIQs disabled
  No RAM or stack used

On exit:
  Instruction cache may be on

Usage:
  This routine must be called once very early on in the HAL start-up, to accelerate the
  CPU for the rest of HAL initialisation. Typically, it will just enable the instruction
  cache (if possible on the ARM in use), and ensure that the processor is in 32-bit
  configuration and mode.

  Some architecture 4 (and later) ARMs have bits in the control register that affect
  the hardware layer - eg the iA and nF bits in the ARM920T. These are the HAL's
  responsibility - the OS will not touch them. Conversely, the HAL should not touch the
  cache, MMU and core configuration bits (currently bits 0-14).

  On architecture 3, the control register is write only - the OS will set bits 11-31 to
  zero.

  Likewise, such things as the StrongARM 110's register 15 (Test, Clock and Idle Control)
  are the HAL's responsibility. The OS does not know about the configuration of the
  system, so cannot program such registers.

  This entry may not be called after RISCOS_Start.


Entry 1:
void *RISCOS_AddRAM(unsigned int flags, void *start, void *end, uintptr_t sigbits, void *ref)
   flags
        bit 0: video memory (only first contiguous range will be used)
        bits 8-11: speed indicator (arbitrary, higher => faster)
        other bits reserved (SBZ)
   start
        start address of RAM (inclusive) (no alignment requirements)
   end
        end address of RAM (exclusive) (no alignment requirements, but must be >= start)
   sigbits
        significant address bit mask (1 => this bit of addr decoded, 0 => this bit ignored)
   ref
        reference handle (NULL for first call)

Returns ref for next call

On entry:
  SVC32 mode
  MMU and data cache off
  IRQs and FIQs disabled

Other notes:
  This entry point must be the first call from the HAL to RISC OS following a hardware
  reset. It may be called as many times as necessary to give all enumerate RAM that
  is available for general purpose use. It should only be called to declare video
  memory if the video memory may be used as normal RAM when in small video modes.

  To permit software resets:
    The HAL must be non-destructive of any declared RAM outside the first 4K of the first
    block.
    The stack pointer should be initialised 4K into the first block, or in some non-
    declared RAM.
    Must present memory in a fixed order on any given system.

  Current limitations:
    The first block must be at least 256K and 16K aligned. (Yuck)
    Block coalescing only works well if RAM banks are added in ascending address order.

  RISC OS will use RAM at the start of the first block as initial workspace. Max usage
  is 16 bytes per block + 32 (currently 8 per block + 4). This limits the number of
  discontiguous blocks (although RISC OS will concatanate contiguous blocks where
  possible).

  This call must not be made after RISCOS_Start.


Entry 2:
void RISCOS_Start(unsigned int flags, int *riscos_header, int *hal_entry_table, void *ref)

   flags
        bit 0: power on reset
        bit 1: CMOS reset inhibited (eg protection link on Risc PC)
        bit 2: perform a CMOS reset (if bit 1 clear and bit 0 set - eg front panel
                                     button held down on an NC)

On entry:
  SVC32 mode
  MMU and data cache off
  IRQs and FIQs disabled

Usage:
  This routine must be called after all calls to RISCOS_AddRAM have been completed.
  It does not return. Future calls back to the HAL are via the HAL entry table, after
  the MMU has been enabled.


Entry 3:
void *RISCOS_MapInIO(unsigned int flags, void *phys, unsigned int size)

   flags: bit 2 => make memory bufferable
    phys: physical address to map in
    size: number of bytes of memory to map in

Usage:
  This routine is used to map in IO memory for the HAL's usage. Normally it would
  only be called during HAL_Init(). Once mapped in the IO space cannot be released.

  It returns the resultant virtual address corresponding to phys, or 0 for failure.
  Failure can only occur if no RAM is available for page tables, or if the virtual
  address space is exhausted.


void *RISCOS_AccessPhysicalAddress(unsigned int flags, uint64_t phys, void **oldp)

   flags: bit 2 => make memory bufferable
          other bits must be zero
    phys: physical address to access
    oldp: pointer to location to store old state (or NULL)

On entry:
  Privileged mode
  MMU on
  FIQs on
  Re-entrant

On exit:
  Returns logical address corresponding to phys

Usage:
  Arranges for the physical address phys to be mapped in to logical memory.
  In fact, at least the whole megabyte containing "phys" is mapped in (ie if phys =
  &12345678, then &12300000 to &123FFFFF become available). The memory is
  supervisor access only, non-cacheable, non-bufferable by default, and will
  remain available until the next call to RISCOS_Release/AccessPhysicalAddress
  (although interrupt routines or subroutines may temporarily map in something
  else).

  When finished, the user should call RISCOS_ReleasePhysicalAddress.


void RISCOS_ReleasePhysicalAddress(void *old)

  old: state returned from a previous call to RISCOS_AccessPhysicalAddress

On entry:
  MMU on
  FIQs on
  Re-entrant

Usage:
  Call with the a value output from a previous RISCOS_ReleasePhysicalAddress.

Example:

  void *old;
  uint64_t addr_physical = (uint64_t) 0x80005000;
  uint64_t addr2_physical = (uint64_t) 0x90005000;
  uint32_t *addr_logical;
  uint32_t *addr2_logical;

  addr_logical = (unsigned int *) RISCOS_AccessPhysicalAddress(0, addr_physical, &old);
  addr_logical[0] = 3; addr_logical[1] = 5;

  addr2_logical = (unsigned int *) RISCOS_AccessPhysicalAddress(0, addr2_physical, NULL);
  *addr2_logical = 7;

  RISCOS_ReleasePhysicalAddress(old);


HAL entries
===========


void HAL_Start(int *riscos_header)