GENERAL
                                  =======

RISC OS will either be running 26-bit, or 32-bit, with no in-between state.

When running 32-bit, it will make no use of 26-bit modes, and will call all
routines in a 32-bit mode. There will be no support for 26-bit code, as this
would entail run-time selection of whether to call entry points in 26 or 32
bit mode, an added complication.

When running 26-bit, it will make no more use of 32-bit modes than it does
currently, and the same restrictions on 32-bit code will apply as now.

Desktop systems will probably always be 26-bit. NCs may be 32-bit, as they
have little or no requirement to run external software, but could just as well
be 26-bit. Anything running on an ARM 9/10 etc will have to be 32-bit.

Standard (non-Kernel) software should be modified to work whether it is
called in a 26-bit or a 32-bit mode. For pure C applications and modules,
this can just be a recompile; the compiled code will then run on both
existing 26-bit systems, back to RISC OS 3.1, and 32-bit systems. A new
Shared C Library will be required to support 32-bit programs on old systems.

Assembler modules should also run on either 26 or 32-bit systems; however to
achieve this use of MSR and MRS instructions will often be required to
manipulate the PSR - if this cannot be avoided the module will become RISC OS
3.1 incompatible. In our build system we will hide most of the differences
inside macros - use macros like SETPSR instead of using TEQP directly, and
26/32-bit forms will be selected on compile time by build switches. You will
either have a module that uses TEQP that will work on ARM2-ARM8, or a module
that uses MSR which will work on ARM6 onwards.

This way we will continue to be able to test software on our desktop systems -
it can be the same 32-bit binary - it will just run in a 26-bit mode.

RISC OS is initially largely unmodified - almost all binary APIs will act the same
in a 32-bit system as they do now, except that they will be called in a 32-bit
version of the documented processor mode.

One side-effect of this is that R14 on entry to routines will contain just
the return address, with no flags. Hence to preserve flags, the CPSR must be
stacked on entry, and restored on exit. This is cumbersome, but can be hidden
inside EntryS and EXITS macros. Note that this behaviour is then slightly
different - you are preserving flags across the call, not restoring the flags
passed in in R14. Most of the time this doesn't matter, as the API is
documented in terms of preserving flags. There are exceptions to this rule,
notably SWI handling.

On a 32-bit system, SWIs are no longer expected to preserve the N Z and C
flags. They may still set/clear them to return results. 32-bit code should
not assume that SWIs preserve flags. Requiring flag preservation would impose
an unacceptable burden on SWI dispatch. This effectively respecifies *all*
SWIs by changing the default rule in PRM 1-29. Also, it becomes impossible
for SWIs outside the Kernel to depend on the NZCV flags on entry. SWIs inside
the Kernel, such as OS_CallAVector, can still manipulate flags freely. This
should not be an onerous restriction; it is impossible to specify entry flags
for SWIs in C or BASIC, for example.

Many existing APIs do not actually require flag preservation, such as service
call entries. In this case, simply changing MOVS PC... to MOV PC... and LDM
{}^ to LDM {} is sufficient to achieve 32-bit compatibility.


                               NASTY HACKY BITS
                               ================

To just set and clear NZCV flags you can use the SETV etc macros, which will
do the right thing (TM) for the different processor types. To actually
preserve flags, you will probably be forced to use MRS and MSR instructions.
These are NOPs on pre-ARM 6 ARMs, so you may be able to do clever stuff to
keep ARM2 compatibility.

For example, the recommended general-purpose code to check whether you're in
a 26-bit mode is:

        MOV     R0, #0          ; for ARM 2+3 compatibility only
        MRS     R0, CPSR        ; NOP on 26-bit only ARMs
        TST     R0, #2_11100    ; EQ if in a 26-bit mode, NE if 32-bit

If you know you are in a privileged mode, then you can use:

        TEQ     PC, PC          ; EQ if in a 32-bit mode, NE if 26-bit

Sometimes you may be forced to play with the SPSR registers. Beware: interrupt
code will corrupt SPSR_svc if it calls a SWI. Existing interrupt handlers know
to preserve R14_svc before calling a SWI, but not SPSR_svc. Hence you must
disable interrupts around SPSR manipulations.

                                   MODULE ENTRIES
                                   ==============

Most module entries are treated the same in the 32-bit world, except they
will be entered in a 32-bit mode, and hence R14 will be a return address
with no flags.

Init,Final
----------
Unchanged. Flag preservation not required - only V on exit is looked at.

Service
-------
Unchanged. Flags on exit ignored.

Help/command
------------
Unchanged. Flag preservation not required - only V on exit is looked at.

SWI decoding code
-----------------
Unchanged. Flag preservation not required - V on exit is looked at in number
to text case.

SWI handler
-----------
On 26 bit systems, R14 is a return address (inside the Kernel) with the
user's NZCIF flags in it, V clear, mode set to SVC. The CPSR NZCV
flags on exit are then passed back to the SWI caller. Hence MOVS PC,R14
preserves the SWI caller's NZC flags and clears V. The NZ and C flags in the
current PSR on entry are undefined, and are NOT the caller's (but V is
clear). Thus you can simply read, modify and preserve the caller's flags.

On 32 bit systems, R14 is a return address only. There is no way of
determining the caller's flags, so you are not expected to preserve them. The
NZC and V flags you exit with will be returned to the caller.

If writing a new module, simply specify that all your SWIs corrupt flags,
then your SWI dispatchers can return with MOV PC,R14, regardless of whether
running on a 26 or 32 bit system.

If converting an existing module to run on 32-bit, it is highly recommended that
the same binary continue to work on 26-bit systems. You should therefore take
steps to preserve flags when running in a 26-bit mode, if the module did
before. When running on a 32-bit system, you needn't preserve flags. The
following wrapper around the original SWI entry (converted to be 32-bit safe)
achieves this, assuming you always want NZ preserved on a 26-bit system.

        Push    R14
        BL      Original_SWI_Code
        Pull    R14
        TEQ     PC,PC                   ; are we in a 32-bit mode?
        MOVEQ   PC,R14                  ; 32-bit exit: NZ corrupted, CV passed back
      [ PassBackC
        BICCC   R14,R14,#C_bit          ; Extra guff to pass back C as well
        ORRCS   R14,R14,#C_bit
      ]
        MOVVCS  PC,R14                  ; 26-bit exit: NZC preserved, V clear
        ORRVSS  PC,R14,#V_bit           ; 26-bit exit: NZC preserved, V set

Yes, this is cumbersome, but it can be removed when backwards compatibility is
no longer desired. The alternative, which would be to pass in caller flags in
R14, would impose a permanent carbuncle on the 32-bit API.


Module flags
------------
This is a new module header entry at &30. It is an offset to the module
flags word(s). The first module flag word looks like:

          Bit 0       Module is 32-bit compatible
          Bits 1-31   Reserved (0)

Non 32-bit compatible modules will not be loaded by a 32-bit RISC OS.
If no flags word is present, the module is assumed to be 26-bit compatible.


        
                              ENVIRONMENT HANDLERS
                              ====================

Undefined instruction handler
-----------------------------
32 bit system: Now called in UND32 mode. No preveneer.
26 bit: as before

Prefetch abort, data abort
--------------------------
32 bit system: Now called in ABT32 mode. No preveneer.
26 bit: as before

Error
-----
32 bit system: USR32 mode. PC contains no PSR flags.
26 bit: as before - PC contains PSR flags, but may not be reliable.

BreakPoint
----------
32 bit system: register block must be 17 words long.
               contains R0-R15,CPSR.
               entered in SVC32 mode
26 bit: as before

Handlers can check format by looking at mode on entry to handler.

The following code is suitable to restore the user registers and return
in the 32-bit case:

        ADR     R14, saveblock          ; get address of saved registers
        LDR     R0, [R14, #16*4]        ; get user PSR
        MRS     R1, CPSR                ; get current PSR
        ORR     R1, R1, #&80            ; disable interrupts to prevent
        MSR     CPSR_c, R1              ; SPSR_SVC corruption by IRQ code (yuck)
        MSR     SPSR_cxsf, R0           ; put it into SPSR_SVC
        LDMIA   R14, {R0-R14}^          ; load user registers
        MOV     R0, R0                  ; no-op after forcing user mode
        LDR     R14, [R14, #15*4]       ; load user PC into R14_SVC
        MOVS    PC, R14                 ; return to correct address and mode

Escape
------
32 bit system: as before, but called in SVC32

Event
----
32 bit system: as before, but in IRQ32 or SVC32

Exit
----
32 bit system: as before, but in USR32

Unused SWI
----------
26 bit system: called in SVC26 mode.
               R14 = a return address in the Kernel, with NZCVIF flags the same
                     as the caller's (except V clear).
               PSR flags undefined (except I+F as caller)
32 bit system: called in SVC32 mode.
               R14 = return address in the Kernel
               No way to determine caller condition flags
               PSR flags undefined (except I+F as caller)

UpCall
------
32 bit system: as before, but SVC32 mode

CallBack
--------
32 bit system: register block must be 17 words long.
               contains R0-R15,CPSR.
               entered in SVC32 mode, IRQs disabled
26 bit: as before

Handlers can check format by looking at mode on entry to handler.

The following code is suitable to restore the user registers and return
in the 32-bit case:

        ADR     R14, saveblock          ; get address of saved registers
        LDR     R0, [R14, #16*4]        ; get user PSR
        MSR     SPSR_cxsf, R0           ; put it into SPSR_SVC/IRQ
        LDMIA   R14, {R0-R14}^          ; load user registers
        MOV     R0, R0                  ; no-op after forcing user mode
        LDR     R14, [R14, #15*4]       ; load user PC into R14_SVC/IRQ
        MOVS    PC, R14                 ; return to correct address and mode
               
Exception registers
-------------------
32 bit system: block must be 17 words long.
               will contain R0-R15,PSR
Exception handlers can determine block format by looking at mode on entry
to handler.


                     SOFTWARE VECTORS
                     ================

Software vectors have a number of different properties. They can be called
under a variety of conditions, and the flags they exit with may or may not
be significant.

When called using OS_CallAVector, the caller's NZCV flags always used to be
passed in in R14, and the claimant's flags on exit would be passed back.

In a 32-bit system, the caller's flags are not passed in R14. Their C and V
flags are visible in the PSR though, just as in a 26-bit system. N and Z are
not visible. Again, exit flags are passed back.

Most vectors are not intended to be called with OS_CallAVector, and their
exit flags have never had significance, for example KeyV, EventV and TickerV.

Others are vectored SWIs, such as ByteV and ReadLineV. These pass back
C and V flags only.

A few vectors, like RemV, attach significance to entry flags. If not claiming,
you mustn't change those flags for the next callee. In 26-bit mode this might
have been achieved by:

        CMP     R1,#mybuffer
        MOVNES  PC,LR

In the 32-bit world, you could change the CMP to a TEQ to preserve C and V, or
you could use something like:

        Push    R14
        MRS     R14, CPSR
        CMP     R1, #maxbuffers
        BLS     handleit
        MSR     CPSR_f, R14
        Pull    PC
handleit
        ...


                     INSIDE THE KERNEL
                     =================

The 32-bit Kernel will stay in 32-bit modes, and pass out to apps/modules in
32-bit modes. Hardware vectors will save R14+SPSR separately to allow return
to any mode/address.

The 26-bit Kernel will switch into 26-bit modes fairly early, but
still save R14+SPSR separately to simplify the state of assembly switches.
It will switch back to 32-bit mode to execute the return, and will munge fake
26-bit R14s when about to call environment handlers.

When the Kernel dispatches an internal SWI, the stack currently looks like:

       SVCSTK-4    Caller's R12
             -8    Caller's R11
             -12   Caller's R10
             -16   SWI number

R11 is the SWI number with the X bit clear, R14 is the return address with
caller's flags.

This will be changed to look like:

       SVCSTK-4    Caller's R12
             -8    Caller's R11
             -12   Caller's R10
             -16   Return R14 (32-bit - no flags)
             -20   SWI number

R11 will still be the SWI number with the X bit clear, but R14 actually will
be the caller's PSR. This minimises changes to the Kernel, as most routines
only actually interpret R14 as holding the flags. For example, they might
exit with:

        ORRVS   lr, lr, #V_bit
        B       SLVK

Such code will not need altering between 26 and 32-bit versions.

A similar change will be made to the structure of the stack during ErrorV:

       SVCSTK-4    Caller's R12
             -8    Caller's R11
             -12   Caller's R10
             -16   Return R14 (26-bit - including flags)

will become:

       SVCSTK-4    Caller's R12
             -8    Caller's R11
             -12   Caller's R10
             -16   Return R14 (32-bit - no flags)
             -20   Return PSR (32-bit)


                               MEMORY MAP
                               ==========

The "Shadow ROM" at FF800000-FFFFFFFF has been removed. This was originally
intended to allow hardware vectors to branch "down" to the ROM, but this scheme
was replaced by LDR PC,xxx hardware vectors. The only place that used it was
the Kernel's start-up "Ctrl-or-R" keyboard interrupt handler. This has been
modified, as it is technically unpredictable to branch around the ends of the
address space.

With abort handlers now being called in Abort mode, an Abort stack is required.
This has been positioned at 02002000, in the area currently "reserved for fake
screen" (VIDC1 compatibility). The stacks are now:

               Limit      Base
          IRQ: 01F00000   01F01xxx
          SVC: 01C00000   01C02000
          UND: 01E00000   01E02000
          ABT: 02000000   02002000

All are defined to be on a 1MB boundary. The area beyond the limit is usually
unmapped, to cause an abort.

Their actual locations and sizes are subject to change. In particular, they
may not be in the lower 64MB in future.


                           MISCELLANEOUS SWIS
                           ==================

OS_EnterOS
----------
If called in a 26-bit mode, takes you into SVC26, else into SVC32.