ReadMe.txt 10.6 KB
Newer Older
Jeffrey Lee's avatar
Jeffrey Lee committed
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159
Zero page relocation & ZeroPain
===============================

This build of RISC OS makes use of "zero page relocation", a change to the
standard memory map which moves the processor vectors and the kernel's "zero
page" workspace away from address zero and up to the high end of the memory
map. This greatly increases the system's resilience to a common type of
software bug known as a "null pointer dereference". However, there is a catch:
because the OS has always had memory mapped to address zero, large amounts of
RISC OS software contains cases of "harmless" null pointer dereferences where
the code reads from page zero and then does something insignificant with the
result (usually ignoring it completely). With the new memory map, these
programs will most likely cease to operate, exiting with a data abort as soon
as they try to access page zero.

Since relocating zero page is very beneficial to the stability and security of
the OS, the goal is to have the feature enabled for all future stable releases,
starting with RISC OS 5.24. But due to the large amount of buggy software out
there we can't simply turn it on and be done with it - we need to have a
transition period in which developers can fix their code without worrying about
the fact their compilers, text editors, etc. are buggy too. We also need a way
for regular users to get involved with the testing process.

This is where ZeroPain comes in. ZeroPain traps most attempts to read page zero
and emulates the operation, providing a safe level of compatibility with the
old memory map. And for any page zero access which it emulates, it adds an
entry to a log file so that the user/developer is aware of the issue.

Note that ZeroPain does not trap writes to page zero, nor any attempts to
execute code. Some reads operations (such as unaligned loads) are also not
dealt with. In these cases a data or prefetch abort will be raised as normal,
and the program will most likely terminate with an error.


Installation
============

The recommended method of installation is to add the ZeroPain module to your
Choices.Boot.PreDesk folder. This can be performed by using the boot merge
utility in Configure to merge in the !Boot structure that is provided here.

To ensure that the log file is clearly visible, the default log location is the
file 'ZeroPain' in the root of your boot drive. However if desired you can
specify a different location for the file by editing the file !Boot.Choices.
Boot.PreDesk.!!ZeroPain.!Run.

Note that there is no need to uninstall ZeroPain before running a ROM which
does not use zero page relocation. The module will refuse to start on such a
machine, and the provided loader in the skeleton !Boot will swallow the error
so that it does not affect the startup of your machine.

Also note that the module contains a built-in kill switch - it will only run on
ROMs built in 2015. This ensures that, as the RISC OS 5.24 release draws near,
we have a period of testing where the module is inactive. This will help to
make sure that as many bugs as possible are caught prior to the release. If
necessary, the testing period may be extended and a new version of ZeroPain
released to cover it.


Use
===

There is little to know about using the module. As long as the module is
running, it will be monitoring the data abort vector and watching for attempts
to read from page zero. If an attempt is caught and emulated, an entry will be
added to the log file. If Reporter is running the module will also send a
message to Reporter's log.

Remember to review the contents of the log file at regular intervals, to see if
there are any issues which haven't already been reported to the relevant
application developer. The information below can help you to try and work out
which application caused a particular access.

ZeroPain will pause logging once the log file exceeds 1MB in size. Deleting the
log will allow it to resume.


Understanding log entries
=========================

Log entries produced by ZeroPain look like the following:

  Time: Sun Jun 28 19:23:18 2015
  Location: Application space
  Current Wimp task: Unknown
  Last app to start: amu -E standalone
  
  R0  = 00013e20 R1  = 00013e28 R2  = 00000000 R3  = 0001818c
  R4  = 00013e20 R5  = 0001e0e6 R6  = 0003ef60 R7  = 00000000
  R8  = 0003f02c R9  = 0003efbc R10 = 0001540c R11 = 000160bc
  R12 = 000160c0 R13 = 000160b0 R14 = 0000d310 R15 = 0000cec0
  DFAR = 00000008  Mode USR32  Flags nzCv if   PSR = 20000010
  
  0000ce78 : e51f183c : LDR     R1,&0000C644
  0000ce7c : e590200c : LDR     R2,[R0,#12]
  0000ce80 : e5b13008 : LDR     R3,[R1,#8]!
  0000ce84 : e3520000 : CMP     R2,#0
  0000ce88 : 1a000007 : BNE     &0000CEAC
  0000ce8c : e5932008 : LDR     R2,[R3,#8]
  0000ce90 : e580200c : STR     R2,[R0,#12]
  0000ce94 : e5922008 : LDR     R2,[R2,#8]
  0000ce98 : e5a32018 : STR     R2,[R3,#24]!
  0000ce9c : e5b0200c : LDR     R2,[R0,#12]!
  0000cea0 : e3520000 : CMP     R2,#0
  0000cea4 : 191ba800 : LDMNEDB R11,{R11,R13,PC}
  0000cea8 : ea000008 : B       &0000CED0
  0000ceac : e5922000 : LDR     R2,[R2,#0]
  0000ceb0 : e5a32008 : STR     R2,[R3,#8]!
  0000ceb4 : e580200c : STR     R2,[R0,#12]
  0000ceb8 * e5b21008 * LDR     R1,[R2,#8]!
  0000cebc : e5902008 : LDR     R2,[R0,#8]
  0000cec0 : e5a21018 : STR     R1,[R2,#24]!
  0000cec4 : e590000c : LDR     R0,[R0,#12]
  0000cec8 : e3500000 : CMP     R0,#0
  0000cecc : 191ba800 : LDMNEDB R11,{R11,R13,PC}
  0000ced0 : ebffffcf : BL      &0000CE14
  0000ced4 : eaffffe1 : B       &0000CE60
  0000ced8 : e1a0c00d : MOV     R12,R13
  0000cedc : e92dd800 : STMDB   R13!,{R11,R12,R14,PC}
  0000cee0 : e24cb004 : SUB     R11,R12,#4
  0000cee4 : e15d000a : CMP     R13,R10
  0000cee8 : 4b001989 : BLMI    &00013514
  0000ceec : e51f18b0 : LDR     R1,&0000C644
  0000cef0 : e3500001 : CMP     R0,#1
  0000cef4 : 03a00004 : MOVEQ   R0,#4

The timestamp is for your convenience, to help you identify how old the log
entry is.

'Location' indicates the location of the code performing the access. Usually
this will be 'Application space' or the name of a module (plus the address
offset into the module).

If 'Application space' was listed as the location then the page zero access is
almost certainly occurring due to a bug in the current application. 'Current
Wimp task' and 'Last app to start' can help to identify which application this
is. Note however that sometimes this information can be misleading - there is
no easy way for ZeroPain to detect when an application exits, and so 'Last app
to start' may in fact refer to an app which has already quit. Similarly,
'Current Wimp task' may be listed as 'Unknown' if an application hasn't yet
registered itself with the Wimp, or if it exits soon after performing the page
zero access.

If the location is listed as being inside a module, then it's possibly a bug in
the module, or it's possibly a bug in the code that called the module. For
example, in order to provide the best performance, many of the functions in the
Shared C Library don't perform error checking on their arguments. For these
functions it is the caller's responsibility to ensure that the arguments are
correct.

The register dump and disassembly is similar to that which can be produced by
the Debugger module, with a couple of key differences:

* The register dump includes the DFAR (data fault address register), indicating
  the address for which the access was attempted
* The highlighted line in the disassembly is the instruction that caused the
  access - in comparison to the output from the Debugger module which (for data
  aborts) will be 2 instructions ahead of the fault

Robert Sprowson's avatar
Robert Sprowson committed
160
In some cases there will be fewer lines of disassembly than shown above. This
Jeffrey Lee's avatar
Jeffrey Lee committed
161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182
will happen if the instruction which caused the abort is close to a page
boundary (the abort handler errs on the side of caution and avoids reading
across page boundaries). However there should still be enough information to
allow the problem code to be identified.


Zero page workspace emulated by ZeroPain
========================================

When ZeroPain emulates a read from zero page, it will return the value 0 for
most locations. However, special handling is performed for the following
addresses, which correspond to the old locations of some kernel workspace
values. If your code is accessing these locations, please consider following
the given instructions for alternative ways of getting at the same value. As a
last resort, you can modify your code to use OS_ReadSysInfo 6 to read the new
location of the value. However, directly accessing kernel workspace (even via
an address obtained through OS_ReadSysInfo 6) is strongly discouraged, as it
may limit our ability to make changes to the OS in future.

&104 - 'ESC_Status'. Software should use OS_ReadEscapeState instead.
&108 - 'IRQsema'. Used by some software to detect if it's running in the
       'foreground' or not.
Robert Sprowson's avatar
Robert Sprowson committed
183 184
&10C - 'MetroGnome'. This is the centisecond counter. Use OS_ReadMonotonicTime
       instead.
Jeffrey Lee's avatar
Jeffrey Lee committed
185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220
&FE8 - 'CLibCounter'. This is used by old versions of the shared C library, and
       the statically linked version (ansilib). Update to a newer version,
       and/or switch to using the shared C library.
&FEC - 'RISCOSLibWord'. As above.
&FF0 - 'CLibWord'. As above.
&FF8 - 'DomainId'. May be used by some software to identify the current Wimp
       task. Use Wimp_ReadSysInfo instead.


High processor vectors
======================

Modern ARM CPUs support a feature known as 'high processor vectors', whereby
the processor vectors are moved from their old location at &0 to a new location
at &FFFF0000. Zero page relocation within RISC OS makes use of this feature -
without it we would still need some memory mapped to address zero in order to
contain the processor vectors. Although high processor vectors is currently a
requirement for zero page relocation, this is likely to change in the future,
in order to support zero page relocation on the IOMD (RiscPC/A7000/RPCEmu)
build of RISC OS.

OS_PlatformFeatures 0 can be used to determine whether high processor vectors
are in use (flag returned in bit 20). Software which needs to interact with the
processor vectors directly should use this to determine their location.

https://www.riscosopen.org/wiki/documentation/show/OS_PlatformFeatures%200

The address of the processor vectors should not be used to infer the address of
the zero page workspace, and vice-versa.

Apart from the change in location, the processor vectors continue to operate as
normal - so for FIQ handlers, you can still rely on having 228 bytes of space
available for your handler code (from &FFFF001C to &FFFF00FF inclusive). Also,
for future compatibility, code should not assume that the processor vectors are
readable from user mode - regardless of whether high processor vectors are in
use or not.