I've played with the implementation of fwrite, and had several different implementations to compare them: 1) "for (i=0; i<nbytes; ++i) putc(*ptr++, stream);" (roughly). This is what current C libraries up to and including 5.26 do. 2) align write buffer using (1) or (4); OS_GBPB blocks_to_write * block_size bytes in one go; write remainder using (1) or (4) 3) as (2), but call OS_GBPB to write block_size bytes blocks_to_write times. In addition, some alternative strategies to do the buffer alignment: 4) if (space_in_write_buffer >= nbytes) memcpy the data into the write buffer and exit (note, you don't flush the buffer until the next write is done) else memcpy space_in_write_buffer bytes into the write buffer; then force the buffer to be flushed. When writing the remainder in (2) or (3), the first condition in (4) will always be true. Choice between methods (2) and (3) is controlled by the WRITE_ONCE macro in stdio.c. Methods (2) and (3) avoid the bytewise data copy from the caller's buffer into the stdio buffer. It's always going to be necessary to align the write buffer before attempting the block write because there might be unflushed data in the write cache already or this might be the first write in which case the internal data structures are not set up to do writing. The test was to write an 8M file to ADFS, NFS, ATAFS, SCSIFS and RAMFS. Times in seconds. The block size is the parameter to fwrite used in the test program. Times varied up to around 2%, using StrongARM RiscPC 200MHz, our RISC OS 4 builds. Strategy Block size ADFS NFS RAMFS ATAFS 1 5K 7.50 20.68 1.87 7.16 32K 7.51 20.76 1.94 7.32 8M 7.47 20.85 1.95 7.42 2 5K 7.44 19.89 0.76 6.40 32K 7.46 14.39 0.62 3.40 8M 7.45 10.29 0.52 2.40 3 5K 7.44 20.29 0.78 6.40 32K 7.47 19.35 0.70 6.10 8M 7.47 18.92 0.63 6.09 We tried SCSIFS on an A540, and the timings are of course completely different from my StrongARM RiscPCs - but the relative stats were that strategy 3 was twice as fast as strategy 1, and strategy 2 was twice as fast 3 for 5K blocks, but 9-10 times as fast for 32K and 8M blocks! We also tried SCSIFS on a StrongARM Risc PC and found that 32K or larger blocks in strategy (2) made a significant difference, but (3) didn't and neither did (2) at 5K. Using or not using (4) didn't make any measurable difference. These are all word-aligned writes. Misaligning the stdio buffer but writing a few bytes before the large write made no measurable difference. The parallelism exploit in the NFS module where multiple transactions can be run in parallel when >8K of data is to be transferred in a single call through the filing system entry point accounts for the variation in NFS; RAMFS benefits a great deal from removing the extra memory copy. The bulk transfer really helps SCSIFS - but that may be down to the architecture of an A540. Any or all changes seem to make no difference to ADFS (ADFSBuffers didn't affect the speed measurably). The major downside is that you lose TaskWindow multi-tasking during writes of large blocks. If applications used setvbuf to set up a 32K buffer, then they should benefit quite a bit - particularly on non-Risc PC IDE bus bound filing systems. -- Stewart Brodie, Senior Software Engineer Pace Micro Technology PLC 645 Newmarket Road Cambridge, CB5 8PB, United Kingdom WWW: http://www.pacemicro.com/