FAQ: Emulator accuracy
I/O emulation
68881 floating point
coprocessor emulation
68020 emulation
68020/68881 Testing
Results of gcc testsuite testing
TME bugs
Q: How good is the emulator's ethernet emulation?
A: Perfect!
Q: How good is the emulator's serial line emulation?
A: Great! I get a rare assertion during keyboard input after logging in over the
emulated serial line, but as I usually use ethernet (or the graphical console),
this is not a big deal.
Q: How good is the tape drive emulation?
A: Perfect!
Q: How do I write to the tape drive?
A: You can't.
Q: How do I get files out of the emulator without using ethernet?
A: Ethernet is cool. Use it! But if you insist, you can use a blank
(but labeled) emulated second drive, then just dd a tar file (in the OS running
in the emulator) over the entire second drive. Then on the host system,
you can tar extract the files from the second emulated drive by just tar xvfing
the file name of the second drive. After that, the second emulated drive
is no good - until you relabel it.
68881 floating point coprocessor emulation
The NetBSD and SunOS kernels are built without any hardware floating point instructions, but the NetBSD and SunOS userlands do call hardware floating point instructions.
Q: How good is the emulator's floating point coprocessor emulation?
A: Pretty darn good. There are a few minor differences in Sun3 emluation
on different host platforms as compared to a real sun3 - but are very
minor. See Results
of gcc testsuite testing for specifics.
Q: I'm scared of any floating point differences - to ensure ultimate
emulation perfection, can't I just turn off floating-point coprocessor emulation
in TME to avoid emulation problems?
A: Yes, you would have a perfectly emulated Sun3 without a floating point
coprocessor, such as one of the original sun3/50's
that came without one, but portions of the NetBSD and SunOS userlands expect
a floating point coprocessor, and so what you have isn't optimally useful.
With SunOS (since you can't rebuild the userland), you
can:
1) Use TME floating point emulation,
then the userland will run fine.
2) Disable TME 68881 emulation, then
the userland will NOT run fine (as the userland calls hardware floating point
instructions - and there is no 68881).
With NetBSD (since you can rebuild the userland, and which
does have limited 68881 emulation build into the kernel), you can:
1) Use TME floating point emulation,
then the userland will run fine.
2) Disable TME 68881 emulation, use
the NetBSD 68881 kernel emulation, then the userland will NOT run fine (as the
userland calls hardware floating point instructions - and there is no 68881 -
and the NetBSD kernel 68881 emulation performs less well than TME's 68881
emulation).
3) Disable TME 68881 emulation, and
build a custom userland with -MKSOFTFLOAT so that no hardware floating point
instructions will ever be generated by the userland, then the userland will NOT
run fine (as although the userland will not call any hardware floating point
instructions, there is a NetBSD MKSOFTFLOAT bug
such that the resulting userland is less reliable that just using a regular
userland and relying on TME's 68881 emulation).
Q: Does TME emulate the 68881 FNEG of x to be (0.0 - x) or (-1.0 * x)?
A: Yes Virginia, there is a IEEE-754
positive and negative
zero. Positive zero equals negative
zero, but they are
stored differently in memory. TME previously used the former method, and
it now uses the latter method, thereby allowing some more GCC testsuite cases to
work identically in the emulator as on a real Sun3.
Q: Is negative zero actually important? YES, seriously. Check out how a badly handled negative zero caused improper streamline plots on page 13 and 14 of How Java’s Floating-Point Hurts Everyone Everywhere. "A circulating component, necessary to generate lift, speeds the flow of an idealized fluid above the wing and slows it below. One streamline splits at the wing’s leading edge and recombines at the trailing edge. But when –0 is mishandled, as Fortran-style Complex arithmetic must mishandle it, that streamline goes only over the wing."
Q: Emulated negative zero in floating-point arithmetic is cool! How
about emulated negative zero in integer arithmetic?
A: Don't be silly. We're emulating a two's complement Sun3 here, not a one's
complement Univac.
TME originally had the following m68k instruction emulation problems:
Now... there are no known m68k instruction/addressing emulation bugs! Woot!
In order to test a wide variety of instructions and addressing modes, I tested the following entire testsuites on a real sun3 and inside the emulator:
The goal is to find simple test cases for any TME emulation problem. Here's an example of how running a testsuite helped locate a problem:
NetBSD 3.0 and SunOS 4.1.1 ran fine under TME compiled with NetBSD 3.1 gcc-3.3.3. But if TME was compiled with NetBSD 3.1 gcc-4.1.1, then NetBSD 3.0 would not boot inside the emulator (fsck would fail - and the device manager would panic during boot). Fortunately, SunOS 4.1.1 would boot under TME compiled with gcc-4.1.1.
So I ran the entire gcc-3.2.3 testsuite w/ SunOS 4.1.1 under TME gcc-4.1.1 and compared the results against those generated under TME gcc-3.3.3. I found the following differences:
Following compile and execute fine under TME gcc3, but go into infinite loop
during compile (compile never finishes) under TME gcc4:
compile/920501-6.c
execute/ieee/980619-1.c
--
Following generate same .s files under TME gcc3 and TME gcc4, but generate
different a.out files, and TME gcc4 a.out file aborts running under TME gcc4.
20000605-1.c (-03 option)
20010605-2.c
builtin-complex-1.c
conversion.c
gofast.c
--
Following generate same .s files and same a.out files under TME gcc3 and TME
gcc4, but a.out aborts when running under TME gcc4.
ashrdi-1.c
lshrdi-1.c
ieee/rbug.c
Note: executables were stripped so that comparisons could be made on a.outs
So the first two sets of differences were not entirely helpful - because they showed that the gcc-3.2.3 compiler running under SunOS 4.1.1 inside TME gcc4 ran incorrectly. But it would be difficult to determine WHY it was running incorrectly. It would be difficult to find the exact m68k instruction that was emulated incorrectly that caused gcc-3.2.3 to not work correctly.
But the last set of files, such as ashrdi-1.c, were much more useful.
In this case, the gcc-3.2.3 compiler and linker ran identically under TME gcc3
and under TME gcc4. The a.out files were identical, but the a.out file RAN
incorrectly under TME gcc4. If we could run NetBSD inside TME, this would
be easy to track down, as we could just step through the code in gdb, one
machine instruction at a time, running one under TME gcc3 and another under TME
gcc4 and see where the program behaved differently. But I don't have a gdb
that runs on SunOS 4.1.1 under TME. So I modified the assembly code
generated by: gcc -S ashrdi-1.c to add print statements of each register after
each instruction and finally found that the instruction:
roxrl #1,d1
performed incorrectly under TME gcc4. Stepping through the TME source
while running the test case under TME gcc3 and TME gcc4 showed that the problem
was in the tme_m68k_roxr32 routine, and that SHIFTMAX_INT32_T had a different
value under gcc4 - which caused it to incorrectly perform the emulated roxrl
#1,d1 instruction. After correcting the value, TME works fine under
gcc4. The detailed description of the problem is here,
tagged on the end of the patches Izumi submitted to allow gcc4 to compile TME.
Another thing that could have been done in this case was to modify TME to print out the program counter to a file whenever it changed. Run gcc-3.2.3 on one of the other groups of files - under TME gcc3 and TME gcc4. Then examine the PC sequence and determine where it diverged. Then "objdump -d" on the gcc-3.2.3 executable and see what it was trying to do when it diverged.
Another trick to keep in your TME emulation-debugging-kit-bag is when comparing NetBSD inside TME vs. NetBSD on a real sun3, one could run an executable under gdb and print out the stack and program counter after each machine instruction and see where they diverged. Time consuming - but it helped find the "movel sp,-(sp)" problem. The PC where the execution diverged offered little assistance in determining the root cause of the problem - other than something on the stack was different. I stepped through several hundred thousand machine instructions in gdb before noting that the stack contents differed after "movel sp,-(sp)".
p.s. Why do I test the full gcc-3.2.3 testsuite under SunOS 4.1.1 in the emulator instead of the gcc-3.3.3 testsuite under NetBSD 3.1 in the emulator? Because the entire testsuite runs by itself under SunOS 4.1.1 and I just need to come back after a few days and examine the results. NetBSD 3.1 has some bugs that cause the testsuite to fail due to running out of memory. Note that these are NetBSD 3.1 for Sun3 bugs - not TME bugs. So I run the testsuite under SunOS 4.1.1 inside TME first, then try the test cases under NetBSD 3.1 in TME second, hoping that they are duplicated and that I can use the integrated NetBSD gdb / objdump utilities to find the problem.
Results of gcc testsuite testing
Everything is great, other than a few floating point differences.
Sun3 emulation
| Host architecture running TME | SunOS 4.1.1 / gcc-3.2.3 testsuite running inside emulated Sun3 | NetBSD 3.1 / gcc-3.3.3 testsuite running inside emulated Sun3 |
| amd64 | pass | pass |
| amd64-32bit i386 load | pass | fail- note 1 |
| i386 | pass | fail- note 1 |
| ultrasparc10 | pass | pass- note 3 |
| ultrasparc10 32-bit sun4u load | pass | pass- note 3 |
| sparc20 | fail - note 2 | not tested |
fail note 1 - failures:
gcc.c-torture/execute/conversion.c -0 -1 -2 -s
Here is a reduced code fragment that produces the difference:
#include <stdio.h>
long double
ull2ld(u)
unsigned long long int u;
{
return u;
}
main()
{
long double a, b;
a = ull2ld(~0ULL);
b = (long double) ~0ULL;
if(a!=b) abort();
}
Per Matt: The ull2ld() program is a good example of how not-easy it is to emulate floating-point. This program checks that conversions of 64-bit unsigned integers to the IEEE754 80-bit format work. The 80-bit format has a 64-bit significand, so it should always work without loss of precision. The IEEE754 compliance level setting in SUN3-CARRERA is "unknown". "unknown" is meant to give the greatest possible functionality, with potentially the least IEEE754 compliance. It gives *something* for almost all floating-point operations, but uses whatever built-in types the host has (which may not even be IEEE754 types) and makes no guarantees about precision or rounding direction. On most host systems, "unknown" actually isn't too bad. Most host systems have IEEE754 types, and i386/x86-64 systems even have the 80-bit format. Why, then, does "unknown" break ull2ld() on an i386 host, which has the 80-bit format? The i386 FPU has a "precision control" setting in its control word, and it's usually set to round all results to 53-bits (for the IEEE754 64-bit format, built-in type "double"). When ull2ld() converts 0xffffffff.ffffffff to long double, it gets rounded to that plus one (which fits in a 53-bit significand). As far as I can tell, NetBSD doesn't have any function that tme could use to set the precision control. I can't find anything on my old RedHat either, but it looks like FreeBSD has an fpsetprec(). I'm reluctant to add host-CPU-specific asm() to tme, especially at this late point and since "unknown" makes no guarantees about precision. The other compliance levels have their own issues. "partial" is basically the same as "unknown" on an i386 host. "strict" makes ull2ld() work, but "strict" doesn't have any of the IEEE754 80-bit transcendental functions (softfloat doesn't provide them), and so "strict" will break anything that uses them. That's the long story. I'm going to leave "unknown" in SUN3-CARRERA and just continue to remind people to be wary of floating-point. It's very hard to do perfect floating-point emulation using built-in types, libm functions and no asm(), which is what tme strives to do.
fail note 2 - failures:
gcc.dg/Wparentheses-1.c
gcc.dg/Wreturn-type.c
gcc.dg/Wunknownprag.c
gcc.dg/array-2.c
gcc.dg/attr-nest.c
gcc.dg/c99-func-4.c
gcc.dg/c99-impl-int-2.c
gcc.dg/cast-qual-2.c
gcc.dg/va-arg-1.c
gcc.dg/vla-init-1.c
gcc.dg/weak-7.c
gcc.dg/cpp/gnuc89-pedantic.c
gcc.dg/cpp/if-1.c
gcc.dg/cpp/if-2.c
gcc.dg/cpp/if-mpar.c
gcc.dg/format/c90-scanf-3.c
gcc.dg/noncompile/20001228-1.c -O3 -fomit-frame-pointer -funroll-loops
gcc.dg/noncompile/20010425-1.c -O0
gcc.dg/noncompile/20010425-1.c -O1
gcc.dg/noncompile/20010425-1.c -O2
gcc.dg/noncompile/20010425-1.c -O3 -g
gcc.dg/noncompile/20010425-1.c -Os
gcc.dg/noncompile/20010524-1.c -O0
gcc.dg/noncompile/20011025-1.c -O1
gcc.dg/noncompile/20011025-1.c -O3 -fomit-frame-pointer
gcc.dg/noncompile/20011025-1.c -O3 -g
gcc.dg/noncompile/20020130-1.c -O2
gcc.dg/noncompile/20020130-1.c -O3 -fomit-frame-pointer
gcc.dg/noncompile/20020130-1.c -Os
gcc.dg/noncompile/20020207-1.c -O0
gcc.dg/noncompile/20020207-1.c -O1
gcc.dg/noncompile/20020207-1.c -O3 -fomit-frame-pointer -funroll-loops
gcc.dg/noncompile/init-3.c -O2
gcc.dg/noncompile/init-3.c -Os
gcc.dg/noncompile/invalid_asm.c -O0
gcc.dg/noncompile/invalid_asm.c -O1
gcc.dg/noncompile/invalid_asm.c -O2
gcc.dg/noncompile/invalid_asm.c -O3 -fomit-frame-pointer
gcc.dg/noncompile/invalid_asm.c -O3 -g
gcc.dg/noncompile/invalid_asm.c -Os
gcc.dg/noncompile/label-lineno-1.c -O0
gcc.dg/noncompile/label-lineno-1.c -O1
gcc.dg/noncompile/label-lineno-1.c -O2
gcc.dg/noncompile/label-lineno-1.c -Os
gcc.dg/noncompile/redecl-1.c -O0
gcc.dg/noncompile/redecl-1.c -O1
gcc.dg/noncompile/redecl-1.c -O3 -fomit-frame-pointer
gcc.dg/noncompile/redecl-1.c -O3 -g
gcc.dg/noncompile/redecl-1.c -Os
gcc.dg/noncompile/va-arg-1.c -O3 -fomit-frame-pointer
gcc.dg/noncompile/va-arg-1.c -Os
pass note 3: only a subset of tests were run - floating points that caused
problems in the past, plus all execute/ieee
gcc.c-torture/execute/20000603-1.c
gcc.c-torture/execute/20000731-1.c
gcc.c-torture/execute/20020314-1.c
gcc.c-torture/execute/20020413-1.c
gcc.c-torture/execute/960405-1.c
gcc.c-torture/execute/960513-1.c
gcc.c-torture/execute/991019-1.c
gcc.c-torture/execute/builtin-complex-1.c
gcc.c-torture/execute/complex-1.c
gcc.c-torture/execute/complex-5.c
gcc.c-torture/execute/complex-6.c
gcc.c-torture/execute/conversion.c
gcc.c-torture/execute/gofast.c
and everything in gcc.c-torture/execute/ieee
Sparcstation emulation
| Host architecture running TME | OS / testsuite running inside emulated Sparc | OS / testsuite running inside emulated Sparc |
| NetBSD 3.1 gcc-3.3.3 | ||
| i386 | fail - note 1, 2 | |
| ultrasparc10 | fail - note 1, 2 | |
| ultrasparc10-32 bit sun4u load | fail - note 1, 2 | |
| sparc20 |
fail note 1 - failures:
gcc.c-torture/execute/960405-1.c (all compile options)
gcc.c-torture/execute/ieee/hugeval.c (all compile options)
gcc.c-torture/execute/ieee/inf-1.c -0 -1
fail note 2: all tests were run other than the following two, which took
forever to run and finally locked up the real and emulated sparcstation:
gcc.c-torture/compile/20001226-1.c
gcc.dg/c99-intconst-1.c
1) Netbsd 3.0 hangs inside emulator after running two days
2) Host system must be rebooted between tests of gcc-3.3.3-testsuite under
netbsd 3.0 inside emulator
3) Weird login in prompt via serial line "tíeóèóõî loçiî:"
instead of "tmeshsun login:"
4) I sometimes (rarely) get an assertion on line 946, threads-sjlj.c, when
typing in via the emulated serial line. The same assertion appears when
starting tmesh with - (a blank device) as the serial device
5) Stop-a doesn't work on some pc's (NetBSD ticket 34903)
6) No keyboard entry at prom monitor after shutdown with sunos (NetBSD ticket
35306)
Get your cool ILVSUN3
paraphernalia!
I have a bet going on how many ILVSUN3 dog t-shirts I can sell by Christmas
2007. Oh YEAH!