FAQ: Emulator accuracy

I/O emulation
68881 floating point coprocessor emulation
68020 emulation
68020/68881 Testing
Results of gcc testsuite testing
TME bugs

 

I/O emulation

Q: How good is the emulator's ethernet emulation?
A: Perfect!

Q: How good is the emulator's serial line emulation?
A: Great! I get a rare assertion during keyboard input after logging in over the emulated serial line, but as I usually use ethernet (or the graphical console), this is not a big deal.

Q: How good is the tape drive emulation?
A: Perfect!

Q: How do I write to the tape drive?
A: You can't.

Q: How do I get files out of the emulator without using ethernet?
A: Ethernet is cool.  Use it!  But if you insist, you can use a blank (but labeled) emulated second drive, then just dd a tar file (in the OS running in the emulator) over the entire second drive.  Then on the host system, you can tar extract the files from the second emulated drive by just tar xvfing the file name of the second drive.  After that, the second emulated drive is no good - until you relabel it.

 

68881 floating point coprocessor emulation

The NetBSD and SunOS kernels are built without any hardware floating point instructions, but the NetBSD and SunOS userlands do call hardware floating point instructions.

Q: How good is the emulator's floating point coprocessor emulation?
A: Pretty darn good.  There are a few minor differences in Sun3 emluation on different host platforms as compared to a real sun3 - but are very minor.  See Results of gcc testsuite testing for specifics.

Q: I'm scared of any floating point differences - to ensure ultimate emulation perfection, can't I just turn off floating-point coprocessor emulation in TME to avoid emulation problems?
A: Yes, you would have a perfectly emulated Sun3 without a floating point coprocessor, such as one of the original sun3/50's that came without one, but portions of the NetBSD and SunOS userlands expect a floating point coprocessor, and so what you have isn't optimally useful.

    With SunOS (since you can't rebuild the userland), you can:
        1) Use TME floating point emulation, then the userland will run fine.
        2) Disable TME 68881 emulation, then the userland will NOT run fine (as the userland calls hardware floating point instructions - and there is no 68881).

    With NetBSD (since you can rebuild the userland, and which does have limited 68881 emulation build into the kernel), you can:
        1) Use TME floating point emulation, then the userland will run fine.
        2) Disable TME 68881 emulation, use the NetBSD 68881 kernel emulation, then the userland will NOT run fine (as the userland calls hardware floating point instructions - and there is no 68881 - and the NetBSD kernel 68881 emulation performs less well than TME's 68881 emulation).
        3) Disable TME 68881 emulation, and build a custom userland with -MKSOFTFLOAT so that no hardware floating point instructions will ever be generated by the userland, then the userland will NOT run fine (as although the userland will not call any hardware floating point instructions, there is a NetBSD MKSOFTFLOAT bug such that the resulting userland is less reliable that just using a regular userland and relying on TME's 68881 emulation).

Q: Does TME emulate the 68881 FNEG of x to be (0.0 - x) or (-1.0 * x)?
A: Yes Virginia, there is a IEEE-754 positive and negative zero.  Positive zero equals negative zero, but they are stored differently in memory.  TME previously used the former method, and it now uses the latter method, thereby allowing some more GCC testsuite cases to work identically in the emulator as on a real Sun3.

Q: Is negative zero actually important?  YES, seriously.  Check out how a badly handled negative zero caused improper streamline plots on page 13 and 14 of How Java’s Floating-Point Hurts Everyone Everywhere.  "A circulating component, necessary to generate lift, speeds the flow of an idealized fluid above the wing and slows it below. One streamline splits at the wing’s leading edge and recombines at the trailing edge. But when –0 is mishandled, as Fortran-style Complex arithmetic must mishandle it, that streamline goes only over the wing."

Q: Emulated negative zero in floating-point arithmetic is cool!  How about emulated negative zero in integer arithmetic?
A: Don't be silly.  We're emulating a two's complement Sun3 here, not a one's complement Univac.

 

68020 emulation

TME originally had the following m68k instruction emulation problems:

Now... there are no known m68k instruction/addressing emulation bugs!  Woot!

 

68020/68881 testing

In order to test a wide variety of instructions and addressing modes, I tested the following entire testsuites on a real sun3 and inside the emulator:

The goal is to find simple test cases for any TME emulation problem.  Here's an example of how running a testsuite helped locate a problem:

NetBSD 3.0 and SunOS 4.1.1 ran fine under TME compiled with NetBSD 3.1 gcc-3.3.3.  But if TME was compiled with NetBSD 3.1 gcc-4.1.1, then NetBSD 3.0 would not boot inside the emulator (fsck would fail - and the device manager would panic during boot).  Fortunately, SunOS 4.1.1 would boot under TME compiled with gcc-4.1.1.

So I ran the entire gcc-3.2.3 testsuite w/ SunOS 4.1.1 under TME gcc-4.1.1 and compared the results against those generated under TME gcc-3.3.3.  I found the following differences:

Following compile and execute fine under TME gcc3, but go into infinite loop during compile (compile never finishes) under TME gcc4:

compile/920501-6.c
execute/ieee/980619-1.c

--

Following generate same .s files under TME gcc3 and TME gcc4, but generate different a.out files, and TME gcc4 a.out file aborts running under TME gcc4.

20000605-1.c (-03 option)
20010605-2.c
builtin-complex-1.c
conversion.c
gofast.c

--

Following generate same .s files and same a.out files under TME gcc3 and TME gcc4, but a.out aborts when running under TME gcc4.

ashrdi-1.c
lshrdi-1.c
ieee/rbug.c

Note: executables were stripped so that comparisons could be made on a.outs

So the first two sets of differences were not entirely helpful - because they showed that the gcc-3.2.3 compiler running under SunOS 4.1.1 inside TME gcc4 ran incorrectly.  But it would be difficult to determine WHY it was running incorrectly.  It would be difficult to find the exact m68k instruction that was emulated incorrectly that caused gcc-3.2.3 to not work correctly.

But the last set of files, such as ashrdi-1.c, were much more useful.  In this case, the gcc-3.2.3 compiler and linker ran identically under TME gcc3 and under TME gcc4.  The a.out files were identical, but the a.out file RAN incorrectly under TME gcc4.  If we could run NetBSD inside TME, this would be easy to track down, as we could just step through the code in gdb, one machine instruction at a time, running one under TME gcc3 and another under TME gcc4 and see where the program behaved differently.  But I don't have a gdb that runs on SunOS 4.1.1 under TME.  So I modified the assembly code generated by: gcc -S ashrdi-1.c to add print statements of each register after each instruction and finally found that the instruction:
    roxrl #1,d1
performed incorrectly under TME gcc4.  Stepping through the TME source while running the test case under TME gcc3 and TME gcc4 showed that the problem was in the tme_m68k_roxr32 routine, and that SHIFTMAX_INT32_T had a different value under gcc4 - which caused it to incorrectly perform the emulated roxrl #1,d1 instruction.  After correcting the value, TME works fine under gcc4.  The detailed description of the problem is here, tagged on the end of the patches Izumi submitted to allow gcc4 to compile TME.

Another thing that could have been done in this case was to modify TME to print out the program counter to a file whenever it changed.  Run gcc-3.2.3 on one of the other groups of files - under TME gcc3 and TME gcc4.  Then examine the PC sequence and determine where it diverged.  Then "objdump -d" on the gcc-3.2.3 executable and see what it was trying to do when it diverged.

Another trick to keep in your TME emulation-debugging-kit-bag is when comparing NetBSD inside TME vs. NetBSD on a real sun3, one could run an executable under gdb and print out the stack and program counter after each machine instruction and see where they diverged.  Time consuming - but it helped find the "movel sp,-(sp)" problem.  The PC where the execution diverged offered little assistance in determining the root cause of the problem - other than something on the stack was different.  I stepped through several hundred thousand machine instructions in gdb before noting that the stack contents differed after "movel sp,-(sp)".

p.s.  Why do I test the full gcc-3.2.3 testsuite under SunOS 4.1.1 in the emulator instead of the gcc-3.3.3 testsuite under NetBSD 3.1 in the emulator?  Because the entire testsuite runs by itself under SunOS 4.1.1 and I just need to come back after a few days and examine the results.  NetBSD 3.1 has some bugs that cause the testsuite to fail due to running out of memory.  Note that these are NetBSD 3.1 for Sun3 bugs - not TME bugs.  So I run the testsuite under SunOS 4.1.1 inside TME first, then try the test cases under NetBSD 3.1 in TME second, hoping that they are duplicated and that I can use the integrated NetBSD gdb / objdump utilities to find the problem.

 

Results of gcc testsuite testing

Everything is great, other than a few floating point differences.

Sun3 emulation

Host architecture running TME SunOS 4.1.1 / gcc-3.2.3 testsuite running inside emulated Sun3 NetBSD 3.1 / gcc-3.3.3 testsuite running inside emulated Sun3
amd64 pass pass
amd64-32bit i386 load pass fail- note 1
i386 pass fail- note 1
ultrasparc10 pass pass- note 3
ultrasparc10 32-bit sun4u load pass pass- note 3
sparc20 fail - note 2 not tested

fail note 1 - failures:
   gcc.c-torture/execute/conversion.c -0 -1 -2 -s

Here is a reduced code fragment that produces the difference:

#include <stdio.h>

long double
ull2ld(u)
     unsigned long long int u;
{
  return u;
}

main()
{
  long double a, b;

  a = ull2ld(~0ULL);
  b = (long double) ~0ULL;
  if(a!=b) abort();
}
Per Matt:
  The ull2ld() program is a good example of how not-easy it is to
  emulate floating-point.

  This program checks that conversions of 64-bit unsigned integers
  to the IEEE754 80-bit format work.  The 80-bit format has a 64-bit
  significand, so it should always work without loss of precision.

  The IEEE754 compliance level setting in SUN3-CARRERA is "unknown".
  "unknown" is meant to give the greatest possible functionality, with
  potentially the least IEEE754 compliance.  It gives *something* for
  almost all floating-point operations, but uses whatever built-in
  types the host has (which may not even be IEEE754 types) and makes
  no guarantees about precision or rounding direction.

  On most host systems, "unknown" actually isn't too bad.  Most host
  systems have IEEE754 types, and i386/x86-64 systems even have the
  80-bit format.

  Why, then, does "unknown" break ull2ld() on an i386 host, which has
  the 80-bit format?  The i386 FPU has a "precision control" setting
  in its control word, and it's usually set to round all results to
  53-bits (for the IEEE754 64-bit format, built-in type "double").
  When ull2ld() converts 0xffffffff.ffffffff to long double, it gets
  rounded to that plus one (which fits in a 53-bit significand).

  As far as I can tell, NetBSD doesn't have any function that tme
  could use to set the precision control.  I can't find anything on my
  old RedHat either, but it looks like FreeBSD has an fpsetprec().

  I'm reluctant to add host-CPU-specific asm() to tme, especially at
  this late point and since "unknown" makes no guarantees about
  precision.

  The other compliance levels have their own issues.  "partial" is
  basically the same as "unknown" on an i386 host.  "strict" makes
  ull2ld() work, but "strict" doesn't have any of the IEEE754 80-bit
  transcendental functions (softfloat doesn't provide them), and so
  "strict" will break anything that uses them.

  That's the long story.  I'm going to leave "unknown" in SUN3-CARRERA
  and just continue to remind people to be wary of floating-point.
  It's very hard to do perfect floating-point emulation using built-in
  types, libm functions and no asm(), which is what tme strives to do.

fail note 2 - failures:
   gcc.dg/Wparentheses-1.c
   gcc.dg/Wreturn-type.c
   gcc.dg/Wunknownprag.c
   gcc.dg/array-2.c
   gcc.dg/attr-nest.c
   gcc.dg/c99-func-4.c
   gcc.dg/c99-impl-int-2.c
   gcc.dg/cast-qual-2.c
   gcc.dg/va-arg-1.c
   gcc.dg/vla-init-1.c
   gcc.dg/weak-7.c
   gcc.dg/cpp/gnuc89-pedantic.c
   gcc.dg/cpp/if-1.c
   gcc.dg/cpp/if-2.c
   gcc.dg/cpp/if-mpar.c
   gcc.dg/format/c90-scanf-3.c
   gcc.dg/noncompile/20001228-1.c -O3 -fomit-frame-pointer -funroll-loops
   gcc.dg/noncompile/20010425-1.c -O0
   gcc.dg/noncompile/20010425-1.c -O1
   gcc.dg/noncompile/20010425-1.c -O2
   gcc.dg/noncompile/20010425-1.c -O3 -g
   gcc.dg/noncompile/20010425-1.c -Os
   gcc.dg/noncompile/20010524-1.c -O0
   gcc.dg/noncompile/20011025-1.c -O1
   gcc.dg/noncompile/20011025-1.c -O3 -fomit-frame-pointer
   gcc.dg/noncompile/20011025-1.c -O3 -g
   gcc.dg/noncompile/20020130-1.c -O2
   gcc.dg/noncompile/20020130-1.c -O3 -fomit-frame-pointer
   gcc.dg/noncompile/20020130-1.c -Os
   gcc.dg/noncompile/20020207-1.c -O0
   gcc.dg/noncompile/20020207-1.c -O1
   gcc.dg/noncompile/20020207-1.c -O3 -fomit-frame-pointer -funroll-loops
   gcc.dg/noncompile/init-3.c -O2
   gcc.dg/noncompile/init-3.c -Os
   gcc.dg/noncompile/invalid_asm.c -O0
   gcc.dg/noncompile/invalid_asm.c -O1
   gcc.dg/noncompile/invalid_asm.c -O2
   gcc.dg/noncompile/invalid_asm.c -O3 -fomit-frame-pointer
   gcc.dg/noncompile/invalid_asm.c -O3 -g
   gcc.dg/noncompile/invalid_asm.c -Os
   gcc.dg/noncompile/label-lineno-1.c -O0
   gcc.dg/noncompile/label-lineno-1.c -O1
   gcc.dg/noncompile/label-lineno-1.c -O2
   gcc.dg/noncompile/label-lineno-1.c -Os
   gcc.dg/noncompile/redecl-1.c -O0
   gcc.dg/noncompile/redecl-1.c -O1
   gcc.dg/noncompile/redecl-1.c -O3 -fomit-frame-pointer
   gcc.dg/noncompile/redecl-1.c -O3 -g
   gcc.dg/noncompile/redecl-1.c -Os
   gcc.dg/noncompile/va-arg-1.c -O3 -fomit-frame-pointer
   gcc.dg/noncompile/va-arg-1.c -Os

pass note 3: only a subset of tests were run - floating points that caused problems in the past, plus all execute/ieee
    gcc.c-torture/execute/20000603-1.c
    gcc.c-torture/execute/20000731-1.c
    gcc.c-torture/execute/20020314-1.c
    gcc.c-torture/execute/20020413-1.c
    gcc.c-torture/execute/960405-1.c
    gcc.c-torture/execute/960513-1.c
    gcc.c-torture/execute/991019-1.c
    gcc.c-torture/execute/builtin-complex-1.c
    gcc.c-torture/execute/complex-1.c
    gcc.c-torture/execute/complex-5.c
    gcc.c-torture/execute/complex-6.c
    gcc.c-torture/execute/conversion.c
    gcc.c-torture/execute/gofast.c
    and everything in gcc.c-torture/execute/ieee

Sparcstation emulation

Host architecture running TME OS / testsuite running inside emulated Sparc OS / testsuite running inside emulated Sparc
    NetBSD 3.1 gcc-3.3.3
i386   fail - note 1, 2
ultrasparc10   fail - note 1, 2
ultrasparc10-32 bit sun4u load   fail - note 1, 2
sparc20    

fail note 1 - failures:
    gcc.c-torture/execute/960405-1.c (all compile options)
    gcc.c-torture/execute/ieee/hugeval.c (all compile options)
    gcc.c-torture/execute/ieee/inf-1.c -0 -1

fail note 2: all tests were run other than the following two, which took forever to run and finally locked up the real and emulated sparcstation:
    gcc.c-torture/compile/20001226-1.c
    gcc.dg/c99-intconst-1.c

 

TME bugs

1) Netbsd 3.0 hangs inside emulator after running two days
2) Host system must be rebooted between tests of gcc-3.3.3-testsuite under netbsd 3.0 inside emulator
3) Weird login in prompt via serial line "tíeóèóõî loçiî:" instead of "tmeshsun login:"
4) I sometimes (rarely) get an assertion on line 946, threads-sjlj.c, when typing in via the emulated serial line.  The same assertion appears when starting tmesh with - (a blank device) as the serial device
5) Stop-a doesn't work on some pc's (NetBSD ticket 34903)
6) No keyboard entry at prom monitor after shutdown with sunos (NetBSD ticket 35306)


Get your cool ILVSUN3 paraphernalia!
I have a bet going on how many ILVSUN3 dog t-shirts I can sell by Christmas 2007.  Oh YEAH!

ILVSUN3 home