YOUR FEEDBACK
A Dozen Predictions for 2008
Joachim Van der Auwera wrote: I agree that code generation can be very imp...


2007 West
GOLD SPONSORS:
Active Endpoints
Your SOA Needs BPEL for Orchestration
BEA
Virtualized SOA: Adaptive Infrastructure for Demanding Applications
Nexaweb
Overcoming Bandwidth Challenges with Nexaweb
TIBCO
What is Service Virtualization?
SILVER SPONSORS:
WSO2
Using Web Services Technologies and FOSS Solutions
Click For 2007 East
Event Webcasts

2008 East
PLATINUM SPONSORS:
Appcelerator
Think Fast: Accelerate AJAX Development with Appcelerator
GOLD SPONSORS:
DreamFace Interactive
The Ultimate Framework for Creating Personalized Web 2.0 Mashups
ICEsoft
AJAX and Social Computing for the Enterprise
Kaazing
Enterprise Comet: Real–Time, Real–Time, or Real–Time Web 2.0?
Nexaweb
Now Playing: Desktop Apps in the Browser!
Sun
jMaki as an AJAX Mashup Framework
POWER PANELS:
The Business Value
of RIAs
What Lies Beyond AJAX?
KEYNOTES:
Douglas Crockford
Can We Fix the Web?
Anthony Franco
2008: The Year of the RIA
Click For 2007 Event Webcasts
SYS-CON.TV
TOP LINKS YOU MUST CLICK ON


Sun engineer's response to 'How Sun can pull out of its slump'
Consistency is key

Digg This!

Page 1 of 2   next page »

Dear Mr. Murphy,

We appreciate the high estimation in which you hold Sun Microsystems, as expressed in your LinuxWorld article of 1/29/03. We would like to address a small part of the article in which you call attention to what you see as a problem with SPARC and scientific computing, quoted below; in fact, although there is something to be learned from your square-root summation example, it is not what you conclude. More follows the quotation:

  • The actual problem is that SPARC, like other RISC CPUs, defaults to the IEEE 754 64-bit floating-point specification. Therefore, it doesn't get the right answers in cases requiring extensive iterative computation. If, for example, you get it to sum the square roots of the first billion positive integers using the GCC compiler and default libraries, the result is off by 0.18.

    Although completely meaningless in business processing, that error is devastating if you're trying to simulate the three-body problem over a long period. You can get around this by using extended precision libraries, but that cuts performance significantly.

  • The self-inflicted version of this problem is that SPARC has an answer to floating-point performance problems in its SIMD instruction set, but Sun doesn't go out of its way to tell anyone about it and hasn't given the research community the support needed to motivate its use. The main problem is that researchers tend to develop for the box on their desk, and these days that's more likely to be an Intel box running FreeBSD or Linux than a SPARC. As a result, when they do run that code on a SPARC, there's nothing there to take advantage of the short-array capabilities provided in the hardware, and they get neither the performance nor the accuracy that the machine is capable of."

The main points we want to make are:

  • The error as measured by the more suitable unit of ulps is not really large and it is further controllable by improving the summation algorithm.
  • The SPARC results are uniform across systems while the IA32 results vary from somewhat better to much worse; uniformity is a virtue. Note that the results on other RISC systems match Sun's and also note that high performance scientific computing is still much more practiced on the RISC systems exemplified in the table.
  • The variability of IA32 results stems from improper management of extended precision registers vis-a-vis memory precision, a subtlety that vitiates significantly the potential real advantage of extended precision intermediate computation. Note that IA32 does indeed comply with IEEE754; its differences with SPARC or other RISCs is within IEEE754 latitude.
  • The example is typical of kernels in the solution of differential equations, like the three-body problem to which you refer, which will overwhelm eventually any finite-bounded precision representation, so better numerical approaches must be used anyway.
  • The SIMD instruction set, VIS, available in successively improved versions in UltraSPARC processors, does not as yet implement floating-point arithmetic. Until recently this disadvantage with respect to the IA32 SIMD sets, SSE and SSE2, was not material since the latter was addressing a more egregious floating-point performance issue. VIS is used to accelerate some FP codes on SPARC by doing logical and bit-wise operations on FP registers. Note that SSE and SSE2 implement SPARC-style pure double and float precision FP arithmetic; there's no use of extended precision.

Next is a more detailed discussion. For a good foundation in floating-point computing we recommend the ACM article by David Goldberg "What Every Computer Scientist Should Know about Floating-Point Computing" especially with the addendum by Douglas Priest that addresses IA32. Also of interest might be Joe Darcy's JavaOne talk or even better, but requiring jdc registration, is What Everybody Using the Java Programming Language Should Know About Floating-Point Arithmetic

I) The Code and Compilation

The source code at the heart of the discussion:

        ------- sqr.c -------
        #include 
        #include 
        main()
        {
        register i;
        double f=0.0;
        double sqrt();

for(i=0;i<=1000000000;i++) { f+=sqrt( (double) i); } printf("Finished %20.14f\n",f); } ------- sqr.c -------

Investigators were asked to compile this program with the following command:

cc -o sqr -O sqr.c -lm

and to run the executable as follows:

time ./sqr

II) The Results: Correct versus IA32 versus SPARC

RUN-NameResultDouble Prec Rep
1) Correct: 21081851083600.37596259382529338=0x42b32c803ebb5060
2) SPARC-quad: 21081851083600.37500000000000=0x42b32c803ebb5060
3) Correct+1ULP: 21081851083600.3789=0x42b32c803ebb5061
4) IA32: 21081851083600.38281250000000=0x42b32c803ebb5062
5) SPARC double: 21081851083600.55859375000000=0x42b32c803ebb508f

The table below presents results from several sources/runs in order of accuracy.

Entry number 1 is the result from the Euler-Maclaurin formula as presented by Murphy. The third column of the table gives the IEEE-754 floating-point double-precision hexadecimal representation of the result.

Entry number 3 presents not the result of a run, but rather serves to illustrate the error in the decimal representation when the hexadecimal number varies by 1 bit, or unit in the last place, denoted ULP.

The best computed results are in error by 2 ULPs and are represented by entry number 4.

The SPARC results which are in error by 47 ULPs, are shown as entry number 5.

Entry number 2, shows a result that in decimal representation, is in error by -.00096259382529338. However, observe that the hex representation shows an error of 0 ULPs, an exact answer. The reason for this apparent contradiction is that IEEE-754 double precision represents only 16- or 17-decimal digits. Additional digits are used in rounding when converting from decimal to the double precision format. Entry number 2 presents the double-precision results when the sum is kept in IEEE-754 128-bit format.

I) Magnitude of the Errors

RUN-NameDecimal ErrorULP Errorerr/correct
SPARC-quad-0.0009625938252933800
Correct+1ULP0.0029374061747066211.39333E-16
BEST0.0068499061747066223.24920E-16
SPARC-double0.182631156174706624786.6296E-16

Consider now the errors encountered in the survey, based on the table in section II. Consider #1 as perfect. Errors can be considered in terms of decimal representation, and ULPs.

When considering the absolute error alone, SPARC results appear bad. However, considering the ratio error/correct, the variance between results, in fact, is discovered to be very small.


Page 1 of 2   next page »

About Gregory V. Tarsy
Gregory V. Tarsy works in the Floating Point and Numerical Computing Group at Sun Microsystems.

LATEST LINUX STORIES
3rd International Virtualization Conference & Expo: Themes & Topics
From Application Virtualization to Xen, a round-up of the virtualization themes & topics being discussed in NYC June 23-24, 2008 by the world-class speaker faculty at the 3rd International Virtualization Conference & Expo being held by SYS-CON Events in The Roosevelt Hotel, in midtown
Verizon Becomes a Counter-Android Linux Convert
Verizon Wireless is snubbing Google's Linux-based Android initiative to go with the LiMo Foundation's mobile Linux spec for its next wave of mobile phones expected next year. Along with Verizon, Mozilla signed up - giving the consortium its first major open source ISV - and a key one f
Adaptec Launches New Series 2 RAID Controller For Linux Users
Adaptec unveiled a new family of entry-level Unified Serial RAID controllers. The new low-profile Series 2 RAID controllers, built on the same Adaptec dual core RAID-on-Chip (ROC) architecture used in its successful Series 5 RAID controllers, provide significant performance enhancement
JavaOne 2008: Sun Challenges Linux
Sun's mule train has finally pulled into Indiana after three years on the road. Indiana is the Linux-friendly Fedora-like OpenSolaris project meant to move the Solaris-shy Linux community off Linux and on to Solaris tempted by Solaris widgetry like the highly scalable, rollback-easy, 1
Kevin Hoffman's Review of Iron Man
I took the advice of a friend of mine and steered clear of the 'normal' movie theaters and went a little out of the way to go to a DLP movie theater. The experience of comparing a regular movie theater to a DLP movie theater is like comparing standard def analog TV with a 1080i HDTV si
Curl Announces Support for Ubuntu for Enterprise RIA Platform
Curl announced it has released the availability of an Ubuntu Installer for the Curl Rich Internet Application (RIA) platform. Curl is a Rich Internet Application platform that competes with Adobe AIR/Flex, Silverlight, and Ajax. Curl has been shipping with Linux support for RedHat 9, S
SUBSCRIBE TO THE WORLD'S MOST POWERFUL NEWSLETTERS
SUBSCRIBE TO OUR RSS FEEDS & GET YOUR SYS-CON NEWS LIVE!
Click to Add our RSS Feeds to the Service of Your Choice:
Google Reader or Homepage Add to My Yahoo! Subscribe with Bloglines Subscribe in NewsGator Online
myFeedster Add to My AOL Subscribe in Rojo Add 'Hugg' to Newsburst from CNET News.com Kinja Digest View Additional SYS-CON Feeds
Publish Your Article! Please send it to editorial(at)sys-con.com!

Advertise on this site! Contact advertising(at)sys-con.com! 201 802-3021

SYS-CON FEATURED WHITEPAPERS

ADS BY GOOGLE