NVIDIA Tesla™ GPU - Bring Home Unprecedented High-Density Parallel Processing | |
Anonymous Coward (OP) User ID: 206079 United States 06/22/2007 08:49 AM Report Abusive Post Report Copyright Violation | |
Anonymous Coward (OP) User ID: 206079 United States 06/22/2007 11:45 PM Report Abusive Post Report Copyright Violation | |
Eliot Rosewater User ID: 161502 United States 06/23/2007 08:20 AM Report Abusive Post Report Copyright Violation | |
Anonymous Coward (OP) User ID: 206079 United States 06/23/2007 08:24 AM Report Abusive Post Report Copyright Violation | Can you over clock the cheap one and make it teraflop? Quoting: Eliot Rosewater 161502I'll bet you can with just the regular GFORCE 8800 that its based on. Only half the ram but you can OC to your hearts content. Time to dig out that old liquid nitrogen cooling system.... |
Bao2
User ID: 254833 Spain 06/23/2007 10:54 AM Report Abusive Post Report Copyright Violation | |
Anonymous Coward User ID: 234305 United States 06/23/2007 12:55 PM Report Abusive Post Report Copyright Violation | |
Space Doubt (OP) User ID: 206079 United States 06/23/2007 01:17 PM Report Abusive Post Report Copyright Violation | And I believe I heard it'll cost about $5000. Quoting: Anonymous Coward 234305- I rather doubt that. Maybe for the four chip server but not individual cards. It is a Gforce 8800 with double the ram. Sure, the support will add to the cost but it won't make it more than double the $500-$600 price for a top shelf GForce. |
Anonymous Coward User ID: 234305 United States 06/23/2007 09:21 PM Report Abusive Post Report Copyright Violation | And I believe I heard it'll cost about $5000. Quoting: Space Doubt 206079- I rather doubt that. Maybe for the four chip server but not individual cards. It is a Gforce 8800 with double the ram. Sure, the support will add to the cost but it won't make it more than double the $500-$600 price for a top shelf GForce. Uh, NO! [link to www.extremetech.com] The company will initially release three products: the C870 board, which will retail for about $1,499; a "deskside" unit, the D870, which will cost about $7,500; and the S870 server, which will cost about $12,000. - |
Anonymous Coward User ID: 115284 United States 06/23/2007 09:23 PM Report Abusive Post Report Copyright Violation | And I believe I heard it'll cost about $5000. Quoting: Anonymous Coward 234305- I rather doubt that. Maybe for the four chip server but not individual cards. It is a Gforce 8800 with double the ram. Sure, the support will add to the cost but it won't make it more than double the $500-$600 price for a top shelf GForce. Uh, NO! [link to www.extremetech.com] The company will initially release three products: the C870 board, which will retail for about $1,499; a "deskside" unit, the D870, which will cost about $7,500; and the S870 server, which will cost about $12,000. - _ |
Anonymous Coward (OP) User ID: 206079 United States 06/23/2007 09:33 PM Report Abusive Post Report Copyright Violation | And I believe I heard it'll cost about $5000. Quoting: Anonymous Coward 234305- I rather doubt that. Maybe for the four chip server but not individual cards. It is a Gforce 8800 with double the ram. Sure, the support will add to the cost but it won't make it more than double the $500-$600 price for a top shelf GForce. Uh, NO! [link to www.extremetech.com] The company will initially release three products: the C870 board, which will retail for about $1,499; a "deskside" unit, the D870, which will cost about $7,500; and the S870 server, which will cost about $12,000. - The price for the individual card seems reasonable, but why five times are much for two cards? What could that case have thats worth $5000? |
Anonymous Coward User ID: 160259 Netherlands 06/23/2007 09:38 PM Report Abusive Post Report Copyright Violation | |
Anonymous Coward (OP) User ID: 206079 United States 06/23/2007 09:51 PM Report Abusive Post Report Copyright Violation | Wow. And how did they achieve this performance? Is it some kind of quantum processor, nano technology or something more "ordinary"? Quoting: Anonymous Coward 160259Its a video card. The same as the Gforce 8800 with twice the ram. They have come a long way. Each has 128 floating point processors called Universal Shaders or Stream Processors. Alone it can't run software. The card is designed to be installed in a computer, and the others connect to a computer with a cable. Programs need to be written to CUDA in order to take advantage of the Tesla. [link to developer.nvidia.com] |
Anonymous Coward (OP) User ID: 206079 United States 06/23/2007 09:56 PM Report Abusive Post Report Copyright Violation | The promise of GPGPU has been covered for the past several years, and with the introduction of programmable graphics processors, that dream was finally able to come to fruition. The previous generation of DX9 cards dipped their toes into the general computing world, but the architectures were still very oriented towards graphics and fixed function processing. ATI had a bit of a better time of it due to their thread management, programmable memory controller, and better branching performance with the X1800 and X1900 series, but the architecture was still rather limited in what it could do. NVIDIA was a bit farther behind with their 7900 series of cards, and even though some basic GPGPU work was done with them, they simply were not flexible enough to handle complex programs and calculations. Fast forward to November of last year and we see NVIDIA release a radical new architecture which excels not only in its flexibility, but also overall performance. The 8800 series of GPUs are the first to support Microsoft’s DirectX 10 and Shader Model 4.0, but these chips are also designed from the outset to be an outstanding performer in general purpose single precision floating point operations. Seemingly the price for next generation performance in games, DX10 compatibility, and GPGPU support is a huge one. The G80 chip the 8800 series is based on is comprised of 680+ million transistors packed into a die that is supposedly around 460 mm square. Another price that has to be paid is power and heat. Currently the top end 8800 GTX/Ultra boards consume around 150 to 175 watts at full bore. They also require a pretty hefty system to whisk the heat away. NVIDIA uses the older 90 nm process from TSMC with their G80 chips, and they use the newer 80 nm optimized process for their lower end G84 and G86 chips. But when considering the alternatives for achieving these types of performance numbers, the G80 chips look incredibly cheap from a price/performance/power standpoint. It would take quite a few dual core/quad core X86 CPUs to match the theoretical output of the G80 in single precision floating point operations. Tesla NVIDIA is branding their GPGPU products as the Tesla family. For the first generation of these products we will see three distinct offerings all based upon the G80 processor on a GTX/Ultra PCB, but without the video outputs. There are three levels of products that NVIDIA will offer. Initially NVIDIA will be the only company actually selling these products, but perhaps in the future, depending on how successful the Tesla initiative is, we may see other partners taking over responsibility for production of the actual Tesla cards, personal supercomputers, and rackmount servers. The most basic product is just a single Tesla board. This is virtually identical to the above mentioned 8800 GTX/Ultra, but without the output ports as well as double the memory (1.536 GB). This card is designed to be plugged into a workstation and be used as a co-processing board using GPGPU software. The initial cost of Tesla C870 is around $1,499. These boards are specifically designed for GPGPU use, so there potentially could be some significant differences between the $1,499 board and a $529 8800 GTX other than double the memory. The board itself has a max power rating of 170 watts and can output (in specific situations) 500 gigaFLOPS per GPU. The second tier is that of the “personal supercomputer” which is a box containing 2 x C870 boards. This product is branded the Tesla D870. Inside the box it contains 2 x C870 boards, an internal power supply, and the PCI-E switch necessary to allow the box to communicate with the workstation it is attached to. In the workstation there is installed a PCI-E switch card that is inserted into the PCI-E 16X slot. An external PCI-Express cable goes from this card to the supercomputing box where it attaches to the internal PCI-E switch that NVIDIA developed (or at least utilizes). This switch is then connected to the individual cards. These are all PCI-Express Gen 2 products, so their throughput is pretty impressive (16 GB/sec aggregate bandwidth). The box consumes around 500 watts at maximum, which is more than double the individual power ratings of the cards contained. This again is due to other internal components such as the switch and cooling solutions. This little monster will set a user back $7,500. For average people, this is a lot of money. For companies who work in the gas and energy field, this is chump change and well worth the cost in terms of space, power consumption, and overall computational performance in one small package. The third tier is that of the GPU Computing Server. This 1U rackmount product packs in 4 x C870 boards and is branded the Tesla S870. This unit typically consumes around 550 watts with a max power rating of 800 watts. In certain circumstances, this setup could produce around 2 gigaFLOPS. The amount of single precision computing power this unit has is tremendous. This rackmount connects to another server the same way as the personal supercomputer does above utilizing PCI-Express Gen 2 cables and switches. These racks are designed to be integrated into larger stacks, and be connected with other 2P/4P/8P servers which would handle the rest of the computational requirements. These racks are not standalone, and do need to be connected to another server which will handle all the OS, networking, and storage duties. NVIDIA eventually plans to have rackmounts which support up to 8 GPGPUs, but that does not appear to be an option at this time. If one of the master servers has 2 x 16X PCI-E slots, then it can theoretically connect to two of the S870s. In terms of hardware NVIDIA appears to have all the bases covered for a comprehensive lineup of GPGPU products. [link to www.penstarsys.com] |
Anonymous Coward (OP) User ID: 206079 United States 06/23/2007 09:59 PM Report Abusive Post Report Copyright Violation | The Software Side NVIDIA released the beta of the CUDA software this past February, and they have been working with the GPGPU community to improve support. Initially there were a lot of complaints about how hard it was to actually program effectively for CUDA, but over time those complaints are looking to have died down. NVIDIA has been working with a variety of entities to further develop their applications to work with the Tesla products. Companies such as Acceleware are helping to develop solutions aimed at the geophysical and electromagnetic simulation. NVIDIA’s goal is to be able to program these GPGPU applications in C without any issues, and with good performance. NVIDIA claims that it is at a point where it can offer these products and have the tools available and mature enough to enable a wider spectrum of applications to be ported for use in the Tesla products. The Future of Tesla Currently the Tesla products utilize the G80 processors for their products. There are no plans to utilize the lower end G84 and G86 products. This is mainly because it would take about 4 G84 chips to equal the overall performance of a single G80. In high performance computing, power is not so much the issue as is space. Being able to use a 1U rackmount to house 4 C870s vs. having that same computing power with slightly better power consumption with 16 G84 based units is a big deal. So for the time being we will only see the G80 based products in the market. This does not rule out the potential for some small form factor usages for the G84, as I would imagine that it could be handy for workstation class laptops. Being able to plug in an Expresscard version of a Tesla product based on G84 could be appealing to some users, but likely that market is so incredibly small that it is not worth it to develop. The G80 does have one large potential disadvantage over other solutions. It currently supports single precision floating point operations. Applications with require double precision operations will obviously not work on the Tesla series of products. By the end of this year NVIDIA does expect to introduce a new Tesla based product that does support double precision operations. This will likely be based on the 65 nm G80 replacement, but with a few changes. At first glance one would honestly think to themselves, “How do double precision floating point operations help my Unreal Tournament 3 performance?” The quick and correct answer is that it does not help one bit. But the upcoming GPUs that will go into the Tesla products later this year featuring double precision will not be the same GPUs that will go into retail boards this Holiday Season. NVIDIA is looking to differentiate their products at the chip level, so that means two separate 65 nm parts that are still based on the same overall architecture, but with one supporting double precision and the other does not. This does not mean that NVIDIA has had to hire double the engineers to complete such a design. Due to the modular nature of current GPUs, it is not as daunting a task to introduce new products based on the same general architecture. We see this currently with the whole range of products that NVIDIA ships. Modern EDA software and standard cell design rules allow engineers to lay out and verify a new design based on the current architecture a lot faster than they used to be able to. Other tools such as the FPGA prototyping machines help to accelerate this process by exposing flaws in timing that could be missed by the EDA software. Later this year we will see NVIDIA release their 65 nm desktop chips for the high end, and around the same time there will be a standalone Tesla product generally based on that same architecture, but supporting double precision operations. NVIDIA realizes that it probably cannot survive forever by just producing graphics chips. By diversifying their product lines to include HPC initiatives, they are expanding their product base as well as providing more security for themselves in the future if the standalone graphics market is rendered obsolete. This is merely the starting point of a new product line and a new market. The potential of GPGPUs in High Performance Computing is largely untapped, but with NVIDIA making a strong push into this market it will likely become commonplace in the very near future. AMD is also pushing hard into this market, and they will likely take the same approach as NVIDIA is doing. The differentiation, and the potential benefits such work will likely inspire, will help these companies through difficult times when a product from one side or the other falls down or does not live up to expectations. What will likely be the determining factors of success in the next year for this budding industry are the software support and the presence of products supporting double precision operations. So far NVIDIA has the upper hand with products available, as well as a clear roadmap for both its hardware and software support. Something else that should be mentioned is that CUDA support is not only relegated to the Tesla series of products. Software written for CUDA should work on any G8x class product, but users should probably only consider using the full 8800 GTX to avoid any compatibility issues. This means that any person that has a need for this type of computing can go down to the local brick and mortar and buy a 8800 GTX for around $550, slap it into their workstation, and have a fully functional board with the same features as the more expensive Tesla models (albeit with ½ the memory). [link to www.penstarsys.com] |
Anonymous Coward (OP) User ID: 206079 United States 06/23/2007 10:16 PM Report Abusive Post Report Copyright Violation | |
WhiteCosmicWizard
User ID: 99906 Australia 06/23/2007 11:40 PM Report Abusive Post Report Copyright Violation | sounds good but doesn't GPU stand for GRAPHICS processing unit? i.e. it will be a supercomputer for graphics but not for CPU speed? and what the hell is a gigaflop? never in my life have i heard anyone use the term gigaflop. |
Anonymous Coward (OP) User ID: 206079 United States 06/23/2007 11:50 PM Report Abusive Post Report Copyright Violation | sounds good but doesn't GPU stand for GRAPHICS processing unit? i.e. it will be a supercomputer for graphics but not for CPU speed? Quoting: WhiteCosmicWizardand what the hell is a gigaflop? never in my life have i heard anyone use the term gigaflop. Those are the figures for floating point operations per second with an ideal thread feed ie. no latency and idle time. Yes, it is based on a video chip. With DX 10 MS gave its blessing new standards for programmability. No longer are there separate units for processing pixels and geometry and triangle setup. There is one standard unit. Each Tesla chip has 128 of these Universal Shaders, and the whole cards has 1.5 Gb of some of the fastest ram you can buy. Along with a management system and memory controller to match its capabilities. With four chips and 6 GB of ram in each rack mount unit a full stack would be competitive with the top one hundred computers in the world. |
Anonymous Coward (OP) User ID: 206079 United States 06/23/2007 11:54 PM Report Abusive Post Report Copyright Violation | Giga-flop Giga = Billion flop = Floating-Point Operations [Per/sec] There are two types of math on a computer. Integer and floating point. FP is the more difficult of the two. An FP unit can easily do integer work. So, each Tesla card can do half a trillion mathematical computations every second. |
Anonymous Coward User ID: 234305 United States 06/24/2007 12:29 AM Report Abusive Post Report Copyright Violation | |
Anonymous Coward User ID: 234305 United States 06/24/2007 12:31 AM Report Abusive Post Report Copyright Violation | Something else that should be mentioned is that CUDA support is not only relegated to the Tesla series of products. Software written for CUDA should work on any G8x class product, but users should probably only consider using the full 8800 GTX to avoid any compatibility issues. This means that any person that has a need for this type of computing can go down to the local brick and mortar and buy a 8800 GTX for around $550, slap it into their workstation, and have a fully functional board with the same features as the more expensive Tesla models (albeit with ½ the memory). Quoting: Anonymous Coward 206079AND at a slower speed. There is no video output on the Tesla cards, and the speed is increased. - |
Anonymous Coward (OP) User ID: 206079 United States 06/24/2007 12:56 AM Report Abusive Post Report Copyright Violation | The price for the individual card seems reasonable, but why five times are much for two cards? What could that case have thats worth $5000? Quoting: Anonymous Coward 234305The dual-card setup is an entire system. - No its not. The Tesla D870 deskside supercomputer is used in tandem with multi-core CPU systems to create a flexible computing solution that fits seamlessly into your existing IT infrastructure [link to www.nvidia.com] |
Anonymous Coward (OP) User ID: 206079 United States 06/24/2007 12:57 AM Report Abusive Post Report Copyright Violation | # Tesla D870 Deskside Supercomputer * Two GPUs (128 thread processors per GPU) * Over 500 gigaflops per GPU * 3 GB system memory (1.5 GB dedicated memory per GPU) * Quiet operation (40dB) suitable for office environment * Connects to host via cabling to a low power PCI Express x8 or x16 adapter card * Optional rack mount kit [link to www.nvidia.com] |
Anonymous Coward (OP) User ID: 206079 United States 06/24/2007 12:10 PM Report Abusive Post Report Copyright Violation | |
WhiteCosmicWizard
User ID: 99906 Australia 06/24/2007 01:37 PM Report Abusive Post Report Copyright Violation | ok so is this just some super duper video card? or will it increase your CPU speed too? |
Anonymous Coward (OP) User ID: 206079 United States 06/24/2007 02:55 PM Report Abusive Post Report Copyright Violation | ok so is this just some super duper video card? or will it increase your CPU speed too? Quoting: WhiteCosmicWizardPrograms will need to be written and compiled with CUDA or something similar to take advantage. Your CPU will be just as fast, but the program may run as much as 100 times faster with one single card. Lets put this into perspective. Two Tesla Cards have a theoretical speed of just over 1 tera flop. The IBM Blue Gene/L system is the worlds fastest computer. Quote: [T]he IBM BlueGene/L system, installed at DOE’s Lawrence Livermore National Laboratory (LLNL), retains the No. 1 spot with a Linpack performance of 280.6 teraflops (trillions of calculations per second, or Tflop/s). More...... [link to www.top500.org] +++++++++++++++ Its peek theoretical speed is 360 Teraflops. [link to domino.research.ibm.com] Its quite bulky. Quote: On October 27, 2005, LLNL and IBM announced that Blue Gene/L had once again broken its speed record, reaching 280.6 TFLOPS on Linpack, upon reaching its final configuration of 65,536 "Compute Nodes" (i.e., 216 nodes) and an additional 1024 "IO nodes" in 64 air-cooled cabinets. More.... [link to en.wikipedia.org] ++++++++++++++++ 64 racks at $2 million each is is 128M. 180 Tesla Severs with four cards each would have the same theoretical FP power costing $2.6 million. Each would need a dual core server, and each rack would need an I/O computer or switch, but the finest machines would add less than a million to the total. And they would take up much less space than 64 Blue Gene racks..... |
Anonymous Coward User ID: 255853 United Kingdom 06/24/2007 06:41 PM Report Abusive Post Report Copyright Violation | |
Anonymous Coward User ID: 162054 United States 06/24/2007 06:54 PM Report Abusive Post Report Copyright Violation | |
edgar celadus User ID: 115284 United States 06/24/2007 06:55 PM Report Abusive Post Report Copyright Violation | Something else that should be mentioned is that CUDA support is not only relegated to the Tesla series of products. Software written for CUDA should work on any G8x class product, but users should probably only consider using the full 8800 GTX to avoid any compatibility issues. This means that any person that has a need for this type of computing can go down to the local brick and mortar and buy a 8800 GTX for around $550, slap it into their workstation, and have a fully functional board with the same features as the more expensive Tesla models (albeit with ½ the memory). Quoting: Anonymous Coward 234305AND at a slower speed. There is no video output on the Tesla cards, and the speed is increased. - - |
Anonymous Coward (OP) User ID: 206079 United States 06/24/2007 07:11 PM Report Abusive Post Report Copyright Violation | |
Anonymous Coward (OP) User ID: 206079 United States 06/24/2007 07:13 PM Report Abusive Post Report Copyright Violation | Something else that should be mentioned is that CUDA support is not only relegated to the Tesla series of products. Software written for CUDA should work on any G8x class product, but users should probably only consider using the full 8800 GTX to avoid any compatibility issues. This means that any person that has a need for this type of computing can go down to the local brick and mortar and buy a 8800 GTX for around $550, slap it into their workstation, and have a fully functional board with the same features as the more expensive Tesla models (albeit with ½ the memory). Quoting: Anonymous Coward 234305AND at a slower speed. There is no video output on the Tesla cards, and the speed is increased. - The speed is not increased on the Tesla. Its slower! They tend to do that with higher end parts that may be running 24/7. |