|By Paul Bemowski||
|August 11, 2003 12:00 AM EDT||
In an SMT system, a single physical processor duplicates some of the on-chip architectural state, allowing the processor core to make greater use of available resources. The second architectural state holds another thread context, allowing the processor to more completely use its resources when an active thread encounters some type of latency.
For example, when a processor encounters a cache miss, there is a slice of time that is normally wasted while the processor makes a long-latency read from main memory. In this brief slice of time, the vast majority of the processor's resources sit idle, while the processor reports itself as busy to the operating system. In an SMT system, the processor will use an on-board thread scheduler to immediately execute the second on-chip thread context's instructions, making use of otherwise wasted cycles.
SMT does incur some overhead. When two threads contend for the same processor resources, it is the responsibility of the on-chip thread scheduler to interleave the two active threads. For this reason, in certain situations a non-HT processor will outperform an HT processor. The net effect however is an overall improvement in performance for multi-threaded applications running on HT-enabled systems.
From a hardware perspective, three subsystems must work together to enable HT: the processor, the chipset, and the BIOS.
Currently, all members of Intel's Xeon processor family support HT. Xeon here is not to be confused with PIII Xeon. When Intel converted the Xeon's architecture to the P4 core, it dropped the Pentium designation, calling the new processors simply Xeon.
Xeons currently come in three flavors: Xeon, Xeon DP, and Xeon MP. All recent versions of these processors will support HT. Some older Xeon and Xeon DP processors, commonly characterized by a smaller 256 Kb L2 cache, do not support HT. If you are purchasing a used Xeon system or used Xeon processors, be sure to confirm that they support HT.
In early 2003, Intel released the 3.06GHz P4 on 0.13 micron technology. This new P4 supports HT, and signals the introduction of HT to desktop systems. Look for Intel to continue to support HT on all of its subsequent P4 releases.
HT requires chipset and BIOS support. Most of Intel's newer chipsets are supporting HT. The following link presents a table of Intel's current server/workstation chipset offerings. The last row in the table indicates whether the chipset supports HT technology.
The Basic Input/Output System, or BIOS, allows a user to set parameters affecting system hardware, before the system boots to an operating system. As such, the BIOS is generally tightly coupled to the chipset on which it is installed. In a BIOS that supports HT, the user will have an option to enable/disable HT support on the processor/chipset. With HT enabled on the system, the BIOS presents each physical processor to the operating system as a pair of logical processors. From that point, it is the responsibility of the operating system to make intelligent use of the additional hardware resources.
Linux Support for Hyper-Threading
Given a processor/chipset/BIOS combination that supports HT, the operating system also needs to support the feature. SMT introduces many nuances that affect thread scheduler performance. The first Linux kernel with explicit support for HT was 2.4.18. Since then the 2.5.x kernel's thread scheduler has incorporated numerous enhancements that will further increase performance on HT-enabled systems.
Next, we'll look at HT support in the 2.4 and 2.5(2.6) series kernels.
Hyper-Threading in the 2.4.18+Linux Kernel
The current stable Linux kernel branch is 2.4.x, initially released in January 2001. The 2.4 kernel has since undergone extensive patching, initially for critical bug fixes, later for feature enhancements and support for new hardware.
Because the BIOS will present even a single HT-enabled processor to the OS as two logical processors, all HT configurations should use SMP (Symmetric Multi-Processing) kernels. Pre-2.4.18 SMP kernels may recognize two processors in an HT configuration; however, the scheduler is completely unaware of the logical/physical processor differentiation. The 2.4.18 patch release added some features to the stock scheduler to make it behave better with HT hardware. A 2.4.18+ kernel is strongly recommended for HT configurations.
Enabling Hyper-Threading in a 2.4 system
Given an HT-enabled hardware configuration, use the following steps to enable HT in a 2.4 kernel:
1. First, confirm that your kernel is version 2.4.18 or later, with SMP support. There are many ways to do this, the easiest is to execute the "uname -a" command in a shell. For Red Hat users, Red Hat 7.3 was the first distribution release to support HT, incorporating a 2.4.18 kernel. If you are using another distribution, check the kernel version before attempting to use HT.
2. Next, modify your bootloader (grub or lilo), adding the following parameter to any other boot parameters currently necessary for your system:
It would be wise to add this as a different boot configuration so that you can boot HT or non-HT. (To create an explicitly non-HT configuration, add the 'noht' boot flag.)
3. Finally, reboot the system. Before it restarts, enter the BIOS setup program. Under the processor options you will be able to enable or disable HT. Enable HT, and boot to the 2.4.18 or later SMP kernel with the additional parameters.
Once you have successfully booted the HT configuration, run top. If HT is properly configured, you should see twice as many CPU states as you have physical processors (two virtual CPUs per physical CPU).
Figure 2 is an example of top running on a Red Hat 7.3 system (2.4.18) with two physical Xeons and HT fully enabled. Note the CPU states 0-4, indicating the four logical processors.
Hyper-Threading on 2.4.18+Thread Scheduler
Performance testing multithreaded benchmarks under the 2.4 kernel series still shows some wide scatter in the data. This is because the scheduler still cannot make intelligent choices regarding logical/physical processors in many situations. Under some conditions, 2.4 will still schedule two active threads on the same physical CPU, causing performance degradation. This condition is often random, causing data points from multithreaded benchmarks to vary considerably. "Full" HT scheduler support was not incorporated into the kernel until 2.5.32.
Hyper-Threading in the 2.5.xLinux Kernel
As is standard in Linux kernel versioning, the 2.5.x versions of the kernel are the development branch that will become the 2.6.x stable releases. The 2.5.x kernel added a number of features to its thread scheduler that should extend the performance improvements of HT even further.
2.5.x Thread Scheduler Improvements
A scheduler patch in 2.5.32 introduced the concept of a shared runqueue. The shared runqueue allows two (logical) CPUs, which share resources like cache, to have a scheduler parallel known as a shared runqueue. The shared runqueue may have many applications, but the initial implementation was created specifically with HT in mind. This new concept optimizes the kernel thread scheduler for HT in the following ways:
- HT-aware passive load balancing: This feature addresses the physical CPU imbalance problem - one physical CPU may be running two active threads, while a second physical CPU sits idle. Passive load balancing will attempt to schedule new active threads on an idle physical processor.
- HT-aware active load balancing: Active load balancing also addresses the physical CPU imbalance problem, this time for currently active threads. If three threads are running on three logical CPUs, and one thread goes idle freeing a physical processor, the scheduler will migrate an active thread from the physical processor running two threads to a physical processor running none.
- Thread affinity: Thread affinity is important in SMP as well as SMT systems. Processors use cache memory to hold data and instructions that the processor is using at the moment. By attempting to keep threads scheduled on the same processor, the efficiency of the cache is greatly increased. Moving a thread between physical processors requires the processor to repopulate its cache from main memory, causing performance degradation.
In an SMT system, because the logical processors share cache, the thread scheduler need only attempt to keep threads attached to a physical processor. The scheduler is free to move threads between adjacent logical processors with no performance degradation due to a stale cache.
- HT-aware task pickup: This will allow the scheduler to pick up tasks on a per-physical CPU basis, rather than per-logical CPU basis. Task pickup is related to thread affinity above.
- HT-aware wakeup: This allows threads that were woken up on active logical processors with an idle sibling to be woken up on the sibling processor. (As you might imagine, sibling processors are adjacent logical processors.)
These features work together in the 2.5.32+ kernel to make more efficient use of the new hardware features of HT systems. In addition, the kernel performs in a more consistent manner by continually making optimal use of the processors. The 2.4.18 kernel still performs better as a whole on an HT system, however, it does so in a less predictable manner.
Performance Gains Using Hyper-Threading
OK, you've built a Xeon-based HT system. What kind of performance improvement can be expected? Which applications will benefit from HT, and which will suffer?
Needless to say, HT is targeted at heavily threaded applications. Single-threaded, compute-intensive applications will see minimal performance enhancements. It should be noted, however, that nearly all modern desktop and server systems make extensive use of threads. Server applications generally process socket IO on a thread-per-socket basis. Desktop applications under X Windows will often be processing socket or disk io, X calls, and the application code in parallel.
To date, performance benchmarks for HT systems have focused on server-side systems. This should not be surprising; Intel only recently released HT on a desktop-focused processor (the recent P4). A Web search will quickly find many papers from the past year detailing performance of HT systems.
A recent IBM white paper by Duc Vianney ran several benchmarks both with and without HT enabled on 2.4 and 2.5 kernels. Vianney's work showed a slight performance degradation of single-threaded processes with HT enabled, but performance improvement for the 2.4.19 kernel was approximately 30%. With the enhanced scheduler in the 2.5.32 kernel, the same benchmarks showed a 51% improvement.
Data from an upcoming Java Developer's Journal article exploring heavily threaded Java applications on HT systems indicated typical performance gains of 10-15%, with some tests indicating gains of up to 75% running Java 1.4 on a 2.4.18 HT system.
SMT is here to stay. As processors become more sophisticated, the raw speed of the processor will become even less of a factor in overall system performance due to added features like HT. Some have speculated that SMT and related technologies will spell the end of the megahertz wars.
As with any new hardware technology, software is catching up. Subsequent Linux kernel releases will make more sophisticated use of the available hardware features. Over time, Linux support for HT will mature, resulting in further performance gains.
The Linux community is waiting with bated breath for Linus and crew to tackle the final bugs in 2.5.x, and release the 2.6 Linux kernel. After a stabilization period (which could be significant), major distributions will migrate to the 2.6 kernel. All the while, HT-enabled hardware will be finding its way into enterprise server racks. When the 2.6-enabled distributions hit this hardware, server-side performance will measurably increase, with no hardware investment whatsoever.
Hyper-Threading technology specifically targets performance gains on heavily threaded applications. These applications are most commonly found in enterprise server platforms - application servers, Web servers, Web services platforms, and Java-based systems. Dell, HP (Compaq), and IBM are all putting forth powerful Xeon-based systems with 2-16 processors running Linux. If HT can improve performance by a conservative 25% in heavily threaded server applications, there's an even stronger case for Linux servers over major Unix platforms for data center use on a cost/performance basis.
Hyper-Threading technology promises to make the Intel/Linux combination even more attractive to IT managers and systems architects looking to upgrade their enterprise software platforms.
|tcx 12/05/03 07:23:10 AM EST|
very useful and detailed information.
for details search g**gle.com for
Apache Hadoop is emerging as a distributed platform for handling large and fast incoming streams of data. Predictive maintenance, supply chain optimization, and Internet-of-Things analysis are examples where Hadoop provides the scalable storage, processing, and analytics platform to gain meaningful insights from granular data that is typically only valuable from a large-scale, aggregate view. One architecture useful for capturing and analyzing streaming data is the Lambda Architecture, represent...
Mar. 27, 2017 08:15 PM EDT Reads: 6,277
SYS-CON Events announced today that Ocean9will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Ocean9 provides cloud services for Backup, Disaster Recovery (DRaaS) and instant Innovation, and redefines enterprise infrastructure with its cloud native subscription offerings for mission critical SAP workloads.
Mar. 27, 2017 07:45 PM EDT Reads: 2,171
With billions of sensors deployed worldwide, the amount of machine-generated data will soon exceed what our networks can handle. But consumers and businesses will expect seamless experiences and real-time responsiveness. What does this mean for IoT devices and the infrastructure that supports them? More of the data will need to be handled at - or closer to - the devices themselves.
Mar. 27, 2017 07:30 PM EDT Reads: 4,566
SYS-CON Events announced today that SoftLayer, an IBM Company, has been named “Gold Sponsor” of SYS-CON's 18th Cloud Expo, which will take place on June 7-9, 2016, at the Javits Center in New York, New York. SoftLayer, an IBM Company, provides cloud infrastructure as a service from a growing number of data centers and network points of presence around the world. SoftLayer’s customers range from Web startups to global enterprises.
Mar. 27, 2017 02:45 PM EDT Reads: 1,952
SYS-CON Events announced today that Linux Academy, the foremost online Linux and cloud training platform and community, will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Linux Academy was founded on the belief that providing high-quality, in-depth training should be available at an affordable price. Industry leaders in quality training, provided services, and student certification passes, its goal is to c...
Mar. 27, 2017 02:30 PM EDT Reads: 4,024
SYS-CON Events announced today that CA Technologies has been named “Platinum Sponsor” of SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY, and the 21st International Cloud Expo®, which will take place October 31-November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. CA Technologies helps customers succeed in a future where every business – from apparel to energy – is being rewritten by software. From ...
Mar. 27, 2017 02:00 PM EDT Reads: 2,060
In his session at @ThingsExpo, Eric Lachapelle, CEO of the Professional Evaluation and Certification Board (PECB), will provide an overview of various initiatives to certifiy the security of connected devices and future trends in ensuring public trust of IoT. Eric Lachapelle is the Chief Executive Officer of the Professional Evaluation and Certification Board (PECB), an international certification body. His role is to help companies and individuals to achieve professional, accredited and worldw...
Mar. 27, 2017 01:15 PM EDT Reads: 748
SYS-CON Events announced today that Loom Systems will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Founded in 2015, Loom Systems delivers an advanced AI solution to predict and prevent problems in the digital business. Loom stands alone in the industry as an AI analysis platform requiring no prior math knowledge from operators, leveraging the existing staff to succeed in the digital era. With offices in S...
Mar. 27, 2017 01:00 PM EDT Reads: 1,478
SYS-CON Events announced today that Interoute, owner-operator of one of Europe's largest networks and a global cloud services platform, has been named “Bronze Sponsor” of SYS-CON's 20th Cloud Expo, which will take place on June 6-8, 2017 at the Javits Center in New York, New York. Interoute is the owner-operator of one of Europe's largest networks and a global cloud services platform which encompasses 12 data centers, 14 virtual data centers and 31 colocation centers, with connections to 195 add...
Mar. 27, 2017 12:45 PM EDT Reads: 1,390
SYS-CON Events announced today that T-Mobile will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. As America's Un-carrier, T-Mobile US, Inc., is redefining the way consumers and businesses buy wireless services through leading product and service innovation. The Company's advanced nationwide 4G LTE network delivers outstanding wireless experiences to 67.4 million customers who are unwilling to compromise on ...
Mar. 27, 2017 11:15 AM EDT Reads: 2,388
SYS-CON Events announced today that HTBase will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. HTBase (Gartner 2016 Cool Vendor) delivers a Composable IT infrastructure solution architected for agility and increased efficiency. It turns compute, storage, and fabric into fluid pools of resources that are easily composed and re-composed to meet each application’s needs. With HTBase, companies can quickly prov...
Mar. 27, 2017 10:30 AM EDT Reads: 3,023
SYS-CON Events announced today that Infranics will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Since 2000, Infranics has developed SysMaster Suite, which is required for the stable and efficient management of ICT infrastructure. The ICT management solution developed and provided by Infranics continues to add intelligence to the ICT infrastructure through the IMC (Infra Management Cycle) based on mathemat...
Mar. 27, 2017 10:30 AM EDT Reads: 3,164
SYS-CON Events announced today that Cloudistics, an on-premises cloud computing company, has been named “Bronze Sponsor” of SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Cloudistics delivers a complete public cloud experience with composable on-premises infrastructures to medium and large enterprises. Its software-defined technology natively converges network, storage, compute, virtualization, and management into a ...
Mar. 27, 2017 09:30 AM EDT Reads: 2,130
There are 66 million network cameras capturing terabytes of data. How did factories in Japan improve physical security at the facilities and improve employee productivity? Edge Computing reduces possible kilobytes of data collected per second to only a few kilobytes of data transmitted to the public cloud every day. Data is aggregated and analyzed close to sensors so only intelligent results need to be transmitted to the cloud. Non-essential data is recycled to optimize storage.
Mar. 27, 2017 08:15 AM EDT Reads: 3,124
"I think that everyone recognizes that for IoT to really realize its full potential and value that it is about creating ecosystems and marketplaces and that no single vendor is able to support what is required," explained Esmeralda Swartz, VP, Marketing Enterprise and Cloud at Ericsson, in this SYS-CON.tv interview at @ThingsExpo, held June 7-9, 2016, at the Javits Center in New York City, NY.
Mar. 27, 2017 08:00 AM EDT Reads: 4,354
SYS-CON Events announced today that Outlyer, a monitoring service for DevOps and operations teams, has been named “Bronze Sponsor” of SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Outlyer is a monitoring service for DevOps and Operations teams running Cloud, SaaS, Microservices and IoT deployments. Designed for today's dynamic environments that need beyond cloud-scale monitoring, we make monitoring effortless so you ...
Mar. 27, 2017 02:15 AM EDT Reads: 4,305
My team embarked on building a data lake for our sales and marketing data to better understand customer journeys. This required building a hybrid data pipeline to connect our cloud CRM with the new Hadoop Data Lake. One challenge is that IT was not in a position to provide support until we proved value and marketing did not have the experience, so we embarked on the journey ourselves within the product marketing team for our line of business within Progress. In his session at @BigDataExpo, Sum...
Mar. 27, 2017 01:45 AM EDT Reads: 3,054
Keeping pace with advancements in software delivery processes and tooling is taxing even for the most proficient organizations. Point tools, platforms, open source and the increasing adoption of private and public cloud services requires strong engineering rigor - all in the face of developer demands to use the tools of choice. As Agile has settled in as a mainstream practice, now DevOps has emerged as the next wave to improve software delivery speed and output. To make DevOps work, organization...
Mar. 27, 2017 01:15 AM EDT Reads: 1,983
DevOps is often described as a combination of technology and culture. Without both, DevOps isn't complete. However, applying the culture to outdated technology is a recipe for disaster; as response times grow and connections between teams are delayed by technology, the culture will die. A Nutanix Enterprise Cloud has many benefits that provide the needed base for a true DevOps paradigm.
Mar. 27, 2017 12:45 AM EDT Reads: 2,289
What sort of WebRTC based applications can we expect to see over the next year and beyond? One way to predict development trends is to see what sorts of applications startups are building. In his session at @ThingsExpo, Arin Sime, founder of WebRTC.ventures, will discuss the current and likely future trends in WebRTC application development based on real requests for custom applications from real customers, as well as other public sources of information,
Mar. 27, 2017 12:30 AM EDT Reads: 944