|By Paul Bemowski||
|August 11, 2003 12:00 AM EDT||
In an SMT system, a single physical processor duplicates some of the on-chip architectural state, allowing the processor core to make greater use of available resources. The second architectural state holds another thread context, allowing the processor to more completely use its resources when an active thread encounters some type of latency.
For example, when a processor encounters a cache miss, there is a slice of time that is normally wasted while the processor makes a long-latency read from main memory. In this brief slice of time, the vast majority of the processor's resources sit idle, while the processor reports itself as busy to the operating system. In an SMT system, the processor will use an on-board thread scheduler to immediately execute the second on-chip thread context's instructions, making use of otherwise wasted cycles.
SMT does incur some overhead. When two threads contend for the same processor resources, it is the responsibility of the on-chip thread scheduler to interleave the two active threads. For this reason, in certain situations a non-HT processor will outperform an HT processor. The net effect however is an overall improvement in performance for multi-threaded applications running on HT-enabled systems.
From a hardware perspective, three subsystems must work together to enable HT: the processor, the chipset, and the BIOS.
Currently, all members of Intel's Xeon processor family support HT. Xeon here is not to be confused with PIII Xeon. When Intel converted the Xeon's architecture to the P4 core, it dropped the Pentium designation, calling the new processors simply Xeon.
Xeons currently come in three flavors: Xeon, Xeon DP, and Xeon MP. All recent versions of these processors will support HT. Some older Xeon and Xeon DP processors, commonly characterized by a smaller 256 Kb L2 cache, do not support HT. If you are purchasing a used Xeon system or used Xeon processors, be sure to confirm that they support HT.
In early 2003, Intel released the 3.06GHz P4 on 0.13 micron technology. This new P4 supports HT, and signals the introduction of HT to desktop systems. Look for Intel to continue to support HT on all of its subsequent P4 releases.
HT requires chipset and BIOS support. Most of Intel's newer chipsets are supporting HT. The following link presents a table of Intel's current server/workstation chipset offerings. The last row in the table indicates whether the chipset supports HT technology.
The Basic Input/Output System, or BIOS, allows a user to set parameters affecting system hardware, before the system boots to an operating system. As such, the BIOS is generally tightly coupled to the chipset on which it is installed. In a BIOS that supports HT, the user will have an option to enable/disable HT support on the processor/chipset. With HT enabled on the system, the BIOS presents each physical processor to the operating system as a pair of logical processors. From that point, it is the responsibility of the operating system to make intelligent use of the additional hardware resources.
Linux Support for Hyper-Threading
Given a processor/chipset/BIOS combination that supports HT, the operating system also needs to support the feature. SMT introduces many nuances that affect thread scheduler performance. The first Linux kernel with explicit support for HT was 2.4.18. Since then the 2.5.x kernel's thread scheduler has incorporated numerous enhancements that will further increase performance on HT-enabled systems.
Next, we'll look at HT support in the 2.4 and 2.5(2.6) series kernels.
Hyper-Threading in the 2.4.18+Linux Kernel
The current stable Linux kernel branch is 2.4.x, initially released in January 2001. The 2.4 kernel has since undergone extensive patching, initially for critical bug fixes, later for feature enhancements and support for new hardware.
Because the BIOS will present even a single HT-enabled processor to the OS as two logical processors, all HT configurations should use SMP (Symmetric Multi-Processing) kernels. Pre-2.4.18 SMP kernels may recognize two processors in an HT configuration; however, the scheduler is completely unaware of the logical/physical processor differentiation. The 2.4.18 patch release added some features to the stock scheduler to make it behave better with HT hardware. A 2.4.18+ kernel is strongly recommended for HT configurations.
Enabling Hyper-Threading in a 2.4 system
Given an HT-enabled hardware configuration, use the following steps to enable HT in a 2.4 kernel:
1. First, confirm that your kernel is version 2.4.18 or later, with SMP support. There are many ways to do this, the easiest is to execute the "uname -a" command in a shell. For Red Hat users, Red Hat 7.3 was the first distribution release to support HT, incorporating a 2.4.18 kernel. If you are using another distribution, check the kernel version before attempting to use HT.
2. Next, modify your bootloader (grub or lilo), adding the following parameter to any other boot parameters currently necessary for your system:
It would be wise to add this as a different boot configuration so that you can boot HT or non-HT. (To create an explicitly non-HT configuration, add the 'noht' boot flag.)
3. Finally, reboot the system. Before it restarts, enter the BIOS setup program. Under the processor options you will be able to enable or disable HT. Enable HT, and boot to the 2.4.18 or later SMP kernel with the additional parameters.
Once you have successfully booted the HT configuration, run top. If HT is properly configured, you should see twice as many CPU states as you have physical processors (two virtual CPUs per physical CPU).
Figure 2 is an example of top running on a Red Hat 7.3 system (2.4.18) with two physical Xeons and HT fully enabled. Note the CPU states 0-4, indicating the four logical processors.
Hyper-Threading on 2.4.18+Thread Scheduler
Performance testing multithreaded benchmarks under the 2.4 kernel series still shows some wide scatter in the data. This is because the scheduler still cannot make intelligent choices regarding logical/physical processors in many situations. Under some conditions, 2.4 will still schedule two active threads on the same physical CPU, causing performance degradation. This condition is often random, causing data points from multithreaded benchmarks to vary considerably. "Full" HT scheduler support was not incorporated into the kernel until 2.5.32.
Hyper-Threading in the 2.5.xLinux Kernel
As is standard in Linux kernel versioning, the 2.5.x versions of the kernel are the development branch that will become the 2.6.x stable releases. The 2.5.x kernel added a number of features to its thread scheduler that should extend the performance improvements of HT even further.
2.5.x Thread Scheduler Improvements
A scheduler patch in 2.5.32 introduced the concept of a shared runqueue. The shared runqueue allows two (logical) CPUs, which share resources like cache, to have a scheduler parallel known as a shared runqueue. The shared runqueue may have many applications, but the initial implementation was created specifically with HT in mind. This new concept optimizes the kernel thread scheduler for HT in the following ways:
- HT-aware passive load balancing: This feature addresses the physical CPU imbalance problem - one physical CPU may be running two active threads, while a second physical CPU sits idle. Passive load balancing will attempt to schedule new active threads on an idle physical processor.
- HT-aware active load balancing: Active load balancing also addresses the physical CPU imbalance problem, this time for currently active threads. If three threads are running on three logical CPUs, and one thread goes idle freeing a physical processor, the scheduler will migrate an active thread from the physical processor running two threads to a physical processor running none.
- Thread affinity: Thread affinity is important in SMP as well as SMT systems. Processors use cache memory to hold data and instructions that the processor is using at the moment. By attempting to keep threads scheduled on the same processor, the efficiency of the cache is greatly increased. Moving a thread between physical processors requires the processor to repopulate its cache from main memory, causing performance degradation.
In an SMT system, because the logical processors share cache, the thread scheduler need only attempt to keep threads attached to a physical processor. The scheduler is free to move threads between adjacent logical processors with no performance degradation due to a stale cache.
- HT-aware task pickup: This will allow the scheduler to pick up tasks on a per-physical CPU basis, rather than per-logical CPU basis. Task pickup is related to thread affinity above.
- HT-aware wakeup: This allows threads that were woken up on active logical processors with an idle sibling to be woken up on the sibling processor. (As you might imagine, sibling processors are adjacent logical processors.)
These features work together in the 2.5.32+ kernel to make more efficient use of the new hardware features of HT systems. In addition, the kernel performs in a more consistent manner by continually making optimal use of the processors. The 2.4.18 kernel still performs better as a whole on an HT system, however, it does so in a less predictable manner.
Performance Gains Using Hyper-Threading
OK, you've built a Xeon-based HT system. What kind of performance improvement can be expected? Which applications will benefit from HT, and which will suffer?
Needless to say, HT is targeted at heavily threaded applications. Single-threaded, compute-intensive applications will see minimal performance enhancements. It should be noted, however, that nearly all modern desktop and server systems make extensive use of threads. Server applications generally process socket IO on a thread-per-socket basis. Desktop applications under X Windows will often be processing socket or disk io, X calls, and the application code in parallel.
To date, performance benchmarks for HT systems have focused on server-side systems. This should not be surprising; Intel only recently released HT on a desktop-focused processor (the recent P4). A Web search will quickly find many papers from the past year detailing performance of HT systems.
A recent IBM white paper by Duc Vianney ran several benchmarks both with and without HT enabled on 2.4 and 2.5 kernels. Vianney's work showed a slight performance degradation of single-threaded processes with HT enabled, but performance improvement for the 2.4.19 kernel was approximately 30%. With the enhanced scheduler in the 2.5.32 kernel, the same benchmarks showed a 51% improvement.
Data from an upcoming Java Developer's Journal article exploring heavily threaded Java applications on HT systems indicated typical performance gains of 10-15%, with some tests indicating gains of up to 75% running Java 1.4 on a 2.4.18 HT system.
SMT is here to stay. As processors become more sophisticated, the raw speed of the processor will become even less of a factor in overall system performance due to added features like HT. Some have speculated that SMT and related technologies will spell the end of the megahertz wars.
As with any new hardware technology, software is catching up. Subsequent Linux kernel releases will make more sophisticated use of the available hardware features. Over time, Linux support for HT will mature, resulting in further performance gains.
The Linux community is waiting with bated breath for Linus and crew to tackle the final bugs in 2.5.x, and release the 2.6 Linux kernel. After a stabilization period (which could be significant), major distributions will migrate to the 2.6 kernel. All the while, HT-enabled hardware will be finding its way into enterprise server racks. When the 2.6-enabled distributions hit this hardware, server-side performance will measurably increase, with no hardware investment whatsoever.
Hyper-Threading technology specifically targets performance gains on heavily threaded applications. These applications are most commonly found in enterprise server platforms - application servers, Web servers, Web services platforms, and Java-based systems. Dell, HP (Compaq), and IBM are all putting forth powerful Xeon-based systems with 2-16 processors running Linux. If HT can improve performance by a conservative 25% in heavily threaded server applications, there's an even stronger case for Linux servers over major Unix platforms for data center use on a cost/performance basis.
Hyper-Threading technology promises to make the Intel/Linux combination even more attractive to IT managers and systems architects looking to upgrade their enterprise software platforms.
|tcx 12/05/03 07:23:10 AM EST|
very useful and detailed information.
for details search g**gle.com for
trust and privacy in their ecosystem. Assurance and protection of device identity, secure data encryption and authentication are the key security challenges organizations are trying to address when integrating IoT devices. This holds true for IoT applications in a wide range of industries, for example, healthcare, consumer devices, and manufacturing. In his session at @ThingsExpo, Lancen LaChance, vice president of product management, IoT solutions at GlobalSign, will teach IoT developers how t...
May. 6, 2016 12:00 PM EDT Reads: 813
When it comes to IoT in the enterprise, namely the commercial building and hospitality markets, a benefit not getting the attention it deserves is energy efficiency, and IoT's direct impact on a cleaner, greener environment when installed in smart buildings. Until now clean technology was offered piecemeal and led with point solutions that require significant systems integration to orchestrate and deploy. There didn't exist a 'top down' approach that can manage and monitor the way a Smart Buildi...
May. 6, 2016 10:53 AM EDT Reads: 157
So, you bought into the current machine learning craze and went on to collect millions/billions of records from this promising new data source. Now, what do you do with them? Too often, the abundance of data quickly turns into an abundance of problems. How do you extract that "magic essence" from your data without falling into the common pitfalls? In her session at @ThingsExpo, Natalia Ponomareva, Software Engineer at Google, will provide tips on how to be successful in large scale machine lear...
May. 6, 2016 10:30 AM EDT Reads: 1,591
SYS-CON Events announced today that Peak 10, Inc., a national IT infrastructure and cloud services provider, will exhibit at SYS-CON's 18th International Cloud Expo®, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. Peak 10 provides reliable, tailored data center and network services, cloud and managed services. Its solutions are designed to scale and adapt to customers’ changing business needs, enabling them to lower costs, improve performance and focus inter...
May. 6, 2016 10:00 AM EDT Reads: 1,513
Digital payments using wearable devices such as smart watches, fitness trackers, and payment wristbands are an increasing area of focus for industry participants, and consumer acceptance from early trials and deployments has encouraged some of the biggest names in technology and banking to continue their push to drive growth in this nascent market. Wearable payment systems may utilize near field communication (NFC), radio frequency identification (RFID), or quick response (QR) codes and barcodes...
May. 6, 2016 10:00 AM EDT Reads: 1,098
You think you know what’s in your data. But do you? Most organizations are now aware of the business intelligence represented by their data. Data science stands to take this to a level you never thought of – literally. The techniques of data science, when used with the capabilities of Big Data technologies, can make connections you had not yet imagined, helping you discover new insights and ask new questions of your data. In his session at @ThingsExpo, Sarbjit Sarkaria, data science team lead ...
May. 6, 2016 10:00 AM EDT Reads: 1,312
SYS-CON Events announced today that Ericsson has been named “Gold Sponsor” of SYS-CON's @ThingsExpo, which will take place on June 7-9, 2016, at the Javits Center in New York, New York. Ericsson is a world leader in the rapidly changing environment of communications technology – providing equipment, software and services to enable transformation through mobility. Some 40 percent of global mobile traffic runs through networks we have supplied. More than 1 billion subscribers around the world re...
May. 6, 2016 09:15 AM EDT Reads: 1,435
We're entering the post-smartphone era, where wearable gadgets from watches and fitness bands to glasses and health aids will power the next technological revolution. With mass adoption of wearable devices comes a new data ecosystem that must be protected. Wearables open new pathways that facilitate the tracking, sharing and storing of consumers’ personal health, location and daily activity data. Consumers have some idea of the data these devices capture, but most don’t realize how revealing and...
May. 6, 2016 09:00 AM EDT Reads: 812
The demand for organizations to expand their infrastructure to multiple IT environments like the cloud, on-premise, mobile, bring your own device (BYOD) and the Internet of Things (IoT) continues to grow. As this hybrid infrastructure increases, the challenge to monitor the security of these systems increases in volume and complexity. In his session at 18th Cloud Expo, Stephen Coty, Chief Security Evangelist at Alert Logic, will show how properly configured and managed security architecture can...
May. 6, 2016 08:45 AM EDT Reads: 672
The IoTs will challenge the status quo of how IT and development organizations operate. Or will it? Certainly the fog layer of IoT requires special insights about data ontology, security and transactional integrity. But the developmental challenges are the same: People, Process and Platform. In his session at @ThingsExpo, Craig Sproule, CEO of Metavine, will demonstrate how to move beyond today's coding paradigm and share the must-have mindsets for removing complexity from the development proc...
May. 6, 2016 07:00 AM EDT Reads: 1,084
Artificial Intelligence has the potential to massively disrupt IoT. In his session at 18th Cloud Expo, AJ Abdallat, CEO of Beyond AI, will discuss what the five main drivers are in Artificial Intelligence that could shape the future of the Internet of Things. AJ Abdallat is CEO of Beyond AI. He has over 20 years of management experience in the fields of artificial intelligence, sensors, instruments, devices and software for telecommunications, life sciences, environmental monitoring, process...
May. 6, 2016 06:00 AM EDT Reads: 1,531
In his session at @ThingsExpo, Chris Klein, CEO and Co-founder of Rachio, will discuss next generation communities that are using IoT to create more sustainable, intelligent communities. One example is Sterling Ranch, a 10,000 home development that – with the help of Siemens – will integrate IoT technology into the community to provide residents with energy and water savings as well as intelligent security. Everything from stop lights to sprinkler systems to building infrastructures will run ef...
May. 6, 2016 04:00 AM EDT Reads: 1,348
We’ve worked with dozens of early adopters across numerous industries and will debunk common misperceptions, which starts with understanding that many of the connected products we’ll use over the next 5 years are already products, they’re just not yet connected. With an IoT product, time-in-market provides much more essential feedback than ever before. Innovation comes from what you do with the data that the connected product provides in order to enhance the customer experience and optimize busi...
May. 6, 2016 02:00 AM EDT Reads: 1,442
Manufacturers are embracing the Industrial Internet the same way consumers are leveraging Fitbits – to improve overall health and wellness. Both can provide consistent measurement, visibility, and suggest performance improvements customized to help reach goals. Fitbit users can view real-time data and make adjustments to increase their activity. In his session at @ThingsExpo, Mark Bernardo Professional Services Leader, Americas, at GE Digital, will discuss how leveraging the Industrial Interne...
May. 6, 2016 01:45 AM EDT Reads: 1,452
The increasing popularity of the Internet of Things necessitates that our physical and cognitive relationship with wearable technology will change rapidly in the near future. This advent means logging has become a thing of the past. Before, it was on us to track our own data, but now that data is automatically available. What does this mean for mHealth and the "connected" body? In her session at @ThingsExpo, Lisa Calkins, CEO and co-founder of Amadeus Consulting, will discuss the impact of wea...
May. 6, 2016 01:00 AM EDT Reads: 1,279
Increasing IoT connectivity is forcing enterprises to find elegant solutions to organize and visualize all incoming data from these connected devices with re-configurable dashboard widgets to effectively allow rapid decision-making for everything from immediate actions in tactical situations to strategic analysis and reporting. In his session at 18th Cloud Expo, Shikhir Singh, Senior Developer Relations Manager at Sencha, will discuss how to create HTML5 dashboards that interact with IoT devic...
May. 6, 2016 12:00 AM EDT Reads: 1,499
Whether your IoT service is connecting cars, homes, appliances, wearable, cameras or other devices, one question hangs in the balance – how do you actually make money from this service? The ability to turn your IoT service into profit requires the ability to create a monetization strategy that is flexible, scalable and working for you in real-time. It must be a transparent, smoothly implemented strategy that all stakeholders – from customers to the board – will be able to understand and comprehe...
May. 5, 2016 11:30 PM EDT Reads: 1,393
A critical component of any IoT project is the back-end systems that capture data from remote IoT devices and structure it in a way to answer useful questions. Traditional data warehouse and analytical systems are mature technologies that can be used to handle large data sets, but they are not well suited to many IoT-scale products and the need for real-time insights. At Fuze, we have developed a backend platform as part of our mobility-oriented cloud service that uses Big Data-based approache...
May. 5, 2016 04:00 PM EDT Reads: 811
The IETF draft standard for M2M certificates is a security solution specifically designed for the demanding needs of IoT/M2M applications. In his session at @ThingsExpo, Brian Romansky, VP of Strategic Technology at TrustPoint Innovation, will explain how M2M certificates can efficiently enable confidentiality, integrity, and authenticity on highly constrained devices.
May. 5, 2016 12:30 PM EDT Reads: 1,383
There is an ever-growing explosion of new devices that are connected to the Internet using “cloud” solutions. This rapid growth is creating a massive new demand for efficient access to data. And it’s not just about connecting to that data anymore. This new demand is bringing new issues and challenges and it is important for companies to scale for the coming growth. And with that scaling comes the need for greater security, gathering and data analysis, storage, connectivity and, of course, the...
May. 5, 2016 12:30 PM EDT Reads: 1,359