Welcome!

Linux Containers Authors: Zakia Bouachraoui, Elizabeth White, Yeshim Deniz, Liz McMillan, Pat Romanski

Related Topics: Linux Containers

Linux Containers: Article

Time for a new installation paradigm, Part 2

Today's package managers fail to make installing & upgrading software easy & error-free

(LinuxWorld) — This is Part 2 in a series calling for a radically new approach to Linux software-installation. Part 1 examined many (though not all) of the problems with the current approaches to software-installation. This time, we'll take a closer look at the technological considerations behind one of the biggest issues for software installation: shared libraries. The best way to solve the problem of shared libraries is to understand why they pose a potential problem and how Linux uses them, so let's explore these issues.

Shared libraries remain the pivotal issue for software-installation. Of course, talent and attention to detail by programmers and library-maintainers can lead to compatibility differences across different versions of shared libraries. However, in general you can expect fewer compatibility issues as the changes to the version numbers move farther to the right. One can usually expect libsomething version 1.x to be incompatible with libsomething version 2.x. You are less likely to experience problems when you move from libsomething 1.3 to libsomething 1.4, and even less likely to have trouble moving from libsomething 1.4.3 to libsomething 1.4.4.

The nightmare: Fun with ldd

Linux is not immune to DLL hell, but the Linux version of DLL hell generally takes a different form than it does on Windows. One usually crosses over into Windows DLL hell when a program is installed that overwrites one DLL file (a Windows shared library) with one that causes problems. This often poses a catch-22 problem: If you restore the old DLL, the new program breaks; if you keep the new DLL, the old program breaks. The Windows API does provide ways to avoid this problem, but few people used them in the past. It is certainly possible to make the same mistake with Linux, but Linux's shared-library maintainers have traditionally been more careful about compatibility issues. Thus, fewer problems arise — even when one overwrites a widely used shared-library with a newer version.

A reason why one doesn't typically overwrite a Linux library with a later version is that shared libraries on Linux are generally installed with filenames that represent the versions of the libraries, such as libsomething.so.6.0.3. Usually, this is of practical consequence only when you move from one major version of a library to another. Normally, you don't have both libsomething.so.6.0.3 and libsomething.6.2.5 on the same system in the same directory — and if you do, one of them is likely to be ignored. Programs rarely load a library by that specific a version. They tend to load libsomething.so.6, which exists as a symbolic link to the latest version, which in this case would be libsomething.so.6.2.5.

You can use a GNU utility called ldd to shed some light on how this all plays out in practice (WARNING: Although highly unlikely, it is possible for programs to exploit security holes in the ldd program, so use it at your own risk).

The ldd program prints information about an application's shared-library dependencies and what the program will use when you run it. For example, when I type ldd /usr/bin/mutt, the output looks something like this:

libncurses.so.5 => /lib/libncurses.so.5 (0x419d4000)
libsasl.so.7 => /usr/lib/libsasl.so.7 (0x4001c000)
libdb-4.0.so => /usr/lib/libdb-4.0.so (0x44b02000)
libdl.so.2 => /lib/libdl.so.2 (0x41126000)
libidn.so.9 => /usr/lib/libidn.so.9 (0x40027000)
libc.so.6 => /lib/libc.so.6 (0x41014000)
libdb2.so.2 => /lib/libdb2.so.2 (0x4312b000)
libcrypt.so.1 => /lib/libcrypt.so.1 (0x41a15000)
libpam.so.0 => /lib/libpam.so.0 (0x41a61000)
/lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x41000000)

This tells you that, given your current system-configuration and how the program mutt was compiled and built, the program mutt will find and load the libraries listed above.

Let's look at one of the libraries in the above list. When I look at libdb2.so.2 on my system, I can see that this particular library is not a file, but a symbolic link to the file libdb2.so.2.7.7. Make a mental note of this, because it will play an important part later in our search for a solution.

What happens if we change the name of this symbolic link? For reasons that will become clear in a moment, let's assume that you have only one file or symbolic link called libdb2.so.2 on your system, and that this particular file resides in the /lib directory. Let's look at what happens when we change to the /lib directory and delete the symbolic link or rename it from libdb2.so.2 to something else, say, libdb2.so.2.old. When we run ldd /usr/bin/mutt again, we should see the following change in the output line for this library:

libdb2.so.2 => not found

The application can no longer find the library, even though it still exists as the file libdb2.so.2.7.7. If you've been trying this out for yourself, do not rename the symbolic link back to libdb2.so.2. Instead, run another library tool called ldconfig. This program examines /lib, /usr/lib, and any other library paths listed in the configuration file /etc/ld.so.conf, and (among other things) creates standard symbolic links for the libraries it finds. If your system is configured like mine, then ldconfig will recreate the symbolic link /lib/libdb2.so.2. (You can delete the /lib/libdb2.so.2.old symbolic link now.)

If you're thinking way ahead, you might make the mistake of assuming that Linux expects all file names for shared libraries to have the major version at the end of the file name. However, if you look carefully at the first example of ldd output, you'll see one exception in the list, libdb-4.0.so. In other words, one cannot count on this particular rule of thumb.

Now let's move the libraries and their symbolic links to the directory /usr/lib and then run ldd /usr/bin/mutt again. The line that refers to this library should change to read something like this:

libdb2.so.2 => /usr/lib/libdb2.so.2 (0x4312b000)

Searching the paths

The system follows a built-in library search path, which includes both /lib and /usr/lib, in that order (older loaders actually search these two locations in reverse). So, as ldd demonstrates, mutt will still find the needed library even though we moved it. If we move it to a directory that is not included in the built-in library path, however, then mutt will fail to find the library once again. If you try moving it to a path that is not searched, you'll see the "not found" message appear again in the ldd output.

Now let's assume that you have three versions of libdb2 on your system. One is version 2.7.7, another is version 2.7.6 and the third is version 2.1.8. (Note: I chose the second and third version numbers at random for the sake of example, so please don't e-mail me if no such versions ever existed.)

These libraries reside in the following places, with the corresponding symbolic links:

/lib/libdb2.so.2 -> /lib/libdb2.so.2.7.7
/usr/lib/libdb2.so.2 -> /lib/libdb2.so.2.7.6
/usr/local/lib/libdb2.so.2 -> /libdb2.so.2.1.8

When you run ldd /usr/bin/mutt, you are most likely to see that it loads the library from the /lib directory:

libdb2.so.2 => /lib/libdb2.so.2 (0x4312b000)

You can tell the system to search these library paths in some other order by changing the settings in /etc/ld.so.conf or by setting an environment variable such as LD_LIBRARY_PATH. Please read the sidebar for reasons why messing with LD_LIBRARY_PATH is a bad idea. It is the easiest way to demonstrate one of the principles of how libraries are loaded, however, so I hope you'll excuse the use of LD_LIBRARY_PATH for the purpose of illustration.

export LD_LIBRARY_PATH=/usr/lib:/lib

When you now run ldd /usr/bin/mutt, you should see that it finds the library in /usr/lib instead.

Likewise, if you run the following command, then ldd /usr/bin/mutt would tell you it will load the library from /usr/local/lib:

export LD_LIBRARY_PATH=/usr/local/lib:/usr/lib:/lib

If there are no compatibility problems between versions 2.7.6 and 2.7.7, then it shouldn't matter if mutt finds the library in /lib or /usr/lib. However, if there are any compatibility problems in version 2.1.8, it will make a big difference if mutt tries to load the library from /usr/local/lib. The mutt program may refuse to load, crash during execution or malfunction in minor, unpredictable ways. One might think that it will either malfunction or fail depending on the severity of the compatibility problem, but that is not always true. It often depends on how one uses the dlopen() function, which is the function that loads a shared library. If you write the function one way, the linker will try to resolve undefined symbols in the code only as they are needed. The program may function (more or less) until it hits a symbol that the linker cannot resolve. If you write it another way, the linker will resolve all the undefined symbols immediately and your program will fail if any symbols cannot be resolved.

If it was easy to control or predict how libraries were loaded, the ability to do both might make it easier to install and manage software. As it turns out, it's not easy to predict or control, but it is not impossible, either. Linux has changed the way it handles shared libraries over time, but as far as I can tell, here's the current method the Linux loader searches for shared libraries (in order of priority):

Traps in LD_LIBRARY_PATH
Why is messing with LD_LIBRARY_PATH a bad idea? On the surface, it may seem as if one could use LD_LIBRARY_PATH to solve compatibility problems. Just install all the libraries your applications need, and set it on a per-application basis to search for the correct libraries.

However, the LD_LIBRARY_PATH is subject to various subtle problems. For one thing, the program-loader ignores LD_LIBRARY_PATH if the executable file sets the user ID or group ID when you run it (this is determined by the setuid, setgid properties of the executable file).

You can also run into very confusing situations where you can make an application run properly with LD_LIBRARY_PATH, after which other applications mysteriously break. One possible explanation is that the application you fixed launched other applications, which inherited the custom LD_LIBRARY_PATH setting. Those other applications probably expected the default library search order, not the modified one in LD_LIBRARY_PATH.

The bottom line is that you generally want to solve library path issues some other way than by using LD_LIBRARY_PATH.

  1. If the programmer passes a fully qualified path (the path starts with "/";) to the loader function dlopen(), it loads the library from that path, if it exists.

    Otherwise, the loader searches for the library using the following order of preference:

  2. If set, it will check the contents of the environment variable LD_LIBRARY_PATH
  3. The contents of /etc/ld.so.cache (you generate this with ldconfig and /etc/ld.so.conf;)
  4. The default search path, which is /lib and then /usr/lib.

Given only the above, the following factors can be crucial to proper library handling:

  1. The settings in /etc/ld.so.conf
  2. Whether the ldconfig program was executed recently
  3. Whether the shared libraries load other shared libraries
  4. Whether the main program loads plugins or launches other applications
  5. The setting of the environment variables LD_LIBRARY_PATH, PATH, and others
  6. Specific environment variables for the program(s)
  7. Configuration settings for the program, plugins, and child programs
  8. Configuration settings for the environment (KDE or GNOME configuration, for example)
  9. Whether the program uses setgid or setuid
  10. Whether you have duplicate or conflicting libraries in /lib and /usr/lib
  11. Link-time settings vs. run-time settings (for example, whether one uses the -rpath switch for the linker, which inserts the runtime link path into the executable)
  12. Many other factors...

These are many of the issues we'll have to consider when creating a new installation paradigm, but not nearly all of them. We will lay more of the foundation in the next article. After that, we can begin to pull it all together and assess what it would take to make a radical but positive change in software installation and management.

More Stories By Nicholas Petreley

Nicholas Petreley is a computer consultant and author in Asheville, NC.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


IoT & Smart Cities Stories
In his general session at 19th Cloud Expo, Manish Dixit, VP of Product and Engineering at Dice, discussed how Dice leverages data insights and tools to help both tech professionals and recruiters better understand how skills relate to each other and which skills are in high demand using interactive visualizations and salary indicator tools to maximize earning potential. Manish Dixit is VP of Product and Engineering at Dice. As the leader of the Product, Engineering and Data Sciences team at D...
When talking IoT we often focus on the devices, the sensors, the hardware itself. The new smart appliances, the new smart or self-driving cars (which are amalgamations of many ‘things'). When we are looking at the world of IoT, we should take a step back, look at the big picture. What value are these devices providing. IoT is not about the devices, its about the data consumed and generated. The devices are tools, mechanisms, conduits. This paper discusses the considerations when dealing with the...
Bill Schmarzo, Tech Chair of "Big Data | Analytics" of upcoming CloudEXPO | DXWorldEXPO New York (November 12-13, 2018, New York City) today announced the outline and schedule of the track. "The track has been designed in experience/degree order," said Schmarzo. "So, that folks who attend the entire track can leave the conference with some of the skills necessary to get their work done when they get back to their offices. It actually ties back to some work that I'm doing at the University of San...
Bill Schmarzo, author of "Big Data: Understanding How Data Powers Big Business" and "Big Data MBA: Driving Business Strategies with Data Science," is responsible for setting the strategy and defining the Big Data service offerings and capabilities for EMC Global Services Big Data Practice. As the CTO for the Big Data Practice, he is responsible for working with organizations to help them identify where and how to start their big data journeys. He's written several white papers, is an avid blogge...
Dynatrace is an application performance management software company with products for the information technology departments and digital business owners of medium and large businesses. Building the Future of Monitoring with Artificial Intelligence. Today we can collect lots and lots of performance data. We build beautiful dashboards and even have fancy query languages to access and transform the data. Still performance data is a secret language only a couple of people understand. The more busine...
If a machine can invent, does this mean the end of the patent system as we know it? The patent system, both in the US and Europe, allows companies to protect their inventions and helps foster innovation. However, Artificial Intelligence (AI) could be set to disrupt the patent system as we know it. This talk will examine how AI may change the patent landscape in the years to come. Furthermore, ways in which companies can best protect their AI related inventions will be examined from both a US and...
Enterprises have taken advantage of IoT to achieve important revenue and cost advantages. What is less apparent is how incumbent enterprises operating at scale have, following success with IoT, built analytic, operations management and software development capabilities - ranging from autonomous vehicles to manageable robotics installations. They have embraced these capabilities as if they were Silicon Valley startups.
Chris Matthieu is the President & CEO of Computes, inc. He brings 30 years of experience in development and launches of disruptive technologies to create new market opportunities as well as enhance enterprise product portfolios with emerging technologies. His most recent venture was Octoblu, a cross-protocol Internet of Things (IoT) mesh network platform, acquired by Citrix. Prior to co-founding Octoblu, Chris was founder of Nodester, an open-source Node.JS PaaS which was acquired by AppFog and ...
The deluge of IoT sensor data collected from connected devices and the powerful AI required to make that data actionable are giving rise to a hybrid ecosystem in which cloud, on-prem and edge processes become interweaved. Attendees will learn how emerging composable infrastructure solutions deliver the adaptive architecture needed to manage this new data reality. Machine learning algorithms can better anticipate data storms and automate resources to support surges, including fully scalable GPU-c...
Cloud-enabled transformation has evolved from cost saving measure to business innovation strategy -- one that combines the cloud with cognitive capabilities to drive market disruption. Learn how you can achieve the insight and agility you need to gain a competitive advantage. Industry-acclaimed CTO and cloud expert, Shankar Kalyana presents. Only the most exceptional IBMers are appointed with the rare distinction of IBM Fellow, the highest technical honor in the company. Shankar has also receive...