Welcome!

Linux Containers Authors: Liz McMillan, Elizabeth White, Zakia Bouachraoui, Pat Romanski, Stefana Muller

Related Topics: Linux Containers

Linux Containers: Article

Time for a new installation paradigm, Part 4

Our Hero sounds off about how application developers and packagers can improve the software-installation process

(LinuxWorld) — It is finally that time in the series to formulate changes and new approaches to software installation on Linux. Let's first summarize what we have learned, as well as the important factors and goals.

  • Administration and troubleshooting are very expensive
  • Any given application may share files with other applications
  • Systems don't necessarily store shared files in the same places
  • People often install the latest versions of software
  • Unofficial versions of software are often required
  • RAM is cheap
  • Disk storage is cheap
  • CPU power is cheap
  • Broadband access, where available, is (relatively) cheap
  • CDs are cheaper still

Here are the primary goals:

  • Installation should be easy for amateurs, flexible for professionals
  • Installation should consume as little time as possible
  • Installations and updates should not break anything on your system
  • Updates and maintenance should be as automatic and hands-off when appropriate
  • The software-installation process should be distribution-agnostic
  • The software-installation process should be version-agnostic for any given distribution

Here is what we learned:

  • Packaging systems are distribution-agnostic, but packages are distribution-centric
  • Unofficial packages can cause problems, even for the target distribution
  • Shared libraries are a key reason why installations are fragile
  • Package managers tend to resolve library dependencies by package, not by library
  • It is difficult to predict how any given application will find shared libraries
  • Applications may search for libraries depending on system/user configuration
  • Applications may search for libraries depending on how they are coded
  • It may be possible to learn something from how automake/autoconf/libtool utilities address platform compatibility issues

We must address two specific groups to solve installation problems:

  1. Application developers: those who design, build and test the applications.
  2. Application packagers: those who design and manage the installation process for the applications.

These can be virtual groups, of course, because the same person or persons may be responsible for everything from start to finish. However, it may be easier to understand the issues if we separate the procedures and recommendations.

Rules for application developers

Given all of the above, there are some very basic and simple procedures application developers can follow that will simplify the task of creating a more ideal Linux software installation paradigm.

SUBHEAD2: 1. Use static linking whenever possible

Application developers should examine more carefully whether or not their application benefits from shared libraries enough to justify use. If the application doesn't really need to use shared libraries, then it shouldn't. This simple rule will make any application that doesn't really need shared libraries far more distribution-agnostic, and it does so without even having to address any of the more-complicated installation issues.

The reasoning behind this rule is that static linking eliminates the potential traps caused by shared libraries at the cost of disk space and memory. Given that disk storage and memory are cheap, whereas administration is expensive, then there is some threshhold at which it is more cost-effective to link statically than dynamically.

Suppose your high-end database server is meant to be used as the primary application on a server. You have to make a choice. Will you link to libthingy.so.3 dynamically, or will you statically link the library libthingy into your application? One must examine the following:

  • The size of libthingy
  • The number of instances of your application that use libthingy
  • The likely number of other applications that will load libthingy

The following numbers are hypothetical and do not represent the true tradeoff, but they should serve well enough to make the point. If libthingy is 5K, and your application launches a maximum of 10 instances, all of which are statically linked with libthingy, you would only save about 45K by linking to libthingy dynamically. In normal environments, that is hardly worth the risk of having your application break because some other build or package overwrites the shared version of libthingy.

If there are as many as 10 other applications using libthingy that are likely to run simultaneously with your database server, then we're still talking about a negligible use of memory. It is also possible that those other applications will indeed load libthingy.so.3 the shared library. If you chose to link statically, it won't matter to those applications.

Client applications are less likely to benefit from static linking, especially if the client application is built to use a sophisticated GUI framework such as GNOME, KDE, Qt, et al. One might be able to justify static linking in some cases, but there's no point in loading the entire Qt toolkit for every application that uses it. On the other hand, it doesn't always hurt to offer a statically linked version of your application. We'd like to avoid this, as a rule, but it is worthwhile to note that the Opera browser is available both statically and dynamically linked to Qt. The statically linked version has solved many a problem for Opera users on Linux.

SUBHEAD2: 2. Design the application to look for resources in separate, identifiable directory trees

Among the problems that users of client applications encounter, one comes up most often: when they upgrade to a new version of a GUI framework, such as Qt 3.1, but need to run one or more applications that are only available for Qt 3.0 or Qt 2.0.

Current directory structures and packaging methods do not work well in this case, as your distribution is likely to assume that when you upgrade from Qt 2.0 to Qt 3.0 or 3.1, the prior version no longer exists. This leads to dependency problems and many other installation headaches.

Several readers jumped way ahead of me by suggesting variations on the following solution. Install commonly used frameworks in their own directories, identified by the framework and version number according to a open-standards-regulated system. Because there is no such system at present, here is an arbitrary example.

The root directory for Qt 3.1 might be /usr/qt/3.1/ (or /opt/qt/3.1/ for those who would argue against loading up the /usr partition), and the root directory for Qt 2.0 might be /usr/qt/2.0/. Application developers should use one or more of the various compile and build methods available to make any Qt-based application look for the Qt libraries in the appropriate version-specific directories first, then search elsewhere only if the libraries are not found.

SUBHEAD2: 3. Create and adopt a new dynamic-link wrapper

This is arguably the most radical of the three rules, but it could prove to be one of the most beneficial. As noted in the previous installment in this series, libtool is a utility and its companion ltdl is a wrapper for the Linux dynamic-library loader. These tools hide the differences between Unix platforms and how they support statically linked and dynamically linked libraries.

Libtool and ltdl do not address the problems of installing and using shared libraries on different distributions or different versions of a distribution. But they do prove one thing — that developers are willing to use tools and wrappers to solve compatibility issues. Why not enhance these tools, expand the ltdl dynamic link wrapper or create a whole new wrapper to address these problems?

The function dlopen() is the default method one would use to load a shared library. The function lt_dlopen() is the ltdl replacement for dlopen(). Let's imagine that we create a new "LinuxWorld" wrapper, where lw_dlopen() replaces the other functions.

We could design our new function to handle the library search and dynamic linking in a more elegant fashion, which should result in fewer problems with dynamically loaded libraries. For example, we might design the function to work this way (this is only intended as an example and not intended as a well thought-out strategy, so some of these steps could easily be rearranged):

  • Look for an application-specific library cache file and try to load the library from the cache.
  • Then, look for the library in a lib directory directly beneath the location where the application is installed.

    Note: This could be referenced either as LW_ROOT/lib, where LW_ROOT is a variable the application can understand as the path to the application directory. Or it could use ./lib or some other method.

  • Then, look for the library according to a predefined, fully-qualified path wherever appropriate, such as /opt/qt/3.1/lib or /opt/KDE/3.1/lib.

    Even better: When appropriate, use framework paths that are publicly available as configuration settings or environment variables, so that lw_dlopen() searches for LW_KDE_FRAMEWORK_3.1/lib as its "hard-coded" path, where LW_KDE_FRAMEWORK_3.1 is the variable. This lets users install these frameworks anywhere on the system, yet still makes them available to the widest range of applications that use them.

  • If the path is not hard-coded, or if the library is not found, then look for the library in the application directory, /lib, /usr/lib or according to the paths defined in the LW_LIBRARY_PATH environment variable.
  • Whenever a matching library is found, attempt to resolve all the data and function symbols.
  • If any symbol cannot be resolved, continue looking according to the predefined algorithm.
  • If no library can be loaded properly, then report the error accurately.

One of the most obvious objections will be that this adds bloat to a process designed to minimize bloat. Fair enough, but the alternatives are probably limited to sticking with the broken approach we have now or linking statically, which has a far greater potential to waste memory and disk space. Unless someone can demonstrate with hard numbers that it is less beneficial to provide these features, I will assume that the tradeoff is worthwhile.

On the other hand, the search priority above (or one like it) will tend to use application-provided shared libraries only when needed, and it will use more-general libraries

Another objection will be that the above process will have a negative impact on speed. This is a more serious consequence, but we can attempt to solve the problem by creating certain rules for application packagers, which we will address next.

SUBHEAD2: 4. Firewall-incompatible versions or build-in backward compatibility

Many applications depend on services accessed via sockets, pipes or other means. For example, KDE runs various component services, many applications use network-audio services, and so on. If one tries to run a versioni 2.2 application on top of a 3.1 framework, one risks the possibility that it will try to communicate with a service that is incompatible. Therefore either the new framework should put a firewall between itself and the old one so that you can run two different versions of the same services simultaneously, or the new framework should be prepared to handle requests by applications that used the old framework.

Rules for application packagers

First, one should examine the following rules with the capabilities of autoconf, automake and libtool in mind. It is not these specific tools that are important, but the fact that they exist. They demonstrate that it is possible to create intelligent tools that search out and discover characteristics of your system, then make decisions based on what is found.

SUBHEAD2: 1. Place user applications in their own directories

The extent to which one applies this rule is debatable, because plenty of programs will function fine in traditional /usr/bin or /usr/local/bin directories. It is also easier to put almost everything in /usr/bin because it reduces the search path requirements. But there is no reason why one cannot install a program in /opt/myprogram/3.2/bin and create a symbolic link to the program in /usr/bin.

It is important, however, not to hard-code the upper portion of the tree. In other words, the installation program should make it possible for the user to install everything under any given directory. For example, if the user or administrator chooses /opt, it would result in everything being installed in /opt/myprogram/3.2. The installation program will have to record this top level directory somewhere for later use. For the purpose of discussion, we'll call this LW_ROOT so that we don't have to worry about what the user chooses during installation.

One would typically expect there to be a set of traditional subdirectories such as LW_ROOT/etc, LW_ROOT/bin, LW_ROOT/lib. Some of these directories could be optional, but LW_ROOT/bin is mandatory, and LW_ROOT/lib should be mandatory for any application that uses shared libraries.

In the case of applications built on a framework, the installation should put them in a subdirectory of that framework.

This will no doubt be one of the more controversial issues, but I believe the advantages will become apparent as we continue through the rules.

SUBHEAD2: 2. Resolve dependencies by file, not by package

First, the notion that we have to resolve dependencies — such as the idea that a needed shared library by the package that provides the shared library — has to go. Granted, this could be a hard nut to crack. Distributions may be inclined to rely on their own methods of packaging and resolving dependencies because anything that is distribution-specific discourages people from switching to another distribution. However, that problem is political and beyond the scope of this discussion.

The installation program should resolve dependencies by file, not by package. This means that the installation program must search for the appropriate shared libraries using a method similar to the one we described for our LinuxWorld dynamic loader wrapper.

One of the more elusive problems is the potential for unresolved symbols in shared libraries. There is no reason why an installation program has to take chances with this. There are various ways we could attempt to solve this problem. Here is one suggestion, but I am by no means claiming it is the only solution or the best one, and it requires a utility that does not yet exist (to my knowledge, anyway).

The application packager runs a utility to create a file that includes all the libraries and symbols used by every executable in the package, whether that executable is a main program, library or plug-in. As the installation program finds needed dependencies in shared libraries, it examines each library it finds to make sure all the needed symbols are there. If it fails to find all the symbols, the search continues.

SUBHEAD2: 3. Use an intelligent algorithm for supplying unresolved dependencies

If an installation program finds the libraries it needs but some other problem prevents the installation from being satisfied with what it finds (such as missing symbols in the library), then it should set a flag to that effect. We will revisit this in a moment in rule 4.

If the installation program cannot find a library it needs, it should use a package-appropriate means of providing the missing libraries. In some cases, this will be as simple as including all the dependencies in the package itself. If the system doesn't have one or more of these libraries, the installation program can install them directly.

This is not feasible in some cases, such as GNOME or KDE applications, because it is tantamount to including the entire framework with every GNOME or KDE application. Installation programs for these applications should be able to access a resource, acquire the dependencies and request permission to install them. In cases where this is impossible, it should give the user a detailed explanation of what is missing and what the user must do to supply what is missing.

SUBHEAD2: 4. Replacements go in the application library directory

Recall that flag we set above? If the installation program found a the library it needed but couldn't use it for some reason, then it should not install a replacement over that library, as it may break another application. It should install the new library in LW_ROOT/lib, where LW_ROOT is the fully qualified, topmost directory chosen by the user or administrator at installation. Yes, this uses up more disk space and memory, but only because we already determined the cost of doing otherwise would be an application that doesn't work properly — or at all.

SUBHEAD2: 5. Generate a configuration file based on the end results

The installation program must record certain install-time parameters for later use. An example would be LW_ROOT, which the application may use later to locate application-specific shared libraries.

SUBHEAD2: 6. Generate a script based on the end results

The new installation paradigm does not stand or fall on this particular rule, but it might be a good idea to run all applications from a script rather than run the executable directly. This approach can cover over a multitude of sins. It can "unset" potentially problematic variables (such as LD_LIBRARY_PATH) and set or access variables it needs (such as LW_LIBRARY_PATH or LW_ROOT). It can check for changes to the application directories.

Best of all, in many cases (such as missing or out-of-date files) it is much easier to write a launch script that fails quickly and gracefully with human-readable error messages than it is to write an application to do the same.

The bottom line is that there is no limit to the advantages, and one can eliminate many potential disadvantages with an intelligent script.

SUBHEAD2: 7. Build a library cache for the application

Here is where we address the speed issue. The installation can take the end results of everything it discovered and create a cache file that speeds up the process of application load times.

SUBHEAD2: 8. Make a redo utility part of every package

Of course, when one creates a cache file, one introduces a possibility where the cache becomes out-of-date. Therefore, each package should include a redo utility that repeats the process of resolving dependencies as if it were installing the application for the first time. This would result in regenerating the library cache based on new information — better libraries in better paths, for example. One might even want to add a cron job to the system that regenerates these cache files on a regular basis.

Final thoughts

As far as I can tell, if developers and packagers flesh out the details of the above suggestions, modify them to iron out wrinkles and then adopt the practices, most if not all Linux software installations will be trouble-free and distribution-agnostic.

The approach has other tangential advantages. First, it is easy to migrate to this type of system because it avoids replacing system-wide libraries whenever there is a conflict. You can keep an existing Red Hat or Debian system intact while installing applications that apply the above rules. Second, application-specific directories make uninstallation fast, easy, and clean.

But the most difficult problems may be political and economic. Once you make it easy to install a package intended for one distribution on another, you take away yet another reason why multiple distributions need to exist. Whether this is sufficient to keep the broken system intact remains to be seen.

In the interest of keeping the length of this installment within reasonable limits (a goal I may already have failed to achieve), I may have neglected a number of important issues. One would be application-executable naming conflicts. Of course, standards groups exist for this reason; we should rely on them where they exist and create them where they don't. Another is the issue of shared configuration files, which I had hoped to address but avoided in order to manage the length of the series. I hope that if you think the above procedures and rules have any potential, you will call to my attention any other missing factors. If we can refine the rules, perhaps the series would inspire a new project or changes to an existing project so that we Linux users could eventually enjoy tangible results.

More Stories By Nicholas Petreley

Nicholas Petreley is a computer consultant and author in Asheville, NC.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


IoT & Smart Cities Stories
Moroccanoil®, the global leader in oil-infused beauty, is thrilled to announce the NEW Moroccanoil Color Depositing Masks, a collection of dual-benefit hair masks that deposit pure pigments while providing the treatment benefits of a deep conditioning mask. The collection consists of seven curated shades for commitment-free, beautifully-colored hair that looks and feels healthy.
The textured-hair category is inarguably the hottest in the haircare space today. This has been driven by the proliferation of founder brands started by curly and coily consumers and savvy consumers who increasingly want products specifically for their texture type. This trend is underscored by the latest insights from NaturallyCurly's 2018 TextureTrends report, released today. According to the 2018 TextureTrends Report, more than 80 percent of women with curly and coily hair say they purcha...
The textured-hair category is inarguably the hottest in the haircare space today. This has been driven by the proliferation of founder brands started by curly and coily consumers and savvy consumers who increasingly want products specifically for their texture type. This trend is underscored by the latest insights from NaturallyCurly's 2018 TextureTrends report, released today. According to the 2018 TextureTrends Report, more than 80 percent of women with curly and coily hair say they purcha...
We all love the many benefits of natural plant oils, used as a deap treatment before shampooing, at home or at the beach, but is there an all-in-one solution for everyday intensive nutrition and modern styling?I am passionate about the benefits of natural extracts with tried-and-tested results, which I have used to develop my own brand (lemon for its acid ph, wheat germ for its fortifying action…). I wanted a product which combined caring and styling effects, and which could be used after shampo...
The platform combines the strengths of Singtel's extensive, intelligent network capabilities with Microsoft's cloud expertise to create a unique solution that sets new standards for IoT applications," said Mr Diomedes Kastanis, Head of IoT at Singtel. "Our solution provides speed, transparency and flexibility, paving the way for a more pervasive use of IoT to accelerate enterprises' digitalisation efforts. AI-powered intelligent connectivity over Microsoft Azure will be the fastest connected pat...
There are many examples of disruption in consumer space – Uber disrupting the cab industry, Airbnb disrupting the hospitality industry and so on; but have you wondered who is disrupting support and operations? AISERA helps make businesses and customers successful by offering consumer-like user experience for support and operations. We have built the world’s first AI-driven IT / HR / Cloud / Customer Support and Operations solution.
Codete accelerates their clients growth through technological expertise and experience. Codite team works with organizations to meet the challenges that digitalization presents. Their clients include digital start-ups as well as established enterprises in the IT industry. To stay competitive in a highly innovative IT industry, strong R&D departments and bold spin-off initiatives is a must. Codete Data Science and Software Architects teams help corporate clients to stay up to date with the mod...
At CloudEXPO Silicon Valley, June 24-26, 2019, Digital Transformation (DX) is a major focus with expanded DevOpsSUMMIT and FinTechEXPO programs within the DXWorldEXPO agenda. Successful transformation requires a laser focus on being data-driven and on using all the tools available that enable transformation if they plan to survive over the long term. A total of 88% of Fortune 500 companies from a generation ago are now out of business. Only 12% still survive. Similar percentages are found throug...
Druva is the global leader in Cloud Data Protection and Management, delivering the industry's first data management-as-a-service solution that aggregates data from endpoints, servers and cloud applications and leverages the public cloud to offer a single pane of glass to enable data protection, governance and intelligence-dramatically increasing the availability and visibility of business critical information, while reducing the risk, cost and complexity of managing and protecting it. Druva's...
BMC has unmatched experience in IT management, supporting 92 of the Forbes Global 100, and earning recognition as an ITSM Gartner Magic Quadrant Leader for five years running. Our solutions offer speed, agility, and efficiency to tackle business challenges in the areas of service management, automation, operations, and the mainframe.