Welcome!

Linux Containers Authors: Liz McMillan, William Schmarzo, Karthick Viswanathan, Pat Romanski, Elizabeth White

Related Topics: Linux Containers

Linux Containers: Article

Open Source "Spying" On Open Source: The CIA Project

LinuxWorld Exclusive Interview With Micah Dowty, Founder and Principal Contributor

LinuxWorld.com recently had the pleasure to interview Micah Dowty, founder and principal contributor to a rather unique project aptly named CIA (http://cia.navi.cx/).  CIA is a project that monitors a wide range of open source projects in real time tracking changes, building statistics, and alerting through a number of channels on events.
 
[LW] Tell us a brief history of yourself and how you came up with the idea for CIA.  What were you trying to solve?
 
[MT] CIA is really the survivor in a chain of failed projects. It started with the Kiwi, my attempt at building a very inexpensive and completely open PDA device. That project failed at its original goals, but I did end up with a "from scratch" Motorola 68k board that booted Linux, which taught me a lot about embedded systems.
 
During the Kiwi's development, I decided I needed to write a GUI. Honestly, this probably originated as Not Invented Here syndrome, but its architecture evolved into something really interesting to me: the PicoGUI project.
 
There might still be people using PicoGUI today, but I lost interest in it a couple years ago. Luckily for CIA, at some point we decided PicoGUI needed a bot reporting commits to our IRC channel.
 
This early bot was the first incarnation of CIA. It was a quick afternoon hack, written by myself and named by Lalo Martins. About a week later, Mike Hearn suggested modifying it to work with any number of projects, and putting it in a central IRC channel. This was June 1, 2003, the birth of #commits on Freenode. In just the space of a few hours, #commits grew from nothing to about the size it is today.
 
Originally, CIA was just created to make the PicoGUI project easier and more fun to keep track of. When we set up #commits, the motivation was mostly just for the novelty of seeing what everyone else is working on at any given time. I think it was much later that we realized just how useful CIA could be to the projects using it. This is one of the things that sparked the complete rewrite in December 2003.
 
[LW] CIA is open-source looking/spying on open-source.  How has your project been received by many of the larger open-source bodies?
 
[MT] The response to CIA has been very positive. Really the only negative comments I remember getting are related to server downtime or bugs in the IRC code. CIA has been pretty reliable on its current home, but there have been periods of time in the past when for either software or hardware reasons it was crashing all the time. It seemed like every time it went down for a few hours I'd get someone threatening to reimplement CIA as a 50 line shell script. Of course, that's pretty much what CIA was before its rewrite- and there are a lot of advantages that this 15,000 lines of Python have over the old pile of shell scripts.
 
There are several large projects that are making use of CIA and showing their support by linking to the web interface. Gaim, AnhkSVN, Enlightenment, Gentoo, Adium, and Beagle are just a few of the larger projects that use CIA and link to it prominently on their web sites.  I don't think CIA has received any official endorsements by large open-source projects or organizations, but some powerful members of these organizations have shown interest. Nat Friedman of Gnome fame was quite excited about CIA and sent a big donation.
 
[LW] What do you see as the top 3 features of CIA?
 
[MT] I think CIA's top feature is that anyone can use it, and it's about as easy to set up as possible for the version control system you're using. With about the same effort it would take to set up a commits mailing list, you can connect your project to a server that will get your commits onto IRC, the web, and RSS.
 
The next best thing about CIA is how it isn't tied to any particular version control system. Internally, CIA is just an architecture for publishing, filtering, and formatting arbitrary messages. CIA supports version control systems I've never used, and it's being used for more esoteric purposes like reporting automated build results. I know I've seen several projects out there for mailing commit messages or generating RSS feeds, but they're all designed for one specific version control system. CIA's client scripts act as an abstraction layer, so by writing a new client you can use it with pretty much anything.
 
The web interface has always been secondary to IRC commit delivery, but I see its ability to create a community of projects as the next most important feature of CIA. Every person and every project on CIA automatically gets a web page, and they're all linked together. Each page has a "related" box that lets you see who works on a particular project, what projects a particular author works on, which version control systems an author uses regularly, etc. These associations actually form an undirected graph that ends up tying most projects together in some way. Back when CIA was smaller, we could visualize this graph. Nowadays it just takes way too much CPU time.
 
[LW] How have "users" used the data, stats, and events published from the CIA Notification server?
 
Many people link to their author page from a personal homepage or blog, and more and more projects are including links from their web site to their CIA stats page. A few projects are including CIA stats directly on their web site using RSS aggregators. CIA does provide a low-level XML feed with more detailed stats, and there's an XML-RPC interface that gives you easy programmatic access to all the data used to generate the web site. I don't think anyone is actually making use of this yet, but it's hard to expect people to use interfaces I haven't got around to documenting yet.
 
The coolest practical use of CIA I've seen recently was on the Planet Gentoo site. Since Gentoo contributors have the same username everywhere, they could link every blog post directly to that user's CIA stats page.
 
I expect people will find even more diverse ways to use CIA once I make the details of the XML-RPC interface well-known. I'm also really hoping that publish-subscribe becomes more common, as polling the RSS feeds really generates a huge amount of web traffic.
 
[LW] The community aspect of CIA is interesting to learn of, with people making a big play of their own 'commit' status.  Do you see the need to feed peoples egos is a big part of what CIA can deliver?
 
[MT] Definitely. Commit reporting has been done before, but one of the things that makes CIA really unique is that it brings projects together into a larger community. Anybody's CVS to RSS gateway or commit mailing list can be useful to developers in pretty much the same way, but CIA has a way of introducing a bit of healthy competition. People love seeing their work Show up in public IRC channels. It seems less like they're locked in a closet pounding away at code in isolation, and more like they're doing something interactive that everyone else can see. CIA lets everyone know when you're making progress and gives you a virtual pat on the back for it. I know many people have trouble developing when CIA isn't around, since it just isn't quite as much fun.
 
[LW] CIA is watching itself, which is pretty cool.  Have you had much help from the community development wise?
 
User contributions have been very important to the CIA client scripts, and in defining the XML message format. I wrote the client script for Subversion repositories, but all other clients were contributed by users. On the server side though, I've been mostly alone. The server's codebase is pretty clean and well-organized, but it's big and largely undocumented. The server is tricky to set up, and it has a steep learning curve, so it has much less appeal for random hacking than the client scripts.
 
 [LW] Fundamentally CIA requires a small piece of script to be installed in the CVS/SVN servers to alert it when something changes.  How do you go about asking for support from say SourceForge based projects? Have they been supportive?
 
[MT] CIA has spread really well just by word of mouth. Generally a project admin or enthusiast hears about CIA, sets it up, then the first news I get about it is a request for a metadata key or IRC bot. When the project was brand new Mike, Lalo, and I advertised it to a few other projects and set up scripts to scrape commits off of email lists. There are still a few projects that are connected to CIA via mailing lists, but the vast majority of projects were set up without any direct encouragement from us.
 
[LW] What are the longer term plans of CIA?  Where do you see it heading?
 
There are some loose ends that I'd like to tie up, like web-based registration for IRC bots and metadata keys. I'm sure there are more bugfixes to be had. That's all just polishing what's already there I don't see CIA changing a whole lot, just becoming easier to use, more robust, and more scalable. CIA already has a lot of feature bloat for what it is, really. The biggest change I see happening in CIA's future is making it easier for people to set up their own CIA servers in such a way that the load can be shared across many machines but the large-scale relationships between people and their work can be maintained.
 

Micah Dowty: Bio Details

Micah started tinkering with electronics and software at a very early age thanks to having an engineer for a father and a teacher for a mother. He finds himself learning more from his own personal projects than from school, and he has also contributed to a handful of larger open source projects including BZFlag, Crystal Space, and the Linux kernel.
 
 
  
 

More Stories By Alan Williamson

Alan Williamson is widely recognized as an early expert on Cloud Computing, he is Co-Founder of aw2.0 Ltd, a software company specializing in deploying software solutions within Cloud networks. Alan is a Sun Java Champion and creator of OpenBlueDragon (an open source Java CFML runtime engine). With many books, articles and speaking engagements under his belt, Alan likes to talk passionately about what can be done TODAY and not get caught up in the marketing hype of TOMORROW. Follow his blog, http://alan.blog-city.com/ or e-mail him at cloud(at)alanwilliamson.org.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


@ThingsExpo Stories
SYS-CON Events announced today that Secure Channels, a cybersecurity firm, will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Secure Channels, Inc. offers several products and solutions to its many clients, helping them protect critical data from being compromised and access to computer networks from the unauthorized. The company develops comprehensive data encryption security strategie...
In his session at @ThingsExpo, Sudarshan Krishnamurthi, a Senior Manager, Business Strategy, at Cisco Systems, discussed how IT and operational technology (OT) work together, as opposed to being in separate siloes as once was traditional. Attendees learned how to fully leverage the power of IoT in their organization by bringing the two sides together and bridging the communication gap. He also looked at what good leadership must entail in order to accomplish this, and how IT managers can be the ...
Recently, WebRTC has a lot of eyes from market. The use cases of WebRTC are expanding - video chat, online education, online health care etc. Not only for human-to-human communication, but also IoT use cases such as machine to human use cases can be seen recently. One of the typical use-case is remote camera monitoring. With WebRTC, people can have interoperability and flexibility for deploying monitoring service. However, the benefit of WebRTC for IoT is not only its convenience and interopera...
There is only one world-class Cloud event on earth, and that is Cloud Expo – which returns to Silicon Valley for the 21st Cloud Expo at the Santa Clara Convention Center, October 31 - November 2, 2017. Every Global 2000 enterprise in the world is now integrating cloud computing in some form into its IT development and operations. Midsize and small businesses are also migrating to the cloud in increasing numbers. Companies are each developing their unique mix of cloud technologies and service...
When shopping for a new data processing platform for IoT solutions, many development teams want to be able to test-drive options before making a choice. Yet when evaluating an IoT solution, it’s simply not feasible to do so at scale with physical devices. Building a sensor simulator is the next best choice; however, generating a realistic simulation at very high TPS with ease of configurability is a formidable challenge. When dealing with multiple application or transport protocols, you would be...
SYS-CON Events announced today that App2Cloud will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct. 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. App2Cloud is an online Platform, specializing in migrating legacy applications to any Cloud Providers (AWS, Azure, Google Cloud).
IoT is at the core or many Digital Transformation initiatives with the goal of re-inventing a company's business model. We all agree that collecting relevant IoT data will result in massive amounts of data needing to be stored. However, with the rapid development of IoT devices and ongoing business model transformation, we are not able to predict the volume and growth of IoT data. And with the lack of IoT history, traditional methods of IT and infrastructure planning based on the past do not app...
To get the most out of their data, successful companies are not focusing on queries and data lakes, they are actively integrating analytics into their operations with a data-first application development approach. Real-time adjustments to improve revenues, reduce costs, or mitigate risk rely on applications that minimize latency on a variety of data sources. Jack Norris reviews best practices to show how companies develop, deploy, and dynamically update these applications and how this data-first...
Intelligent Automation is now one of the key business imperatives for CIOs and CISOs impacting all areas of business today. In his session at 21st Cloud Expo, Brian Boeggeman, VP Alliances & Partnerships at Ayehu, will talk about how business value is created and delivered through intelligent automation to today’s enterprises. The open ecosystem platform approach toward Intelligent Automation that Ayehu delivers to the market is core to enabling the creation of the self-driving enterprise.
Internet-of-Things discussions can end up either going down the consumer gadget rabbit hole or focused on the sort of data logging that industrial manufacturers have been doing forever. However, in fact, companies today are already using IoT data both to optimize their operational technology and to improve the experience of customer interactions in novel ways. In his session at @ThingsExpo, Gordon Haff, Red Hat Technology Evangelist, shared examples from a wide range of industries – including en...
Consumers increasingly expect their electronic "things" to be connected to smart phones, tablets and the Internet. When that thing happens to be a medical device, the risks and benefits of connectivity must be carefully weighed. Once the decision is made that connecting the device is beneficial, medical device manufacturers must design their products to maintain patient safety and prevent compromised personal health information in the face of cybersecurity threats. In his session at @ThingsExpo...
"We're a cybersecurity firm that specializes in engineering security solutions both at the software and hardware level. Security cannot be an after-the-fact afterthought, which is what it's become," stated Richard Blech, Chief Executive Officer at Secure Channels, in this SYS-CON.tv interview at @ThingsExpo, held November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA.
SYS-CON Events announced today that Massive Networks will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Massive Networks mission is simple. To help your business operate seamlessly with fast, reliable, and secure internet and network solutions. Improve your customer's experience with outstanding connections to your cloud.
SYS-CON Events announced today that Grape Up will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct. 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Grape Up is a software company specializing in cloud native application development and professional services related to Cloud Foundry PaaS. With five expert teams that operate in various sectors of the market across the U.S. and Europe, Grape Up works with a variety of customers from emergi...
Detecting internal user threats in the Big Data eco-system is challenging and cumbersome. Many organizations monitor internal usage of the Big Data eco-system using a set of alerts. This is not a scalable process given the increase in the number of alerts with the accelerating growth in data volume and user base. Organizations are increasingly leveraging machine learning to monitor only those data elements that are sensitive and critical, autonomously establish monitoring policies, and to detect...
Because IoT devices are deployed in mission-critical environments more than ever before, it’s increasingly imperative they be truly smart. IoT sensors simply stockpiling data isn’t useful. IoT must be artificially and naturally intelligent in order to provide more value In his session at @ThingsExpo, John Crupi, Vice President and Engineering System Architect at Greenwave Systems, will discuss how IoT artificial intelligence (AI) can be carried out via edge analytics and machine learning techn...
Everything run by electricity will eventually be connected to the Internet. Get ahead of the Internet of Things revolution and join Akvelon expert and IoT industry leader, Sergey Grebnov, in his session at @ThingsExpo, for an educational dive into the world of managing your home, workplace and all the devices they contain with the power of machine-based AI and intelligent Bot services for a completely streamlined experience.
When shopping for a new data processing platform for IoT solutions, many development teams want to be able to test-drive options before making a choice. Yet when evaluating an IoT solution, it’s simply not feasible to do so at scale with physical devices. Building a sensor simulator is the next best choice; however, generating a realistic simulation at very high TPS with ease of configurability is a formidable challenge. When dealing with multiple application or transport protocols, you would be...
With tough new regulations coming to Europe on data privacy in May 2018, Calligo will explain why in reality the effect is global and transforms how you consider critical data. EU GDPR fundamentally rewrites the rules for cloud, Big Data and IoT. In his session at 21st Cloud Expo, Adam Ryan, Vice President and General Manager EMEA at Calligo, will examine the regulations and provide insight on how it affects technology, challenges the established rules and will usher in new levels of diligence a...
An increasing number of companies are creating products that combine data with analytical capabilities. Running interactive queries on Big Data requires complex architectures to store and query data effectively, typically involving data streams, an choosing efficient file format/database and multiple independent systems that are tied together through custom-engineered pipelines. In his session at @BigDataExpo at @ThingsExpo, Tomer Levi, a senior software engineer at Intel’s Advanced Analytics ...