Build a DIY Cloud with Euclayptus, Nimbus and Amazon EC2
Cloud computing is all the rage right now, but did you know that you can also run your own cloud? The main selling points of elastic cloud computing is that you can buy CPU and memory as needed, and that you no longer maintain hardware. However, running your own cloud is great for flexibility, testing, and even selling cloud computing if you choose. In this article, we will clear up some confusion around cloud computing, and give a brief overview of two software suites that implement a DIY cloud.
What Is the Cloud?
Oftentimes we talk about "moving applications to the cloud," which is another way to say we utilize software as a service, or SaaS. Web-based applications can be said to live in the cloud, and they might even run on a cloud service, but they are not the topic at hand. We are talking about cloud clusters, similar to Amazon's EC2 service. This is sometimes called Infrastructure-as-a-Service, or IaaS.
A cloud is a group of servers that each runs virtual machines. What makes it a cloud is twofold: VM creation is automated, and new VMs can be deployed automatically based on pre-defined usage thresholds. For example, you may run a Web application server on a VM. When CPU or RAM utilization increases beyond 70 percent, a new VM automatically fires up. That is elastic (cloud) computing.
What they don't tell you is that this is not magic. So you have another VM capable of serving requests. What now? You must have a way to send customer requests to the new VM. This is usually accomplished by dedicating a VM to load-balancing duties with a pre-defined list of possible IP addresses, or via networking tricks. Regardless, realize that using cloud computing to dynamically scale based on demand is only part of the picture, and you must handle the rest.
Software: Infrastructure Options
Both open source IaaS kits we are aware of implement the Amazon EC2 API (or at least parts of it). This allows DIY cloudies to create and test their own tools without paying for EC2 service. They are both dubbed as research tools, but that is not to say they are not stable. Their suitability for enterprise production use is ultimately up to you.
Before we explore the options, Eucalyptus and Nimbus, one final note about these services. VMWare and other virtualization infrastructure automation tools do support dynamic provisioning and other features mentioned here. The difference with these tools is that they are open source, and they implement the EC2 API, which avoids vendor lock-in. You can move your infrastructure to Amazon if you decide to stop running your own servers, and likewise you can migrate back in-house to wrest more control.
Eucalyptus stands for, "Elastic Utility Computing Architecture for Linking Your Programs To Useful Systems." It was developed at the University of California, Santa Barbara as a research tool for cloud infrastructures. They implemented Amazon's EC2 interface, which enables administrators to implement portable automation and management tools. Eucalyptus is also designed to allow multiple interfaces, so should another cloud service become popular, it can work with that interface as well.
Eucalyptus runs on Linux systems, and RPMs are available for the RPM-based systems. The source is also available for building on unsupported Linux systems, but even more exciting is that you can deploy Eucalyptus on a Rocks cluster. With Rocks, Eucalyptus is deployed with basically one command.
If you're familiar with EC2, and use a supported distribution of Linux (especially Rocks), Eucalyptus is quite trivial to get up and running, and it even includes a Web GUI for some management tasks. VM images are used to deploy the VMs that will run in the cloud, and they are configured the same way Amazon requires.
Management of clusters is done, literally, with the Amazon EC2 tools. You download the EC2 client from Amazon, and point it at your Eucalyptus setup. Eucalyptus even comes with Walrus, which acts just like Amazon's S3 storage infrastructure, for even greater portability between the two, and S3 flexibility when using Eucalyptus.
Nimbus is another infrastructure that implements a cloud. Similar to Eucalyptus, Nimbus provides EC2 interfaces to give users control over their VMs. The difference is that Nimbus is run from a Globus Java container.
The installation is a bit more manual. You must first install AutoContainer, then verify Xen is installed and working. Then, download and run the Nimbus install script, and perform some manual configuration steps. Once it's up and running, though, you can deploy VMs all day long.
Nimbus is not surprisingly similar from a user perspective, given that it too implements the Amazon EC2 interface. From an administrative standpoint, it is however a bit more "manual" than Eucalyptus.
Both solutions (and commercial ones too) provide a mechanism by which users can quickly deploy servers--a one-click cluster, if you wish. Eucalyptus seems to have the most developed tools and user interfaces, and the widest user base. If you are a university, ISP, or hosting company, Eucalyptus can be used to provide EC2-like services for customers. For general users, the sysadmins of the world, deploying and testing cloud services will allow you to either test future uses of EC2, or even run your own cloud. In either scenario, investing the time to learn to use this technology is sure to pay off, especially in terms of job knowledge. Clouds are the future.