Create a CloudBolt Plug-in: Check EC2 Instance Reachability

11/2/15 11:30 AM

UPDATE: As of Verison 5.3.1, it will no longer be necessary to check EC2 instance reachability. This functionality has been rolled into the product. This article is still a great example of what it takes to write a CloudBolt plug-in and will be useful in many other scenarios. ~Rick

A common use-case I see frequently is the need to make sure new EC2 instances are up and ready to accept SSH connections before CloudBolt marks the provisioning job as complete. In this article, we’re going to work together to write a CloudBolt plug-in that will add this functionality to our CloudBolt environments. In doing so, I hope you'll not only gain an appreciation for the power of CloudBolt as a cloud automation platform, but you'll also see how easy it is to extend our base feature set using upgrade-safe scripts.

Getting Started

Writing Python code is a relatively painless process that usually starts with a text editor. I use OSX, so I prefer TextMate. If you’re a Windows user, I suggest Sublime Text 2 (http://www.sublimetext.com/2) or Notepad++. Another great option is to use PyCharm for all your CloudBolt plug-in development projects. I plan to expand on this topic in a future article.

Planning Our Attack

Let’s talk briefly about what we want to accomplish with this plug-in: When we provision a VM to EC2 via CloudBolt, we want to wait until that server is finished initializing and ready for SSH access before marking the entire CloudBolt provisioning job as complete. By default CloudBolt marks the job complete once the VM state is set to “OK” by AWS. Unfortunately, this isn’t the full story on the VM's readiness. The “OK” state is set before the VM is initialized and before the user can login via SSH. Imagine your poor users – they just used the awesome CloudBolt platform to spin up a VM, and once their job is “complete”, they get a “Connection Refused” error when they try to connect via SSH – not cool.

To address this issue, we'll extend CloudBolt to wait until our new EC2 instance has passed all EC2 status checks before marking the job as successfully completed. To accomplish this, we’ll trigger an action at the post-provision stage of the “Provision Server” Orchestration Action that will poll EC2 every two seconds to see if our new instance is reachable according to the EC2 status checks. We‘ll implement this action as a CloudBolt plug-in script written in Python.

Starting our Plug-in

Let's start our plug-in with a file called “poll_for_init_complete.py” with the following contents:

def run(job, logger=None, **kwargs):

return "", "", ""

The CloudBolt platform knows to call this function when it‘s time to execute the plug-in, therefore it's essential that it exists in your plug-in script. Note that the first and required parameter to this function is called job. This implies that we should expect the CloudBolt platform to call this function with the originating provisioning job passed as a job.models.Job object.

Returning a tuple of ("", "", "") is the default way of communicating to the CloudBolt platform that the script was a success.

Let's Get Busy

Let's add a few more lines to our plug-in script to get the server (our new EC2 instance) from the Job object and wait until it's reachable:

import time

from jobs.models import Job

TIMEOUT = 600

def run(job, logger=None, **kwargs):

server = job.server_set.first()

timeout = time.time() + TIMEOUT

while True:

if is_reachable(server):

job.set_progress("EC2 instance is reachable.")

break

elif time.time() > timeout:

job.set_progress("Waited {} seconds. Continuing...".format(TIMEOUT))

break

else:

time.sleep(2)

return "", "", ""

Let's walk through what what we have so far:

server = job.server_set.first() sets the variable called server to the Server object associated with this job. Since we're working with a server provisioning job, it's safe to assume we're only going to have one Server associated with this job, therefore we call first() on our job's server_set property.

We defined a constant called TIMEOUT in our plug-in module and set it to 600. We then use this TIMEOUT at timeout = time.time() + TIMEOUT to set the time at which we should no longer wait for our EC2 instance to initialize. This prevents CloudBolt from waiting indefinitely if for some reason EC2 cannot determine the reachability of our server. Since this is in seconds, we'll stop waiting after a maximum of 10 minutes has passed before marking the job as complete. This should be the exception – not the norm.

We then start an infinite loop that will only stop when either our timeout elapses or we determine that our EC2 instance is reachable with the function is_reachable(server) which we haven't yet defined.

Is it Reachable or Not?

The script above is still missing the implementation of our is_reachable function. Given the server object associated with this job, this function will use the AWS Boto API to determine the reachability status for our new EC2 instance. Note: Boto is the name of the Python API used to access the AWS API.

Let's add our is_reachable function to our script above our run function:

import time

TIMEOUT = 600

def is_reachable(server):

instance_id = server.ec2serverinfo.instance_id

ec2_region = server.ec2serverinfo.ec2_region

rh = server.resource_handler.cast()

rh.connect_ec2(ec2_region)

wc = rh.resource_technology.work_class

instance = wc.get_instance(instance_id)

conn = instance.connection

status = conn.get_all_instance_status(instance_id)

return True if status[0].instance_status.details[u'reachability'] == u'passed' else False

def run(job, logger=None, **kwargs):

# SNIP...

Let's step through this function step-by-step:

instance_id = server.ec2serverinfo.instance_id
Get the EC2 instance ID associated with our new server being provisioned through CloudBolt. This is a string that looks like i-2423c494 in the EC2 console.
ec2_region = server.ec2serverinfo.ec2_region
Get the AWS region into which our new EC2 instance is being deployed.
A few CloudBolt platform API gymnastics to get the backing Boto API objects without specifying any credentials. Always keep credentials out of your scripts!
rh = server.resource_handler.cast()
rh.connect_ec2(ec2_region)
wc = rh.resource_technology.work_class
instance = wc.get_instance(instance_id)
Get the Boto Instance object associated with our new server's instance ID.
status = instance.connection.get_all_instance_status(instance_id)
Using the connection associated with our Boto Instance object, return the instance status for our server.
return True if status[0].instance_status.details[u'reachability'] == u'passed' else False
If the reachability status for our server is “passed”, return True because our new server is now reachable. If not, return False. We use status[0] because our get_all_instance_status function above returns an array. In this case we're only asking for the status of one instance, so we know the array only has one Status object and thus we use status[0].

Going back to our loop you can now see how the is_reachable function is used to keep the loop going if the answer is false:

while True:

if is_reachable(server):

job.set_progress("EC2 instance is reachable.")

break

elif time.time() > timeout:

job.set_progress("Waited {} seconds. Continuing...".format(TIMEOUT))

break

else:

time.sleep(2)

If our server is NOT reachable, and our timeout hasn't expired, we wait two seconds and try again.

Putting it All Together

The complete script can be downloaded from cloudbolt-forge. CloudBolt Forge is a source of user-contributed actions and plug-ins

Now that it's ready, let's add it to the appropriate trigger point in CloudBolt.

In your CloudBolt instance, navigate to Admin > Actions > Orchestration Actions and click “Provision Server” on the left tab bar. Find the “Post-Provision” trigger point at the bottom of the page and click the “Add an Action” button.

Select “CloudBolt Plug-in” and in the next dialog, click "Add new cloudbolt plug-in".

Specify a name for our new plug-in (Poll for EC2 Init Complete), select the "Amazon Web Services" resource technology, browse to your script, and click "Create". Selecting the "Amazon Web Services" resource technology ensures this plug-in only runs against AWS resource handlers that you've defined and not others to which this plug-in is not applicable.

Give it a try

Provision a server to one of your AWS-backed CloudBolt environments. Watching the job progress, you'll see that the job is not marked as complete until the server is fully reachable and SSH access is available.

Questions? Comments? Concerns?

Don't hesitate to reach out to me (rkilcoyne@cloudbolt.io) or any of the CloudBolt Solutions team for help!

Topics: Automation, AWS, CloudBolt

If It Isn’t Self-Service, It Isn’t a Cloud

Posted by Ephraim Baron

10/28/15 8:30 AM

A while back, I was working for a large storage company. We had a marketing campaign called “Journey to the Cloud” where we advised enterprises about cloud computing – as we defined it. For us, the cloud was all about storage. Of course, for server vendors the cloud was all about servers. Ditto for networks, services, or whatever else you were selling. There was a lot of “cloud-washing” going on. I knew we’d reached the Trough of Disillusionment when, as I got up to present to a prospect, they told me “if you have the word ‘cloud’ in your deck, you can leave now.”

Fast-forward five years, and cloud computing appears to have reached the Slope of Enlightenment. By nearly all measures, cloud adoption has increased. Ask any CIO about their cloud strategy, and they’ll give you a well-rehearsed answer about how they’re exploiting cloud to increase agility and drive partnership with the business. Then ask, “How are you enabling user self-service?” Typical responses start with blank stares or visible shudders, followed by “oh, we don’t do that!” They may say “we’re only using private cloud”, or they may mention OpenStack or containers. If so, you should point out “If it isn’t self-service, it isn’t really a cloud.”

Unless it provides self-service it is not a cloud

Defining Cloud Computing

When looking for a definition of cloud computing, the National Institute of Standards and Technology (NIST) version is widely cited as the authoritative source. NIST lists five “essential characteristics” of cloud computing. The operative word is ‘essential’; not suggested; not nice-to-have. If a service doesn’t have all five, it’s not a cloud. These include:

Broad network access
Rapid elasticity
Measured service
Resource pooling
On-demand self-service

The NIST model of cloud computing

For the last of these, on-demand self-service, the cloud test is simple. If users can request systems or applications and get them right away – without directly involving IT – they are getting on-demand self-service. If they have to submit a ticket and wait for an intermediary to review and fulfill their request, it’s not a cloud.

Working With You or Around You

At this point, you may be told “we don’t offer self-service because our users don’t understand IT. They need our help.” There was a time when that reasoning may have worked. The C-I-‘no’ of the recent past had the power to rule by fiat and ban anything that wasn’t explicitly on the IT approved list. Users had no choice. But times have changed. Now, users can simply create an account with a public cloud service, swipe their credit card, and get what they want, when they want it.

As a result, companies are seeing a marked increase in so-called shadow IT – pockets of information technology that exist and are managed by users rather than by formal IT groups. And while this may cause wailing and gnashing of teeth by everyone from security, to finance, to IT operations, it’s nearly impossible to stop. The genie is out of the bottle.

Rather than trying to prevent or shut down rogue users, IT must take a different approach. They need to ask their users “how can we help you?” rather than “how can we stop you?”

“Be the cloud, Danny”

IT needs to become a cloud services provider to their users

If you work in IT and want to stay relevant, you need to be as easy to work with as a cloud service provider. Do that, and users won’t look for alternatives. After all, they have their own jobs to do.

So how do you get started? That’s where CloudBolt comes in. We’re a cloud management platform that was designed from the start with the end-user in mind. We enable systems administrators to establish standard configurations and to publish them to their users via an online service catalog. Users get rapid access to capacity; IT maintains control and compliance. Best of all, CloudBolt isn’t restricted to a single cloud vendor’s services and APIs. We work with more than a dozen cloud providers, from private to public, as well as with a wide variety of configuration management and orchestration tools. We even integrate with legacy, brownfield environments giving you a single place for managing existing as well as new deployments.

The CloudBolt Service Catalog is where end users get what they need

If simple and powerful cloud management sounds appealing, try it for yourself. Just download the CloudBolt virtual appliance. It’s free to use for lab environments. Deployment and setup are fast and easy. Before you know it, you’ll be providing real cloud services to your users.

“Inconceivable!” you say? Think again.

Topics: Cloud Management, Automation, IT Self Service

Accelerate DevOps by Combining Automation and Cloud Management

Posted by Justin Nemmers

10/15/14 4:20 PM

The advent of DevOps in Corporate IT has dramatically increased the value that Configuration Management (lately, also known as CM and/or Configuration Automation/Data Center Automation tools) provided in a complex data center environment. Popular examples include Ansible, Puppet, and Chef. Whether your IT organization has implemented an end-to-end DevOps model, or you’re interested in implementing one, the unification of Cloud Management and Data Center Automation is a great way to ensure that your DevOps teams get the most out of IT-provided and supported services and resources.

At the core of highly productive and agile DevOps teams is the rapid access to required resources, and the ability to control what is deployed where. Long wait times for resource provisioning will not just delay release and product, but also likely anger your team. On the other hand, granting the DevOps team unfettered access to on-prem virt and public cloud resources is a capacity planning and potential financial disaster just waiting to happen.

As DevOps automates more of the application management and provisioning process with tooling (Related posting: Why Manual Provisioning Workflows Don't Work Anymore), it becomes more critical to effectively integrate CM with the actual infrastructure. Providing end users and developers alike with access to DevOps work product becomes more complex and challenging.

Cloud-Management-and-Devops-is-like-PBandJ-72
DevOps and Cloud Management go together like peanut butter and jelly. Each makes the other more awesome. (Image Credit: Shutterstock)

So how does an IT organization achieve maximum value from the time and cost investment in these CM tools? By tightly integrating Cloud Management with their entire stack of CM tools.

Advantages

Using a cloud manager such as CloudBolt to integrate CM with the infrastructure provides immediate value. By deploying both tools, IT can provide DevOps with:

Controlled access to required infrastructure, including networks, storage, and public cloud environments.
A single API and UI capable of front-ending numerous providers, which means when IT changes cloud providers, DevOps doesn’t need to re-tool scripts and automations.
Fully automated provisioning and management for real-time resource access.

CloudBolt allows IT to natively configure and import application and configuration definitions as well as automations directly from your CM tool of choice. End users can then select the desired components, and deploy them onto appropriately sized system or systems in any environment.

IT organizations can put into place hard divisions between critical environments—such that only certain users and groups can deploy systems, services, and applications into specific environments. For instance, CloudBolt will prevent a developer from deploying a test app onto a system that has access to a production network and production data.

Results

Customers that have implemented CloudBolt also are able to chose from one or more CM tools based on capabilities of a specific tool. Does one team prefer Puppet over Chef? Each team can be presented with a discrite slice of underlying infrastructure that makes use of their preferred CM tool(s).

The result is clear: more effective DevOps teams that spend less time dealing accessing resources, and more time getting their work done. IT is happy because CloudBolt enables them to improve governance of entire enterprise IT environments, and finally offers IT the ability to alter underlying infrastructure technology choices in ways that are fully abstracted from end users. By using a single CloudBolt API to access and deploy resources, DevOps isn’t disrupted when IT alters underlying infrastructure technology.

Interested? You can be up and running with CloudBolt today. All you need is access to a Virt Manager or a Cloud Platform, and less than 30 minutes.

Topics: Cloud Management, Automation, Puppet, Chef

Automation of the Trinity: Virtualization, Network, and Security

Posted by Justin Nemmers

6/28/13 3:51 PM

Danelle Au wrote an exellent article for SecurityWeek that is essentially a case study for why organizations need CloudBolt C2 in their environments. She talks about how, at scale, the only way to achieve the needed environment security is with significant automation, making the key point that “automation and orchestration is no longer a ‘nice to have.’”

IT Security, firewall, automation

Yep. It’s a must. A requirement.

In her description of a manual provisioning process, Danelle accurately points out that there are numerous variables that need to be accounted for throughout the process, and that one-off choices, combined with human error can often open up organizations to broader security issues.

In order to achieve the “trinity” (as Danelle calls it) of “virtualization, networking and security”, a tool must have domain knowledge of each of the separate toolsets that control those aspects. Tools like vCenter, RHEV, or Xen handle Virtualization Management (just to name a few). Each of those tools also has some level of their own networking administration and management, but a customer might also be looking to implement Software Defined Networking that’s totally separate from the virtualization provider. So now couple Virtualization Management with a tool such as Nicira, or perhaps Big Switch Networks, and the picture only grows more complicated.

Security, the last pillar of this trinity, is really the most difficult, but absolutely the one that benefits not just from automation, but also strict permissions on who can deploy what to where on what network. Automation might be able to grasp the “deploy a VM onto this network when I press this button” concept, but you need something quite a bit smarter when you take a deeper look at the security impacts of not just applications, but which systems they can be deployed on, in which environments.

So how do you expect admins to juggle this, with 1,000 different templates covering all the permutations of application installs in the virt manager? It’s probably not sustainable, even with a well-automated environment.

What is an admin to do? Well, for starters, admins use Data Center automation/Configuration Management tools like Puppet, Chef, HP Server Automation, GroundWorks, and AnsibleWorks to name a few. But in order to fully satisfy the security requirement, those applications and tools must also be fully incorporated into the automation environment. And then governed, to make sure that the production version of application X (which potentially has access to production data) can never be deployed by a QA admin into the test environment. An effective automation tool must be able to natively integrate with the CM as well, otherwise

And Denelle’s point of view was largely from the private cloud. What happens when it’s private cloudS, not cloud? And let’s not forget about AWS and their compatriots. Adding multiple destinations and target environments can drastically increase the complexity.

I do, however, have one glaringly huge issue with one of her comments: “It may not be sexy…” I happen to think that “The ability to translate complex business and organization goals” is more than a little sexy. It is IT nirvana.

Topics: Software Defined Network, Challenges, Automation

CloudBolt Blog

Create a CloudBolt Plug-in: Check EC2 Instance Reachability

Getting Started

Planning Our Attack

Starting our Plug-in

Let's Get Busy

Is it Reachable or Not?

Putting it All Together

Give it a try

Questions? Comments? Concerns?

If It Isn’t Self-Service, It Isn’t a Cloud

Defining Cloud Computing

Working With You or Around You

“Be the cloud, Danny”

Accelerate DevOps by Combining Automation and Cloud Management

Automation of the Trinity: Virtualization, Network, and Security

Recent Posts

Posts by Topic

Subscribe to Email Updates

Follow Us

Product

Resources

Partners

Company

Support

Social