Category: DevOps
-
Redunancy Planning, more work than adding one of everything
Since I started my career, redundancy has been featured in almost every deployment discussion. The general best practice is to add an additional element for each service tier, also know as N+1 redundancy. This approach is straight forward, but many people would actually be surprised by how often these schemes fail. At a very famous…
-
Three Monitoring Tenants
This week, I was seeing a drop in average back-end performance at work, we had an average drop in page load performance from ~250ms to around 500ms. This seemed to be an intermittent problem and we searched through out graphs at NewRelic with no clear culprit. Then we started looking at our internal MediaWiki profiling…
-
Groovy, A Reasonable JVM Language for DevOps
I’ve worked at several environments where most of our product was run through the JVM. I’ve always used the information available to me in Mbeans, but the overhead of exposing them to a monitoring system like Ganglia or Nagios has always been problematic. What I’ve been looking for is a simple JVM language that allows…
-
A few things you should know about EC2
Availability Zones are Randomized Between Accounts I had someone from Amazon tell me this, so I assume this to be true. In order to prevent people from gaming the system availability and over allocating instances in a singe az, zones ids are randomized across customers. So for any two accounts us-east-1a != us-east-1a. Amazon promises…
-
Configuration Management Tools Still Fall Short
I have a gripe with almost every configuration management tool I’ve used. I’m most familiar with chef, but I’ve used puppet a bit, so I apologize to the fine people at OpsCode in advance since my examples will be chef based. The Cake is a Lie Every time I run chef I tell my self…
-
SSH Do’s and Don’ts
Do Use SSH Keys When ever you can use a key for SSH. Once you create it, you can distribute the public side widely to enable access where ever you need it. Generating one is easy: ssh-keygen -t dsa Don’t Use a Blank Passphrase on Your Key This key is now your identity. Protect it.…
-
Techincal Debt Better Than Not Doing It
Its time to admit that sometimes it’s okay to incur technical debt, particularly when it comes to getting it done. So many times, I’ve run into to places that have constipated operations environments, or automation processes because something is hard to do automatically. If you can’t automated it, don’t block all other tasks because of…
-
User Acceptance Testing for Successful Failovers
Things fail, we all know that. What most people don’t take into account is that things fail in combination and unexpected ways. We spend time and effort planning redundancy and failover schemes to seamlessly continue operations, but often neglect to fully test these plans before rolling services and equipment into production. What inevitably happens is…
-
Solr Query Change Beats JVM Tuning
I’ve been spending the last few days at work trying to improve our search performance, and have been banging my head against the dismax query target and parser in Solr. For those not familiar with the Dismax, its a simplified parser for Solr that eliminates the complexity from the Standard query parser. Instead of search…
-
Dealing with Outages
No matter what service you’re building, at some point you can expect to have an outage. Even if your software is designed and scaled perfectly, one of your service providers may let you down, leading to a flurry of calls from customers. Plus the internet has many natural enemies in life (rodents, backhoes, and bullets),…