DevOps, from wikipedia:

DevOps (a portmanteau of “development” and “operations”) is a concept dealing with, among other things: software development, operations, and services. It emphasizes communication, collaboration, and integration between software developers and information technology (IT) operations personnel. DevOps is a response to the interdependence of software development and IT operations. It aims to help an organization rapidly produce software products and services.

I think everybody in IT world knows about DevOps concept. Or at least heard about it. Somebody might hire DevOps Engineers.

But if you ask yourself - what does DevOps mean precisely? What should DevOps Engineer do during the work day? - the answer probably is clear to you only if you’re DevOps Engineer (it should be!) or if you work very closely with them. For the other world it’s some kind of leprechaun that magically solves all the problems for the team (or application). Well, that’s usually true (not the leprechaun part), but it seems to me there are no any good common standards or rules for this job.

I’d like to share my thoughts about what day-to-day activities should have every DevOps Engineer. I wrote them in a form that every developer should understand, especially if you’re interested in doing more DevOps stuff in your team.

Two Simple Rules Link to this heading

So, let’s start from the two very simple rules that every DevOps person should adopt:

  • Everything should be automated
  • Everything should be automated in a way that other member of the team can use it

That’s it. You can apply it to anything - running tests, accessing logs, using monitoring, doing releases, … The whole point of having a DevOps Engineer is to glue your Product/Dev/Operations/QA teams together, eliminate any unnecessary communication and manual work. As a result you should see increased speed of development and decreased number of bugs (at least human errors).

Ok, that was very high level things. Let’s go deeper.

DevOps Activities Link to this heading

Development tools support Link to this heading

It’s important to increase development speed as much as you can - it affects budget, happiness and even business metrics. But sometimes you have to work with really complicated applications, especially if you use service-oriented architecture.

Having the same setup for all developers can be challenging. Luckily tools like Vagrant can reduce the struggle. For example, Vagrant + Docker is a really powerful combination to reproduce any complicated stack.

Also, make sure you use VCS for everything ;-)

Continuous Integration (CI) Link to this heading

Jenkins, TeamCity, Bamboo, CircleCI, Travis… - all these tools have very simple, but very powerful idea: run your builds or tests automatically, triggered by events (usually commits in VCS system) or time. It allows you, as a developer, sleep well, because all your changes are tested almost in realtime.

As a DevOps Engineer it’s important to make sure developers can see results and debug any issues, but at the same time they shouldn’t deal with CI system too much - use email/chat notifications. They will like it.

Configuration management Link to this heading

When a codebase is tested it’s time to deploy it. At least to a staging server. We’ll talk about deployments in the next section, but here let’s talk about environments. Usually every team has a local environment for local development, dev or staging environment for running tests / showing demos and, of course, production environment.

So, imagine you decided to update a version for some library. Or programming language. It means you have to go and update it for every environment! And in case of local environment you should do it for every team member! That’s a nightmare!

Fortunately we have really nice tools for configuration management. Check Tools section below.

Deployment / Continuous Delivery Link to this heading

Ok, codebase is ready, we also have a few environments that were configured automatically. It’s time to deploy our application!

Actually, sometimes you might think that it’s a trivial task. You can just write a small bash-script that takes the latest version from your VCS and restarts a service or something.

But it can be really complicated as well. Load balancers for rolling updates, feature flags, blue green deployments, RDBMS replicas and shards…

And again, even very complicated deployment can be automated. Check Tools section for more details.

Btw, you can combine CI and deployments! It’s called Continuous Delivery (or Continuous Deployment) and the idea behind is obvious: deploy your change right away if all tests are succeeded. That’s a huge win, you can deploy very often and you can get a feedback very fast.

Security Link to this heading

When you prepare your application for production release it’s very important to understand who can access what. Things like ssh keys, VPNs, IP whitelisting should make you life easier.

Monitoring Link to this heading

So, application is running and it gets some traffic. Nice!

But production environment is always different. And someday you’ll see that one part of your app works really slow. Or just behave strange. Or traffic is too high. And you can’t reproduce it locally :(

That’s why you should use monitoring. And when I say monitoring I don’t mean have NewRelic integration (which is great, actually) and relax. Measure Anything, Measure Everything - that’s very good idea, especially for your future. You business folks will say thank you, you’ll see.

Maintenance Link to this heading

Monitoring itself usually is not enough. First of all, when something is not working you should know it first. Notifications and alerts that wake you up at 3am on Sunday are really helpful.

All kinds of logs help you investigate issues and if you can afford Splunk - just buy it.

You can design some systems to have self-healing procedures. That’s not easy, but can reduce a lot of pain.

Backup & Restore Link to this heading

You probably do some backups, don’t you? But have you ever tried to actually use them?

Backups can give you false confidence, you should only rely on restore procedure. Make sure you have backups for database, file storage, etc. and they can be quickly used. Otherwise you’re in trouble.

Hint: restore = [configuration management + ] deployment + backup data.

Tools Link to this heading

Chef, Puppet, Ansible, SaltStack - these are main DevOps tools and every DevOps person should be familiar with at least one of them. They all have important features like configuration management, multi-node deployments, task execution, etc.

Usually if you want to create a bash-script and put it on a remote machine one of those tools is a better solution.

Let me show you an example of using Ansible for Tomcat installation and configuration:

yaml
 1---
 2- name: Install Java 1.7
 3  yum: name=java-1.7.0-openjdk state=present
 4
 5- name: add group "tomcat"
 6  group: name=tomcat
 7
 8- name: add user "tomcat"
 9  user: name=tomcat group=tomcat home=/usr/share/tomcat
10  sudo: True
11
12- name: delete home dir for symlink of tomcat
13  shell: rm -fr /usr/share/tomcat
14  sudo: True
15
16- name: Download Tomcat
17  get_url: url=http://www.us.apache.org/dist/tomcat/tomcat-7/v7.0.55/bin/apache-tomcat-7.0.55.tar.gz dest=/opt/apache-tomcat-7.0.55.tar.gz
18
19- name: Extract archive
20  command: chdir=/usr/share /bin/tar xvf /opt/apache-tomcat-7.0.55.tar.gz -C /opt/ creates=/opt/apache-tomcat-7.0.55
21
22- name: Symlink install directory
23  file: src=/opt/apache-tomcat-7.0.55 path=/usr/share/tomcat state=link
24
25- name: Change ownership of Tomcat installation
26  file: path=/usr/share/tomcat/ owner=tomcat group=tomcat state=directory recurse=yes
27
28- name: Configure Tomcat server
29  template: src=server.xml dest=/usr/share/tomcat/conf/
30  notify: restart tomcat
31
32- name: Configure Tomcat users
33  template: src=tomcat-users.xml dest=/usr/share/tomcat/conf/
34  notify: restart tomcat
35
36- name: Install Tomcat init script
37  copy: src=tomcat-initscript.sh dest=/etc/init.d/tomcat mode=0755
38
39- name: Start Tomcat
40  service: name=tomcat state=started enabled=yes
41
42- name: deploy iptables rules
43  template: src=iptables-save dest=/etc/sysconfig/iptables
44  notify: restart iptables
45
46- name: wait for tomcat to start
47  wait_for: port={% raw %}{{http_port}}{% endraw %}

As you can see, it’s very easy to read. Don’t be afraid. Just pick one of those tools and act (pick Ansible).

Summary Link to this heading

Constantly apply Two Simple Rules and you’ll see how much time you spend on actual development instead of struggling with configuration, environments or deployments. You can’t automate development process (yet), but you should automate everything else.