According to Wikipedia, DevOps is
an emerging set of principles, methods and practices for communication, collaboration and integration between software development (application/software engineering) and IT operations (systems administration/infrastructure) professionals.It has developed in response to the emerging understanding of the interdependence and importance of both the development and operations disciplines in meeting an organization’s goal of rapidly producing software products and services.
In this post, I will explain how I have come to realize that I have been doing DevOps for some time without really noticing.
Setting up the stage
I have been working as a consultant within the Diams iQ project team at Dennemeyer for about 4 years now. The product is an Intellectual Property management system. This project is a long running Agile project and with time, our development process has evolved a lot.
We started as a (almost by the book) Scrum project with one team, one product backlog, one sprint backlog and so forth. Then we tried to split one bigger team into two smaller teams to gain some velocity and work in parallel on different areas of the product. It turned out we did not gain so much, therefore we merged the teams back into one. Let’s just say that we have embraced the Agile principle of inspect and adapt, and evolved our process a lot… and there are some very good reasons for that.
One of the specificities of this project is that the product is both used internally by Dennemeyer people, and also made available to Dennemeyer customers. Because of this, the product has been designed to be highly customizable and we had to involve more people into our development process, including internal stakeholders, customer representatives, technical people from Dennemeyer and IT operations at the customers.
On the one hand, working with a customer means for us that some of the customer’s representatives act as stakeholders and can ask for or propose a new feature that they would like to see in the product. The team takes it into account, and after validation by the product owner, the feature is added to the product backlog.
On the other hand, we have to take into account the customers IT infrastructure and make sure our deployment process fits with it. This is not trivial. Of course we published system requirements for the product. But different customers means different infrastructures and different levels of flexibility
- Some allow Click-Once deployment, some don’t.
- Some have a single Active Directory domain, some have a Forest with multiple domains.
- Some are ready to open their firewall to allow communication with a web service, some would rather die than have to change one single rule on their firewall.
- Some allow RDC on their servers for initial setup and upgrades, some want to supervise everything that is done, and others would not allow remote access.
- Some have proxies or reverse proxies, some don’t.
And so forth and so on…
The early days
When we rolled out our first few customers, the developers where in charge of the deployment. At that point, the deployment process consisted in a series of manual steps that had to be carefully executed one after the other. Even though these steps were documented, a lot of things could go wrong and the slightest human mistake or difference in the IT infrastructure could lead to hours of troubleshooting.
At that time, the team was composed of
- 2 product owners
- 9 developers
- 1 dedicated tester
- 1 scrum master
The development team was lucky enough to work in the office next to Dennemeyer IT guys, which means we had just to open one door to ask for something IT infrastructure related.
Dev : We need to test Diams iQ compatibility with Windows Server 2008. Do you thing it would be possible to get a new 2008 server ?
IT Ops : No problem ! Give me 15 minutes to create a new virtual machine from our 2008 template and it will be ready.
Dev : We need to access customer XXX network through RDP via VPN. Could you synchronize with their IT and make it happen ?
IT Ops : I will let you know as soon as the connection is available and provide you the VPN client.
As you can understand, this is already DevOps. Dev people speaking directly with IT Ops without having to ask for the hierarchy approval to do this or that. That’s already quite something. And let me tell you that it is pretty amazing to have this level of understanding from IT guys regarding development and deployment issues and vice versa.
All in all it was working quite well. But then…
Things got more complicated
With time, we started to have more and more customers and more and more environments to deploy for each release. Test and QA systems as well as production systems, where different versions of the product were installed. Let me tell you that when you release on a monthly basis, upgrading all of these environments was quite some work and very time consuming.
We also had more and more customization work to do for one customer or the other. Developers could just not go on like that or half of their time would have been taken by deployment and customization work, which means development would have slowed down, even though more customer requests were coming in.
Diams iQ is built as a Smart Client application, with an application server running as a Windows service, a SQL Server database server, and a WPF client that can be deployed via Click-Once or an MSI package.
We offer 4 types of hosting possibilities for Diams iQ
- Internal instances (intranet, TCP communication): Only used by Dennemeyer internally, either for production or for development, testing and staging.
- Demo instances (extranet, HTTPS communication) : Prospective customers can test the product before they decide to buy it. It can be a standard demo instance or a dedicated demo instance with migrated customer data for more important customers.
- Hosted instances (extranet, HTTPS or TCP over VPN) : Production instances where the application and database servers are hosted on the Dennemeyer IT infrastructure. The client is installed on the customers machines.
- On premises instances (intranet, TCP) : production or QA instances installed directly on the customer infrastructure. Mostly using TCP communication.
Relationships with the customers were handled by the product owners and one person who was also in charge of data migration for customers how had one of the previous IP software products delivered by Dennemeyer and on site support for the first days of the roll out.
But we knew it could not last.
Involving new people
That is exactly when we started to include new profiles in our team. We have now 3 persons that are part of what we call the implementation team. Two are collocated, working in the same room than the rest of the team, one is in the same building and has to go down two floors to join us to the daily meetings.
This person is actually the one mentioned above in charge of data migration (among other things), who is spending a lot of time at the customer’s. After a go live, he is the one collecting the customer’s feedback and provides the team with some news about the customer’s (dis)satisfaction.
What are the two other persons doing ?
- They participate to most (if not all) of the meetings (daily meeting, requirements sessions, technical sessions, etc.)
- They participate in the testing effort
- They ensure second level support
- They make sure that a new customer satisfies all system requirements and provides all necessary access before we start installing a instance of Diams iQ
- They deploy and upgrade the different internal and external environments for each release on a monthly basis
- They realize customer specific implementation and customization
- And much more…
We also have a first level support (helpdesk) which receives all the feedback from the customers if something goes wrong (which of course is never happening ). The product is able to send error reports that go directly into the helpdesk mailbox and is entered by the helpdesk people as a defect in our issue tracking system after they checked it is actually a defect.
While involving more people was necessary, we also realized our processed could be improved. The Implementation team work did not really fit with the iterative rhythm of Scrum. We tried to manage their work in sprints, but most of the time items would be moved from one sprint to the next because customers did not provide information on time to finish the work, or because something more urgent would come up.
In the mean time, the 2 weeks sprint rhythm also felt more and more constraining for the developers and we often had items that were done, but not done done on the day of the demo. This was sometimes leading to less quality because there was just not enough time to test the feature properly. And, not often but sometimes, a feature would blow up in our face during the demo which is never a good sign.
We had already tweaked our Scrum board, adding columns for testing and acceptance testing, and some of us had already shown a lot of interest for Kanban. We started suggesting that we could actually adapt our process and move from the iteration planning process of Scrum to the pull system of Kanban. We spent some time discussing it during a retrospective and after management approval, we just went for it. We designed a new board, started filling the input queue, clarified some rules, decided on the work in process limit we would put in place. Most of it almost happened over night. Of courses we have been adapting the process since then, but the switch from Scrum to Kanban was actually quite natural and all of us felt confortable with it within days.
What Kanban brought to us was, among other things, a better visibility on the implementation and deployment process, and better involvement of the implementation team. Some of the customers are acting as stakeholders for complete modules of the application, proposing features and making sure that delivered features fit their business needs. This was already the case when we were doing Scrum, but we have now a better visibility on features that are developed as standard features and features that are customer requests, because we introduced different card colors on our board, one for each type of item.
Involving all these people and using Kanban actually made the team more efficient and responsive to change. We are now delivering on a regular basis, for more and more customers, whether they are European based, in the US or even multinational corporations.
The US market is special. The first reason for this is that the business there is a bit different, so the companies have different needs. The second main reason is that we don’t share the same time zone, which makes things more complicated. That is why we decided to have people in the States who know about the product and are able to setup a Diams iQ instance on their own.
To achieve that, one member of the implementation team spent 2 months in Chicago to train US people. Some internal and external stakeholders in the US also have given some feedback after testing Diams iQ. They came up with some features that were missing to really target the US market. These features have been integrated in the development process and are now being delivered and made available as standard features for all customers.
Here is a story of how it goes. A few weeks ago, some US stakeholders reported that one of the features that was often asked for by the US market was the ability to schedule reports and automatically attach reports to an e-mail. An epic about possible reporting improvements was actually already existing in the product backlog. The product owners started splitting the epic into smaller items that were placed into our Kanban board input queue
- Scheduling reports
- Attaching a report to an e-mail (client side)
- Sending a scheduled report via e-mail
The two first items were implemented and smoothly went from the input queue to the Ready to deliver column. The next items is the queue being the third one, and because I had just finished working on something else, I picked the item and dragged it into the Design & development : in progress column.
We had already implemented another feature that was sending e-mails from the application server, so I already new we had an SMTP server available internally. I just refactored a little bit our code to use an SMTP configuration file that could differ according to the environment and used the settings of the SMTP that were already there. Development went on, until I realized while testing the feature that it was working fine when sending internal e-mails on domain dennemeyer.com, but it was not working when sending e-mails to addresses that were on a different domain. Not exactly the expected behavior. I was getting an error Mailbox unavailable. The server response was: 5.7.1 Unable to relay. It seemed to me that we had an infrastructure issue, or maybe something was misconfigured on the mail account I was using. I went directly to see the IT and asked them if they had any idea of what was wrong.
We had a short discussion and came up with some ideas of a few things to tweak and some tests to do to make it work. I went back to my station, did the changes we spoke about, and tested the different solutions we had spoken about until I finally got what was going wrong. I will not enter into too many details here, but it turned out the from address needed to be in synch with the user name of the account we were using to connect to the SMTP server, and specially use the format email@example.com.
Feeling a little more knowledgeable about SMTP configuration and Exchange server, I went back to the IT guys and explained to them what I found out. I asked them to create a new e-mail account specially configured to be able to connect on the SMTP server and sent e-mails, but not to receive any. This was done in a matter of minutes and I could reconfigure our system to use these new settings.
In the mean time, I had already prepared empty configuration files for the different customer environments that are not hosted by Dennemeyer and that would need different settings. I went to the the implementation team and told them about it. I explained that they would need to ask our customers to provide the right settings if they wanted to have this feature available. I explained were the config file was and what settings they would have to ask for, and of course I made sure they would know the trick about the from address / user name. I also documented the whole thing by quickly writing a page on our development wiki.
The guys came back to my asking what would happen if one of the customer did not provide proper settings. They wanted the system to disable the feature if no proper config was available. So I rapidly implemented a check on the config file and told them as soon as I was done so they could check it was working fine.
This small story show you how the feature development involved different people with different profiles and responsibilities, and not only developers. I was initiated by a stakeholder, included in the backlog by the product owners, developed by developers in coordination with IT and implementation teams. Tested and accepted by our dedicated tester and the product owners who proposed improvements that were done on the fly. And just before the item was put to Ready to deliver the implementation team also proposed a small improvement to disable the feature when no SMTP configuration is available.
Different people, with different roles and different interests participated in the development of the feature ensuring a high quality and fast release of the feature. Software engineering, Quality Assurance and Technology Operations people were involved in this, using direct communication channels as much as possible, therefore avoiding long feedback loops.
All of this has been done in that project for a while now. Without even realizing it and without formalizing any methodology, we have included operations and delivery in our software development process and using Kanban made it even easier to achieve. In the end, I think that is what DevOps is all about.