Notes from building internal developer platform — Part 1: Why are we here
I will start this part by narrating through a classic story that I first observed when I entered this industry around 13 years ago. I was working as a network engineer at a system integration company. And a common issue that I encountered was who to blame when an application is not functioning properly. Is it purely an application mistake? Or should network/infrastructure take the blame? Engineers from both camp were constantly throwing argument against the other side and this is where I came in, providing visibility to the system’s working so that problems can be correctly tackled.
Years passed and we see this shift in the industry on architectural best practice level to respond to bigger demand in functions and values provided by software. Enterprises and mid-sized startups start to embrace and implement microservices in order to scale up both their system’s capabilities and the organization structure. The idea of reflecting the domain splitting in both code ownership and governance is proven to provide organizations with the required agility to respond to the market demand.
However, that alone is not enough. Architectural change is a very expensive activity and shifting to microservices also requires a shift in mindset as well. Thus, the DevOps movement was born along with its famous mantra of “you build it, you own it”. And it’s actually a very valid state to achieve. Microservices provide developers with the ability to own specific part of the system and this means they should be able to change it following certain requirements. But that also means that it becomes harder to have single team (let alone person) that have complete knowledge of the operation of the whole system, in and out. Thus, the operation of the code should be part of its owner responsibility, in order to achieve the promised agility.
As mentioned before, DevOps was intended to be a cultural and mindset shift, as opposed to a specific role within an organization. That being said, practically speaking, it’s not straightforward to achieve. “Ensuring that my code is logically correct” and “ensuring that my code can be integrated successfully in the system” are 2 different problem that requires deep knowledge of at least 2 different domains. Thus, the role of DevOps was born, to fill in the gap required by developers to successfully operate their services. Initially, this means that the DevOps folks are doing a lot of Ops tasks on an ad hoc basis and that is acceptable. However, if we stop at that, it’ll mean that we’re not moving past the “application vs network” narrative, only repeating it in different context.
I did quick survey on Twitter and discovered that said condition might happen in many organizations. Out of 129 respondents, 54.3% developers mentioned that they aren’t confident enough to deploy their services without guidance from DevOps folks. This is of course relatable. Wrangling with programming languages and frameworks is vastly different then wrestling with Kubernetes deployment configuration or CI/CD pipeline. But then again, whatever happened to “you build it, you own it”?
All is not doom and gloom though. I’ve learned from the aforementioned survey that those who feel confident is mainly because their DevOps team take time to actually build a system that will enforce specific workflow to happen. As we’ve learned from many things in life, a standardized workflow will subsequently push certain culture to take place. Something that goes hand in hand with the DevOps vision. We refer to said system as platform. An entity that acts as intermediary between developer and the underlying operation infrastructure, that provides capabilities that will help developers in running and operating their services, as independent as possible.
I can stop at this, but it’ll feel very half baked. A platform has so many faces and perspective that we can discuss. What exactly is a platform? How can it actually benefits developers? How can I build one? These questions and more intricacies that I discovered while managing one, will be discussed in next entry in this series.