“We must be an innovative and data-driven company”. Let’s go! This is the mantra of many companies today. All have in mind the GAFA and Co model where innovation is a key driver in the organization and data is everywhere. Looking closer, how could it be different? The concept is not really new, from the first time, companies need to be more creative than their competitors and need to measure their performance and know their customers. The only difference is that nowadays, everything goes faster. It’s clearly now “about fast companies eating slow ones”.
The result of this mantra? New directions and management teams have been created to address these areas. So, it is common to find, with the product/engineering organization, a data organization and an innovation organization. Each one with dedicated teams that can be schematized as follow (many others organizations are possible):
- Data organization: Data Governance, Data Collection, Data Engineering, Data Analytics, Data Science…
- Innovation organization: Sourcing, Technical team, Innovation advocate…
Organizations are now able to have dedicated people working and delivering on these fundamental subjects. But does it work that way and deliver value as expected?
The first question to be answered may be “what is really expected from an innovative and data-driven company”. On the data side, is it about gathering as many datas as possible and providing all the company KPIs on a central point as we tend to see in current executions? Not sure… Rather, when asking the real motivation, this is more about enabling an easy-to-access and easy-to-use trustable data platform to enable each team to quickly provide and test new use cases in production.
This is exactly the same on the innovation side. The ability to identify an opportunity, develop a response, and validate it as quickly as possible in a real use case is the most important part.
Pitfalls of these organizations
On the data side, this organization has led to a dedicated team responsible of a global datalake, gathering and providing data for the entire company. This could enable KPI delivery and new use cases based on this new oil. But let’s analyse why companies are struggling to really get what they expect.
- The data team faces many functional domains, in fact, all company domains. What is functionally well decomposed at the product level is grouped here into a single place. This creates a concentration of interpreted knowledge spread over a small number of people. And lead to an issue of data quality and accuracy, and a problem of resilience of the competences inside the data team.
- The data team faces many different customers from these several domains with very various demands. It is clearly unlikely that it will be possible to provide a relevant response to specific requests in so many business areas.
- The data team has lots of dependencies. To ingest the data, the team need to work with every product team to get the data. But the relationship is not fair. How to get what you need from teams that have not, or few, incentive to do it.
- The opposite is also true. A product team which needs access to data to enable a new feature has dependencies with the data team. And this does not make the process any easier.
- The data team manages data that it does not own. This results in an issue of trust from the customer and accountability for data quality as the data team which provides the data does not own the original data source and is not in a position to know all its specificities..
As a result, this model is hardly scalable. As the organization becomes more and more complex and domain-based, the data team is faced with more and more demands. For each request, the team need to appropriate the domain and has dependencies with other teams. Its customers ask for accurate data, but the team does not have the ownership on it. This schema does not follow the flow of information.
On the innovation side, the dedicated organization sources issues by meeting with operational staff and develops answers. The difficulties here are:
- Capabilities to test: the technical architecture and dependencies with product teams can bring lot of difficulties to test response in a production environment. Especially when these answers need to be tested quickly and frequently to identify impacts.
- Different priorities: Once the right answer has been identified, comes the time of transfer. Giving ownership to the relevant product team to implement, deploy and maintain the final solution. And this is rarely the main priority of the product team which has its own backog.
Being fast is key to be innovative. All these dependencies do not allow the required agility to innovate.
Another way
All the issues mentioned above are not new, and have been addressed in the engineering part. The product teams bring together all the necessary skills within a well-defined scope. This is to limit dependencies and develop accountability. Why not use the same organization for data and innovation?
Let’s take a product organization broken down into tribes, each related to a well-defined domain, and composed of several product teams. Why not distribute data and innovation organizations inside these tribes? Depending on the size of the organization, each tribe could have a dedicated data/innovation team or dispatch data/innovation competencies inside each product.
On the data side, this results on a model where the data lake, its content, is no longer managed by a centralized team, but where each tribe has the responsibility to expose its own data in a way all the organization can use them.
This organization could provide the following benefits:
- Ownership: Each tribe, which manages the operational products, has the ownership on exposed datas and innovation tests. This drastically reduces dependencies.
- Knowledge: The team in charge of the domain data belongs to the tribe that has the knowledge of this data.
- Prioritization: Road maps, for both operational products, data product, and innovation projects related to the domain are shared by the same tribe.
Decentralized data and innovation teams does not mean there is no need for a centralized part of these organizations aiming to:
- Provide the necessary platforms to support distributed teams
- #Data: scalable platform, data schema, data security, access control, data discovery, log management, …
- #Innovation: inner source facilities, easy to use A/B test platform, …
- Define the governance
- Define methodologies and training
- Leading communities
- Define tribes data and innovation objectives
This of course brings other challenges such as the interoperability of different data domains, the discovery and understanding of the data, the duplication of data, or the coordination of different innovation initiatives. But this model, by emphasizing distributed ownership, accountability and belonging to a field of expertise, provides the tools to overcome obstacles that currently prevent the full exploitation of these development axes.