That is probably not news to anyone who worked with any data. If not, I’m glad you have read the title.
“But we now have all the data science and big data technologies, which solve it, so what you are on about?”
Do you remember some of the movies, where the heroes are trying to find a treasure? They get through a maze to a cave and after overcoming several traps they get to the treasure chest. They grab the top of the chest and slowly open it, with the light shining on our heroes. For whatever reason, there is a light bulb in there. The camera moves inside the chest and you watch several surprised faces looking in. Do you know what the treasure was? The friendship that was created on the way, of course. It was in them the whole time. Unfortunate for Joey whose head was chopped off by a massive axe dropping of the ceiling.
Anyway, where were we? Managing data. Right…
Same for data management. It is in you the whole time and it is unlikely that another technology will sort out your problem. Think about some basic piece of data of yours. For example one of your client’s address. Now take a stab at how many copies of it you have. Is it 1? Maybe 5? What about a thousand? Do you think it’s ridiculous? Maybe.
“We have it in our CRM system and it’s our source of truth, so we have it just once.” How did you obtain the address? Did they send it to you through email? Personal conversation? Or do you have it on a paper contract? How did you get it to the CRM system? Is it from some other system? What is an integration like – manual monthly bulk upload or automated real-time replication?
All of a sudden, you might have it in email, paper, a bulk upload file stored on a shared file system and staff personal laptop.
Then you might have in your order system, your service desk and some third party that manages shipping. All of those places will have it in a slightly different format. Why? Because every single system and move of data from one place to another was done by different people independently, so it would be real luck to have it consistent everywhere.
Once a while you run a report and an export which you save down or send somewhere. So again, it will be on emails, office365, laptops, mobile phones. Now back up each device on daily basis and you have easily a thousand copies. #winningTheArgument
Now when a client changes address, what happens? Where do you change it? Do you delete and update everything?
No, I don’t want you to modify your backups, calm down.
Or do you do it only in a place where you immediately need to? Does everyone know that they can’t use any of the versions in emails, files and other systems? And same as for any change; a new accidental mistake can be introduced, such as a typo or copy-pasting postcode into an address field and the other way around. So now you might have the address two thousand times in 7 versions while none of them is 100% correct. Post office deals with some typos, so you don’t even have to fix all of the mistakes, do you?
You hire a new administrator – how can they decide which address is correct? Fun, huh?
So there is a lot of people, systems and processes which are fairly independent because they have not been designed around data, but around operational needs, responsibilities and organisational structure. Which, to be fair, makes sense. There is no point in having all shining data when you are not able to ship a toothbrush to your neighbour. You need to make sure that someone orders it from a manufacturer, picks it up and pays for it, puts it to a warehouse, takes the customer’s order, packages it, ships it and delivers it. Plus someone needs to deal with the complaint that the bamboo toothbrush with charcoal fur (or how is it called) does not fix 50 years of neglect in one single movement.
It’s all in your and your hands only.
I went on a completely different tangent than I’ve originally wanted, so there will be some more related posts coming. It’s also been a while and I know my man PD can’t wait to read something.