Data is all around. From what you wear (Apple Watch and Google Glasses), to accelerometers in your car measuring where you go and when, to security cameras – the list of things we have about us which generate data is astonishingly long. We are quite literally swimming in all of the ‘Everything’ that we produce, log and record every day. It is estimated that 2.5 quintillion bytes of data is created every single day, and that by 2015 the amount of data that crosses the internet every five minutes or so will be equivalent to the total size of all movies ever made and that annual internet traffic will reach a zettabyte.
Let’s put that in perspective. A quintillion bytes is equivalent to 57.5 billion 32 gigabytes iPads and a zettabyte is roughly 200 times the total size of all words ever spoken by humans. That is a massive volume of information being generated and stored, usually in proprietary data warehouses, available to only a few. Most of it is passive, unstructured and almost wholly lacking in context. It is dirty big data, collected because it is there and really just taking up memory while it waits for the right question to be asked.
Appearing for the first time in 1995, the idea of sharing geophysical and environmental data, across countries and boarders, gave rise to the concept behind “open data”. A group of authors from an American scientific agency released a report, promoting an open and transparent dialogue between countries in order to better analyze and understand the global phenomena they were studying. What it came down to, and what it has since embraced, was the idea that collective knowledge, meaning knowledge that is shared amongst all, is beneficial to all and that the open and free dissemination of information is of massive benefit to the individual but also to the masses.
Governments, government agencies, private enterprises and individuals around the world are becoming more and more wary of the amount of private and personal data being collected every day. The collection of data in and of itself is not the main concern. The worry is that we, as the consumer and private individual, don’t always know what is being collected for or for what purpose it is being used.
The volume of data that is being stored and collated gives business, governments and individuals an unprecedented power to understand, analyze, and ultimately change the world we live in. However, it also gives them an immense opportunity to do us harm. The bigger and wider this digital world we live in gets, the more of our lives will be laid bare. For good or for bad. Perhaps there is nothing we can do about this slow march of progress, even if we wanted to. However, in the understanding and knowing what is being collated and how it is being used, there is accountability and transparency.
Open data is perhaps the most effective way we have to ensure that the power inherent in the possession of large amounts of data is made honest. By the sharing and dissemination of big data, passive data, small data, and archival data – we have an unprecedented opportunity to change the world we live in. Organizations like Code4SA are forming all over the world. Like-minded individuals coming together with a shared vision. People that specialize in the very human stories that can quite often be found in the numbers. Numbers that can only be analyzed and used to tell the story of a nation if the data behind it is freely accessible and open to all. A perfect example – using the data available from StatsSA, the Director of Code4SA very eloquently told the story of marriage in South Africa. using numbers and an XKCD style illustration. The disturbing tale of how many girls under 20 marry men older than 60. The heartwarming and fuzzy inducing fairytale of the 92 year old Bride and her 94 year old Beau. The landscape in between telling the tale of sugar daddies, sugar mommies, children getting married and adults getting hitched.
Non-profit organizations like Code4SA and Code for Africa are determined to create a landscape where all data, but especially governmental data, is shareable, usable and available to all who may want to use it commercially, for civic information or any other purpose that they choose.
For a government, perhaps it can be lead to a fairer, and more honest reflection of democracy. One in which resources are channelled to the right places, where departments and municipalities are held accountable. Only last year, the City of Cape Town approved their Open Data Policy, albeit with some hurdles that need to be jumped through and disclaimers. It’s new but populating slowly with budget data, ward boundaries and transport maps. Policies like these go a long way in proving that a Government or Municipality is serious about transparency and change. When you are not afraid to have the numbers out there for people to see, then you most certainly don’t have anything to hide.
And South Africa doesn’t fare too badly by comparison to more developed nations either. We rank 41 out of 86 in the World Wide Web Foundation’s Open Data Barometer annual report, a middling score with room for improvement but the best in Africa, at least.
To improve, we have to lobby all arms of government to adopt policies which emphasise publishing data whereever possible, rather than only on request. After all, if my taxpaying money is being used to collect, collate and analyze information about me and my direct environment – then do I have a fundamental right to have access to that data? I think so. Are the clinics in my area fully furnished with all the equipment they need? What is their safety record? Do the dams in the area I live in have any environmental issues? Does the area I am moving to have good schools and well established and safe public transport? These are all the things that an open data policy, which is not hindered by red tape, could reveal to the everyday man on the street.
However, effective open data policies reach further than just my back yard. Does the country I am going to visit in December have safe water? Accessible and safe hospitals? Does the country I am going to potentially open a branch of my business in have a sound internet/broadband/telecommunication infrastructure?
Every day we walk a very fine line between who we truly are and who we are online. It is human nature to want to only show the best. Be the best. Sometimes even invent the best. Whether on a personal level, or a governmental, or business level.
Open data, and the successful implementation of open data policies, are a mirror. You cannot lie to a mirror. It shows the truth whether you want to see it or not.