What’s the problem with data?

An essential aspect of digital transformation is the representation of the entire business and private life in databases as ones and zeros. The created transparency awakes a feeling of insecurity, as long as one has not consciously dealt with the topic. That’s why we should be aware of the flood of data.  By 2020, IDC predicts a data volume of 40 zettabytes http://ow.ly/Ao5v7. This is 40,000,000,000,000,000,000,000 bytes. The novel algorithms allow evaluations, without a clear idea of the question. With this amount of data, it becomes more and more difficult to derive meaningful insights. At the same time, valuable information can be identified in a targeted manner. So what’s the problem with data?

Let us remember, where the data is coming from.

  • The connection data is continuously stored in the telephone network – including geographical movement data.
  • The use of every EC card triggers the storage of the complete payment process.
  • Each time a credit card is used, it triggers a series of inspections and data storages.
  • The use of the Internet leaves its traces in the systems of access providers, network operators, service providers and, in case of mobile access, with all “eavesdroppers”.
  • A purchase leads to storing of customer data. Less personal, if no personal data has been provided. Very personal, e.g. with a customer card that links all interactions to the user.
  • Wish lists, which remember all products for a possible, future purchase or other use, are available to the providers at any time.
  • Free offers are often linked to the e-mail address.
  • Reading e-books stores the reading habits.
  • Media providers, like Youtube or Spotify, remember the status and history of the usage.
  • In social media platforms, users provide openly whatever comes to their mind.
  • Electronic logbooks store all movements of a vehicle.
  • Books, films, music, photos, sculptures, in short any cultural asset can be documented digitally.
  • The residence data is stored in the registration offices.
  • Once a year, the tax office processes income data.
  • Current camera systems allow surveillance in the public space with face recognition – and can connect it to a system, like in the Social Credit System in China, in order to observe and reward good conduct.

These are just examples of how our world has become digital and virtual. All this data can be retrieved from anywhere at any time. Data can be copied and passed on at will, without losing the „original“. In the past, data was tied to the physical object of storage, such as a book, picture, or record. Without the „data carrier“, one does not have the data anymore. With the virtualization of data in ones and zeros, this physical dependency becomes history. The free flow of digital data leads to new questions regarding ownership, the related rights and obligations as well as the associated safety possibilities. When books were burnt in the past, every further use has gone up in smoke. Nowadays, a purchased e-book becomes already unreadable when changing the operating system. The rights of use dissolve unintentionally. Customers have to buy cultural goods again, when  they change media. Here a new, fertile ground is evolving for content providers.

According to copyright law in Germany, the data belongs to the originator, i.e. the artist or author, who created the data. Authorized users, i.e. the buyers of data for personal use, have only a right of use. Anything beyond is illegal.

  • Commercial use
    Data found on the Internet may not be used or resold for commercial purposes without the written consent of the author. This means that books, articles or website content as well as all other types of media such as films/videos, pictures, music or sounds may only be consumed. The creation of user profiles, for example for purchase proposals, is in the grey area of legal use. Passing on the data, with or without fee, is illegal.
  • Competitive intelligence
    The collection and linking of publicly available data for the purpose of competition analysis is a particular form of commercial exploitation, which is difficult to detect. The data can be found on company websites, in news pages and above all on social networks; for example, when an employee reports profanes from his daily work and business trips. Linking these data with others unveils strategies, plans or even problems of an enterprise.
  • Governmental use
    With the same mechanisms of Competitive Intelligence, such as Data Mining, Big Data or Business Intelligence, government agencies can process the available data. They collect and link them with the telecommunications data that can be accessed. The descendants of the dragnet search thus receive user profiles that go far beyond people’s self-image.

Bottom line: The huge amount of data can be evaluated according to specific questions. However, this involves high costs for personnel and computers as well as decisive attention, which are only available in special cases. This is best seen in the reading recommendations, which are based on the user behavior and are oriented to the clicks on a website, but stupidly promote already purchased products or offer even authors their own books as a reading recommendation. Unauthorised re-use by companies and the government goes unnoticed. It all starts with finding new contacts and the related background research that is possible for everyone, it goes to the Cambridge Analytics‘ election manipulations, and far beyond the possibilities that we can imagine. This professional use of the data by interest groups and government agencies is the real problem with the data.