Schlagwort-Archive: Data

What we can know

What we can know was already a concern of Socrates according to Plato “I know that I do not know (actually: I know as a non-knower – οἶδα οὐκ εἰδώς, oîda ouk eidōs)? Heinrich von Pierer also had his view on his company’s knowledge: If Siemens knew what Siemens knows. Despite all attention, it is overlooked that knowledge only becomes tangible indirectly through appropriate media (paper, electronic media, and channels). It is not an exchange of knowledge that takes place but copying of data. The sender does not let go of its knowledge but instead offers distorted formulations through filters (see Meta model of language). It is the receiver who interprets the data. The continuous evolution of our knowledge reinforces the distortions. What is true today may be false tomorrow. In 2017, Kellyanne Conway drew attention in the right direction by talking about alternative facts. Niklas Luhmann described that information, communication, and understanding are contingent occurrences in the last century, i.e., a statement is neither necessary nor impossible and can be one way or another. What can we still know today with this insight?

Suppose we disregard for a moment the sender’s intentions and look at the content, the knowledge. In that case, there remain a few questions that must be clarified in advance to be able to criticize each other: What is knowing? How is it different from not knowing? The following points provide food for thought.

  • Sign – Data – Information – Knowledge – Wisdom
    Since the accelerated processing of data with IT, we face the question: What is information? After the millennium turn, knowledge management has answered the query with the knowledge pyramid (see also meme units). On the lowest level, there are the signs (icons, pictures, light or sound signals), which can be represented on the next level as data (e.g., 101010). As soon as the data is on the level above in a context of meaning, information (e.g., 42) results. Information accumulated over time then networks into knowledge on the next layer (e.g., the ultimate answer to the question of life, the universe, and everything) that then becomes wisdom at the top (e.g., science fiction provides exciting ideas). There are no interfaces to assess the top three levels by factual means. Nobody can look into the head of another.
  • Seeing is believing – is knowing?
    In the pre-Internet era, the fourth estate, the media, besides the executive, legislative, and judicial branches of government, controlled interpretations of circumstances. Things written in black and white, which existed as a photo or film, and were accepted as a given. In the absence of opinions expressed by other eyewitnesses, no doubts arose. Today, everyone has a cell phone, access to social media, and even their website. This makes them a publisher with global reach – without universally accepted values and in a legal vacuum only with national regulations that are difficult to enforce. Take Wikipedia as an example, that only publishes articles relevant to an encyclopedia, provably published elsewhere, and compiled by the author. This excludes undocumented knowledge. Thus, we understand only half.
  • Is oral tradition knowledge?
    The writing was invented just over 7000 years ago. Before that, knowledge was transferred for millennia by oral tradition. In Fahrenheit 451, Ray Bradbury imagined even a bookless future in which the classics of world literature were again passed on orally. This spoken word, for example, would not find its way into Wikipedia. To limit our knowledge to data that is on any physical carrier would let us become stultified. The truth would be left to the authors, photographers, filmmakers, and other archivers without publication. Only when thoughts are externalized into a medium, they will become knowledge. Thus, the world does not know what the world knows.
  • Personal explanations
    Knowledge is formed exclusively in the mind of the observer. The personal field, the learned models of thinking, experiences made over time, and unconscious feelings are decisive for the respective interpretation. This leads to different facts like the above interpretation of Douglas Adams, or the other reading of Lewis Carroll, namely any signification, show. It was new for me that 42 is also the second pseudo-perfect number. And who knows that it is Frank’s current age. Since the receiver determines the meaning, the sender’s intent gets lost because it cannot be transmitted. The world will have to endure willy-nilly with alternative facts.
  • Confirmed facts – if so, how many proofs
    The description of a situation depends on the standpoint. Different views automatically result in alternative realities that are coherent in themselves. Besides, observers standing close together will additionally agree to descriptions due to unconscious group pressure, although they may have made a different observation. Do many observers make a fact better? If so, how many must be there to be known? For the classical media, at least two concurring sources are enough to adopt a fact. Isolated observations do not enrich the knowledge of the world?
  • Knowledge gets crafted by science
    Knowledge is located in science. The various disciplines have expanded their echo chambers to such an extent that overarching approaches seek to integrate different pieces of knowledge. For example, when engineering incorporates knowledge from biology into the new field of bionics. The advantage of science comes from the extensive evidence gathered in laboratories or reality and published in studies. It becomes difficult with phenomena that cannot be measured, such as the effects of homeopathic compounds. The natural reaction of scientists to such black box effects is rejection because of the lack of detectability. Such (non-) knowledge’s proponents are disparaged as esotericism, mystics, sectarians, or conspiracy theorists. Do we exclude the wisdom of a shaman or Socrates from the world knowledge?
  • Only verity is knowledge
    Karl Popper dissolved the aberration that only confirmed facts are true with his falsification. The real truth is only obtained when a point is disproved. And how do we classify the broad field of literature? Is it knowledge if we know the personnel of the Human Comedy or the Steppenwolf? Are fictional stories true? Does an unusual perspective lead to special knowledge? Can this knowledge be questioned because the majority perceived it differently? In the end, all facts are remarkable and valid. All turn their attention to different aspects. Even statements, which falsify willfully, create knowledge (In this case, the why would be needed). Doesn’t world knowledge include everything?

Bottom line: The intangible nature of knowledge makes it difficult to classify. The different data formats do not give any information about where the knowledge comes from – except: out of the mind of a person. Do you understand the sentence? In Japanese: この文を理解していますか?; or in Arabic: هل تفهم هذه الجملة؟. However, we are open to all kinds of knowledge as soon as we see or hear it. However, what information we understand depends on us, on our wealth of experience. As group beings, we tend to follow the many – which explains the influx of conspiracy theorists. However, you should ask yourself whether a so-called conspiracy theory is not also knowledge. After all, we believe the results of scientists as well without being able to check them. Libraries are full of scientific papers describing our state of knowledge – but what is declared valid today may be outdated tomorrow. Even truth does not help us because everyone has its reality, which fits coherently into its concepts. What we can know is that many views cannot be reduced to one fact. It is much more critical to avoid holy wars because it lies in the nature of propositions that they could be different. We have to learn to deal with alternative facts instead of negating them categorically.

What’s the problem with data?

An essential aspect of digital transformation is the representation of the entire business and private life in databases as ones and zeros. The created transparency awakes a feeling of insecurity, as long as one has not consciously dealt with the topic. That’s why we should be aware of the flood of data.  By 2020, IDC predicts a data volume of 40 zettabytes http://ow.ly/Ao5v7. This is 40,000,000,000,000,000,000,000 bytes. The novel algorithms allow evaluations, without a clear idea of the question. With this amount of data, it becomes more and more difficult to derive meaningful insights. At the same time, valuable information can be identified in a targeted manner. So what’s the problem with data?

Let us remember, where the data is coming from.

  • The connection data is continuously stored in the telephone network – including geographical movement data.
  • The use of every EC card triggers the storage of the complete payment process.
  • Each time a credit card is used, it triggers a series of inspections and data storages.
  • The use of the Internet leaves its traces in the systems of access providers, network operators, service providers and, in case of mobile access, with all “eavesdroppers”.
  • A purchase leads to storing of customer data. Less personal, if no personal data has been provided. Very personal, e.g. with a customer card that links all interactions to the user.
  • Wish lists, which remember all products for a possible, future purchase or other use, are available to the providers at any time.
  • Free offers are often linked to the e-mail address.
  • Reading e-books stores the reading habits.
  • Media providers, like Youtube or Spotify, remember the status and history of the usage.
  • In social media platforms, users provide openly whatever comes to their mind.
  • Electronic logbooks store all movements of a vehicle.
  • Books, films, music, photos, sculptures, in short any cultural asset can be documented digitally.
  • The residence data is stored in the registration offices.
  • Once a year, the tax office processes income data.
  • Current camera systems allow surveillance in the public space with face recognition – and can connect it to a system, like in the Social Credit System in China, in order to observe and reward good conduct.

These are just examples of how our world has become digital and virtual. All this data can be retrieved from anywhere at any time. Data can be copied and passed on at will, without losing the “original”. In the past, data was tied to the physical object of storage, such as a book, picture, or record. Without the “data carrier”, one does not have the data anymore. With the virtualization of data in ones and zeros, this physical dependency becomes history. The free flow of digital data leads to new questions regarding ownership, the related rights and obligations as well as the associated safety possibilities. When books were burnt in the past, every further use has gone up in smoke. Nowadays, a purchased e-book becomes already unreadable when changing the operating system. The rights of use dissolve unintentionally. Customers have to buy cultural goods again, when  they change media. Here a new, fertile ground is evolving for content providers.

According to copyright law in Germany, the data belongs to the originator, i.e. the artist or author, who created the data. Authorized users, i.e. the buyers of data for personal use, have only a right of use. Anything beyond is illegal.

  • Commercial use
    Data found on the Internet may not be used or resold for commercial purposes without the written consent of the author. This means that books, articles or website content as well as all other types of media such as films/videos, pictures, music or sounds may only be consumed. The creation of user profiles, for example for purchase proposals, is in the grey area of legal use. Passing on the data, with or without fee, is illegal.
  • Competitive intelligence
    The collection and linking of publicly available data for the purpose of competition analysis is a particular form of commercial exploitation, which is difficult to detect. The data can be found on company websites, in news pages and above all on social networks; for example, when an employee reports profanes from his daily work and business trips. Linking these data with others unveils strategies, plans or even problems of an enterprise.
  • Governmental use
    With the same mechanisms of Competitive Intelligence, such as Data Mining, Big Data or Business Intelligence, government agencies can process the available data. They collect and link them with the telecommunications data that can be accessed. The descendants of the dragnet search thus receive user profiles that go far beyond people’s self-image.

Bottom line: The huge amount of data can be evaluated according to specific questions. However, this involves high costs for personnel and computers as well as decisive attention, which are only available in special cases. This is best seen in the reading recommendations, which are based on the user behavior and are oriented to the clicks on a website, but stupidly promote already purchased products or offer even authors their own books as a reading recommendation. Unauthorised re-use by companies and the government goes unnoticed. It all starts with finding new contacts and the related background research that is possible for everyone, it goes to the Cambridge Analytics’ election manipulations, and far beyond the possibilities that we can imagine. This professional use of the data by interest groups and government agencies is the real problem with the data.