Data – the (very long winded) second language we all speak

I was fortunate enough to attend SINET 61, a major cybersecurity event supported by ACSGN. It was a very interesting event (hot button jobs alert: security, blockchain and IoT) with lots of expertise in one room. Fundamental to a lot of discussions was data. What is data?

Data traditionally means pieces of information stored/transmitted by a computer. Its the universal language most modern electronic speaks to each other. Google it and you will see 1s and 0s, and lots of green screens.

Let’s ignore all that, and go back to where it all started…{queue dream sequence music :)}

Once Upon a Time, when computers first evolved into the shape we know today, a lot of electronics sensors can only read 2 voltages correctly, On and Off. Through convention they were labelled as 1s and 0s.

Now let’s relabel them as As and Bs and the computer just become another person like us. But they are like a tourist that doesn’t speak English. Just like foreign languages, when you put letters in different orders you will form words that mean different things.

The down side of making words out of two letters is that you are limited with how many meanings you can make in a given word length. A 2 letter word (AA, AB, BA, BB) only allows you to represent 4 different ideas. In contrast, because there are 26 letters in English, 2 letter words gives you possible (26×26=) 676 unique words.

In order to emulate the English alphabet (so that humans can interact with machine without having to talk gibberish), a group of people called American National Standards Institute agreed to a set of zero and ones strings that will represent the alphabet and commonly used symbols. This standard is now known as ASCII character set. The added bonus is that machines from different manufacturers can now also understand each other when they operate on ASCII.

But what about the others...

At this point, I am expecting a lot of passionate people writing in to say, what about the other standards? Yes, there are other standards out there, but fundamentally they are an agreement to use different strings of 1s and 0s out there.

Like in the human world when different people speaks different languages. This cause a lot of issues when people try to use one set of data in one machine in one standard to another one that use a different standard, but that’s the subject of another post. 🙂

Every time you press a key on your keyboard, you are actually speaking in strings of 0s an 1s to your computer. You didn’t know you are talking a second language every time you use a keyboard did you?