The slides that follow are very general in nature - they present the 'big picture' concepts of Big Data. By nature, the content is relatively speaking, soft/squishy/fluffy..
It *is* important to understand the context in which we will discuss data mining etc. in upcoming lectures, otherwise the material will seem dry/irrelevant.
How many of these buzzwords do _you_ know? :)
Big Data has indeed become somewhat of a catch-phrase/buzzword. But, we can provide an operational definition: Big Data is data that is 'too big' to be stored in a single machine, and/or processed by a single machine. This definition is intentionally vague, to keep it relevant for the future as well.
Big Data is data that has:
In other words, it is data that is varied in nature (comprises diverse types), changes often, and comes in large quantities.
Big Data is not only big, but getting bigger at a rapid rate..
Big Data can result from:
Wikipedia: Datafication is a modern technological trend turning many aspects of our life into computerised data and transforming this information into new forms of value. Examples of datafication as applied to social and communication media are how Twitter datafies stray thoughts or datafication of HR by LinkedIn and others.
In other words, it is the notion that people, our built envrironment (eg. number of freeways in the US), etc. can lead to data generation.
"Once we datafy things, we can transform their purpose and turn the information into new forms of value."
IoT is the 'Internet of Things' - what if (almost) every lightbulb, tire, building, plane engine, bridge, fridge etc. had an IP address and a sensor, and transmits data periodically through a network? Among other things, it will lead to an *explosion* of data :)
Here is an IoT infographic:
Big Data can be quite useful if collected, analyzed and interpreted properly. Here are things that can be problematic:
Again, these are Big Data's characteristics:
What are things we can we do now, that we couldn't, before?
* combine multiple sources of data (however small or seemingly insignificant) for a better 'bigger picture'
* exploit unstructured data - voice, video, images, tweets, blog posts..
* provide insights to [internal] frontline managers in near-real-time (to enable making more agile business decisions)
* experiment with the marketplace (fluid price-setting) as often as needed!
So here's what is new: better insight, quicker action.
According to IEEE (and others), a long time.
Here are some links:
We are at the start of a transformative phase, fed by our relatively-new ability to collect, store, analyze and benefit from MASSIVE amounts of data from every walk of life.