Data science in a nutshell
Data science is not a rocket science. Although a data scientist needs to have many skills in different areas, you don’t need to be an omniscient unicorn to take advantage of some parts of data science.
Definition of data science
According to Wikipedia:
“Data science is an interdisciplinary field about processes and systems to extract knowledge or insights from data in various forms, either structured or unstructured, which is a continuation of some of the data analysis fields such as statistics, data mining, and predictive analytics, similar to Knowledge Discovery in Databases.”
According to Peter Naur:
“The science of dealing with data, once they have been established, while the relation of the data to what they represent is delegated to other fields and sciences.”
According to Journal of Data Science:
“By ‘Data Science’ we mean almost everything that has something to do with data: Collecting, analyzing, modeling…”
Some practical advises
- Do use experiments and A/B tests when possible
There are a lot of decisions you make a day. The effect of these decisions cost you money. If you want make the best decision, you’d better take A/B test results into account.
- Do calculate median, not just mean
A common mistake is to calculate some mathematical mean instead of median. If outliers exist in your data (almost everytime), then there is no point in calculating traditional mean, because it will be biased towards the outlier. In most situation, median is a much better indicator.
- Do gather as much data as you can from your own business
You have more data than you would have imagined. Gmail, Google Analytics, CRM software have functions to export data to a spreadsheet. You possibly have customer surveys or data from your fitness app or smart bracelet, so on. You should gather data from different sources, because data storage is easy and cheap. Even if you now think that those data are unnecessary, later you may realize that you can take some advantage of the historic data your stored previously.
- Do not think that correlation is causation
If you look at a chart or a scatterplot, you may notice some relationship between some things. This relationship helps you understand your business, helps you predict the future, but do not ever think that one causes the other! (There may be a hidden reason that causes both)
- Visualize your data
Humans are evolved to understand visual information, not numeric. So if you want yourself and others to understand your story behind the data, you must visualize it.
- Do not listen to your intuition
Your intuition is just the result of your neural network in your brain that was taught on a very limited amount of data, and it is historically mainly designed to survive, not to get rich.
Want to learn more?
- Read our blog at https://answerminer.com/blog
- Watch TED Talks at https://www.ted.com/playlists/56/making_sense_of_too_much_data
- Coursera at https://www.coursera.org/browse/data-science
- EDX at https://www.edx.org/course?search_query=data+science
- Grab your own data and try to mine its informations using different tools like Excel, Tableau, R, SPSS or AnswerMiner