Technology
What Machine Learning Engineers Understand that Others Don’t
What Machine Learning Engineers Understand That Others Don’t
I was recently asked on Quora, "What is something that machine learning engineers know that others don’t?" This seems like a fascinating question, so here is an exploration of some unique insights.
A Bit of Background
I have spent over 35 years in the field of data science, which used to be called data analysis and statistical analysis. My career path has involved working for governments, universities, non-profits, and businesses. I even had some consulting work, but my journey began more by accident. My undergraduate degree was in geophysics, but I quickly found that work dried up, so I honed my skills in mathematics, statistics, and computing. The transition from mainframe computers to personal computers and servers in the 1980s and 1990s opened up a lot of opportunities, leading me to my current career path.
Common Misconceptions About Data Scientists
Many people believe that data scientists move into the field after studying something completely unrelated. While it is true that some do come from different backgrounds, a significant portion of them have a STEM or mathematics-based education. For instance, I have a close personal relative with a PhD in astrophysics who now professionally transitions to data science. This illustrates that the transition from one scientific field to another within the domain of data science is quite common.
Hidden Depths of Data Science
Data science is not as monolithic as it might seem. While there are many technical matters that a data scientist must know, such as higher mathematics, statistical theory, and computer coding, these are just the tip of the iceberg. Here are some lesser-known insights:
Isolated Data Science Techniques
Data science techniques can be very useful for predictive purposes, but improving a model gets increasingly difficult as the level of prediction desired or needed grows. More data and faster processing can help, but they do not necessarily scale linearly. This can be hard for people to accept, especially business people who want to leverage data science to make a lot of money.
Artificial Intelligence Limitations
Another understated aspect of data science is the limitations of artificial intelligence. While artificial intelligence can outperform humans at specific tasks, human-level intelligence is still a long way off. An AI model, such as a multilevel perceptron, can be trained to recognize cats, much like a four-year-old child. However, it is challenging to explain how the model makes its decisions, even with a deep understanding of the model's architecture and detailed programming knowledge.
Fits and Starts in Research
AIs progress in fits and starts, which can lead to a "recession" known as the AI desert. Private investment money dries up, and corporate research projects can whither due to a lack of funding. The sense that AI research is no longer a surefire way to get tenure and research money can also dry up, leading to reduced university-level interest. This highlights the financial and ideological challenges facing the AI field.
Explaining vs. Predicting
Another nuanced aspect is the difference between explaining and predicting. Regression techniques are great for explaining variables, but machine learning techniques might struggle. Conversely, machine learning techniques can offer superior predictive power, but the reasons behind the predictions can be hard to understand.
Objectivity and Impartiality
Machine learning models do not care about our feelings or political attitudes. Supervised learning models will make predictions based on the data they receive. If the results contradict our assumptions, the models don’t care. Carefully selecting data to support a preferred view will not result in useful predictive models.
All Models Are Wrong, but Some Are Useful
Finally, it’s important to remember that all models are imperfect, but some can be useful. This is a common piece of advice in the field. Regardless, there is a lot more to know, and keeping things concise is a good practice, especially for those entering the field.