Machine learning and deep learning have become an important part of many applications we use every day. There are few domains that the fast expansion of machine learning hasn’t touched.
FREMONT, CA: Many businesses have thrived by developing the right strategy to integrate machine learning algorithms into their operations and processes. Others have lost ground to competitors after ignoring the undeniable advances in artificial intelligence.
But mastering machine learning is a difficult process. You need to start with a solid knowledge of linear algebra and calculus, master a programming language such as Python, and become proficient with data science and machine learning libraries such as Numpy, Scikit-learn, TensorFlow, and PyTorch.
And if you want to create machine learning systems that integrate and scale, you’ll have to learn cloud platforms such as Amazon AWS, Microsoft Azure, and Google Cloud.
Naturally, not everyone needs to become a machine learning engineer. But almost everyone who is running a business or organization that systematically collects and processes can benefit from some knowledge of data science and machine learning. Fortunately, there are several courses that provide a high-level overview of machine learning and deep learning without going too deep into math and coding.
But in my experience, a good understanding of data science and machine learning requires some hands-on experience with algorithms. In this regard, a very valuable and often-overlooked tool is Microsoft Excel.
To most people, MS Excel is a spreadsheet application that stores data in tabular format and performs very basic mathematical operations. But in reality, Excel is a powerful computation tool that can solve complicated problems. Excel also has many features that allow you to create machine learning models directly into your workbooks.
While I’ve been using Excel’s mathematical tools for years, I didn’t come to appreciate its use for learning and applying data science and machine learning until I picked up Learn Data Mining Through Excel: A Step-by-Step Approach for Understanding Machine Learning Methods by Hong Zhou.
Learn Data Mining Through Excel takes you through the basics of machine learning step by step and shows how you can implement many algorithms using basic Excel functions and a few of the application’s advanced tools.
While Excel will in no way replace Python machine learning, it is a great window to learn the basics of AI and solve many basic problems without writing a line of code.
Linear regression machine learning with Excel
Linear regression is a simple machine learning algorithm that has many uses for analyzing data and predicting outcomes. Linear regression is especially useful when your data is neatly arranged in tabular format. Excel has several features that enable you to create regression models from tabular data in your spreadsheets.
One of the most intuitive is the data chart tool, which is a powerful data visualization feature. For instance, the scatter plot chart displays the values of your data on a cartesian plane. But in addition to showing the distribution of your data, Excel’s chart tool can create a machine learning model that can predict the changes in the values of your data. The feature, called Trendline, creates a regression model from your data. You can set the trendline to one of several regression algorithms, including linear, polynomial, logarithmic, and exponential. You can also configure the chart to display the parameters of your machine learning model, which you can use to predict the outcome of new observations.
You can add several trendlines to the same chart. This makes it easy to quickly test and compare the performance of different machine learning models on your data.
In addition to exploring the chart tool, Learn Data Mining Through Excel takes you through several other procedures that can help develop more advanced regression models. These include formulas such as LINEST and LINREG formulas, which calculate the parameters of your machine learning models based on your training data.
The author also takes you through the step-by-step creation of linear regression models using Excel’s basic formulas such as SUM and SUMPRODUCT. This is a recurring theme in the book: You’ll see the mathematical formula of a machine learning model, learn the basic reasoning behind it, and create it step by step by combining values and formulas in several cells and cell arrays.
While this might not be the most efficient way to do production-level data science work, it is certainly a very good way to learn the workings of machine learning algorithms.