When it comes to machine learning algorithms (or algorithms in general), the performance of an algorithm can be checked on two parameters:
- Computational efficiency of algorithm
- Accuracy of algorithm
Computational efficiency – This basically measures how much time and/or space it will take on an input of size N to arrive at the output. In case of machine learning algorithms, generally, we don’t talk about it much. As model development is a time taking procedure itself, and we don’t train model at run-time in production. We don’t bother much about computational efficiency.
Accuracy of algorithm – Accuracy is a loose term here. While checking the performance of an algorithm, we use different performance metrics depending in the problem in hand. For example, for a classification problem, one person may in interested in how many points are correctly labels, another person may be interested in AUC metric or F1 score.
So, to check the performance of machine learning algorithm people use different metrics as fit by their problems. For example, regression and classification both cannot be evaluated using same metric. There is no absolute formula or score for all algorithms. It depends on your assumptions and business need.
In some cases we do look at computational complexity at training time especially memory complexity. SVMs are notoriously bad at this as also hierarchical clustering. However much of this is just expert knowledge and we don’t compute the complexity. Also in most cases we use library functions and writing our own algorithm is much rarer except in research environments. Therefore expert knowledge on which models to use on a dataset often suffices.
However if you are creating your own algorithm or your own implementation of a standard algorithm then you will need to do it. Last year a colleague and I created our own implementation of Locality Sensitive hashing and had to look at both memory and time complexity. Also we built an online model which was similar to the way a decision tree is built using information gain. Even there we had to look at time complexity.
A detailed discussion on the topic can be found here:
authoured by Sujoy Roychowdhury, Senior Data Scientist – Cognitive Computing
Login to GoCrackIt now to interact with our mentor Sujoy