Embarking on Machine Learning Journey
The 5 things you must look out for from a Data Scientist point of view
When introducing Machine Learning (ML) into your organization’s processes, it is imperative to have a thorough implementation strategy, as well as a calculated data management strategy.
GDS Link’s Chief Data Scientist, Florian Lyonnet, compiled 5 Best Practices to ensure your organization is leveraging its machine learning efforts into tangible action plans.
Focus on what you want to achieve
Being able to detect these changes in a timely fashion is also crucial so that the decision to re-tune a model can be taken as soon as the first signs of degradation are observed. Having proper analytical tools around monitoring your models will pay itself back on the long run because the time you lose with models underperforming adds up over time.
– How do I plan to productionize these models? Here, ask yourself if you are going to use the native language to deploy these models or will you be exporting them in a standard format like PMML. If the latter, you need to think about the export capabilities of the language/library to this format.
– The algorithm is only a subpart of the problem, you also need to think about where you will execute the code that creates the variables that feed these models. Often, data scientists rely on libraries that can manipulate multidimensional arrays and that implement functions like sorting, filtering and array operations. Translating this logic into another language might be time consuming, prone to errors and overall limit your capacity to quickly create new models and deploy them. Again, choosing the infrastructure that will allow you to run natively these transformation functions will be a huge boost to your productivity.
This also highlights the importance of having a platform that gives you direct access to hundreds of data sources as well as partnering with data experts.
About the Author