Even though every Machine Learning Engineer has a slightly different set of tools that they use, in it’s core, they are still similar: A scripting language, a ML/DL framework, maybe some data preparation tools. But when it comes to working as a Machine Learning Engineer - especially working remotely - you might want to add some things to stand out.
In some companies, Machine Learning models are not only something to analyze data or play around with - they actively get deployed in the web, on mobile phones or embedded devices.
Any skill that you can acquire which helps you optimize inference speeds, minimize model sizes and enhance deployability can help a team massively.
In many cases - and for many frameworks - this means for you to be able to either use pre-built tools or being able to code some C++ or similar to use low-level APIs. But it’s also important to know what steps you can take to bring down model sizes and what pitfalls there are for performance.
When learning machine learning, you often get nicely cleaned and processed training data. In some cases, you might even be able to just pull it down locally to your machine and use it.
Well, especially when working in startup environments, this might not be the case. You might need to handle massive amounts (1TB+) of noisy, uncleaned data. So how do you do that?
Teach yourself to process data on distributed clusters, such as with Apache Spark, and learn some best practises on how to deal with remote data. Hint: If you have to work with a remote data storage, it won’t work to pull down images one-per-one.
This is one of the most asked skills nowadays in the Machine Learning job market, and yet one of the ones that is mostly missing in toolbelts of young Machine Learning Engineers.
Many people are visual people. They like images over numbers, movies over books. Just as important as achieving impressive results is to show them to the world - or at least your manager. A well crafted and logical plot or graph is just the right way to do it.
Most people know on a basic level about how to deal with a graph framework like
matplotlib, but as soon as you know some advanced features and tricks, your graphs will be clearer, better and easier for non-tech people to understand. It will become a valuable skill, especially if you are doing presentation or are working on Machine Learning in a non-ML company.
With these three tricks, you will make your skills broader and be more valuable for a range of different companies. A research facility might not need production-ready code, but might be able to use your graph skills. Vice versa for a Machine Learning startup.