Managing outsourced machine learning projects

Managing outsourced machine learning projects

Advice · July 17, 2018

As more businesses want to experiment with artificial intelligence, they learn the unique challenges in managing outsourced research & development projects. It helps for them to be prepared and well informed, and for freelancers, to learn to lead from behind.

Even if you’re otherwise comfortable outsourcing software development or other IT jobs, sharing business-crucial databases, collecting know-how, or more generally, managing outsource R&D workflows can be intimidating. Failing to address the big questions first can lead to the project getting derailed or abandoned halfway through.

My preferred approach is based around the agile methodology, which acts as a safety check where the potential downside is capped but creativity is in free flight. The point of this agile approach is to limit the scope of work, and hire freelancers on incremental steps. Each of these chunks is completed with a shipped product and the know-how transferred from the freelancer to the company.

Agile is a fantastic tool, but it’s difficult to do with artificial intelligence, where sometimes a lot of effort is required only to figure out what can’t be done. Designing the scope well goes a long way in meeting commercial targets, or, in a worst case scenario, capping the investment to a minimum.

Big data is front-loaded

The motivation to experiment with AI is that all data sets, even the smallest ones can provide insights that are extremely valuable for the business. The bad news is that no two companies are exactly alike, and while almost every business has something useful to work with, the process starts with a discovery phase to find out what insights are actually possible to gain.

Most artificial intelligence projects are therefore heavily front-loaded, where lot of thought and effort has to be put into the first iterations, and we only get to see the result much later. Even the simplest and most generic recommendation engine can be derailed by a perfectly normal looking site’s perfectly clean data, if that particular niche or traffic is less predictable than others.

Therefore, it’s best practice to experiment on the cheap side first. Testing a few smaller ideas, and see where they would lead - these bite-sized jobs are easier to manage and cheaper to run. There’s no magic bullet but a combination of common sense, a structured approach and learning from all those experiences goes a long way.

Starting super small

Especially when working with new freelancers, starting small is best practice for many reasons:

  1. It makes it easier to set challenging yet realistic goals. Certainly helps to be well informed, but the focus is always on what’s truly important for your business. It’s alright to require freelance help scope the project, and it’s similarly fine to start with a discovery project on finding ML a role within the company. However, if the time estimate is over a month from start to finish for an iteration, you want to find a smaller test run first.

  2. Starting small makes it easier to retain the right level of control in the relationship. It’s expected that the bigger part of project management will be shifted to the freelancer, but it’s still a good idea to keep an eye on schedule and make absolutely sure to transfer all relevant know-how to the company once the freelancer finished the job.

  3. You’ll get better prices. Looking at a smaller scope, developers are more comfortable to give a flat price, which aligns motivations much better than a day rate. It also keeps away agencies that would try to lowball with their first offer in the hope that they’d have enough time to recoup it all as the scope changes.

Is this safe at all?

Sharing data with freelancers is always a security matter, and you might not be allowed to share user data at all. It’s better to stay on the safe side of the law with data, especially since GDPR came into force in May 2018.

Working in small iterations also helps with sharing the minimum amount of data required for freelancers to work with. It’s all the more important if this is the first time working together with them.

The best practices in transferring data are:

  1. Mask all user information that can be masked. Include hash values instead of emails, IP addresses and anything that could identify a user. Remove everything that’s not absolutely necessary for the job. Your freelancer will be able to guide you with what’s the minimum they need or is useful for them.

  2. If possible, don’t share live data but instead, generate test values that follow the structure of a real dataset. Ideally, the model should be designed to use multiple input sources, and you should be able to switch between staging and live datasets to experiment without the freelancer’s help initially.

Future proofing

This project is likely not your last one, and all upcoming artificial intelligence experiments should be able to use the knowledge gained in previous ones.

To future proof the R&D efforts, the goal is to make it easy for any one developer to pick up the job where the other one left it off:

  1. It’s a lot easier to write code than it is to read it, and developers don’t enjoy writing documentation. To help the next developer, the absolute minimum is a short and easy-to-read README that explains how the system works and how to use it.

  2. Open standards and popular frameworks help. These days most AI developers will have similar tools in their toolbox: Python with a framework like TensorFlow, Scikit-Learn or similar. Always make sure freelancers plan to use known frameworks as opposed to their home-grown, proprietary code.

  3. You can use automated or manual tests, but you can’t outsource all testing to the same people who build the system. If you make sure you test every project in-house, you’re already halfway through with knowledge transfer, and you’ll be well positioned to hire the next freelancer.

Keeping the creative momentum

Once a project finishes and deliverables are transferred to the company, freelancers will be eager to start a new project. This is a good time to encourage innovation and idea generation: what could be the next opportunity to explore?

Use this time for a discussion to draw conclusions and gather a large number of ideas to continue, then form those into more concrete potential projects.

There’s a temptation to continue as long as it’s fun, but make sure it makes sense for the business to continue. Keep control on your side: only approve proposals when you’re ready to translate them into projects and actions. Figure out whether you’ve passed over into a phase where the added value is not worth the investment.

About the Author: Richard Dancsi

Richard is an entrepreneur and digital strategist, with a masters degree in Mathematics and Computer Science. As an interim CTO, an independent consultant and tech coach, he helps organisations excel in an environment shaped by rapid change. ​Richard has over 15 years experience in delivering technologies and workflows for businesses small and large, to expand the reach of their product and mission.

Find a Remote ML Job today!
Remote Machine Learning Jobs in your inbox as they get posted.