Industrialisation of machine learning techniques – Challenges bigger than algorithms

Category

Better and cleaner products can be built when they are designed with best estimates. Machine learning helps in calculating best estimates especially if you have big, valuable, and good data. Storage Portal project is the child of this ideology. It started with a vision of developing machine learning applications that can do ‘magic’ with engines, their design, and simulations.

Storage Portal, a platform that eases engine data communication and analysis, is one example project which has made it clear that the challenges in industrialising machine learning techniques are bigger than algorithms. We had to ask and answer some difficult questions when developing our product. Some of the questions were initially anticipated and some could have not been. The ambitious job called for a couple of requirements such as acquiring the engine measurement data with its engine metadata, since one is meaningless without the other. The other condition was a user friendly method to acquire and display these data. The application’s user interface (UI) came with a mechanism to store data to the database and communicate with it through the application’s backend. These units of the application needed to be hosted in a server. Do you see how the simple Machine Learning code needs to be supported by its vast and complex infrastructure?

In our project’s first stages, the driving force was saving time and resources by centralizing and standardising the way of working. In the beginning, the issues we were facing were related to data structuring. We found out that data is acquired by different software in different formats and saved based on different naming conventions inside different file locations and drives. When talking about data, we have learnt that it is important to examine what is considered ‘real’ data. i.e. there might be big data with no value. The other pain was lack of metadata associated with these measurement data. The measurement data means plenty for the experts using them, however, that’s where the ball stops rolling. There are cases where an engineer from another department cannot understand what it means and/or how to read it. This for our project meant a couple of months of meetings with the experts who understand the data and getting a translation of what the data means, which part of it is useful, how often the data is collected, what proprietary software are used and so on. Acquiring the sample data is not also an easy task as there are security issues we needed to follow: a defined protocol was necessary but took time and resources.

While making sense of the data, we were simultaneously working on the user friendly user interface. This meant defining the use cases and user types, and iteratively designing the different pages and features. When designing the UI, we needed to consider the user experience as well. This will attract more users to use the application which will prove useful to different departments and more data for developing a better engine. The Wärtsiliän applications were studied and the internal UI libraries were researched in order to reuse them as opposed to reinventing the wheel. The UI designs were presented and commented on by the experts which will use the application.

The infrastructure to build such a platform is not a walk in a park. There is always a question of resources. One needs to consider the amount of storage space, retention policy, best technology frameworks for their own organisation, adapting to or complimenting already working solutions, and so on. The aspect of working with people should also be considered when waiting on replies, arranging meetings, asking for support, and planning a timeline. In conclusion, applying a machine learning algorithm is the smallest part of the challenges.

 

Hindia Mohammed

Hindia Mohammed is an experienced and professional software developer who works as a Software Consultant at FINETO Oy. She has been indirectly involved in the Clean Propulsion Technologies project through the Storage Portal project where she worked together with Wärtsilä. Hindia’s Bachelor’s thesis “Storage Portal – A React-Redux Web Application Utilizing Polarion for Test Data Management” is available via this link: https://www.theseus.fi/bitstream/handle/10024/335500/Storage%20Portal.pdf?sequence=2&isAllowed=y