Nowadays, auto insurance companies face challenges more complex than ever such as social and economic inflation, the unpredictability of vehicle repair costs, and the increased complexity of auto insurance fraud. The players in this market need to employ new means to keep good customers while providing competitive insurance rates at the same time. Maintaining competitive insurance rates translates into reduced operational costs. One solution to tackle the problem of insurance fraud is using AI damage inspection tools to detect fraudulent claims. As this use case demonstrates, AI damage inspection tool take pictures of damaged cars, assess the extent of damages, and predict the repair cost. To be as efficient as possible, the AI model must be trained on a large set of data. Each insurance company has its own insurance claims database, including pictures of the cars’ damages, therefore the existing set of historical claims records can be used to train the AI model. However, the data set is limited, and a larger number of images used for training would lead to an improved AI model. To tackle the problem of limited datasets for training the AI models, three players in the insurance market agreed to share car insurance claims images to train one another’s AI models for the damage inspection tools. Each of the three companies has its own AI model, each AI model is trained on the same common set of training data resulting from the aggregation of the car insurance claims images. Data confidentiality is critical for insurance companies, so one company’s car insurance claims data cannot be shared with the other two companies. The question is how the three players can train their AI models on the common set of training data while ensuring data confidentiality at the same time. This requires a solution that enables all the three companies to train their AI models on a common set of training data while ensuring data confidentiality at the same time.
The solution to this problem is for the three companies to use the advanced Compute-To-Data (C2D) feature of Ocean Enterprise.
The first step towards this solution is for each company to register as a participant in an Ocean Enterprise-powered data space. Such data space provides a marketplace - a user interface to publish, search, and consume assets within the data space - as well as a dedicated C2D environment, where algorithms can be securely executed on published data sets in the data space.
Being a participant in this data space allows a company to publish assets and consume other participants' assets.
Second, each company publishes its training data set as an asset in the marketplace associated with that data space. The assets will be registered as “for compute-only”, meaning that the training data cannot be downloaded by a consumer, but used only to run an algorithm in the C2D environment.
Similarly to the training data set, each company will register the AI model training algorithm as an asset in the same marketplace, also as “compute-only”.
The training algorithm of the AI models needs some preparation to run in a C2D; for instance, the algorithm needs to know the location of the data sets when they’re uploaded to the C2D environment.
Also, the instance has to save the updated parameters of the model in a persistent form after the training is executed.
Next, each company’s AI model training algorithm is granted access to the other companies’ training data assets. With this last step, the environment is ready.
Now, a company can run the training algorithm of its AI model on the training data sets of the other two participating companies in the C2D environment. In the C2D environment, the data set and the algorithm are uploaded, and the algorithm is executed. During the execution of the C2D job, nobody can access the C2D environment so both the training data sets and the AI model are secure. Once the C2D job is finished, the results are returned to the consumer. In this case, the result of the C2D job is the updated parameters of the AI model.
By using the C2D environment, the company that publishes the training data is confident that its data were not disclosed while the company that runs the algorithm gets its AI model improved. This allows each company to improve their own proprietary AI model and in turn improve their ability to quickly and accurately process insurance claims.
This case study is being actively worked on by Ocean Protocol Foundation, a founding member of the Ocean Enterprise Collective. Learn more and get in touch with Ocean Protocol Foundation here.