In the ever-evolving field of technology, continuous learning of new tools and methodologies is essential. Occasionally, we encounter projects where the recently acquired knowledge can be directly applied to solve existing issues. In this instance, I was exploring on-device machine learning and LLM technologies when I was presented with an opportunity to leverage these advancements to significantly reduce the service cost by 40x.
Background
Without disclosing the actual project, let's discuss the product and its functionality. The product was a specialized document scanner mobile application where users could scan their documents. The app utilized advanced scanner SDKs and APIs to process the documents and use the extracted data for further processing and providing document data related features. It relied on multiple third-party paid APIs to achieve this.
For a basic cost idea, let's say for processing a single document, it was using mainly two paid services. The first service was for live document capture, which scans, crops, enhances, and saves the image, and it cost around $5,000 for lifetime use. The second service was to process the image we got after scanning, extract data, and run some AI tasks on the data. This was the main service, costing us around $10 per 1,000 requests. So, if a single user is doing, let's say, 100 scans a year (which users are doing), then we have to pay $1 for just one user. And if we have 10,000 users, then we have to pay $10,000 for just this service every year.
Problem
The main problem was the high cost of the service. The service was costing them a lot, and because of this, the founder put the project on hold for almost a year. The primary reason for this was the cost of the service, and they did not have any paid plan that could reduce the cost. Therefore, the only way to reduce the cost was to decrease the usage of the service.
Solution
Recently, I had the opportunity to work on the project and connected with the founder, who explained the project and the problem to me. I began exploring alternative solutions.
Since the project was completely free for users, my first approach was to find some open-source alternatives to replace the paid services. However, our main concern was the quality of the service.
I had been working with OpenAI on some projects and had use cases in mind for the OpenAI ecosystem. However, directly using OpenAI with the images and performing the same tasks was accurate but did not reduce the cost significantly. So, I started exploring other approaches.
I looked into on-device ML solutions that could perform image-related tasks on the device itself. I found some good libraries that could handle OCR and other image processing tasks on the device.
After implementing on-device ML, the accuracy was good, and for some parts of the service, we could cut the cost to zero. However, for the next service that worked with data and performed AI tasks, I tried using Gemma and some TFLite on-device models. Unfortunately, the accuracy was too poor to be usable.
For this part, I decided to use OpenAI's GPT-4o-mini model. The accuracy was good, and the cost was very low. So, we decided to use OpenAI for this task.
Result
Overall, after implementing on-device ML and OpenAI, we achieved the same results with better accuracy using OpenAI. Additionally, the cost was reduced by 40 times compared to the previous cost. With the current solution, for the same 10,000 users using the service 100 times a year, the cost was reduced from $10,000 to $250. The service is now live, and after a few weeks of testing in production with real users, everything is working fine.
Conclusion
After working on this project, I learned that it is essential to keep exploring new technologies and tools. You never know when you will get the opportunity to use them in real-world projects. The most important thing is to keep the cost in mind while developing the product. Always try to find the best solution that can reduce the cost and increase the efficiency of the product.
So, this was the story of how I used on-device machine learning and OpenAI to reduce a service cost by 40x.
Thanks for reading.
Note: I don't write that much, so I hope you like the story. If you have any questions or suggestions, feel free to ask.