By Genie Yuan, Regional Vice President APAC Japan, Couchbase
Imagine a consumer who is renovating his house and is browsing an e-commerce platform for materials. With predictive analytics capabilities, the platform’s mobile app can tap the user’s browsing and purchase history to suggest products they are likely to buy. But with generative artificial intelligence (Gen AI), the app comes up with a complete architectural plan for a room, lists down all the needed items, and identifies which are available in the platform’s inventory.
Customer service is just one of the many ways generative AI can potentially transform how companies do business. With smartphones serving as the primary interface with which individuals interact with applications, it is more than likely that enterprises will race to integrate this new technology into their own apps or launch new ones that contain generative AI capabilities.
For example, there were 168.3 million cellular mobile connections in the Philippines at the start of 2023 according to Data from GSMA Intelligence. Between two companies releasing their versions of a novel generative AI-powered app, it is the one with the mobile app which can reach a larger audience and possibly have a bigger impact on its users.
But to unlock the full potential of generative AI apps, enterprises must overcome several barriers concerning limitations of mobile devices, issues with cloud servers, data privacy and security challenges, as well as ensuring data integrity.
AI on the edge
A barrier to generative AI-powered mobile apps is that mobile devices still fall short in terms of the computational capabilities required by Large Language Models (LLMs). Cloud servers do not fully address this either because they are best suited for handling computationally intensive tasks such as training deep learning models and LLMs.
In addition, AI systems that are dependent on sending data to a centralised server are unable to deliver instantaneous responsiveness. Latency leads to delays that undermine the timeliness of AI-generated insights, while also incurring significant bandwidth expenses due to continuous data transfer.
The key to maximising the potential of mobile and edge AI is to adopt a strategic methodology towards model architecture, effective data management, and leveraging the inherent computing resources of the device. This includes moving the processing of tasks that demand real-time interaction between AI systems and users, as well as other machine learning operations, to the device level. By moving it closer to the network’s edge, overall performance is improved while enhancing user privacy through the reduction of the amount of data transmitted.
Better performance with less computing load
Reducing the device’s workload when using AI is another way to improve performance. For example, model quantisation is a process similar to how we compress large files such as music and videos so they can be attached to an email. This is done by rounding numbers in the data being used by AI, reducing the consumed storage space within the model itself.
A similar technique, GPTQ, also simplifies the data in an AI model but after training. This can be compared to a verbose classical novel that is rewritten with simpler language, resulting in fewer pages being used but retaining the original’s general themes and ideas.
Another process, LoRA, looks for patterns and connections within the training data to help improve the AI model’s predictions. By enabling the model to focus on the parts of its datasets which are relevant to predictions, the AI model makes better predictions while using fewer resources.
Improving data privacy, security, and synchronisation
Other factors which businesses should take into account when deploying mobile AI include the safeguarding of data privacy and security, as well as ensuring the highest levels of data synchronisation. While user privacy and data protection are already improved if processing is moved to the edge, it can be further enhanced through the implementation of robust data encryption and techniques.
Equally as critical to the deployment of mobile AI as data privacy and security is data integrity. Without the latter, edge devices and their AI applications cannot deliver their expected value in the form of valuable insights, analytics, and better decision-making.
One such mechanism for effectively synchronising data between edge devices and centralised servers or the cloud is a cohesive data platform with the ability to handle multiple data formats. By allowing AI models to access and engage local data repositories, whether online or offline, data integrity and consistency is ensured throughout a network. This, in turn, helps AI applications be swift, dependable, and adaptable to diverse settings.
Simplicity is key
Given the current limitations of smartphone hardware, as well as the inherent challenges of relying on cloud servers, the optimal architecture for mobile AI emphasises simplicity. Streamlining enables more resources to be allocated to AI algorithms, a crucial consideration in mobile settings where resource constraints are prevalent. With many companies looking to develop their AI-powered tools and solutions, those who make a successful breakthrough in the mobile app front can potentially take the lead over the competition.