Project highlights
- Capture and recognize handwritten text on touchscreens
- Correct and predict input words
- Provide safe car control while driving
- Industry:
- Automotive, Machine Learning
- Market:
- Global
- Team size:
- 10 engineers
- Project duration:
- 2 months
Business challenge
The Intellias team has completed an R&D project for an Android car keyboard application with a handful of premium features including handwriting recognition. This was our company’s response to market demand for a solution that would address some of the most common challenges and inclusivity needs faced by drivers on a daily basis:
- Complex and distracting interfaces compromise safety
- Noisy car environments muffle voice assistants
- Speed limits set constraints on the use of keyboard text input
- Vehicle vibration makes it hard to hit keys and distorts the quality of handwritten content
- Input convenience for left-handed and hearing-impaired drivers is often overlooked
With all these driver pain points in mind, we came up with a check-list of hard requirements for our product. Intellias has successfully conducted an R&D project that demonstrates our strong technological capabilities and background in neural networks, machine learning, and artificial intelligence. After a series of team meetings and workshops, we confirmed that our solution offers an efficient input method.
Solution delivered
Our team of Android and machine learning developers, security and maintenance specialists, designers, and QA engineers started to work full steam ahead on our R&D product.
The custom Android keyboard application supports several modes of user input: standard, swipe, and handwriting recognition. It can understand finger-written text and lets drivers communicate efficiently with their car infotainment system without taking their eyes off the road. This solution is extremely intuitive, recognizing any natural handwriting: uppercase, lowercase, cursive, block, and even superimposed text.
Thanks to the system’s integration with user information, drivers can enter a destination, view points of interest on the map, change climate control settings, turn on their favorite music, and call or message their contacts ─ all with the twist of a finger. The application’s ability to capture handwritten content from scanned images and even suggest and predict words based on a user’s habits and writing style provides convenience to drivers and increases safety.
Handwriting recognition
Our handwriting detection software, implemented with the TensorFlow framework, transcribes images of words into digital text. We built a convolutional neural network (CNN) and trained it on the IAM offline dataset of word images. An input image, represented by vectors and time deltas, is fed into the CNN model, which extracts relevant features. The feature map output is then further processed, the algorithm reveals correlations with words, and digital text is generated. This model recognizes handwritten text from images with considerable accuracy.
Multi-character recognition (MCR)
To implement a multi-character recognition model, we used MXNet and ResNet as backbone neural networks as well as specific pre- and post-processing functionality to deal with finger movement while driving. Our MCR approach is composed of three steps:
- Detecting text areas by way of feature extraction using a convolutional neural network
- Recognizing characters using a convolutional neural network
- Applying a language model to correct errors
Correction and word suggestions
We developed word correction and word suggestion/prediction models using classical machine learning approaches.
By calculating the Levenshtein distance, a word correction model corrects wrongly identified words and enhances word detection accuracy.
Additionally, we implemented a reinforcement learning approach based on handwriting samples from drivers in test cars to enhance models with user interaction feedback. Text corrections on user input will be collected and the model will be updated accordingly.
A word suggestion model predicts the next word or phrase to minimize the amount of input required. This model is based on bigram and n-gram extraction as well as statistics from historical user data. It takes into account the context of writing, language and location dictionaries, and even a user’s behavioral and spelling preferences.
The solution also includes a mechanism for location-based suggestions to prompt the user when searching for a specific address or place. Based on the user’s location, the system extracts the requested information from an NDS map database and suggests a location or street name. With years of experience building location and mapping solutions, our engineers have all the necessary expertise for creating and updating location databases and integrating them with car keyboard applications.
Apart from developing a technical solution that includes the keyboard itself along with a back end that allows it to achieve high recognition accuracy, we also addressed other machine learning-related challenges:
- Updating models
Our solution includes an over-the-air (OTA) update service that allows for easily distributing freshly trained handwriting recognition machine learning models to vehicles. In this way, recognition accuracy can improve over time in cars on the road.
- Acquiring sufficient training data
We employed state-of-the-art technologies for data augmentation to extend the training data set, augmenting multi-character recognition data with a mixture of real and artificial data. To accomplish this, we adhered to Autoencoder and GAN approaches and used only real handwriting data to generate augmented data samples. In addition, we used classical augmentation approaches including data rotation, scaling, shifting, and distortions to account for vehicle movement.
Business outcome
As automotive giants worldwide continue to raise the bar for driving safety, comfort, and efficiency, it’s important for car companies to stay ahead of the curve in the face of fierce competition. The keyboard application we devised will allow automakers to maintain customer satisfaction rates above the industry average and even increase them with time. Our highly responsive solution offers intuitive interactions and nearly effortless input. It is inclusive by nature and boasts impressive word recognition accuracy. This application is a fundamental step toward changing the driving experience for users.
Intellias engineers brought a lot of industry-specific expertise and important skills to the table for this project, including automotive competence, experience with location and navigation systems, Android software development skills, and knowledge of machine learning.
Now that we have achieved an acceptable and sufficient level of accuracy for handwritten text recognition, the next step is to improve our neural network by testing it on model cars. We’re also planning to conduct additional research on classical computer vision approaches – edge extraction and edge classification – to further increase performance and optimize this product for embedded use. After we bring the quality target close to the expected level, our next goal will be to train the model to recognize over 30 languages and turn our solution into a multi-lingual product.