Persistent Systems has developed an app that employs object-recognition technology to help visually-impaired people navigate through their everyday lives.
The Drishti app uses the smartphone’s camera to identify and classify objects and people and describes what it sees to users, according to TECH2.COM. However, Drishti isn’t limited to just perceiving individual items. Instead, the app is capable interpreting the events and objects comprising an entire scene, providing critical context to users.
At its most basic level, the app can relay a simple audio description of something it sees, such as text written on a sign. However, Drishti also can understand and relate the details of its surrounding in a comprehensive fashion. For example, Drishti can note that a man with a black shirt is walking nearby through a path flanked by green foliage.
To train the app, Pune, India-based Persistent Systems said it connected its software to a user’s Google+ photos and Facebook account to gather images. The company then used its machine-learning algorithm to study the data and develop contextual information about the user’s daily life. Leveraging this information, Drishti’s object-recognition capabilities allowed it to identify people the user frequently encounters, such as friends and family.
Drishti requires no specialized hardware. Easier processing tasks are performed locally on a smartphone. However, more processor-intensive work is conducted in the cloud. Persistent Systems said it developed the software and launched it on Google Compute Engine, the infrastructure-as-a-service component of the Google Cloud Platform.
The company said it developed the app in just 24 hours as part of a global hackathon event it hosts. Persistent Systems now is considering whether to monetize the product. Instead of offering Drishti as a standalone product, the company said its technology might be integrated into other applications that have broader functionality, according to TECH2.COM.
Persistent Systems said Drishti was inspired by a project from Microsoft Cognitive Services called Seeing AI.
Seeing AI utilizes a pair of smart sunglasses and a mobile app that can describe real-world events to visually-impaired users by voice. Similar to Drishti, Seeing AI provides interpretation of the nearby scene. For example, when the user touches the sunglasses, Seeing AI can describe the appearance and mood of people nearby. It also can relay descriptions of actions, such as observing that a young man is performing a skateboard trick in the vicinity.
To build Seeing AI, a Microsoft researcher employed the company’s artificial intelligence application programming interfaces (APIs). The five specific API used were Microsoft Face Recognition, Barcode and QR Code Scanning, Object-Recognition, Text-to-Speech and Custom Recognition.
Nirel Marofsky is project analyst for the cognitive engine and application ecosystem at Veritone. She acts as a liaison to strategic partners, integrating developers and their capabilities into the Veritone Platform. Learn more about our platform and join the Veritone developer ecosystem today.