Is 2018 pointing towards the end of mobile as we know it?
The latest developments in Voice, Vision and AI
The powerful pocket computer is becoming powerful and interactive more than ever before. In late 2017, we have seen significant improvements in hardware available in mobile devices. This is greatly increasing interaction design opportunities. Cameras are evolving from an input device to an intelligent interactive medium. The microphone is now a direct gateway to artificial intelligence and NFC is bringing the IoT even nearer.
Flagship devices such as Pixel 2, iPhone X and Samsung S8 are making all this accessible while providing interaction designers and software engineers with opportunities to create even more new meaningful experience for users.
Are these new devices simply the latest cool gadgets or highly polished prototypes for the next generation of devices?
Hints in User Interaction
The traditional input elements are naturally still available on today’s devices. Haptic actions such as tapping and gestures still lead the way in the way we interact with content on our mobile devices. Yet, the power of AI manifested through improved Voice and Vision interaction is changing the game.
Voice Interaction
The microphone is one of the most essential input devices in phones. It is generally taken for granted few were the occasions where it was given a proper use other than its role in phone calls. With the raise of voice driven AI agents such as the Google Assistant and Siri, the microphone now became a gateway to artificial intelligence.
The Voice User Interface (VUI) evolved over the recent years. It started off as an alternative for the keyboard until it was also seen as a possible medium to give commands to machines. Subsequently, in the early versions of Siri and Google Voice, it also became a gateway to search engines, providing an alternative method for users to search the web. Over the past year, we’ve seen these techniques becoming more refined and truly realising the idea of an assistant on one’s pocket. This software can now be linked to calendars and other applications for better context. It also acts as an interface with the IoT while also allowing users to have small talk or play games with the agent itself. This video outlines some features of the Google Assistant.
Smart(phone) Vision
In the late 1990s, when mobile devices became truly accessible, they were devices that could send/receive calls and SMS. In the next decade they also came with coloured displays and had a camera integrated within them. Up until recent years, the camera in smartphones served mostly a single purpose, that of a digital camera, with the convenience of it being part of your personal device. In other words, it was convenient because as users we did not need to separately carry a phone and a camera and could only have a single device.

However, efforts were being made to make better use of this rich and powerful input device in our phone. It is now becoming another main interaction device in our phones.
In reality, it is even incorrect to refer to the phone camera. There are multiple cameras on the latest phones and hardware is finally becoming available for software engineers to integrate computer vision techniques in mobile applications. These multiple cameras also include the emerging trend of having stereo cameras on the backside of the phone, therefore providing the opportunity for stereo vision and the subsequent depth computation. It is evident that till now, manufacturers and OS designers are making minimal use of this powerful capability and the most popular use of this configuration is its application in computational photography. The results are impressive and we can’t wait till we see its full implementation resulting in effective augmented reality that looks promising through the release of ARCore and ARKit respectively.
Through iPhone X, Apple also revolutionised the configuration of the front camera system. The front camera is accompanied by an infrared camera, dot projector and three light sensors to provide precise tracking of facial expressions. This is being popularly used in the FaceID feature that enables users to unlock the phone just by looking at it. Apple’s engineers also found a playful application for this sophisticated technology in Animoji. However, this is the tip of the iceberg for what is ahead. Through AI techniques, developers may soon be able to detect user expressions while using applications and relate expressions to features of content being presented to the users. Possibilities are endless.

The recent developments in deep learning are also enabling rich features that can be used on mobile devices. Open-source frameworks such as TensorFlow are providing a lite version that allows developers to deploy it on smartphones. This provides new opportunities such as object detection and recognition. Google Lens project is already showcasing its features such as recognising paintings, OCR and detecting products in images. All this, paves the way to proper Augmented Reality.

Handheld Controllers
High performing mobile devices are providing an opportunity to make Virtual Reality accessible. This is done through headsets such as Samsung’s GearVR or Daydream VR. The only challenge with such a configuration is that the device is used in a headset and therefore solely serves as a display. This naturally makes interaction a challenge.
At first, headsets such as Google Cardboard had an inbuilt magnetic trigger that evolved into a tapping button that allowed the user to interact. This was nonetheless restricted to a single button and interaction was limited.
With the Daydream controller, the game just changed. This sensor-packed controller allows users to interact within the virtual world like never before. This controller has buttons to go out of the app or switch but it also has a sensitive clickable trackpad for finer movements and interaction. The controller also has a gyroscope and an accelerometer that tracks hand movements as you explore the virtual world.
Other Sensors
While we are enjoying the above mentioned cutting edge interaction media in our mobile devices, we should not forget about the other sensors that add more context to our apps. All the above can be integrated with location data a emerging from GPS or Galileo signals. This can be further enriched by coupling it with compass readings and accelerometer/gyroscope data. Possibilities are, once again, endless.
Pixel2 just introduced a new concept of interacting with a mobile device…squeezing. No, that’s not a typo. You can squeeze the device and it will bring up the Google Assistant interface. It’s like squeezing someone’s arm to get attention. This time, your agent is in the phone so you’d have to squeeze the phone instead. This can also provide new opportunities for new experience design on mobile devices.
End of mobile as we know it?
Well, it could be and all this points precisely in that direction. Today’s latest phones will look in retrospect as the prototypes for the next generation of devices. These new devices will probably be something on the lines of Google Glass that minimally relies on haptic interaction. The level of performance being achieved by vui agents and device artificial vision are in my opinion the confirmation that we were waiting for.
During the F8 conference in 2017, Mark Zuckerberg clearly indicated that Facebook’s strategy is that of providing a new view on the world through Augmented Reality via devices that look like glasses.
Few months later, Google also relaunched its Glass project through the X Company. This time Google is focusing more on enterprise applications to help workers perform better by making use of this hands-on device. The interesting part is that after you browse through the official page, it becomes even clearer that the enablers are Voice, Vision and AI. All items that I highlighted in this blog post are pointing in one direction: today’s latest phones are prototyping the next generation of devices.
For now, we’ll just wait, design and code 😎