Category: AR / VR

App Development AR / VR Cloud Speech Assistants

How-To: Convert Neural Voice Audio from Amazon Polly (mp3) to Spark AR (m4a)

Post author By andijakl
Post date November 3, 2021

Finished Spark AR project playing a sound file upon its start. The artificial speech was generated using the Amazon Polly text-to-speech neural voice engine.

Currently, Facebook’s Spark AR Studio is restrictive with supported audio formats. Unfortunately, only M4A with specific settings is allowed. This short tutorial is a guidance on how to convert artificially generated neural voices (in this case coming from an mp3 file as produced by Amazon Polly) to the m4a format accepted by Spark AR. I’m using the free Audiacity tool, which integrates the open-source FFmpeg plug-in.

Spark AR has the following requirements on audio files:

M4A format
Mono
44.1 kHz sample rate
16-bit depth

Generating Audio using Text-to-Speech (mp3 / PCM)

Neither Amazon Polly nor the Microsoft Azure Text-to-Speech cognitive service can directly produce an m4a audio file. In its additional settings, Polly offers MP3, OGG, PCM and Speech Marks. MP3 goes up to a sample rate of 24000 Hz, PCM is limited to 16000 Hz.

Tags ar, augmented reality, AWS, Spark AR, Speech, Speech Services

Android AR / VR Digital Healthcare

Enlightening Patients with Augmented Reality

Post author By andijakl
Post date September 3, 2021
No Comments on Enlightening Patients with Augmented Reality

In a recent research project, we researched possibilities for interactive storytelling, usability, and interaction methods of an Augmented Reality app for patient education. We developed an ARCore app with Unity that helps patients with strabismus to better understand the processes of examinations and eye surgeries. Afterwards, we performed a 2-phase evaluation with a total of 24 test subjects.

We published the results at the IEEE VR conference. The peer-reviewed paper is available through the open access online proceedings or on ResearchGate.

A brief overview of the main findings:

Health Literacy and Education

Low health literacy is a well-known and serious issue. 1 in 5 American adults lack skills to fully understand implications of processes related to their health . Audio and computer-aided instructions can be helpful. Especially spoken instructions lead to a higher rate of understanding . A smartphone app that combines multiple approaches can therefore provide great benefits.

We developed and evaluated a prototype Augmented Reality (AR) mobile application called Enlightening Patients with Augmented Reality (EPAR). The app is designed for patient education about strabismus and the corresponding eye surgery. It is intended to be used in addition to the doctors’ mandatory consultations.

Tags App Development, ar, ARCore, augmented reality, healthcare, unity

Android App Development AR / VR

2D Image Tracking with AR Foundation (Part 4)

Post author By andijakl
Post date August 18, 2021
2 Comments on 2D Image Tracking with AR Foundation (Part 4)

2D Image Tracking in AR Foundation with two simultaneously tracked images.

With 2D image tracking, you can create real-life anchors. You need pre-defined markers; Google calls the system Augmented Images. Just point your phone at the image, and your app lets the 3D model immediately appear on top of it.

In the previous part of the tutorial, we wrote Unity scripts so that the user could place 3D models in the Augmented Reality world. A raycast from the smartphone’s screen hit a trackable in the real world, where we then anchored the object. However, this approach requires user interaction and a good user experience to guide users, especially if they’re new to AR.

Please accept YouTube cookies to play this video. By accepting you will be accessing content from YouTube, a service provided by an external third party.

YouTube privacy policy

If you accept this notice, your choice will be saved and the page will refresh.

Using 2D Image Tracking

You need to provide reference images, which your app’s users will then encounter in the real world. AR Foundation distinguishes these images and tracks their physical location.

Some usage scenarios where 2D image tracking is helpful:

Recognition of real-world objects
Automatically place information on top of objects
Create an indoor info or navigation system
Often quicker & easier than plan detection

Tags App Development, ar, ARCore, augmented reality, google, unity

Android App Development AR / VR

Raycast & Anchor: Placing AR Foundation Holograms (Part 3)

Post author By andijakl
Post date August 12, 2021
1 Comment on Raycast & Anchor: Placing AR Foundation Holograms (Part 3)

In the first two parts, we set up an AR Foundation project in Unity. Next, we looked at to handle trackables in AR. Now, we’re finally ready to place virtual objects in the real world. For this, we perform a raycast and then create an anchor at the target position. How to perform this with AR Foundation? How to attach an anchor to the world or to a plane?

Please accept YouTube cookies to play this video. By accepting you will be accessing content from YouTube, a service provided by an external third party.

YouTube privacy policy

If you accept this notice, your choice will be saved and the page will refresh.

AR Raycast Manager

If you’d like to let the user place a virtual object in relation to a physical structure in the real world, you need to perform a raycast. You “shoot” a ray from the position of the finger tap into the perceived AR world. The raycast then tells you if and where this ray intersects with a trackable like a plane or a point cloud.

A traditional raycast only considers objects present in its physics system, which isn’t the case for AR Foundation trackables. Therefore, AR foundation comes with its own variant of raycasts. They support two modes:

Tags App Development, ar, ARCore, augmented reality, google, unity

Android App Development AR / VR

Trackables and Managers in AR Foundation (Part 2)

Post author By andijakl
Post date August 10, 2021
No Comments on Trackables and Managers in AR Foundation (Part 2)

Plane and Point Cloud detected by AR Foundation

After setting up the initial AR Foundation project in Unity in part 1, we’re now adding the first basic augmented reality features to our project. How does AR Foundation ensure that your virtual 3D objects stay in place in the live camera view by moving them accordingly in Unity’s world space? AR Foundation uses the concept of trackables. For each AR feature you’d like to use, you will additionally add a corresponding trackable manager to your AR Session Origin.

Trackables

In general, a trackable in AR Foundation is anything that can be detected and tracked in the real world. This starts with basics like anchors, point clouds and planes. More advanced tracking even allows environmental probes for realistic reflection cube maps, face tracking, or even information about other participants in a collaborative AR session.

Trackable managers available in AR Foundation.

Each type of trackable has a corresponding manager class as part of the AR Foundation package that we added to our project.

Tags App Development, ar, ARCore, augmented reality, google, unity

Android App Development AR / VR

AR Foundation Fundamentals with Unity (Part 1)

Post author By andijakl
Post date August 6, 2021
No Comments on AR Foundation Fundamentals with Unity (Part 1)

AR Components, AR Subsystem and Provider Plugins in AR Foundation.

When developing mobile Augmented Reality apps, you usually want to target both Android and iOS phones. AR Foundation is Unity’s approach to provide a common layer, which unifies both Google’s ARCore and Apple’s ARKit. As such, it is the recommended way to build AR apps with Unity.

However, few examples and instructions are available. This guide provides a thorough step-by-step guide for getting started with AR Foundation. The full source code is available on GitHub.

AR Foundation Architecture and AR SDKs

To work with AR Foundation, you first have to understand its structure. The top layer of its modulare design doesn’t hide everything else. Sometimes, the platform-dependent layers and their respective capabilities shine through, and you must consider these as well.

AR Foundation is a highly modular system. At the bottom, individual provider plug-ins contain the glue to the platform-specific native AR functionality (ARCore and ARKit). On top of that, the XR Subsystems provide different functionalities; with a platform-agnostic interface.

Tags App Development, ar, ARCore, augmented reality, google, unity

Android AR / VR Image Processing

Visualize AR Depth Maps in Unity (Part 5)

Post author By andijakl
Post date December 11, 2020
1 Comment on Visualize AR Depth Maps in Unity (Part 5)

Video of hands-on example for AR Foundation Depth Maps with Unity

In the final part, let’s look at how we can generate and use the AR depth maps through Unity’s AR Foundation. In the previous part, we tested the ready-made example. Now, it’s time to write code ourselves.

In this case, I’m using Unity 2021.1 (Alpha) together with AR Foundation 4.1.1 to make sure we have the latest AR support & features in our app. But as written in the previous article, Unity 2020.2 should be sufficient.

I’ve tested the example on Android (Google Pixel 4 with Android 11 & ARCore), but it should work fine also on iOS with ARKit.

You can download the full, final AR Foundation Depth Map sample from GitHub. I’ve released the project under MIT license.

Project Setup

First, configure the project for AR Foundation. I won’t go into too many details here, as the official documentation is quite good on that:

XR Plug-in management: activate the management in the project settings. Additionally, enable the ARCore Plug-in provider. To check if everything was installed, open Window > Package Manager. You should see both AR Foundation as well as ARCore XR Plugin with at least version 4.1.1.

Android player settings: switch to the Android build platform, uncheck multithreaded rendering, remove Vulkan from the rendering APIs, make sure the package name is personalized and finally set the minimum API level to at least 24 (Android 7.0).
Scene setup: add the required prefabs and GameObjects to your scene. Right-click in the hierarchy panel > XR > XR Session. Also add the XR Session Origin.

By default, the AR depth map is always returned in Landscape Right orientation, no matter what screen orientation your app is currently in. While we could of course adapt the map to the current screen rotation, we want to keep this example focused on the depth map. Therefore, simply lock the screen orientation through Project Settings > Player > Resolution and Presentation > Orientation > Default Orientation: Landscape Right.

Tags ar, ARCore, augmented reality, depth map, google, machine learning

Android AR / VR Image Processing

Compare AR Foundation Depth Maps (Part 4)

Post author By andijakl
Post date December 8, 2020
No Comments on Compare AR Foundation Depth Maps (Part 4)

In the previous parts, we’ve taken a look behind the scenes and manually implemented a depth map with Python and OpenCV. Now, let’s compare the results to Unity’s AR Foundation.

How exactly do depth maps work in ARCore? While Google’s paper describes their approach in detail, their implementation is not open source.

However, Google has released a sample project along with a further paper called DepthLab . It’s directly accessing the ARCore depth API and builds complete sample use-cases on top of them.

DepthLab is available as an open-source Unity app. They use the ARCore SDK for Unity directly and not yet the AR Foundation package.

Depth Maps with AR Foundation in Unity

However, Google recommends using AR Foundation with their own ARCore Extensions module (if needed; currently, they only add Cloud Anchor support). Therefore, let’s take a closer look at how to create depth maps using ARFoundation.

Tags ar, ARCore, augmented reality, depth map, google, machine learning

Android AR / VR Image Processing

How to Apply Stereo Matching to Generate Depth Maps (Part 3)

Post author By andijakl
Post date December 4, 2020
3 Comments on How to Apply Stereo Matching to Generate Depth Maps (Part 3)

In part 2, we rectified our two camera images. The last major step is stereo matching. The algorithm that Google is using for ARCore is an optimized hybrid of two previous publications: PatchMatch Stereo and HashMatch .

An implementation in OpenCV is based on Semi-Global Matching (SGM) as published by Hirschmüller . In Google’s paper , they compare themselves to an implementation of Hirschmüller and outperform those; but for the first experiments, OpenCV’s default is good enough and provides plenty of room for experimentation.

3. Stereo Matching for the Disparity Map (Depth Map)

OpenCV documentation includes two examples that include the stereo matching / disparity map generation: stereo image matching and depth map.

Most of the following code in this article is just an explanation of the configuration options based on the documentation. Setting fitting values for the scenes you expect is crucial to the success of this algorithm. Some insights are listed in the Choosing Good Stereo Parameters article. These are the most important settings to consider:

Block size: if set to 1, the algorithm matches on the pixel level. Especially for higher resolution images, bigger block sizes often lead to a cleaner result.
Minimum / maximum disparity: this should match the expected movements of objects within the images. In freely moving camera settings, a negative disparity could occur as well – when the camera doesn’t only move but also rotate, some parts of the image might move from left to right between keyframes, while other parts move from right to left.
Speckle: the algorithm already includes some smoothing by avoiding small speckles of different depths than their surroundings.

Visualizing Results of Stereo Matching

I’ve chosen values that work well for the sample images I have captured. After configuring these values, computing the disparity map is a simple function call supplying both rectified images.

Tags ar, ARCore, augmented reality, depth map, google, machine learning

Android AR / VR Image Processing

Understand and Apply Stereo Rectification for Depth Maps (Part 2)

Post author By andijakl
Post date December 2, 2020
10 Comments on Understand and Apply Stereo Rectification for Depth Maps (Part 2)

In part 1 of the article series, we’ve identified the key steps to create a depth map. We have captured a scene from two distinct positions and loaded them with Python and OpenCV. However, the images don’t line up perfectly fine. A process called stereo rectification is crucial to easily compare pixels in both images to triangulate the scene’s depth!

For triangulation, we need to match each pixel from one image with the same pixel in another image. When the camera rotates or moves forward / backward, the pixels don’t just move left or right; they could also be found further up or down in the image. That makes matching difficult.

Wrapping Images for Stereo Rectification

Image rectification wraps both images. The result is that they appear as if they have been taken only with a horizontal displacement. This simplifies calculating the disparities of each pixel!

With smartphone-based AR like in ARCore, the user can freely move the camera in the real world. The depth map algorithm only has the freedom to choose two distinct keyframes from the live camera stream. As such, the stereo rectification needs to be very intelligent in matching & wrapping the images!

In more technical terms, this means that after stereo rectification, all epipolar lines are parallel to the horizontal axis of the image.

To perform stereo rectification, we need to perform two important tasks:

Detect keypoints in each image.
We then need the best keypoints where we are sure they are matched in both images to calculate reprojection matrices.
Using these, we can rectify the images to a common image plane. Matching keypoints are on the same horizontal epipolar line in both images. This enables efficient pixel / block comparison to calculate the disparity map (= how much offset the same block has between both images) for all regions of the image (not just the keypoints!).

Google’s research improves upon the research performed by Pollefeys et al. . Google additionally addresses issues that might happen, especially in mobile scenarios.

Tags ar, ARCore, augmented reality, depth map, google, machine learning