Depth sensing with a camera phone...

Hello,

I’m trying to work out whether depth sensing using a single camera is possible.

Had a Tango for a while, trying to make the same sort of content work on regular phones, so I’ve been messing around with ARtoolkit and Vuforia, and have had some success using a 2D image target, but ultimately not as stable as I would like (glitches when camera is perpendicular to image), I was wondering about solving that problem by building/3D printing a cone or pyramid with IR LED’s around distributed on the faces as fiducial markers, then use a webcam or phone camera to see the IR light and work out the depth based on the IR LED cluster separation distance.

Does this seem feasible? I’m not massively familiar with how depth track IR works other than I know you usually need a stereoscopic IR projector and camera, does anyone know if the projector and camera need to be in sync, similar to ultrasonic pings? Or is it possible decouple them?

In terms of software I know of OpenCV, just seen that it can work with Unity, so that excites me, just need to know if this seems possible to someone else more knowledgeable than myself!

Is anyone around tonight to have a chat about it?

Rich, I have never really done anything this complex with a camera module. I understand what you are trying to achieve but it’s a bit out of my bailiwick. Some of the others have played around with camera modules and so may be able to assist you further…

There’s one technique called photogrammetry
this involves taking a series of pictures around an object, then using number crunching to work out the 3d model
this is different from the ir / depth sensing approach and requires more processing power but less hardware
I’m not sure if your interested in that form of 3d scanning but the one app I tried that worked the best was 123D Android app
http://www.123dapp.com/catch
I found this app, but it’s more difficult to use

Awesome thanks for the links Richard. I got distracted with finishing off AR Chess so I didn’t bosh down tonight:

I should elaborate a little further on the idea, I came across using cones as markers for Augemented Reality applications after experimenting with 2D markers:

2D markers work kind of well, so I thought I could improve using a 3D marker to account for perpendicular use, using a shape whose dimensions you know exactly, so you have known reference point for scale. The reason for using IR led’s on top was maybe for extra added depth sensing stability, but I could probably get away with tracking regular LED’s.

The links Richard provided are for scanning an unknown thing, which is great, although I’m wondering what if you already know information about the object, can we gain some insight into the surroundings to help anchor points in an AR context.

I’m thinking it could work using a cone with a spiral like distribution of dots/LED’s. Track a specific colour and try to recognize clusters from angle/separation and intensity. Just thinking out loud. Not sure if it would work.

If you are fixed on a phone camera then I’m going to struggle to add anything - but if you just want a single camera then I have a couple of ideas:

  • you could use a known-sized item (or a series of them receding into the distance) as a guide
  • you could project a grid onto the item you are scanning, the deformity of the grid will give you some idea of depth.
  • you could use a reflective sphere to judge the light source and work out your 3d space from that - this is how they do it with Harry the lizard in “Death in Paradise”. This video: https://www.youtube.com/watch?v=LWB6ESuarwY might help you understand the technique.
  • you could photograph your object in a box that already has the grid printed on the walls, ceiling and floor.
  • if your camera is a Lytro then you have all that information from the camera, I believe.
  • This video is pretty much what you are doing: https://www.youtube.com/watch?v=X8uDVL1iy9o
  • this video: https://www.youtube.com/watch?v=5otNClEEGSQ may also offer some clues.
  • a Kinnect can, I believe do what you want - but the definition is not great.

Good list! Makes it sound do-able!

Was thinking about the apps for recognising QR codes, in relation to
recognising a printed grid.

The reference object is relatively easy, but not sure how much that will
give you. Fundamentally, monocular vision is limited:

Is there any scope for using the front camera of the phone, in addition to
the main camera, for ‘rabbit syle’ 360 vision? or chicken style using the
camera shake or movements?

Col

P.s. Hac people, have you heard of these? https://foxdogstudios.com/ They
did their phone-interactive comedy show at Dead Cat Comedy last night and
it was ace

If you are fixed on a phone camera then I’m going to struggle to add anything - but if you just want a single camera then I have a couple of ideas:

What’s the difference between a phone camera or a single camera? A phone camera is a single camera. I was thinking I could potentially even detect IR light as some phone cameras might not have an IR filter, so I could pick up on it if I had a dedicated emitter, the idea comes from same way that some camera’s in film studios are attached with an IR led array that beams the light up to retroreflective stickers to get an idea of where the camera is in space.

you could use a known-sized item (or a series of them receding into the distance) as a guide

you could project a grid onto the item you are scanning, the deformity of the grid will give you some idea of depth.

This is the method I’d like to explore at the moment, a known sized item or series of them, what I’m trying to get is a reference point in space for the virtual to be over laid.

I’m thinking if I use a smart phone, I could take advantage of the IMU, gyro/compass for added reference points or to manipulate the camera in the 3D scene. QR codes and 2D AR markers work well but only at certain angles within a certain range, I want investigate how to increase the range of interaction.

And yes I understand monocular depth sensing is still very much an open question, but that 3D Fiducials for Scalable AR Visual Tracking paper seems like it’s edging in the right direction. So steal the cone idea add some lights and get my blob on with OpenCV!

I would imagine that, for rabbit vision, you would have to have cameras that have near 180 degree vision. But it’s an interesting idea. Perhaps, with a fish-eye adapter?

  • But rabbits are more about spotting predators, rather than working out how much farther away one blade of grass is from another. They pretty much just get their heads down and start munching.

I would suggest that most phone cameras are not really suitable. For one thing you cannot lock the focus.

One possible method might be to use a lens with a really wide aperature, giving a very shallow depth of field, then adjust the focus ring progressively to bring differing parts of the item into focus - this will give you a very linear cue as to depth.

Flat, even lighting is important for this sort of work, I would imagine.

You could really do with better kit than just a phone.

The difference between a single camera and a phone camera is primarily how much control you have over the lens focus and aperature.

But if you are looking to do infra red then I would suggest that a PiNoir camera attached to a Pi would be your best bet, as phone cameras filter most of the IR out to get the colour balance correct. Better still, you could have two Pi with a camera each and do stereo vision very cheaply indeed (particularly if you used PiZero) - which would be way better. Particularly if you had two cameras at right angles (or maybe 45/60 degrees?) to the object.

You have not said much about the size of your object, which might rather inform the discussion.

That’s a great idea, very similar to the way the leap motion works, that uses a stereoscopic camera and an IR emitter, that has been trained to specifically detect hands to a really high degree, all the bones and joints.

The object, I had either a pyramid or cone in mind, is of arbitrary proportions, they are just known variables, cone’s height and radius say is 100mm and 50mm. I was thinking about distributing leds around the surface, potentially in an uneven distribution or just a distribution where it doesn’t look the same from all sides, so you can detect clusters and inform yourself of where you are in 3D space relative to the object.

So you have a calibration step, you put your camera at a certain angle at a certain distance to the dots/constellations on the cone, it figures the cone is always the 0,0,0 point in space. But this approach may falter if the focus cannot be locked on a phone camera.

Let’s say I’m making a tabletop AR game with a little cone you stick in the middle of the table for reference, the main piece of information you need to know about is where the table is in space and where the graphics should be overlaid, instead of detecting the table, what if I could infer some information about the table based on the reference markers on the cone and known size/shape, the reference would also help to match the virtual camera with the video feed overlay.

I should come and show some of the experiments I’ve been doing with 2D markers, it works surprisingly well, just not a some angles, this is just an idea to improve/expand on the current method.

Any reason you can’t use lidar for ranging?

Let’s say I’m making a tabletop AR game
Well, if /that/ is what you are doing it sounds not unlike the PlayStation game that was around a while back, for Harry Potter
https://www.youtube.com/watch?v=2n5prI9L6G8

  • might be helpful?

The horrific cost?

:wink:

Single point LIDAR is not too expensive, I was thinking of a unit like this - http://www.robotc.net/blog/wp-content/uploads/2011/11/GP2D12.pdf

I assume that works by knowing the size and orientation of the ‘optical codes’ also means that you don’t need to range them.

I assume the wand has an IMU in it as well as the IR illuminated bulb.

I assume that works by knowing the size and orientation of the ‘optical codes’ also means that you don’t need to range them.

I assume the wand has an IMU in it as well as the IR illuminated bulb.