Intractable Motion Tracking User Interface Prototype

By Kenneth Olson

(Iteration Two)

Inspiration 

I was inspired by science fiction user interfaces (UI) from movies like: “Oblivion” and “Minority Report” and other UI work from the motion designer, Gmunk. I wanted to try and create a real high tech interactable UI system using approachable low tech. This is so others could easily recreate this system. The above Image is the sample inspiration I made inside of Isadora. In the patch, the dots flash on and off, the numbers change, the lines move back and forth, and the circles rotate. Everything in the patch (except for the black grid) was made inIsadora and moved using several “Wave Generator Actors” and “Shape Actors.” 

Approach

In most of the movies examples of “future UI” The actors are interacting with some sort of black or clear display, and are using their hands as an input to alter or affect the objects on the display. To get Isadora to listen/follow my hands I used the “Eyes ++ Actor,” a web camera, black tape, and a white table top. My goal was to keep the overall system approachable and simple to create, and a web camera with black tape seemed to be the simplest tools for the job. 

The system works by: first, wrapping the users index fingers with black tape. Second, set up the web camera in a top down orientation, looking down at the users hands. Third, use a white table top, or a white sheet of paper works great, (this creates a high contracting image for isadora to track). Finally, direct the web camera output into an “Eyes ++ Actor”. From here anything is possible. Depending on lighting and other conditions, I found it helpful to add some extra Isadora Actors to make the system run smoother. (as shown below).

Eyes ++ Actor

The “Eyes ++ Actor” works great for this system, however, results may vary for other people. I was able to track up to three fingers at a time with relative ease. I should also note the “Eyes ++ Actor” works by following the brightest object in the scene, So by using a white table and black taped fingers I needed to turn “inverse” ON in the “Eyes ++ Actor” settings. I also assume this system will also function with a black table/background with white taped fingers. In this scenario you would keep the “inverse” setting to OFF in the “Eyes ++ Actor” settings. Because my hands are so white they blended into the white table easter, but for people with significant darker skin than mine, I would suggest using white tape with a darker table.

Uses and Examples

I used this system three different ways:

1) piano

2) connect the dots

3) multiple sliders.

Piano

In this system, when I moved my finger, with the tap on it, from left to right or right to left the lines on the screen would shrink. Sound could be added within this system, like a piano note when each line is triggered.

Connect The Dots

In this system, I used both hands. I have tape on my left and right index finger. The left dot is following my left index finger and the right dot is following my right index finger. The line is being auto generated with the “Lines Actor” and will always follow and connect the two dots together.

Sliders

In this system I have two different sliders. The slider on the left controls the horizontal position of the small square found in the box on the right. And the Slider on the right controls the vertical position of the small square. When used together the square can move around within the box. An important feature I wanted to create with these sliders was when one slider was in use the other sider would not move. I accomplished this with the use of the “Panner Actor” to select a specific area of the web camera output to watch. As with the other systems the “Eye’s ++ Actor” was using the entire web camera output to read and follow my taped finger. However, by using the “Panner Actor” I could scale down what the “Eye’s ++ Actor” could see, this focused the web camera output to a specific range. Meaning the “Eye’s ++ Actor” could only see my finger within a specific area of the table.

Assessment

With the time I had I accomplished what I set out to do by creating a hand controlled science fiction user interface. I would have liked to been able to put all of the systems I created for this project together, however, my computer wouldn’t allow such things to happen. For future iterations I would like to play with scale more. Perhaps replace the finger with a human body and have the “Eye’s ++ Actor” follow the human form. The “Eye’s ++ Actor” did work most of the time, but I did lose the tracking of my finger sometimes causing the visuals in the “Projector Actor” to “glitch out” not sure what was causing this issue weather it was the web camera, the “Eye’s ++ Actor”, or maybe the several other actors I used to edit the webcam footage. I would also like to find a way for the user to be touching the objects being affected in Isadora. Meaning, the user could touch the computer screen or a projection screen and the objects being projected would look like they were directly following the users hands on the screen, instead of the objects indirectly following the movement of the hands on the white or black table.

Isadora Patch:


Tara Burns – “a canvasUnbound” (Cycle 3)

Cycle 3: Basement iteration

Goals
*To have panels that disappear when triggered
*To have that reveal an underlying theme/movie
*To use the Oculus Quest as the reveal movie

Challenges
*Everything worked in my office and then when changing to the basement I had to add a few more features in order for it to work. I think the version it ended up at will hopefully be more able to travel with slight modifications. *It is very difficult to create an interactive system without a body in the space to test.
*The Oculus Quest doesn’t work without light, so without directional light I did get that working but you couldn’t see the projection. So in the final video I opted to just use a movie, knowing that it did work is good enough for me at this point and when/if I’m able to use directional light that doesn’t effect the projection we can try it again then. Alternately the positive of this is that I can interact with the system more, if painting in VR, I can’t see when and if I make the panels go away and where I need to dance in order to make that happen.

Moving forward
I’d would put this as big as possible and flip the panels to trigger on the same side as myself (the performer). Take some time to rehearse more inside the system to come up with a score with repetition and duration that allowed for people to see the connections if they are looking for it. Perhaps use the VR headset if that works out, but I am also ok with painting and then recording (the recording is the score that corresponds when the dance) a new white score specific to the space I am performing in to then use in performance. If large enough I think it would be easy to see what I am triggering when they are on the same side as me. In my basement, I chose to trigger the opposite side because my shadow covered the whole image.

The 20 videos projection mapped to a template I made in photoshop.
The test panel –> I used these projectors to show the whole image that the ORBECC was seeing and the slice of the frame I was using to trigger the NDI Tracker. I used the picture player to project my template for the above projection mapping.
These actors above come from the NDI tracker’s depth video -> chroma Key (turns the data tracked into the color (in this case red)) -> HSL Adjust (changes red to white) -> Zoomer (zooms the edges of the space to the exact area of space I want to track) -> IDlab Effect Horizontal tilt shift (allowed me to stretch the body so that it would cover the whole sliver that we are tracking whether upstage or downstage) -> Luminance Key (I actually don’t think it does anything, but we used it earlier to reduce the top (closest) and bottom (farthest) spaces to close in the space I wanted to track) then to the panner seen below.
This is almost the whole patch. Continued from above, the panner allowed me to track the exact verticle space I wanted to track -> the Brightness made that brighter and then the limit scale value created the trigger amount when in between the two numbers requested (45 – 55). Here you see four of the 20 progressions to the projection mapped panels which all are triggered the same and once I change panner/Calc brightness/Limit scale value to a user actor it will be easy to adjust for multiple spaces.
Here is my user actor that counts how many times each panel is triggered. It is attached to the unactive/red lines above in the image above this one. The range from the Calc Brightness (which is the range that goes into the limit scale value comes to this input and then when triggered adds on the counter until 10 and the inside range actor spits out a trigger to turn off the active parameter on the projector of the panel after I have triggered it 10 times. As I move through my score, this actor, deletes all the panels to reveal an underlying movie.

I recorded both 10 seconds of stationary and 10 seconds of slowly moving videos in the Oculus Quest (I wasn’t sure which would look better) and cut them into short clips in Davinci Resolve. I chose to use only the moving clips.

I converted all the movies to the HAP codec and it cut my 450% load in Isadora to 140%. This decision was prompted not because it was crashing anymore but it was freezing when I would click through tabs.

After some research on HAP, I found a command line method using ffmeg. My partner helped me do a batch on all my videos at the same time with the above addition.


Cycle Project 3

In my Cycle 3 project, I wanted to get a bit better filter with a harder edge to edit out the background of the images. Alex and I worked together and added some TT Sharpen effects, Gauzian blur and TT Sorbel Edge Detection. These filters stacked on top of each other allowed me to get my entire body cut out from the background. I think if I had something like a green screen in the background, the effect would be even more precise.

This image shows the part of the picture that was removed through the filter
This image shows the larger filter I was able to create to encompass my entire body

My major goal for the third iteration of the project was adding some things to make the user experience more interesting. I added some sound effects when buttons were pushed on the makey makey, as well as some short animations that would play after the user took a picture with a hat on.

I also added a third environment, which is a party scene. Overall, this project allowed me to synthesize many of the tools we were working with during the class. I used the makey makey as the interface. I also used the Leapmotion hand sensor to allow users to rotate and resize an image.

Much of my work on this project involved compositing, as I used the depth camera to capture the image of the user as well as the filter that would allow for the removal of the background.

If I were to continue on this project further, I would want to take the composited image of the user with the hat and put them into a game like situation, perhaps something like some of the games that came with the game boy camera software. I found that I really enjoyed designing experiences that users would interact with and trying to figure out what would make them clear and easy to use.

Cycle 3 Isadora patch

https://1drv.ms/u/s!Ai2N4YhYaKTvgbYSo1Tsa-MdTV2ZGQ?e=EmjBNh

This is a recording of me showing the different parts of my cycle 3 project

https://1drv.ms/u/s!Ai2N4YhYaKTvgbYctMNYMgWOb7P3Sg?e=5hRoqY


Cycle Project 2

As the next step in the photo booth project, I wanted to switch from using my webcam to the Orbecc Astra camera so that I could capture depth data while I was capturing the image. With the depth data, I would use a luminance key to filter out the background portion of my image.

One of the difficult parts of this project was the resolution of the Astra camera. It incorrectly detected some parts of my face, so they became transparent when run through the luminance key. In order to combat this, I added a gauzian blur, but it was not quite the tight filter I was looking for with my project.

This image shows how the Orbecc Astra captured the depth data
Here is the result using the gauzian blur to remove the background of the image.

https://1drv.ms/u/s!Ai2N4YhYaKTvgbYWfqACckMc8dOFkg?e=wsLF3P

This is a link for my code for cycle 2

https://1drv.ms/u/s!Ai2N4YhYaKTvgbYbCQq_7ZQ6ZIghtg?e=CXGBWk

This is a link to a video file of my cycle 2 presentation.


Cycle Project 1

For this project I wanted to synthesize some of the work I had done on my previous projects in the class. I wanted to create a kind of photobooth. Users would operate the booth by the makey makey and then Isadora would capture a webcam image of the user. From there the user can select different environments to add hats. For this iteration of the project, I offered a choice between a Western theme and a space theme.

After selecting the desired theme, the users could adjust the size of the hat by pressing a button on the makey makey interface. From there, they could use the Leapmotion controller and reposition and rotate the hat as well as resizing it.

One of the most difficult parts of this project for me was figuring out the compositing using virtual stages. Additionally, I spent a lot of time trying to find ways to make the prompts (which appeared over the image) disappear before the image was taken.

This is a link to my code for the project.

https://1drv.ms/u/s!Ai2N4YhYaKTvgbYVl0LiupCRh2PFdg?e=wuKpZm

Here is the link for my class presentation.

https://1drv.ms/u/s!Ai2N4YhYaKTvgbYaMqgJRuOsmjBMog?e=rDAOMd


Depth Camera CT Scan Projection System

by Kenneth Olson

(Iteration one)

What makes dynamic projection mapping dynamic?

Recently I have been looking into dynamic projection mapping and questioned what makes dynamic projection mapping “Dynamic”? I asked Google and she said: Dynamic means characterized by constant change, activity, or progress. I assumed that means for a projection mapping system to be called “dynamic” something in the system would have to involve actual physical movement of some kind. Like the audience, the physical projector, or the object being projected onto. So, what makes dynamic projection mapping “dynamic” well from my classification the use of physical movement within a projection mapped system is the separation between projection mapping and dynamic projection mapping. 

How does dynamic projection mapping work?

So, most dynamic projection systems use a high speed projector (meaning a projector that can project images at a high frame rate, this is to reduce output lag). Then an array of focal lenses and drivers are used (to change the focus of the projector output in real time). A depth camera (to measure the distance between the object being projected onto and the projector) and then a computer system with some sort of software to allow the projector, depth camera, and focusing lens to talk to each other. After understanding the inner workings of how some dynamic projection systems work I started to look further into how a depth camera works and how important depth is within a dynamic projection system.

What is a depth camera and how does depth work?

As I have mentioned before, depth cameras measure distance, specifically the distance between the camera and every pixel captured within the lens. The distance of each pixel is then transcribed into a visual representation like color or value. Over the years depth images have taken many appearances based on different companies and camera systems. Some depth images use gray scale and use brighter values to show objects closer to the camera and darker values to signify objects further in the distance. Each shade of gray would also be tied to a specific value allowing the user to understand visually how far something is from the depth camera. Other systems use color, while using warmer versus cooler colors to measure depth visually.

How is the distance typically measured on an average depth camera?

Basically most depth cameras work, the same way your eyes create depth through “Stereoscopic Vision”. For this Stereoscopic Vision to work you need two cameras (or two eyes) in this top down diagram (pictured above), the cameras are the two large yellow circles and the space between them is called the interocular (IN-ter-ocular) distance. This distance never changes, however, this ratio needs to be at a precise distance because if the interocular distance is too close or too far apart the effect won’t work. On the diagram the dotted line shows the cameras are both looking at the red circle. The point at which both camera sight lines cross is called the zero parallax plane, and on this plane all objects are in focus. This means every object that lives in front and behind the zero parallax plane is out of focus. Everyone at home can try this, If you hold your index finger a foot away from your face, and look at your finger, everything in your view, except your finger, becomes out of focus, and with your other hand slid it left and right across your imaginary zero parallax plane, with your eyes still focused on your finger you should notice your other hand is also in focus. There are also different kinds of stereotypes, another common type is Parallel, on the diagram, the two parallel solid lines coming from the yellow circles point straight out. Parallel means these lines will never meet and also mean everything will stay in focus. If you look out your window into the horizon, you will see everything is in focus, the trees, buildings, cars, people, the sky. For those of us who don’t have windows, Stereoscopic and parallel vision can also be recreated and simulated inside of different 3D animation software like Maya or blender. For those who understand 3D animation cameras and rendering, if you render an animation with parallel vision and place the rendered video into Nuke (a very expensive and amazing node and wires based effects and video editing software) you can add the zero parallax plane in post. This is also the system Pixar uses in all of its animated feature films.

Prototyping

After understanding a little more about how depth cameras work I decided to try and conceive a project using an Astra Orbbec (depth camera), a pico projector (small handheld projector), and Isadora (projection mapping software). Using a depth camera I wanted to try and prototype a dynamic projection mapping system, where the object being projected onto would move in space causing the projection to change or evolve in some way. I ended up using a set of top down human brain computed tomography scans (CT scans) as the evolving or “changing” aspect of my system. The CT scans would be projected onto regular printer paper held in front of the projector and depth camera. The depth camera would read the depth at which the paper is at in space. As the piece of paper moves closer or further away from the depth camera, the CT scan images would cycle through. (above is what the system looked like inside of Isadora and below is a video showing the CT scans evolving in space in real time as the paper movies back and forth from the depth camera) Within the system I add color signifiers to tell the user at what depth to hold the paper at and when to stop moving the paper. I used the color “green” to tell the user to start “here” and the color “red” to tell the user to “stop”. I also added numbers to each ST scan image so the user can identify or reference a specific image.

Conclusion 

The finished prototype works fairly well and I am very pleased with the fidelity of the Orbbec depth reading. For my system, I could only work within a specific range in front of my projector, this is because the projected image would become out of focus if I moved the paper too far or too close relative to the projector. While I worked with the projector I found the human body could also be used inplace of the piece of paper, with the projected image of the SC scans filling my shirt front. The projector could also be projected at a different wall with a human interacting with the depth camera alone, causing the ST scans to change as well. With a more refined system I can imagine this could be used in many circumstances. This system could be used within an interactive medical museum exhibit, or even in a more professional medical setting to explain how ST scans work to child cancer patients. For possible future iterations I would like to see if I could incorporate the projection to better follow the paper, having the projector tilt and scale with the paper would allow the system to become more dynamic and possibly more user friendly.


Cycle 2 – Stanford

I the Sketchup file I had been working on and put it in VR. I did this using a program called Sentio VR. After I created an account, I was able to install a plugin for Sketchup that allowed me to export scenes. Once the scenes were exported, I could go to the app on the Oculus Quest and input my account code to view my files.

I also had to find a way to mirror the Quest to my MacBook. I used the process outlined by the link below.

https://arvrjourney.com/cast-directly-from-your-oculus-quest-to-macbook-e22d5ceb792c

This gave me a mirrored image, but the result was not what I was looking for. I did not want to see two circles of image, so after I recorded the video, I cropped it to give a better product.

Screenshot of the video before I cropped it
A screen capture while I walked around the set
Another Scene (The Church)
Another Scene (Memphis)

Cycle 1 – Stanford

My final project is to create take a design (scenic) that I had done in the past and put it in VR so that you can walk around it and see it from both the audience and actor view. I focused on my Sketchup file for the first cycle.

The design is for a show called Violet. It is a musical set in the South in 1964. It is about a woman named Violet that has a huge scar across her face and she is traveling by bus to see a TV preacher in hopes that he can heal her.

I started from a base Sketchup file that had the Thurber already created.

Full Stage View
Close Up of the Truss
Additional Pieces

PP2 – Stanford

For this assignment, I used the Makey Makey to count the points of a card game. I created 3 buttons for each team with the labels 5, 10, and 20. These are the point values of the cards in the game. My goal was to have Isadora count the points for each team and when one reached the winning amount, a light would light up in the winning team’s color.

The buttons hooked up to the Makey Makey

In addition to the Makey Makey, I used an ENTTEC Pro. This allowed me to send a signal to an LED fixture from my computer.

Colorsource PAR

Each of the buttons were assigned a different letter on the Makey Makey. My patch used each letter to count by the value of the button it was associated with. It then added each team’s values together and a comparator triggered a cue to turn on the light fixture in either red or blue when a team reached 300 points or more.

My Patch
The cue triggered by hitting 300 points

Tara Burns – Cycle Two

1st trigger corresponds with audience rigth panel, 2nd trigger corresponds with audience left panel, 3rd trigger corresponds with 2nd from audience left panel, 4th trigger corresponds with 2nd from audience right panel

Goals
– Using Cycle 1‘s set up and extending it into Isadora for manipulation
– Testing and understanding the connection between Isadora and OBS Virtual Camera
– Testing prerecorded video of paintings and live streamed Tilt Brush paintings in Isadora
– Moving to a larger space for position sensitive tracking through Isadora Open NDI Tracker
– Projection mapping

Challenges and Solutions
– Catalina Mac OS doesn’t function with Syphon so I had to use OBS Virtual Camera in Isadora
– Not having a live body to test motion tracking and pin pointing specific locations required going back and forth. I wouldn’t be able to do this in a really large space but for my smaller space I put my Isadora patch on the projection and showed half the product and half the patch so I could see what was firing and what the projection looked like at the same time.
– Understanding the difference between the blob and skeleton trackers and what exactly I was going for took a while. I spent a lot of time on the blob tracker and then finally realized the skeleton tracker was probably what I actually needed in the end.
– I realized the headset will need more light to track if I’m to use it live.

Looking Ahead
The final product of this goal wasn’t finished for my presentation but I finished it this week which really brought about some really important choices I need to make. In my small space, if I’m standing in front of the projection it is very hard to see if I’m affecting it because of my shadow, so either the projection needs to be large enough to see over my head or my costume needs to be able to show the projection.

I am also considering a reveal, where the feed is mixed up (pre-recorded or live or a mix – I haven’t decided yet) and as I traverse from left to right the paintings begin to show up in the right order (possibly right to left/reverse of what I’m doing). Instead of audience participation, I’m thinking of having this performer triggered; my own position tracking and triggering the shift in content perhaps 3-4 times and then it stays in the live feed. Once I get to the other side, it is a full reveal of the live feed coming from my headset. This will be tricky as the headset needs light to work (more than projection provides), which is a reason I switched to using movies in my testing as I didn’t have the proper lights to light me so the headset could track and you could see the projection. I also was considering triggering the height of the mapped projection panel (like Kenny’s animation from class) and revealing what is behind that way. Although I do want to keep the fade in and out.

I used the same set up from Cycle 1 to wirelessly connect the headset to the computer and send it to OBS. I created these reminders in my patch to make sure I did all the steps necessary to make things work. Note: The Oculus Quest transmits a 1440×1600 resolution per “eye.” To be able to transmit that resolution to Isadora, make sure the “Start Live Capture” in OBS is turned off, change to the appropriate resolution, then “Start Live Capture” and Isadora should receive this information.
The Video in Watcher caught Virtual Camera in Isadora from the live capture in OBS then projection mapped the four panels of projections and their alternate panel to be triggered. Knowing this works is a big step and now I need to decide if it is necessary.
Later, I projection mapped movies I downloaded from the Oculus Quest, so I didn’t have to have the headset streaming a live feed of VR footage while testing.
I began using “Eyes++” and “Blob Decoder” to to trigger the panels but wasn’t able to differentiate between blobs/areas of space.
This is what happens (although interesting) using the blob decoder. It was very difficult to achieve a depth that wasn’t being triggered by extraneous elements even using threshold. Perhaps using ChromaKey might have helped, but essentially I think I want the locations to correspond with specific panels blob decoder seemed too care free in that regard.
I switched to using the “Skeleton Decoder” and used “Calc Angle 3D” (see Mark Coniglio’s Guru Session #13) to calculate the specific area I wanted to trigger the fade between movies. Mark explains it better but essentially you stand (or ideally have someone else stand) in the space where you want the trigger, watch the numbers in the x2, y2, z2, and catch the median numbers they send off when you are standing in the space. Then put those numbers in the x1, y1, z1. Send the “dist” to the value in a “Limit Scale Value” and determine the range where it can catch the number. In Mark’s tutorial, he achieves ‘0’, however I couldn’t do that so I made a larger range in my limit scale value actor and that seems to work. I hypothosize that it might be the projection interference with the depth camera but I’m not sure. More testing is needed here, perhaps I can reduce my depth range in the OpenNDI Tracker.
I did this 4x to trigger each panel. Note: they are all going into the same “Skeleton 1” id on the Open NDI Tracker because I only had one body to test. So choreographically, I have to change the patch if I want more people in the work by connecting each panel to a different skeleton id.
This is how I achieved the numbers by myself. This way, I was able to watch the screen, remember the numbers and then input them into the actor.