Cycle One: The Movement-Based Sound Explorer

As I began thinking about what I wanted to make my cycles about, I found myself gravitating towards a question I had previously been interested in when beginning to work on my senior project at the beginning of the year: How might technology allow for the creation of new modes of musical interface, where the relationship between audience and performer is almost entirely dissolved?

My primary resource was Max/MSP, as I know it best of all the computer music softwares, and I find it very useful for the development of new ways of making music.

Going into this cycle I knew one of the central resources that I could use to help me answer this question was Google MediaPipe–a real time motion capture software that uses webcam input as opposed to dedicated hardware/software that requires mo-cap suits. This allows for systems which anyone can easily interact with, even without knowing how the system works or what each mo-cap landmark is controlling. I handled this part of my patch in TouchDesigner, as the Max integration of MediaPipe has some difficulties I don’t have time to get into here.

My main goals for this cycle were to create an interface to interact with sound that was fun, interesting, but also left room for potential emergent behavior when left in the hands of different users.

My TouchDesigner network, which takes the full MediaPipe data, selects only the landmarks I wish to use for musical control, and sends them over OSC into Max. I am also sending the black image with dots into Max over NDI as a monitor for movement. As this was just the first cycle, I only used the right wrist for control. I plan on implementing control for both limbs in future cycles.

The second major piece of software I used was the Fluid Corpus Manipulation (FluCoMa) toolkit for Max/MSP (also available in SuperCollider and Pure Data). This toolkit uses machine learning software to analyze, decompose, manipulate, and playback a large collection (or corpus) of samples. I initially chose this piece of software as one of its modes of playback is a 2D plotter which can map two different aspects of the sample analysis on to an X and Y axis. I thought this would be a perfect interface for MediaPipe control as the base 2D plotter uses mouse input, which I found to be detrimental to using it as an “instrument.”

The 2D plotter as it appears in the UI of my patch, the black square is where the MediaPipe skeleton is received over NDI.

I had initially wanted to expand the idea of the 2D plotter to a 3D one, as I felt being able to interact with the patch in a 3D space would be much more natural. However, I found expanding the logic to work in 3 dimensions was a much more difficult task than I’d thought, so I decided to stick with the 2D plotter for this cycle.

An overview of my Max code. The folder of media that the user wishes to analyze is dumped into the fluid.audiofilesin object, which concatenates them into one continuous buffer. It receives the OSC MediaPipe data on the left, which is then routed through the “query” subpatch, which figures out the nearest points in the plotter to the scaled MediaPipe data and sends it to “playback,” where a play~ object is told what part of the combined buffer to play back. By doing this, movement within the camera frame becomes an analogue to movement in the plotter.
Inside “playback” subpatch.
The analysis portion of my Max patch. It runs each sample (cut up by the “slicing” subpatch” in the earlier photo) through Mel-frequency cepstral coefficient (MFCC) analysis, a short-term feature extraction technique. Through fluid.bufstats~ and fluid.bufflatten~, the buffers which store the features of the MFCC analysis are concatenated into one single channel. The following subpatches “normalization scaling,” “fit kdtree,” and “dump normalized data to points” are used to group the MFCC data into 2 distinct dimensions, normalize the values, and then send them to the plotter .js object.

Results

I thought that I was mostly very successful with the goals I set out to accomplish. Everyone wanted to try out the patch, which I thought was a testament to the “fun” and “interest” aspects of it. The controls were also quickly picked up on, which was a goal of mine, as I’m interested in systems that audiences can interact with regardless if they’re conscious of the mechanics of that interaction or not. I was most interested to see how different people had their own unique ways of interacting with it as well. Chad, for example, was really trying to make something rhythmic and intelligible out of it, while others were going all over the place, or looking for specific sounds.

Some missed opportunities that I want to expand on in future cycles is the use of the Z dimension in controlling the playback of samples, as well as the use of multiple limbs to control playback. As you can see in the video, users were somewhat restricted in how they could control the patch by the Z direction not doing anything, as well as the fact that only the right hand could trigger sounds. By expanding this idea to 3D, instead of two, and allowing for the use of multiple limbs, I think it’ll give people more freedom in how they interact with the corpus of sounds.

This was the first real project I’ve done with FluCoMa, and thus I learned a ton about its mechanisms, particularly the storage of non-audio data in buffers. This is a concept used a lot more in environments like SuperCollider or Pure Data, as Max has some other objects for storing that kind of information. However because of the way the machine learning tools in FluCoMa work, it needs to store all of the information it may need in RAM. This was also the first project I’ve done sending OSC data between different apps on my computer, which had a bit of a learning curve as I discovered OSC data sends as strings, instead of floating point numbers. This didn’t create any real difficulty, as the conversion took no time, but it did make me aware of an important aspect of using OSC (particularly with Max, as certain objects process strings/floats/integers differently).


Cycle 1 -Solo (but at this moment) Paper Plate DJ

What is up?
Here is my cycle one post 🙂

I started this cycle with the hopes and dreams of creating a live performance experience that challenges the user to piece together a story from a song with lyrics they can’t understand. This branches from two research interests/questions.
1. How can interactive technology facilitate meaning-making and engagement with works of art?

2. How can designers best utilize interactive/immersive experiences to invoke a sense of power within their participants?

I tapped into my own personal lived experiences to explore these questions; I feel the most powerful when moving to music and playing rhythm games. So I set out to recreate that feeling of being pretty good at a rhythm game. Demonstrated in the brainstormed documents below 🙂

I chose a song that is entirely in Japanese, with the knowledge that no one in my class knows Japanese. I sectioned off a translation of the lyrics to attach to 4 different inputs that align with the imagery described. I also used the music video as a reference. I also planned to buy buckets. The user drums on various parts of the bucket that are labeled with lines of lyrics from the song. They will keep the tempo in the area of the drum where they believe the lyrics are being sung.

Circles represent the drums (tap to interact), the stars represent footage that will distort the current imagery based on emotional output from the lead singer (hold to interact).
After first declaring buckets as my control panel, I was advised to think more deeply about the device I wanted to utilize. So I drew up schematics for the device before Spring Break. The user moves the joystick/drum surface to various sides based on their guess of what lyrics are being sung. They can hold various parts of the top of the pad to alter the footage based on emotions interpreted by the user. However, during Spring Break, I decided to just go with buckets…The control scheme will be the same, just drumming instead of shifting the position of the surface. As I am not an engineer and don’t have time to work with one on this.

I decided to create the control scheme first – focusing entirely on that before setting up the physical controller.

The whole ding dang thing, there is more content now but this is close enough.
The tap controls (how each hit cycles through clips)
Distortions (hold controls). The levels are on turn up to one, and the comp is on multiply so when everything is white you cannot see the footage.
How the video playback works 🙂 Basically once you tap a lyric space it stays in that spaces footage folder until you tap another.
On the day of presenting, I hastily crafted this plate to control… However, I was only able to program 3 of the keys, for the distortion footage (stars in the original plan), the user had to hold down the key; ideally, they would just have to hold down on a conductive area of the bucket/controller. I used my Fitbit as the grounding agent.

This test was accompanied by instrumental music. I didn’t want to reveal the song yet, and wanted to focus on the physical reactions with the controls to influence how I construct the controller in cycle 2. Most of the songs were slower-paced, but once a faster one came on, Chad (shown in the video above, but the specific moment wasn’t captured on film) stood up and rapidly switched between visuals, which contrasted with the slow, exploratory manner he had before when testing out the various interactions. The song I chose is a bit funkier and faster-paced than the music played before, so I think it will add some excitement to the interactions.

I received feedback that having the grounding element be the user holding their thumb to the center of the plate felt more natural than the watch, and avoiding tangled wires. I also received feedback on the controls being confined to the desk. I agreed as I plan to have the control be in the center of the room, but I didn’t have that prepared for this cycle… So on to the next…


Cycle 1: The (bad) Friend

The Score

My idea for this cycle was simple (or atleast it seemed so in my head): make an AI-powered interactive experience where the user shares a space with an AI ‘presence’. It lives on a screen, but its there for you and it listens to whatever you have to say – or dont have to say. The score: a participant enters a space, speaks naturally, and the environment responds to the quality of what they shared through a particle system. No text output, no voice back. Just the space changing around them. The framing I gave participants was: “this is a friend you can talk to.” That framing is what became the main problem.

Resources

  • TouchDesigner for the visual/particle system
  • Python + PyAudio for microphone input
  • OpenAI Whisper for speech-to-text transcription
  • Claude API to interpret the speech and return atmospheric parameters (brightness, movement, weight, density) as JSON
  • OSC to pipe values from Python into TouchDesigner
  • Orbecc depth camera for body tracking (ceiling-mounted, blob detection)
  • Motion Lab
  • A Michael for troubleshooting (1)
The original Score and process diagram

Process and Pivots

I wrote a python script that takes user input through the microphone, then uses OpenAI Whisper for speech-to-text transcription. It then sends the speech to claude in order to parse it according to the system prompt I gave it, which were metrics like emotional register, weight, intensity etc. The python script in turn sends these metrics to touchdesigner through OSC. Inside touchdesigner, I made table DATs that were storing the values of the incoming signals in order to apply those values to the visual system (a particle system). The values were suppsoed to effect the movement and color of the particle system.

The color registers
OSC inputs and the Table DATs
The particle system
The particle system (visuals)
The media pipe and the orbecc systems for motion detection

I initially built my system on MediaPipe for body tracking, but then when I shifted the system to the motionlab, I had revelations. The system worked fine for a laptop but for it to work in an open space and a big projection screen, it would need a camera directly in front of the participant’s face (and the screen) to work, which sounds horrible for an immersive experience. So, I switched to blob detection through the Orbecc ceiling camera. That took a while to get right. It wouldn’t even detect me and I couldn’t figure out why so I made the very obvious assumption that it hates me lol. Turns out it needs something to reflect off of and I was wearing all black.

The original prompt to Claude was trying to do an emotional analysis, as in read how the person was feeling and respond to that. At some point I rewrote it to just read the texture and quality of what was shared, not the emotional content. That was actually the most important design decision I made: the difference between “I understand you” and “I am here.” The particle system was jerking between states and it felt mechanical, so I also had to apply some smoothing for it to not act crazy.

What Worked, What Didn’t, What I Learned

What didn’t work: ALOT. I think apart from the framing of the system, I had not realized the amount of time I needed to properly do this. I had only gotten a limited amount of time in the MOLA so I was only able to troubleshoot the projection and not run through the whole pipeline. I did not anticipate alot of things as they went wrong the biggest example of this would be the lag. There’s bad bad latency in the pipeline (mic → Whisper → Claude → OSC → TouchDesigner) and it was long enough that participants got confused. They’d speak, nothing would happen, they’d speak again, then two responses would arrive at once. A few people got genuinely frustrated. The “friend you can talk to” framing made this much worse because it set up an expectation of conversational timing that the system couldn’t meet. Lou said it was a bad bad friend. Like one of those people who keep looking at their phone when you’re trying to talk to them.

What worked unexpectedly: The observers. People watching someone else use the system felt something – specifically, they felt empathy for the participant who was being poorly served by the AI. That observation became the most interesting research finding of the whole cycle.

What I learned: Time is the biggest resource, and you have to plan according to it. Instead of trying to force all of your bajillion ideas into the time that you have. Also, framing matters more than you think it does! Had the same system been framed a different way, I would’ve gotten away with it, but since I had framed it a specific way, there were specific expectations.


Pressure Project 2: One for All, All for one

I named my cell One for All, All for One because it is built around the idea that an individual, and communities as a whole are constantly shaping each other. The cell itself is a constan conversation between the oneself and the communal archive. It takes a live video feed and layers it over a slideshow of images showing communities and people from different parts of the world.

Then interactive sound enters the picture. A glitch effect driven by audio input levels determines how much the live video overlay fractures. The louder the audio, the more the live layer breaks apart and reveals the slideshow underneath. Alongside this, I built in an internal LFO paired with an Edge TOP to create a rhythmic pulse, something I called a “heartbeat”. Even without external input, the system works fine and feels alive.

The audio module of the system which controls the switch
The video module, which overlays video input with the slideshow using the switch. Also shown here is the module of the glitch effect, which uses noise, displace, ramp, texture3d, and time TOPs. The Edge TOP is applied on the very final video output recieved from all of the revious processes.
The signal module which plugs into Edge TOP

The structure is modular and layered, and honestly not that complicated. There is a live video input, and a media player (which controls the slideshow) which plug into a switch. The output of the switch goes into a glitch system, and a pulse system. Each could be replaced without breaking the overall logic. The audio input, live video input, and the signal (LFO) are designed so that they could be overridden by an external network signal as well. The cell has its own system, but it is designed to connect, following the true concept of one for all, all for one.

The External network of the Cell which enables it to communicate with other cells in the network. It includes audio input/output, video input/output, and signal input/output.

Reflection:

When all the cells assembled, things became unstable. Signals were constantly dropping and connections were dying. For a while, I thought something was wrong with my cell because nothing would show up (it was a problem with the input signals I was getting). When I finally got it to work, very interesting emergent behaviors appeared. The glitches danced to different rhythms. Video overlays ended up in very interesting stacked outputs. It was interesting because I did design my system while being aware that it had to be plugged into a bigger system. However, I did not envision the results I got during testing. What I controlled alone became either amplified or distorted by others. The network did not just combine outputs. It reshaped them. I think where my careful planning fell off was the heartbeat. I had not accounted for the fact that other cells can have signals of different types. Instead of a steady pulse, I got an irregular signal input, which changed the whole heartbeat effect. At first it felt like something went wrong. My cell was no longer just reacting to my inputs. It was reacting to everyone. That is exactly what One for All, All for One means. Each cell affects the others. Each signal influences the collective behavior. My cell had a life of its own. In the network, it learned to respond, adapt, and sometimes surrender to the collective.

Project File: pp2_Zarmeen.zip


Pressure Project 2 – The Flipper

Description

The Flipper is a TouchDesigner patch that uses an audio input to create video, and uses a video input to create audio. When used in a network this “cell-block” acts independently by creating entirely new audio and video, instead of just modifying the audio and video it receives. Its modularity lies in its ability to provide other users in the cell-block network with new sources of audio and video, that are themselves generated from other audio and video over the network.

Collective Documentation

Pending

Individual Documentation

Overview of my cell-block’s network. This is connected to three inputs and outputs on the outside of the container, which connect to other cell-blocks on the network. While there’s a lot on screen, it breaks down into a few simple sections.

This portion of the network takes in audio from over the patch through the in_audio CHOP. The envelope, math, and audioparaeq objects slow the stream of data and boost high frequencies, respectively. This then is turned into a spectrogram and is sent directly to a chopto TOP.

This portion processes that audio spectrum into a new visual. Starting in the bottom left, I use a series of TOPs to create a flow-like visual, which is then composited with the spectrum. This new visual is colored using a series of ramps and a look up TOP. The ramps are cycled through using either an LFO or an input from in_osc over the network. An example of the visuals this produces is below.

Lastly, this portion of the patch processes video received over the network from the in_video TOP (or in this case, a camera input) into audio. While I didn’t get quite as interesting an audio output as I wanted, I still think I was effective in transforming video to audio. The video that is received gets sent directly to a topto CHOP, which reads RGB values over the X and Y planes of the video. the following objects then reduce the amount of data, and turns those waves into a stereo audio signal by the merge CHOP. This wave is given an envelope by the math objects (I attempted to control this with another osc input but failed to make it work) and is sent out over the network. An example of audio is included below.

Reflection

Since I knew I wanted to flip the audio and video signals inside my patch, the independence of the cell-block was semi-inherent the entire time I was working on it. In order to ensure it was connectable with others, however, I needed to ensure that whatever the patch did was interesting enough, while still clearly using audio and video to influence the opposite output, so that it didn’t just seem like I was generating something entirely new.

I made choices about what to include and exclude primarily by trying to figure out what I could accomplish that was reasonably within my ability, but still interesting. For example, I’ve worked with spectrogram imagery in the past, so I knew I would be able to incorporate that the easiest. On the opposite end of that, I attempted to integrate FM synthesis into the audio part of my patch to get some more interesting sounds. However, with my inexperience in TouchDesigner, I found it really difficult to make FM work, so I chose to exclude it.

One thing that surprised me was how even if cell-blocks didn’t work “perfectly” together, they still were able to have some sort of interaction, even having unexplainable interactions. I was also a bit surprised how underutilized the OSC data we were sending was. I know I was personally having difficulty in doing something interesting with the OSC signals, but it was interesting that it was a widespread problem. I think this might come from the fact that the other signals we were working with were both very tangible. Since the OSC input was just a number, I think we were a bit less motivated to find an interesting way to use it, as opposed to the audio and video, which we could immediately do interesting things with.

I think we didn’t have quite enough time to experiment with combining our cell blocks in different ways for a lot of emergent behavior to emerge. But one that I enjoyed seeing was how the visuals would layer together through 2, 3 or 4 cell-blocks. I thought that all of the cell blocks were interesting on their own, but the most interesting visuals were created through the combination of several together. This relates to Halprin’s cell-block framework through the idea that we can each create our own module that does its own thing, but the most exciting behaviors only emerge once we begin to combine the different cell-blocks, and experiment with how they feed into each other.

Download Patch


Pressure Project #2 – Transcendence through Snares

  • Description of my cell-block: 
    • Independently, my cell-block uses the snare of the audio provided to cycle through a set of mouth shapes to simulate lip syncing (albeit not realistic lip syncing). It also takes the video input and through Ramp and Displace, warps the image based on the “mid” registered from the audio as well. Without outside input the audio used is “Position Famous” by Frost Children, and the video is a looping timelapse POV of a subway traveling underground. This was to create a sense of motion and exhilaration (the movement of the subway and displacement), and playfulness (the lip syncing). 
  • Collective documentation: 
    • Video/photos of the assembled system: Admittedly, I forgot to take footage of the showcase. I was a bit more nervous about this project, worried about everything working properly with the other cell-blocks. Once it was my turn, I only focused on presenting my work. I plan to reach out to classmates to see if they recorded footage. 
  • Process reflection:  
    • The cell-block was self-contained but, on the exterior, was connected to incoming TOP and CHOP inputs as well as feeding those inputs out. So, on its own, the block would play as planned, but once outside audio and video were fed in, they would then take the effects of the previous media. There were some issues with feedback loops when testing this out, but mostly, it worked. The lips were a last-minute add and therefore independent…so no matter what, the lips stayed on screen; how it reacted depended on the audio input.  
    • I made the choice to control the level of flashiness and movement with my visuals. It’s easy to fall into producing loud and flashy imagery with programs like TouchDesigner or even After Effects, however, I try to use media responsibly, and I also didn’t want to give myself a headache. I’ve made materials that are hard for photosensitive people to take in, and while some others loved the chaotic visuals, I wasn’t satisfied knowing a group of people wouldn’t be able to watch it (and enjoy it).  
    • I was surprised a lot of people didn’t use audio that contained a lot of snares (or used much audio at all)… I was also surprised that everything worked together for the most part (if you can’t tell, I was nervous). 
    • Everyone’s work offered me something new when combined. I would combine with Luke’s when I wanted the most cohesive combination, I would combine with Zarmeen’s when I wanted to destroy everything (or use her audio), I combined with Chad’s because I wanted to appear on his channels more, and I combined with Curtus when I wanted to see a dragon. 
    • This project was a new way to envision Halprin’s cell-block method, but in a strictly digital realm. The goal was to have every block exist on its own and influence others (multiply the possibilities of the content produced). I think we mostly did that, although networking still feels stressful to me; I at least know how it works (sort of). 
  • Individual documentation:

PressureProject.zip


Pressure Project 1: The Musical Spiral

Description

The musical spiral is a self-generating patch that randomly generates shapes at different sizes and positions, and spins them in a random direction for a random-length cycle. When these shapes cross a line, they (are supposed) to trigger a random musical note.

Documentation

Before starting to code my patch, I did a quick sketch for my idea of what I generally wanted the patch to do to help me save time later. While I had to change and add a bit outside of this, this essentially became the outline of what my code would look like.

The overview of my patch, upon entering the scene, the random numbers for duration of the cycle and the direction of the spin are generated, since they’ll be applied to all of the shapes. When the cycle ends, the spinning shape user actor sends a trigger to the Jump++ actor, going to a duplicate scene, which jumps back to the first scene.

Inside my “spinning shape” actor, the final result of my original user actor sketch. The bottom 2/3rds of the screen contains the actors randomizing the attributes of the shapes actor. The top 3rd deals with spinning the shape clockwise or counter-clockwise (decided by the “flip coin router” user actor) for a cycle of random length with a random delay from shape-to-shape.

Inside my “hitbox trigger” user actor. This actor takes each shape (which has been sent to its own virtual stage) and looks for when it makes contact with a small white rectangle I sent to every virtual stage in the “hitboxes” user actor. When it makes contact, it was supposed to trigger a random sound in the “sound player” actor.

Random selection of 18 short samples of single notes. Chromatic from C3-F4.

How I checked if one of the spinning shapes was inside the same area as the hitbox, sending a trigger when they “made contact.”

Sound playback user actor.

A sample of how the final version of the patch behaved. The white line (actually smaller than the hitbox) was left on screen to provide a reference for when the sound was supposed to trigger (despite it not working that way due to the high load of the patch).

Reflection

One of the best ways I managed the 5-hour time constraint was to make the sketch of my idea as seen earlier in this post. By working backwards from my initial idea to solve the problem the best I could on paper, I gave myself a framework to easily build off of later when problems or changing ideas arose. It also meant that I had a general idea of all the different parts of the patch I would need to build before I actually started working on it. This also guided what I would include/exclude in the patch.

While my patch didn’t end up working the way I wanted it to (sounds were supposed to trigger immediately when the shapes crossed the line, unlike what is seen in the above video) I was very surprised how this didn’t “ruin” the experience, and how it even created a more interesting one. With the collision of the shapes and the white line being decorrelated from the sounds, the class became seemingly became more curious about what was actually going on, especially when the sounds would appear to trigger with the collision after all. I was also interested to see the ways people “bootstrapped” meaning on to this patch. For example, Chad had noticed that in one of the scenes, the shapes were arranged in a question mark sort of shape, leading him to ask about the “meaning” of the arrangement and properties of the shapes, despite them being entirely random.

During the performance of the patch, I unlocked the three achievements concerning holding the class’s attention for 30 seconds. I did not make someone laugh, or make a noise of some sort, as I think the more “abstract” nature of my patch seemed to focus the room once it started.


Pressure Project#1: Pitch, Please.

Description: Pitch, Please is a voice-activated, self-generating patch where your voice runs the entire experience. The patch unfolds across three interactive sequences, each translating the frequency from audio input into something you can see and play with. No keyboard, no mouse, just whatever sounds you’re willing to make in public.

Reflection

I did not exactly know what I wanted for this project, but I knew I wanted something light, colorful, interactive, and fun. While I believe I got what I intended out of this project, I also did get some nice surprises!

The patch starts super simple. The first sequence is a screen that says SING! That’s it. And the moment someone makes a sound, the system responds. Font size grows and shrinks, and background colors shift depending on frequency. It worked as both onboarding and instruction, and made everyone realize their voice was doing something.

The second sequence is a Flappy Bird-esque game where a ball has to dodge hurdles. The environment was pretty simple and bare-bones, with moving hurdles and a color-changing background. You just have to sing a note, and make the ball jump. This is where things got fun. Everyone had gotten comfortable at this point. There was a lot more experimentation, and a lot more freedom.

The final sequence is a soothing black screen, with a trail of rings moving across the screen like those old screensavers. Again, audio input controls the ring size and color. Honestly, this one was just made as an afterthought because three sequences sounded about right in my head. So, I was pretty surprised when majority of the class enjoyed this one the best. It’s just something about old-school screensaver aesthetic. Hard to beat.

What surprised me most was how social it became. I was alone at home when I made this and I didn’t have anyone test it so, it wasn’t really made with collaboration in mind, but it happened anyway. I thought people would interact one at a time. Instead, it turned into a group activity. There was whistling, clapping and even opera singing. (Michael sang an Aria!) At one point people were even teaming up, and giving instructions to each other on what to do.

When I started this project, I had a very different idea in my mind. I couldn’t figure it out though, and just wasted a couple hours. I then moved on to this idea of a voice controlled flappy-duck game, and started thinking about the execution it in the most minimal way possible (because again, time). This one took me a while, but I reused the code for the other two sequences and managed to get decent results within the timeframe. There’s something about knowing there is a time limit. It just awakens a primal instinct in me that kind of died after the era of formal timed exams in my life ended. In short, I pretty much went into hyperdrive and delivered. I’m sure I would’ve wasted a lot more time on the same project if there was no time limit. I’m glad there was.

That said, could it be more polished? Yes. Was this the best I could do in this timeframe? I don’t know, but it is what it is. If I HAD to work on it further, I’d add a buffer at the start so the stage doesn’t just start playing all of a sudden. I would also smooth out the hypersensitivity of the first sequence which makes it look very glitchy and headache-inducing. But honestly, with the resources that I had, Pitch, Please turned out decent. I mean, I got people to play, loudly, badly, collaboratively, and with zero shame, using nothing but their voice. Which was kind of the whole point.


Pressure Project #1 – A Walk In Nature

Description: “A Walk In Nature” is a self-generating experience that documents two individuals’ time together deep in the woods.

The Meat and Bones (view captions for descriptions):

Photos I took before production (I had no real clue what I was going to do)

I set up a “waiting room” to make sure everything was working properly, since a lot hinged on the audio working. This rotating image of me was made during an attempt to make myself “do a cartwheel.” I haven’t quite figured it out yet.
A screenshot of the intro scene. It’s a forest that distorts at random. The title text reads in Comic Sans “A Walk In Nature”.
How I generated the title, I am notorious for choosing horrible fonts in my projects, so this time I wanted to do it… but on purpose, for healing.
Horrible photo (sorry), but this is the setup to initiate Kaleidoscope CC on the forest background randomly. I may have overdone it. This also contains Jump++ to transition scenes.
The setup for the intro voices (I will expand upon this process in the next scene).
A screenshot of the main scene: three deer appear on screen, one has a human face superimposed onto it, and another has eight legs and no head. (Beautiful).
Similar to the Kaliedoscope CC, I used TT Pixellate to make the background feel more 64-bit, as the deer images are pixel art.
The setup to superimpose a live camera feed on a deer.
My group of User Actors mainly contains deer, but one contains the conversations of the disembodied voices.
My setup for the deer actors, the messed-up one is the same, but with Reflector.
The setup for the conversation.
A screenshot of the secret end scene that is a failed attempt of handtracking, the idea was for the viewers to get to pet a deer.
My setup trying to simulate hand tracking.

The Reactions:

I am very thankful for Zarmeen’s presence, as I don’t know if I would’ve achieved all the bonus points without her. While I received relatively affirming verbal feedback at the end, without her talent of reacting physically, I would have felt way more awkward showing this messed-up video.

Reflections:

I was actually extremely relieved to have a time limit on the project, as I am very limited on time as a grad student with a GTA and part-time job (it’s rough out here). I loved the idea of throwing something at the wall and seeing what sticks. I chose to do the majority of the work in one setting, figuratively locking oneself in a room for five hours and leaving with a thing felt correct. I did note ideas that popped up throughout the week, but I didn’t end up doing any of them anyway.
I was far too hung up on the idea of making sure people pay attention; original ideas had the machine barking orders at the viewers to “not look away”, but that felt mean. So I went with the idea of making everyone so uncomfortable that they forget to look away, like how I feel watching Fantastic Planet. Towards the last hour, I realized that aside from robots talking, I needed user interaction to make this feel whole. However, the cartwheel and petting action didn’t work out as pictured above. So what if the audience could be the deer?
The last hour was me messing with an app to use my camera as the webcam (Eduroam ruined my dreams there). So I grabbed a webcam from the computer lab the day of. (sorry Michael) I knew I was going to choose one lucky viewer to hold the camera, and choosing Alex was improvised I just thought he would be most excited to hold it. I was pleasantly surprised that there were expressions of joy while watching, as when I showed my partner, she was scared and mad at me. I am glad my stupid sense of humor worked out. 🙂


AI EXPERT AUDIT – DANDADAN

I chose the anime DanDaDan as my topic. I believe I am an expert in a lot of anime/manga related topics because I have been reading manga and watching anime for more than a decade now. I love DanDaDan especially because it’s one of the few series lately that’s a little different in a world of overly saturated genres like the leveling-up games. DanDaDan is a breath of fresh air and super weird and fun filled with all sorts of absurdity. So, in order to train notebook LM about this topic, I used some YouTube videos. The videos focused on the storyline, major arcs, characters, and why is it such a hit.

1. Accuracy Check

I wasn’t so surprised that it got the gist of the story correct. I did give it sources where the youtubers summarized the whole storyline and talked about its characters, arcs and resolutions. So, it wasn’t a bad generic overview, I would even say it was good for a summary. It’s only when you’ve been thoroughly into a certain subject area that you start understanding the nuances and tiny details of it. I think it didn’t say something outright absurd if we were to talk about what it got wrong. It’s just that it sometimes mispronounced some names. With the names being Japanese, I am not surprised that they might be mispronounced, but the AI used a range of mis-pronunciations for the same name.

One of the voices in the podcast was too hung up on making the story what it is not. I mean sure it was justified at some points but it insisted that the real ideas behind this absurd adventure-comedy are deeper themes like teenage loneliness, and that it’s actually a romance story while it’s not. (It’s a blend of scifiXhorror) Sure there are sub-themes like in all anime, but it’s not the main theme. The other voice sometimes did agree with this idea. The podcast was not focused enough on just keeping it fun and light- which is what DanDaDan really is.

2. Usefulness for Learning

If I was listening to this topic for the first time, I feel like this podcast wouldn’t be a bad starter. Like I mentioned earlier, it gave a pretty decent summary of the whole plot. I think it definitely gets you started if you need a quick explanation of a subject area. I found the mindmap to be pretty decent too. It was a decent overview of the characters and the arcs. The infographic on the other hand… so bad. The design is super cringe and again, a lot of emphasis is on the romance and how it drives the action. Which I disagree with.

3. The Aesthetic of AI

Overall, the conversation was SO very cringe, and it was very difficult to get used to it in the beginning. I used the debate mode and they were talking so intensely about a topic that’s just nowhere as serious as the AI made it out to be. I had to just stop and remind myself it’s just a weird, fun anime they’re talking about. AI has this tendency to make everything sound intense, I guess.

4. Trust & Limitations

I would recommend AI to someone who wants a quick summary or overview of a topic. It’s what the AI is good at. What I wouldn’t recommend is to dwell on the details that the AI talks about. If anyone wants details or wants to form an opinion about a topic, they should look into it themselves.

Link to the podcast:

https://notebooklm.google.com/notebook/e5c722e5-dd21-4dc4-ae39-a7f22076b7d8?artifactId=d912ed44-154e-422e-aa93-fc9307c9a2f2

AI-Generated Visuals:

Sources:
https://youtu.be/8XdTF5tnMVU?list=TLGG7J2IoA7cY1QwNTAyMjAyNg