top of page
  • pw2257

Hypercinema Assignment Week 4 - Runway ML

*Using Runway ML and a model (or models) augment an existing piece of media through the process of addition, subtraction, or generation. Think about how this could apply to the framework of storytelling.


Concept 1 - Generate a Instagram post caption

I always struggle to come up with interesting captions for my Instagram posts. What if I can ask the machine to do it for me, simply based on the photo I upload?


The model detects visual concepts (objects, object parts, the interaction between them) from images and labels each with a short descriptive phrase.

Test image:

I chose a photo of a banana and a banana pepper laid side-by-side. I was curious to see if the model can differentiate the two.

Input Image
Output Image














Turned out DenseCap model was able to see the banana, but mis-detected the banana pepper as another banana. The color descriptions are pretty off. It did detect the pepper's stem and its accurate color. It tried to understand what lays in the background, detecting a plate (actually an upside down lid) and a shirt (actually a table cloth). Overall, judging by the output image I consider DenseCap really smart.

Here's the image description text I copied from the output data:

{

a blue and white banana

a blue banana

the blue and white stripes

the toilet is silver

the kite is black

a blue and green banana

the stem is green

the blue and white banana

the plate is white

the blue shirt is blue

}

I have no idea where the toilet and the kite came from. I don't see them on the descriptive phrases on the output image, either.

Now, I don't think these words count as an interesting Instagram post caption. So I went ahead and looked for another model that can probably do the job from these generative text.


Like the title says, this model allows users to flexibly plug in one or more tiny attribute models representing the desired steering objective into a large, unconditional language model (LM).

I tried using the entirety of the DenseCap output text as the input (prompt) to PPLM, but it failed to generate anything. I then shorten the prompt to just one sentence. Some of the provided control attributes did not output the result I thought they would, but here are some outputs I would probably use as the caption for my banana/banana pepper photo on Instagram. LOL.


Output 1 (control attribute: Legal)

a blue and white banana.

I have always wanted to be a vegan. My mother taught me a lot about it. I am a huge proponent of vegan, but I don't want anything more. The most important thing is to be a vegetarian. The more I eat the healthier I become.

If you have any questions or questions, please let me know in the comments section below or by e-mail


Output 2 (control attribute: Military, I think it should be Politics. Either the option interface or the code is buggy.)

the toilet is silver-white.

I'm not sure the bathroom is white enough.The U.S. government has been using a toilet that has a white color and no white markings.I'm not sure the bathroom is white enough.The U.S. government has been using a toilet that has a white color and no white markings.

I'm not sure the bathroom is white enough.


Output 3 (control attribute: Religion, though I think it should be Science.)

the stem is green.

I'm not sure the green can't really be removed by a chemical.

I think that is a pretty big problem, but it's pretty important if you are looking for a way to prevent the growth and growth of a plant. I'd say if you are looking for a way to prevent the growth and growth of a plant, you might want to take a look at the various plant types.

So I'm not sure what you might want in this post.


Concept 2 - Generate a film script

This idea came from Sunspring (a 2016 experimental science fiction short film entirely written by an artificial intelligence bot using neural networks.) I watched a clip of it and thought it was extremely awkward in an impressive way. I am curious to know how the creator deicide on the data set used to train the machine. And I found a model that can help me write something interesting.


This model demonstrates that language models begin to learn tasks, such as question answering, machine translation, reading comprehension, and summarization, without any explicit supervision when trained on a new dataset of millions of webpages called WebText.

I tried to input one sentence contains characters and action, "A three-legged dog, a bold man with a scar, and a bat in suit running down the street". I was quite happy with the result because it did capture the thrilling and absurd sentiment I was going for.

Here's a snippet from one of the 50 output samples:

A three-legged dog, a bold man with a scar, and a bat in suit running down the street. Someone has a laser rifle.

HARDY

Think you can spot him?

JASON

That's no chance. A man doesn't get caught on a perch and run away with his life, Cole.

HARDY

I saw him running down the street a coupla times here -He was close. He's a coward.

BLINDMAN

I saw him too.

HARDY

But that dog.

JASON

The dog said it wasn't.

Cole shifts in his chair and thinks. They continue walking.

HARDY

Maybe the man's still chasing him.

Off the street, outside, a fox runs with an umbrella and no mask. The guy gives chase. A man jumps from a helicopter, and a cat gets shot in the tail.

HOLD ON:

A crate on a crate a half block away.

HOLD ON:

The FBI recruits in all their search and rescue gear are watching.

For a second they all look at each other.

They lean over the railing and look under the counter.

BLINDMAN

I'd rather lose my job than find him here.

JASON

We better get him.

They just keep walking.

At last, the Dog decides to give up. He stands in the cold.


It sort of reminds me of the story telling exercise we did in our first Hypercinema class where each person added a sentence to the story and passed it onto the next person. It feels very unpredictable yet sensible in some way.



Concept 3 - Generate a GM animal

I am a believer of slowing down the climate crises and ending animal cruelty effectively by eating plant-based food. I wanted to try generating imaginary animals (genetically modified) that have traits of livestock AND pets. They should look completely strange and new but can still trigger empathy like cats and dogs do. This kind of GM creatures can become characters in narrative art that makes people think twice about eating animals in a less aggressive way.

I tried using a face-swap model on images of a beef cattle and a dog, but it didn't work. Apparently it only works for human faces. And I tried this:


BigBiGAN builds upon the BigGAN model, extending it to representation learning by adding an encoder and modifying the discriminator. It achieve the unsupervised representation learning on ImageNet, as well as in unconditional image generation.


Input Images / Output Images:

















































I was amazed by how the outputs are all very dog-like. It might have something to do with the database. These creatures all look pretty outlandish, however. They will need lots of tweaking to achieve an adorable, empathy-evoking look. But I think we're on the right track.


14 views0 comments
bottom of page