Some Thoughts: AI Retouch

Image by user Rosykk on reddit: What a wonderful nature - Disco Diffusion V5 ( + VanceAI ) — An AI generated image by user Rosykk on reddit

Currently, AI based images are often generated from noise or a reference image using a phrase (prompt) to describe what should be visible in the image.

Sometimes the positioning of objects in such AI videos seems a bit arbitrary. Beautiful that may be and it may remind of works by M. C. Escher but I wonder if anybody has tried to use AI dreaming – not with one seed image but together with a semantic segmentation/labeling video tracker scheduled before the actual AI processing:

An example of semantic segmentation/labeling by the Computer Vision Annotation Tool (CVAT)

I currently have no time to proceed deeper into this topic but I think this algorithm could work:

Segment and label video stream images
- Find pixel areas corresponding to certain objects or object parts in your images
- For example people, dogs, cars and parts thereof: heads, feet, tires, perhaps even eyes, noses …
Possibly exchange or change/refine the labels of your tracked areas to your liking
- Add moods etc.
Let the AI create new content for (some of) the found objects
- Using the new labels as prompts and taking into account
- The original image
- Previously created AI images
- The surrounding pixel areas
- Object velocities/optical flow

An example for the generation of content by a segmented image: NVIDIA Canvas

I think this way you could control a lot better where and how the AI dreams, especially since you enable it to know where a head is and where a leg or a foot, and it could move the objects perfectly with the video stream, too.

Possibly useful links in this respect as of April 2022:

Recent Posts

Categories

Go to Main Page

Archives

Works in Progress

Kunstwerke in Arbeit