Loading

Artificial Intelligence Or just artificially intelligent

When AI text to image generators came to prominence some years back I had no interest in what they produced or how they worked. However a few months back I started to review my inherent lack of concern and found myself thinking ‘what’s the point of being critical of a process I don’t understand or haven’t used’ ?

I started researching the various AI models based on my understanding of how digital images are stored / encoded / decoded etc. I found myself looking at AI generated images and thinking ‘ hang on, how is this happening? These are all just binary code, how are they decoding text prompts and generating images’ ?

This is how I was introduced to ‘Diffusion’ models & ‘Generative Adversarial Networks’ and ‘Neural Networks’ in general. Suddenly I found myself interested in the AI pipeline and the controversy surrounding AI image generation somehow warranted my attention. The fact that to facilitate machine learning these neural networks are fed millions if not billions of images that create vast data sets that can be read & enable the re-encoding of decoded ‘text prompts’ to create images, demystified and somehow neutralised the issues surrounding ‘image integrity’

Now the hunt was on to find something I could experiment with. Three AI tools stood out by the virtue of their exposure on social media. Dall-E, Mid Journey and Stable Diffusion. The idea of using txt to describe what I wanted to make didn’t appeal to me but I investigated it regardless just to see if my adversarial prejudice had any real foundation and despite some interesting results the artist and photographer in me questioned the need to have a third party do something I myself could do. It was then that the rebel in me found its voice. This is where it got interesting. ‘What if I can generate txt prompts that the neural network can’t resolve?’ ‘What sort of image is that going to create?’

I’m good with language and a skilful writer. I should be able to throw a spanner in the works. My earliest experiments proved that careful pairing of contextual anomalies or contradictions not only generated interesting results but helped me to understand how the encoding / decoding pipeline might be working. However, I was still using the data sets that the AI was trained on and despite the txt prompts helping to generate some interesting and at times unique images that were already stepping away from the characteristic AI stereotype I needed another disruptor to make them ‘my own’ so to speak. What I realised here was that I somehow needed to be able to provide my own images (drawings, paintings, prints and photographs) as source material / code. I could do this to some degree with Dall-E but couldn’t pair my own images against my own txt prompts. Enter Wombo Ai.

This small Canadian startup met and exceeded my every expectation. Despite the initial hefty investment, which enabled me to leverage the full capability of the software, the decision to buy proved to be the right one because, although I didn’t know it at the time, I was going to reach the point where I would have full creative control. So here I am, some 10-12,000 images later making work that I can with a clear conscience call my own. Do I need to do this? No I don’t. Would I include any of these images in exhibition submissions? Not yet, but it probably won’t be far away. Can what I’ve learned be used in an educational setting? Absolutely.

These were the first AI images (txt to image) I posted to one of my Instagram accounts on Jan 27th 2023.
A week later on the 3rd of Feb I posted this from the first lot of images where I used my own drawings as source material. .
These were uploaded the following day Feb 4th. The rest is history.
' Recent works / random examples

With this type of working process the images generated often don’t bear any resemblance to the source. What is encoded, down-streamed and then processed by the neural network and then re encoded and upstreamed as an image often only bear a very generalised resemblance if any at all. So here is an example of this just done now as I’m writing this. The image sequence is source, process, output.

The txt prompts have been omitted from this sequence, because in truth that’s where the magic happens.

Loaded source image / original drawing.
Processing selection
Output variations. Look closely at the source image and the output variations and you’ll be able to identify three characteristics of the source image that the neural network combined with the txt prompts to generate these images.

For anyone with an Instagram account here’s the link to where this work is being posted.

Choosing a different processing selection for the same source image and txt prompts results in an entirely different output.
The same can be done with original student work.

This is a good free online txt to image running off the StableDiffusion AI. No login or account required.