Recently I’ve been experimenting with AI generative tools from a technical and workflow perspective. By ingesting the text, images and other digital assets of humanity, these large language and visual models are as much a mirror as an interesting technology.
I’ve written two articles now on GPT Models and those are here and here.
The Reality of AI
How it’s all going to go down, might have similarities with two previous innovations.
The emergence of graphics and publishing programs on desktop computers put publishing in the hands of the general public. This was made possible by laser printers using Adobe’s Postscript or Hewett Packard’s Printer Control Language (PCL) producing output on standard office paper.
It took a little while, though professional-level software such as PageMaker, QuarkXPress, and Illustrator quickly became integral components of the workflows of graphics professionals, agencies, printing companies, and typesetters.
Before this, if your publication was internal, you had to seek the services of a professional individual or company to create the finished work. The company had a much higher overhead, with expensive typesetting and graphic equipment or large printing presses.
At trade shows, the pitch was that with the right gear and software, even secretaries, (1980s vernacular), would be able to create a professional-looking company newsletter. However, the actual results varied depending on the user’s skill level, ranging from impressive to trash.
It wasn’t perfect and professionals could still do it better. There was a whole level of business that disappeared overnight though, as desktop publishing became cheaper, more efficient, quicker and sometimes “good enough”. This is being reflected again in the creation of assets in online services such as Canva.
The disruption depended on the company. Desktop publishing didn’t immediately affect everyone. Still, companies scrambled to make sense and onboard the coming wave. Some businesses even took the attitude, “That thing is just a toy”, and then they were gone.
Desktop computers, scanners and sophisticated software soon became integrated into the workflows of agencies, printers and graphic designers. This increased the demand for digital assets (images and illustrations) to feed that workflow.
Operators of the software with skill were able to produce more complex designs and after a time deliver results exceeding the previous manual way. In turn this had an influence on the design process where designers experimented with software features to create new artworks.
Some businesses pivoted, some disappeared.
It’s not surprising that prompting feels like the evolution of search. Search doesn’t have to create any of the web assets and sites it finds. It just has to provide the most relevant results for your query.
Whomever had the best search results and whomever had the best categorised pages became the game changer. The quality of the actual resulting site pages was not connected to the search engine. Initially search used meta data allowing coders to say a page was about something. Search engines countered this by then relying on the content of the page.
Companies hired Search Engine Optimisation companies or staff to make the content rank higher to get as near as possible to the first page for particular search strings that people might use. It was like the old game show “Family Feud”. It’s not necessarily the right answer, just the one that everyone thinks. That’s an important distinction where language models, such as ChatGPT, are concerned.
In a perfect world the best result and the top result would be the same, though it is not always the case. A further wrinkle is the advertising above the results related to the search performed.
Some people are better at searching than others because everyone has different levels of experience, patience and knowledge when it comes to researching topics or finding information online.
Like prompts, search allowed for extra text tools such as quotes, plus, minus and others to refine the results or to exclude unwanted categories of pages. Now that many searches are from voice, or use ‘natural language’ that level of precision is slightly diminished even as the tools improve.
Prompting is starting from the natural language perspective and as such even slight changes in wording can deliver very different results. This can be due to the meaning or just deficiencies in the training model.
Chat models may be able to eventually narrow down to exactly the thing you want to know rather than finding a set of pages that you have to find the answer in.
Now what we are doing is giving a large and trained data model a prompt that causes it to give a response based on that training and data. This can be text, images, anything digital as long as it is part of the data set. As usual industry responses range from sceptical to evangelistic.
The Reality of Generative AI Workflow
The preceding information sets the scene for how disruption and innovation changes the backdrop. In a practical sense these innovations are ‘not that good’, ‘better’, ‘equivalent in certain cases’, ‘better again’, ‘I can’t remember what before was like’. Eventually they become part of the fabric.
Individuals and businesses might transition their workflow and get the best of potential outcomes. This means having to experiment and try using even current models in workflow. I’ve run a number of experiments with audio, images and chat that have led up to this design experiment.
The purpose of this experiment was to create a physical finished art product not just a digital image. I wanted to hold the product in my hand.
Here is how it turned out.
Step 01: The Brief
There was the possibility of creating a t-shirt design that had the following elements: A Monkey, Two Tone SKA Music and that the design would work in a single colour on a black t-shirt.
Step 02: Generating Digital Assets
Previously the illustrated assets would need to be sourced from an illustrator or from stock. Using stock could yield poor results due to lack of originality and potential copyright issues. If an illustrator was used there would probably be one design and some reworking, with an agreed process about how this would work. The result would depend on the design brief, skill of the illustrator, budget and the clients understanding.
The court cases are already piling up against MidJourney, StableDiffusion and Dall*E2 image generators. Though the works generated are completely original, or at least a transformed work, the court cases are on the use of images in the training data.
In both cases you might not get the result that you can see in your mind. Your bias. With the generative model the cost allows you to create many iterations refining the prompt and basically refining your design brief on the fly.
Sometimes in the generative process you get results that make you think, “hmmmm….why don’t I follow this path to see where it takes me”, you have to be prepared to embrace the random as well as the iterative process. You need to think as much like a chef creating a recipe as much as a design.
I used MidJourney as the generative tool. MidJourney can be fun as well as extremely frustrating. Now there is a command to “/describe” as well as “/imagine”, this can make it easier to find the right words that the language model understands. This is the ‘natural language’ element.
My prompts went through many changes. This is an indication of some words, though not the individual prompts as there were several.
Black and white illustration. Ska Monkey. Single Monkey Man Character. Full body standing skanking dancing. Ska. Black Suit and PorkPie Hat. Single white ink. TShirt Printing. Happy. Smiling. Joy. No background. –ar 2:3 –chaos 0 –no text –no background –no shading
“AR” is the aspect ratio. “Chaos” allows close matching to your prompt and increasing this setting can get some wild results. “No” filters out undesirable things the model pops into the result. For some reason MidJourney loves to add random text to purely graphic images and you have to tell it not to.
In a couple of cases I fed some of my image composites back through the prompts to refine. I combined elements in PhotoShop, uploaded that image and used that as part of the prompt to generate new images using the image to ‘super prompt’ the result.
Eventually I had a number of generated images selected from many that had elements I wanted. Hat here, eyes there and so on.
I’m sure you know, though click on the thumbnails to see the bigger images.
Step 3: Design Process
The lowest level of using MidJourney is to just generate an image, possibly refining it further by pushing the variations button on one or more of the images in the 4 image grid created.
In this case I wanted to produce a result that was emblematic and worked to a design brief. Art that was finished and could be used in a number of instances though primarily as a flexible t-shirt emblem.
This is where I refined the design brief (using the prompt ideation and images).
I wanted the design to incorporate a pork pie hat, two tone ska chequerboard pattern, a suit style aka Jason Statham in Snatch. I wanted it to sit on a floating background, so the main head and shoulders would stand out. If I was going to take it further, I could then put a logo or horizontal brading below the emblem.
I used Adobe Photoshop to cut out the elements I wanted and stitch them together into an emblem as I could see it in my head.
To get usable files in MidJourney you pick the images you want and get the system to generate a larger file with more pixels in the image. The resolution is the same, (72 dpi), the measurement of the image is bigger. That is an important distinction to designers.
Even though the images generated were eventually black and white, the files were still RGB and despite requesting no shading the edges tended to blend off to the white background. I set the images to grayscale and adjusted the brightness and contrast to get near as possible for the working files.
I did toy with the idea of making the background and tie red. Although striking, this negated the ska chequerboard that was part of the brief.
I got to a finished result and the initial face was too stern, too Jason Statham. Not fun and a bit mean looking. I switched out to a more relaxed sunglasses wearing monkey face and that was great although the hat was too ‘street’, so I kept the original hat.
Even though the ska chequerboard is square, I masked out the chequerboard in a circle to change it up a bit.
Once I had the elements combined, I used the tools in Photoshop to refine the elements. I had to mask the background of the head out of the chequerboard and make that part transparent. I had the shirtfront, though I wanted the jacket to fade out in the lower part so I made/adjusted the lapels and bottom into line art.
Finally I was happy with the result.
Step 4: Production
Printing the bitmap/pixel image at a high resolution on a t-shirt would still result in some edge artefacts appearing due to the nature of the image. This is OK when using a photograph, though a black and white illustration that would look horrible.
I took the whole thing into Adobe Illustrator and did a master trace and fixed this up till I got a fully transparent single white ink result. This meant that no matter how large I made the image it would always render on the shirt with perfect edge detail.
I exported this to an EPS (Encapsulated Postscript) file as the finished art.
I loaded the EPS art file up to a t-shirt printing company called Create Apparel and received the t-shirt via Australia Post about a week later.
Step 5: Results
The experiment was a success and I’m very happy with the results.
The main fun of doing it was being able to try out lots of ideas and refine the brief. Even if I was going to brief an illustrator, I could have come clutching ideas that really represented what I was after, rather than a written brief no matter how detailed.
I already had skills, this gave me a powerful tool to use to refine and add to the illustration. Would an illustrator have created a better illustration or artwork? It would depend on the purpose of that artwork. As the designer, the main change was how I generated the assets I would have got from an illustrator or stock. After that it was pretty much the same except with more go rounds.
My refinement was like search. I got a lot of results, though it was up to me to refine and use those parts of the results that were relevant. General intelligence might just create the result if the explanation or input is good enough. I’ve seen enough design briefs to know how fraught that is.
I was able to do it from my apartment, given a nice computer and the right software.
The result also had to be technically sound to work in the t-shirt printing process. The result I got could be used in mass production.
That is one way Generative AI Tools can be used in a production workflow and potentially how to integrate them into yours.