[AI Art] Midjourney Mischief
Aug. 20th, 2022 10:52 am![[personal profile]](https://www.dreamwidth.org/img/silk/identity/user.png)
So! My boss found a new toy called Midjourney, and encouraged me to set up an account (and, while I'm at it, a Discord account). I got a standard subscription for a month; if we were to actually use this for work-related applications, however, we'd need a yearly corporate account.
For work-related applications -- well, we'll see. But the important thing is just that I've been able to do some silly stuff with the tool. Plus, Koogrr is on Discord, so I've tried bouncing a few of the results for feedback. Among other things, I tried doing some Sasta portraits.
To back up a bit, the idea of Midjourney is much like other AI art interfaces such as DALL-E. You provide some text input and possibly an image URL. The program basically does some very fancy image-collaging, drawing from image sources on the internet. It's optimized toward certain types of subject matter; unlike DALL-E, you can end up with some fairly convincing-looking forward-facing portraiture of pretty girls, or some passable landscapes, though you might get some oddities such as more than one sun and/or moon in the sky, or some more trees and hills above what appears to be the clouds (but, hey, that works for fantasy), or your pretty girl face might have eyes that aren't quiiiiiite aligned correctly. Often what looks fine in a thumbnail will start to fall apart when you ask for the AI to "upscale" the image to add more detail: vaguely defined blobs that MIGHT be humanoid figures standing in a cityscape or landscape can work simply because the viewer's imagination can fill in the blanks. However, get the AI to start adding details and it becomes more evident that what COULD have been a car is a jumble of random parts, what COULD have been a human has a few too many limbs, what looked like writing is clearly gibberish, and so forth.
On certain forums I've seen folks boast of putting out a new RPG supplement that's entirely illustrated with AI art, with no need to pay some greedy ILLUSTRATOR to fill the pages with something other than walls of text. I could see some possibilities there for "filler art," though it takes some effort to churn out things that are consistent in style, and nigh impossible to have recurring SPECIFIC subjects.
So far, I've gotten some random interesting results, but largely a bunch of stuff that: a) looks cool only when it's really small; b) really could use some touch-up work. Honestly, my co-workers who've been toying with this seem to have gotten much better results than me; I haven't really gotten a knack for the proper syntax, but perhaps it's a matter of overly-challenging subject matter.

Mucha Medusa About Nothing
Prompt Line: /imagine prompt: symmetrical medusa priestess in the style of alphonse mucha --ar 9:16
Koogrr suggested a medusa portrait. I gave it a shot. "in the style of {artist}" can be done to strongly drive what sort of image you're going to end up with. I've had a lot of fun with "in the style of Studio Ghibli" or "in the style of Thomas Kinkade." You can also get interesting results from "in the style of a tarot card" or "in the style of a travel poster."
"Symmetrical" is a keyword I've used to try to force it to give me a front-face portrait. Sometimes it'll try to depict a character at a slight angle, and the eyes really don't line up well, and various other things go wrong, and in my experience so far, if you want a face that looks the least bit decent with Midjourney, it works best when the subject is staring straight at the camera.
"--" is some syntax that indicates a specification, and in this case "--ar" is Aspect Ratio. Without specifying it, images come out square (256x256 thumbnails, for instance), but with an aspect ratio you can get portrait or landscape orientation. Choice of dimensions DOES seem to significantly impact contents as well: I've found that I'm far more likely to get a full-body or close-up portrait of a single subject with a portrait-oriented picture, and I'm far more likely to have the requested subject be merely standing within a landscape (possibly very small and not terribly detailed), AND be merely one of multiple subjects, if it's landscape orientation.


Ghibli-Style Spiral Garden
prompt: /imagine prompt spiral-world garden landscape at sunset in style of Studio Ghibli --ar 9:16
So this starts to illustrate a little of the process, and this was actually what I started with first, after reading up on a few examples, and trying to think of an "art style" to try out. You type in some keywords, it grinds away, and then gives you four different interpretations of your instruction. You can simply try again, roll the dice, and see what else you get ... or try a completely different set of keywords ... or you can select one of the options and either generate new variations branching off from that particular interpretation ... or you can select one of the options and "upscale" it to create a more detailed version (though the process of "detailing" can make some pretty drastic changes, and often I PREFER the low-detail thumbnail).
I envisioned some sort of floating sky-island with a garden on it, but ... eh, how to describe that? Basically I got something that had "garden" in it and "spiral" somehow, and it's pretty good at just combining all the keywords and ignoring a lot of the prepositions and details such as "with" or "inside" or "on top of" or "in addition to," etc. If you want a thing and then separately another thing, you might be better off doing entirely different pictures and just trying to Photoshop the elements you want together into a single picture post-process. Maybe it's possible, but I haven't figured that out yet.
The image on the left is the first one I started with. The image on the right is where I picked one output and asked for variations on that theme.

The Magic Shop: Potions on a Shelf
prompt: /imagine prompt magic potions on wooden shelf in style of Thomas Kincade --ar 16:9
(Note: I misspelled "Kinkade." But I liked the way things turned out, and when I corrected to "Kinkade" ... well, more on that below.)
Possible application: greebly little unidentifiable items for decorating an "RP menu" for a Warcraft "magic shop." That is, some RP "vendors" enjoy making pretty menus to advertise their imaginary wares (sometimes represented as pseudo-items with attached descriptions, icons, and usage-related messages and sound effects thanks to the TRP3 add-on for World of Warcraft). So I thought that while these might not be identifiable as specific wares, perhaps imagery of potions and other trinkets might make for some interesting decorative elements. After all, I could run back through with some specific dimensions if there was a specific space to be filled.
"Upscaling" one of the images makes for some fascinatingly detailed texture, but not necessarily any greater clarity about what's being depicted in the labels. Also, it's more evident that there's some crazily-misshapen warping going on with the bottles, corks, etc., in certain places.

So, what other "generic" magic items could I put on shelves for a magic shop? Oh! I misspelled Thomas Kinkade! Let's fix that next time....

Magic Shop: Magic Rings
prompt: /imagine prompt magic rings on a wooden shelf, artwork by Thomas Kinkade --ar 16:9
Okay, suddenly everything is a LOT more colorful. I'm not sure what artist it was channeling for "Thomas Kincade with a C," but that seems to be distinct from "Thomas Kinkade with a K."
Also, "ring" is a word with varied visual interpretations, it seems, or at least more so than "potion." The thing I have to keep in mind is: if you looked up this word in Google Images, what would you find? Try to get a horror-movie poster of Freddy, and you might well get a strange amalgam between Freddy Krueger and "Freddy" from "Five Nights at Freddy's." (And that's exactly what happened in one example I saw.)
Here, I think invocation of the spirit of Thomas Kinkade demanded a LANDSCAPE, so "ring" got worked into it in #2. In #3, those look like ... donuts?!? And in #4, that looks like a magic ONION ring. Very weird. I think I'll go back to "Kincade" for consistency, even if it makes no sense.

Magic Shop: Amulets
prompt: /imagine prompt magic amulets on wooden shelf in style of Thomas Kincade --ar 16:9
This isn't what I was envisioning for "amulets," but that seems to be a word with multiple interpretations beyond what I was trained to expect from D&D. Why, in the recent Sandman series, we were taught that "amulet" is an archaic term for "ring." (Really?!) But this I suppose at least works for some miscellaneous magical knickknacks on a shelf, which is what I was ultimately going for anyway.
I wonder what I'll get if I drop the art style?

Magic Shop: Beer Bottles
prompt: /imagine prompt fantasy beer bottles on a wooden shelf --ar 16:9
Okay, so some beer bottles for a fantasy setting, perhaps for the beer booth at the next Dwarven moot! Those are some pretty intriguing bottles, I think.
Let's try putting the style back in.

Magic Shop: Beer Bottles
prompt: /imagine prompt fantasy beer bottles on a wooden shelf in the style of Thomas Kincade --ar 24:9
It looks a little more like the style of the other pictures, but only so much. Those are some weird bottles!
Hey, how about Dwarven steins?

Magic Shop: Dwarven Steins
prompt: /imagine prompt Dwarven steins, on a wooden shelf, in the style of Thomas Kincade --ar 24:9
Okay, those steins in the upper left corner actually look like they might have some Dwarven faces on them, but they're pretty mutated. Let's try some variants.

Magic Shop: Dwarven Steins (variant)
prompt: /imagine prompt Dwarven steins, on a wooden shelf, in the style of Thomas Kinkade --ar 24:9
Hmm. Okay, interesting, but I kind of liked the idea of some Dwarf faces. This is a job for ... PHOTOSHOP!

magic Shop: Dwarven Steins (touchup)
Upscaling just made them look alien. So I decided to keep the "thumbnail" version, pull color sample palette from the image itself, and do some touch-up work to turn some of those mutant scribbles into something like faces. Here's the result, for better or worse.
...
Now, I did a lot of experimentation with movie posters, post-apocalyptic scenes, imaginary travel posters, and such, but I should probably put those in another post.
I tried to see if I could do anthropomorphic animals. In short? Not very well at all. Okay, maybe I could start with someone like Anubis, since there should be images of him, and he's already "anthropomorphic." The trouble is ... he's usually depicted in profile, and Midjourney wants to do things straight-on. Well, here goes!

Anubis in a Tuxedo
Voila! Yeah, pretty weird. I tried variations and other keywords, but mostly I got a bunch of mutant horrors. This is the best of the crop, and upscaling only made things much, much worse.
Let's try ... Sasta! Maybe Cyber-Sasta. I learned that I could feed in an image to start with, and maybe that'd help it get to somewhere closer to where I wanted. I probably should have gone with a fresh drawing, straight-on, but for expediency, I tried a picture from a previous portrait:

So, here goes Cyber-Sasta!

Cyber-Sasta Experiment #1
prompt: /imagine prompt https://s.mj.run/9qvO56VAJu0 --iw 0.90 Cyberpunk Sasta Anthropomorphic Cougar Face, symmetrical :: background cyberpunk city, holograms, neon lights, at night, raining :: cinematic atmosphere, realistic, highly detailed, 8K, octane render --ar 9:16
And I get ... AIEEEEEEEEE! NIGHTMARE FUEL!
Okay, wait, the upper-left picture, #1, sort of looks like a cat-humanoid. I tried some other keywords and ended up with a bunch of mutant cyber-girls who didn't look the least bit cougar-like. Maybe I could try some variants of that first image instead?

Cyber-Sasta Variant
prompt: /imagine prompt https://s.mj.run/vo9EVdRCacM --iw 0.90 Cyberpunk Sasta Anthropomorphic Cougar Face, symmetrical :: background cyberpunk city, holograms, neon lights, at night, raining :: cinematic atmosphere, realistic, highly detailed, 8K, octane render --ar 9:16
Eeeeeek. Some of those are pretty awful. But #3 (lower left) seems on track.

Cyber-Sasta Variant-Variant
prompt: /imagine prompt https://s.mj.run/y2NlJsmKa2M --iw 0.90 Cyberpunk Sasta Anthropomorphic Cougar Face, symmetrical :: background cyberpunk city, holograms, neon lights, at night, raining :: cinematic atmosphere, realistic, highly detailed, 8K, octane render --ar 9:16
And ... it just goes downhill from there. Guess I should have quit while I was ahead. Let's go back and "upscale" that one and see what we get.

Cyber-Sasta Upscale
Huh. Well ... it's detailed, and it's very cyber. But I think I liked the thumbnail better. Maybe I should try this from another angle. First off, lose the tall proportions and maybe I'll get more face, less mutation. Who knows?

Cyber-Sasta Square Pic Experiment
prompt: /imagine prompt https://s.mj.run/9qvO56VAJu0 --iw 0.90 Cyberpunk Sasta Anthropomorphic Cougar Face, symmetrical :: background is cyberpunk city, holograms, neon lights, at night, raining :: cinematic atmosphere, realistic, highly detailed, 8K, octane render
Strangely, the cyberpunk city background has vanished entirely, and any holograms, neon lights, etc., are just part of the character. There are some interesting results here (and more horror), but it's not really what I was aiming for. Sasta needs some HAIR. Also, an eyepatch.

Cyber-Sasta Square Pic Experiment with Hair
prompt: /imagine prompt https://s.mj.run/9qvO56VAJu0 --iw 0.90 Cyberpunk Sasta Anthropomorphic Cougar Face, with eyepatch, long brown hair :: background is cyberpunk city, holograms, neon lights, at night, raining :: cinematic atmosphere, realistic, highly detailed, 8K, octane render
Well! That does interesting things. The city is back, but the upper-left one looks like a human with a bad facial-paint job. The one to the lower-left is most interesting, but nobody's got an eyepatch, and we're missing cat ears on that one. I'll save it for later, and try a few other approaches.
( ... lots of nightmare-fuel later ... )
AIEEEEE! Uhm ... let's try again.

Cyber-Sasta Square Pic Experiment #16-or-so
prompt: /imagine prompt https://s.mj.run/UsLKeYVf8so --iw 1.00 Cyberpunk Sasta Anthropomorphic Cougar Face, with cat ears, long brown hair, cyber eye :: background is cyberpunk city, holograms, neon lights, at night, raining :: cinematic atmosphere, highly detailed, 8K, octane render
The "--iw" by the way allows me to assign a "weight" to a keyword (or in this case the source image). 1.00 is as high as I can go, I think. Still, given that the upper two pictures don't even give me a face, I really don't quite understand how the "seeding" works.
So, you know what? It's Photoshop time. I decided to combine elements from different pictures -- the bits I thought worked better. Also, a bit of the warp tool to realign things couldn't hurt (much).

Cyber-Sasta Square Pic Thumbnail
No eyepatch, but this is cyberpunk! I borrowed one of those "cyber-eyes" to put over one eye, and used the warp tool to create some ears out of hair highlights.

Cyber-Sasta Pic Upscale Experiment
I tried "upscaling" the image, but it went in weird directions. It added the rain finally, but got rid of the whiskers(?) and made the character more human, less cat-person. One of the eyes was really wonky, but I'm fine with that because it's going to be a CYBER EYE! This time I substituted in a cyber-eye from another "upscale" output.

Cyber-Sasta Variants with Touch-Up
Here, I've tweaked a few of those other variant pics with a bit of touch-up work. Some ears here, tail there, glowy eye, muzzle marks, etc.
Anyway, fun toy, and I'm still learning. I'll share some other results later. I'm not really sure about applications. For my own "art," I would have trouble putting my signature on it because most of the work isn't really mine, and furthermore it's essentially cobbled from other sources. For work, well, that's a whole 'nother topic.
no subject
Date: 2022-08-22 08:30 pm (UTC)So, anyway, what ended up happening was that in some cases the original pic might seem to have a tail, ears, cat-like nose, etc., but during the Upscale process, it would add new details that transformed that apparent tail into ... a fold of drapery? And those ears became ... glowing nonsense kanji symbols in the hair? And the nose would just become a misshapen thing, a new human-like mouth with lips might form out of what was previously the chin, and other mayhem.
It's quite perplexing to watch the work-in-progress while it's working on a picture. A few blobs go down. Then it seems to carve these blobs into shapes. And then it works further ... and what seems like an accidental detail to a blob might get shaped into something new. It seems very iterative, in many ways as if this were a work being passed around the room, with different artists working on it, even if they still get the same initial keywords. What seems one moment to be a rightly-proportioned nose at one stage gets mutated into a mouth at another, or vice versa, and sometimes I may end up with a face that has multiple noses, multiple mouths, lapel buttons that transform into extra eyes, and other horrors! It's interesting, and clearly it can lead to some interesting results, but I find it quite bizarre. Apparently there's no underlying "structure" to the art -- no grand plan of "this shall be the face, here are the eyes, here's the nose, pointing this way," etc. -- but rather "Oh, I'll make this into a nose," or, "Here, I'll put a plug outlet," and then the next artist sees the plug outlet, does a bit of pattern matching and says, "Oh, that's a FACE!" and turns it into such, even though this makes no sense in the greater context.
no subject
Date: 2022-08-22 08:36 pm (UTC)Here's an interesting thread if you hadn't seen it, by Ursula Vernon using an AI art generator:
https://twitter.com/ursulav/status/1467652391059214337