• 25 Posts
  • 6 Comments
Joined 1 year ago
cake
Cake day: June 26th, 2023

help-circle

  • To a degree, my question is, how do you feel about others being able to generate content, especially when it is limited in flexibility and quality.

    Also, I’m curious if you see the real potential market if you flipped the perspective, adopt the tech, and use it to your advantage. Maybe it is layering and backgrounds for composition, maybe it is full on training to generate content, or maybe it is simply maximizing time by allowing the AI to rework images.

    Like the typical image generation process most people think about turns a text prompt into an image using an image consisting of mathematically random noise and turning it into a version of the prompt in a series of steps. There are other methods too. One method takes an image as input, overlays some noise, and then uses this as the baseline to generate an image from. Basically a blurry or bad image can add just a small amount of noise and the AI can render it better. This isn’t like photo filters or editing. I would be using this to my advantage. I would also look very carefully at what is hard to generate with AI rn and focus on making stuff that it cannot do well. There is a lot more generated content than I thought before I learned how this works and what AI does poorly.


  • Honestly, may I ask, how do you perceive this?

    I have used images to help me learn how training works with AI. It is far easier to see ass nipples are a mistake than it is to see that poor text training has resulted in a middle aged woman with excessive hairiness and a passion for gardening is now going by the name Harry Potter.

    I may have a database of images and trained models that I have used to learn, not your content in particular, and not any particularly good results. I’ve mostly explored why labias are so bad with stable diffusion, and scrapped a couple of ftv galleries. I wouldn’t call myself a fan of anyone really. I’m certainly not a mark in this space. My real interest is in other AI applications. Posting trained models of people seems too gray area for me. At the same time, this is becoming a super powerful tool that essentially expands exposure and likely attracts the type of person that would pay for more. Like the recent creation of Open Dream makes it possible to do image layering for complex composition. I’m curious about a content creator’s take here.


  • It isn’t too hard to read the way the scripts parse prompts. I haven’t gone into much detail when it domes to stable diffusion. The GUIs written in gradio, like Oobabooga for text or Automatic1111 are quite simple python scripts. If you know the basics of code like variables, functions, and branching, you can likely figure out how the text is parsed. This is the most technically correct way to figure this stuff out. Users tend to share a lot of bad information, especially in the visual arts space, and even more so if they use Windows.

    Because the prompt parsing method this is part of the script. If we don’t know what software you are using, it is hard to tell you what to do with certainty. I think most are compatible, bit I don’t know for sure. In the LLM text space, things like characters are parsed differently across various systems.

    With Automatic1111, on the text2img page, there is a small red icon under the image that opens up a menu in the GUI and lists all the LoRAs you have placed in the appropriate folder for LoRAs on your host system where you installed A1111. Most of the LoRAs you download that show up on the text2img page will have a small circled “i” icon in one corner, this will usually contain a list of the text data that was used to train the LoRA. This text data was associated with each image. These are the keywords that will trigger certain LoRA attributes. When you have this LoRA menu open, if you click on any of the entries, it will automatically add the tag used to set the strength of the LoRA’s influence on the prompt. This defaults to 1.0 but this is always too high. Most of the time 0.2-0.7 work okay. You also need the main key word used to trigger the prompt added somewhere in the prompt. This can be difficult to find unless to keep this information from the place you downloaded the LoRA from. Personally, I rename all of my LoRAs to whatever the keyword is. Also, you’re likely going to get a lot of LoRAs eventually. Get in the habit of putting an image relative to what each LoRA does in the LoRAs folder. The image should be named the same as the LoRA itself. A1111 will automatically add this image to each entry you see in the GUI menu. LoRAs are not hard to train too. Try it some time. If you can generate images, you can train LoRAs.