AI Images - Generating Images with Stable Diffusion

<div class="atitle" style="font-size:200% !important"><span class="db8orange">AI Images</span></div>
<div class="atitle2">Creating Image using AI<br><br><br><br><br>
<span class="apage1">by Peter Martin / <a href="https://db8.nl" target="_blank">db8.nl</a></span>
<br>
<span class="apage1">slides: <a href="https://slides.db8.nl" target="_blank">https://slides.db8.nl</a></span>
<style>.atitle{text-shadow: 2px 2px 2px #000;}
.atitle2{text-shadow: 2px 2px 2px #000;}
</style>
<div style="margin: 0 auto;">
<a href="https://db8.nl" title="Joomla Specialist db8 Nijmegen" target="_blank">
    <img height="50" style="border:0;background:transparent;margin-right:20px;
    box-shadow: 0 0 0px rgba(0, 0, 0, 0.15);" data-src="images/db8-logo.svg">
</a>
</div>
</div>

---

<div class="title"><span class="db8orange">Overview</span><br>
<ul style="font-size: 70%">
<li>Images on websites</li>
<li>AI in the Cloud</li>
<li>AI Locally</li>
<li>Stable Diffusion<ul>
  <li>Installation</li>
  <li>Usage</li>
</ul></li>
<li>Questions</li>
</ul>
</div>

---

<div class="title"><span class="db8orange">Images</span> on websites</div>

----

## Why images on websites
<ul>
<li>Visual appeal</li>
<li class="fragment">Communication - a picture / thousands words</li>
<li class="fragment">Branding and identity</li>
<li class="fragment">Search Engine Optimization (SEO)</li>
</ul>

----

### Important aspects with images
<ul>
<li><span class="db8orange">Copyright</span> and Licensing</li>
<li class="fragment">Quality and Resolution</li>
<li class="fragment">Alt Text and Accessibility</li>
<li class="fragment">Image Size and <span class="db8orange">Load Time</span></li>
<li class="fragment">Context and Relevance</li>
<li class="fragment">Cultural Sensitivity and Diversity</li>
<li class="fragment">Originality and Creativity -> <span class="db8orange">SEO</span></li>
</ul>

----

### Issues with images
<ul>
<li>Images found on the internet > <span class="db8orange">Copyright claims</span></li>
<li class="fragment"><span class="db8orange">Royalty-free</span> images & <span class="db8orange">originality</span></li>
<li class="fragment">AI Images > <span class="db8orange">trained</span> on images > Copyright infringement?</li>
<li class="fragment">AI generated Images > <span class="db8orange">no manual labor</span> > No Copyright?</li>
<li class="fragment">AI Images: <span class="db8orange">Diminishing</span> supply of <span class="db8orange">original</span> work</li> 
</ul>

---

<div class="title"><span class="db8orange">AI in the cloud</span>
<br>
<span style="font-size: 60%">text-to-image models</span>
</div>

----

<div class="title"><span class="db8orange">DALL-E</span>
<br>
<ul style="font-size:60%">
<li class="fragment">DALL·E, <a href="https://openai.com/dall-e-2" target="_blank">DALL·E 2</a>., and DALL·E 3</li> 
<li class="fragment">(WALL-E + Salvador Dalí)</li>
<li class="fragment">OpenAI (<a href="https://openai.com/dall-e-3" target="_blank">DALL·E 3</a> via ChatGPT-4)</li>
</ul>

</div>

----

### <span class="db8orange">DALL-E</span> via ChatGPT-4

> a photo realistic image of unwanted mail

</div>

----

### <span class="db8orange">DALL-E</span> via ChatGPT-4

> a photo realistic image of unwanted mail in landscape format

</div>

----

<div class="title"><span class="db8orange">Stable Diffusion</span>
<br>
<ul style="font-size:60%">
<li class="fragment">via <a href="https://clipdrop.co/stable-diffusion-turbo" target="_blank">clipdrop.co/stable-diffusion-turbo</a></li> 
<li class="fragment">Stability AI</li>
</ul>
</div>

----

### <span class="db8orange">stable-diffusion-turbo</span>

> a photo realistic image of unwanted mail

</div>

----

<div class="title"><span class="db8orange">Leonardo AI</span>
<br>
<ul style="font-size:60%">
<li class="fragment">Leonardo AI: AI Art Generator</li>
<li class="fragment">via <a href="https://app.leonardo.ai/" target="_blank">app.leonardo.ai</a></li> 
</ul>
</div>

----

### <span class="db8orange">Leonardo AI</span>

> a photo realistic image of unwanted mail

</div>

----

<div class="title"><span class="db8orange">Midjourney</span>
<br>
<ul style="font-size:60%">
<li class="fragment">Only paid model, via discord</li>
<li class="fragment">Midjourney, Inc, San Francisco</li>
<li class="fragment">via <a href="https://www.midjourney.com/home" target="_blank">midjourney.com/home</a></li> 
</ul>
</div>

----

### <span class="db8orange">Midjourney</span>

> a photo realistic image of unwanted mail

</div>

----

----

### Generating Images in the Cloud

- Free services
  - limited 
- Paid services
- What happens with your data?

---

<div class="title"><span class="db8orange">AI Locally</span></div>

----

### AI Locally

- AI Software locally
  - Photoshop
  - Stable Diffusion

----

### <span class="db8orange">Stable Diffusion</span>

Checkpoint: <a href="https://huggingface.co/CompVis/stable-diffusion-v-1-4-original" target="_blank">stable-diffusion-v-1-4-original</a>

> a photo realistic image of unwanted mail

</div>

----

### <span class="db8orange">Stable Diffusion</span>

Checkpoint: <a href="https://civitai.com/models/15003/cyberrealistic" target="_blank">Cyberrealistic v4.1</a>

> a photo realistic image of unwanted mail

</div>

----

### <span class="db8orange">Stable Diffusion</span>

Checkpoint: <a href="https://civitai.com/models/241415/picxreal" target="_blank">PicX_real</a>

> a photo realistic image of unwanted mail

</div>

---

<div class="title">Installing <span class="db8orange">Stable Diffusion</span></div>

----

### Clone Stable Diffusion WebUI

```bash
$ git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui
$ sudo apt install python3.10-venv
$ alias python=python3
$ python -m venv venv
$ sudo apt install libtcmalloc-minimal4
$ ./webui.sh
```

----

### Download Checkpoint

- <a href="https://huggingface.co/CompVis/stable-diffusion-v-1-4-original" target="_blank">huggingface: stable-diffusion-v-1-4-original</a>

> The Stable-Diffusion-v-1-4 checkpoint was initialized 
> with the weights of the Stable-Diffusion-v-1-2 checkpoint 
> and subsequently fine-tuned on 225k steps 
> at resolution 512x512 on "laion-aesthetics v2 5+" 
> and 10% dropping of the text-conditioning 
> to improve classifier-free guidance sampling.

----

### Improve performance

- NVidia <span class="db8orange">CUDA</span> (Compute Unified Device Architecture)
  - parallel computing platform
  - allows software developers to use a GPU for general-purpose processing
  - proprietary and closed-source!

----

```bash
$ sudo ubuntu-drivers devices
```

```txt
== /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0 ==
modalias : pci:v000010DEd000025A0sv00001028sd00000A61bc03sc02i00
vendor   : NVIDIA Corporation
model    : GA107M [GeForce RTX 3050 Ti Mobile]
driver   : nvidia-driver-535-server-open - distro non-free
driver   : nvidia-driver-525 - distro non-free
driver   : nvidia-driver-470 - distro non-free
driver   : nvidia-driver-535 - distro non-free recommended
driver   : nvidia-driver-535-server - distro non-free
driver   : nvidia-driver-535-open - distro non-free
driver   : nvidia-driver-525-open - distro non-free
driver   : nvidia-driver-525-server - distro non-free
driver   : nvidia-driver-470-server - distro non-free
driver   : xserver-xorg-video-nouveau - distro free builtin
```

----

```bash
$ sudo apt install nvidia-driver-535
$ sudo nvidia-smi
```
```txt
Tue Jan  2 10:48:47 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.129.03             Driver Version: 535.129.03   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce RTX 3050 ...    Off | 00000000:01:00.0 Off |                  N/A |
| N/A   50C    P0              N/A /  40W |      8MiB /  4096MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A      2950      G   /usr/lib/xorg/Xorg                            4MiB |
+---------------------------------------------------------------------------------------+
```

----

#### Result

- Laptop does not "suspend" anymore....

----

#### First Results

----

#### more First Results

----

#### more First Results

---

<div class="title">Using <span class="db8orange">Stable Diffusion</span></div>

----

### Stable Diffusion <span class="db8orange">checkpoint</span>

- Large datasets used to train Model
  - Models: <a href="https://civitai.com/models/" target="_blank">civitai.com/models</a> 
    - <a href="https://civitai.com/models/15003" target="_blank">CyberRealistic</a> 
    - <a href="https://civitai.com/models/241415/picxreal" target="_blank">PicX_real</a>

----

### Prompt / Negative <span class="db8orange">Prompt</span> 1/2

- Use English
- Use keywords (not sentences)
- Use: Subject + verb + object + adjectives (beautiful)
- Order is important! First words most important. 
- or change keyword weight (keyword:x.y) -> x.y = 1.33

----

### Prompt / Negative Prompt 1/2

- Specify:
  - environment (indoor, outdoors, park)
  - lighting (soft, ambient)
  - tools or materials (pencil, photo)
  - color scheme (dark, vibrant, dynamic lighting)
  - camera (wide angle, close up)
  - photography (polaroid, etc)
  - art style
  - art inspiration -> list of artists

----

### Prompt resources

- Extension: prompt generator
- check other images
- <a href="https://lexica.art" target="_blank">lexica.art - images</a>
- <a href="https://prompthero.com" target="_blank">prompthero.com</a>
- <a href="https://stablediffusion.fr/prompts" target="_blank">stablediffusion.fr/prompts</a>
- <a href="https://openart.ai" target="_blank">openart.ai</a>
- <a href="https://openart.ai/promptbook" target="_blank">openart.ai/promptbook</a> !!!

----

### Parameters

- <span class="db8orange">Sampling method</span> (see Settings tab) -> de-noising
- <span class="db8orange">Sampling Steps</span> (depends on Sampling Method)
- Size 512x512 (most are trained on that)
- Batch count (how many times) -> serial processing
- Batch size (how many images) -> parallel, needs more GPU!

----

### <a href="https://stable-diffusion-art.com/samplers/" target="_blank">Sampling method</a>

- Ordinary differential equations (ODE) solvers
  - Euler – simplest
  - Heun – more accurate, slower
  - LMS (Linear multi-step method)
- <span class="db8orange">A</span>ncestral samplers - every step different outcome
  - Euler a
  - DPM2 a
  - DPM++ 2S a

----

### Classifier Free Guidance

- CFG =  (how strict to follow prompt)
  - 2-6 = creative
  - 7-10 = creative/guided
  - 10-15 = detailed clear prompt
  - 16-20 = very detailed prompt
  - more than 20 mostly unusable

----

### Seed
- initial random noise, -1 = random
- re-use seed = very similar image

source: <a href="https://stable-diffusion-art.com/samplers/" target="_blank">stable-diffusion-art.com/samplers</a>

----

### Other

- Restore face
- Tiling 
- Hires
- in URL ?__theme=dark
- presets: style.csv
- <a href="https://stable-diffusion-art.com/automatic1111/" target="_blank">AUTOMATIC1111: A Beginner’s Guide</a>
- <a href="https://stable-diffusion-art.com/consistent-face/" target="_blank">5 methods to generate consistent face with Stable Diffusion</a>

----

### Other tabs

- img2img
- PNG info - Get prompt info
- Extensions

----

#### Better Results

After changing the Checkpoint (model), Sampling Method and parameters:

----

#### Better Results

---

<div class="title">Using Stable Diffusion <span class="db8orange">via API</span></div>

----

### Stable Diffusion <span class="db8orange">via API</span>

in webui-user.sh
> export COMMANDLINE_ARGS="--skip-torch-cuda-test --medvram --api"

----

### Stable Diffusion <span class="db8orange">via API</span>

http://127.0.0.1:7860/docs
<img style="border:0px;width:1650px" src="images/24-linux-ai-images/api-screen.png">

---

<div class="title"><span class="db8orange">Questions?</span></div>

----

## Photo Credits

<ul style="font-size:50%">
<li>https://pixabay.com/photos/notebook-paper-pages-open-731212/</li>
<li>https://unsplash.com/photos/landmark-poster-lot-QNc9tTNHRyI</li>
<li>https://unsplash.com/photos/white-clouds-and-blue-sky-during-daytime-A9_IsUtjHm4</li>
<li>https://unsplash.com/photos/a-robot-made-out-of-legos-sitting-on-a-table-jMDtJtFs8EQ</li>
<li>https://unsplash.com/photos/three-drinking-glasses-Y1ge0B9_oGE</li>
<li>https://unsplash.com/photos/mona-lisa-painting-0WQOCx1g8hw</li>
<li>https://unsplash.com/photos/a-laptop-computer-sitting-on-top-of-a-wooden-desk-b25Eso94UH0</li>
<li>https://unsplash.com/photos/white-and-blue-tablet-computer-keyboard-VFiQvZPlm2k</li>
<li>https://unsplash.com/photos/a-group-of-electronic-devices-yJVpnfqu8GY</li>
<li>https://unsplash.com/photos/person-holding-click-pen-FwF_fKj5tBo</li>
<li>https://unsplash.com/photos/macro-photography-of-black-circuit-board-FO7JIlwjOtU</li>
<li>https://unsplash.com/photos/question-mark-neon-signage-8xAA0f9yQnE</li>
</ul>

</section>