Today’s Text Prompt Engineering Experiment 1 uses the Meta AI platform with the “Imagine” text prompt to generate images and animations. High level and technical details about the Text Prompt Engineering techniques follow the image examples in this experiment.
| Meta AI | Text Prompt |
![]() | sketch two transparent cats on a black background, uplight |
![]() | paint two transparent cats on a black background, uplight |
![]() | Two transparent cats |
![]() | two cats |
![]() | two cats, high contrast, shadows, three point lighting |
![]() | needlepoint art of two cats, high contrast, shadows, three point lighting |
![]() | needlepoint art of two cats, high contrast, shadows, downlight |
![]() | intarsia art of two cats, high contrast, shadows, leftlight, photo realistic |
![]() | sharpie graffiti art of two cats, high contrast, shadows, downlight, photo realistic |
![]() | sharpie graffiti art of two cats, high contrast, shadows, downlight |
Of course I asked Meta AI for a high level overview and technical details about the Text Prompt Engineering standards that are expected:
MoniGarr Request:
regarding text prompts for image generation, what are the technical details
Meta AI Response:
A great question! Text-to-Image Models like myself use a technique called Prompt Engineering to generate images from text prompts. Here are some technical details:
Text Encoding:
The text prompt is first encoded into a numerical representation using a tokenizer (like BERT or CLIP). This converts the text into a sequence of tokens (words or characters) that can be processed by the model.
Embeddings:
The encoded text is then passed through an embedding layer, which converts the tokens into a dense vector representation (embeddings). These embeddings capture the semantic meaning of the text.
Image Generation:
The embeddings are then passed through a generator network (like a GAN or VAE), which uses the text representation to generate an image. This network consists of multiple layers that progressively refine the image, starting from a random noise vector.
Training:
The model is trained on a large dataset of text-image pairs, using a contrastive loss function (like CLIP’s) to learn the relationship between text and images. The goal is to minimize the distance between the text embeddings and the image embeddings.
Prompt Engineering:
The art of crafting effective text prompts to elicit desired images from the model. This involves understanding how the model interprets language, using specific keywords, and structuring the prompt to guide the generation process.
These technical details are a simplified overview, but I hope they give you a sense of the amazing technology behind text-to-image generation!
—————————————————————————
MoniGarr Request:
provide technical information regarding the prompt engineering keywords and structure
Meta AI Response:
Prompt Engineering is a crucial aspect of text-to-image models like myself. Here are some technical details on keywords and structure:
KEYWORDS:
- Objects: Include specific objects you want to appear in the image (e.g., “cat,” “car,” “tree”).
- Attributes: Describe the objects’ properties (e.g., “red car,” “big house,” “happy dog”).
- Actions: Define actions or poses for the objects (e.g., “running cat,” “flying bird,” “smiling person”).
- Scenes: Set the context or environment (e.g., “beach,” “city,” “forest”).
- Styles: Specify art styles or genres (e.g., “Impressionist,” “anime,” “sketch”).
- Emotions: Convey emotions or moods (e.g., “peaceful,” “energetic,” “mysterious”).
STRUCTURE:
- Simple: Use a single sentence or phrase (e.g., “A black cat on a white background”).
- Compound: Combine multiple objects or scenes (e.g., “A cat and dog playing on a beach”).
- Adjective-Noun: Use a descriptive adjective followed by a noun (e.g., “A vibrant sunset over a tranquil lake”).
- Verb-Object: Use an action verb followed by an object (e.g., “A bird flying over a mountain”).
- Scene-Object: Set a scene and include an object (e.g., “A city street with a red sports car”).
TIPS:
- Be specific and concise
- Use relevant keywords
- Experiment with different structures and phrasing
- Avoid ambiguity and unclear requests
By leveraging these keywords and structures, you can craft effective prompts that help me generate amazing images for you!
References:
Meta Ai. Meta AI. (n.d.). https://www.meta.ai/
Text Prompt Tips: materials (1/18/2023) Retrieved 6/27/2024. https://aigeneration.blog/2023/01/18/text-prompt-tips-art-materials/









