如何使用Python創(chuàng)建人工智能驅(qū)動(dòng)的WhatsApp貼紙生成器

作者：李睿 2025-11-11 09:54:38

本文介紹如何使用Python創(chuàng)建人工智能驅(qū)動(dòng)的WhatsApp貼紙生成器。該系統(tǒng)在Google Colab中實(shí)現(xiàn)，支持直接拍攝或上傳圖片，能夠?qū)⑷粘Ｕ掌D(zhuǎn)化為漫畫(huà)或皮克斯風(fēng)格的個(gè)性化貼紙。

譯者 | 李睿

審校 | 重樓

想象一下，不必再依賴網(wǎng)絡(luò)上的通用素材，就能發(fā)送完全由自己定制的表情包與卡通貼紙——那會(huì)是怎樣的體驗(yàn)？使用OpenAI公司最新推出的GPT-Image-1模型，用戶可以將自己的自拍照或日常照片轉(zhuǎn)化為妙趣橫生或風(fēng)格獨(dú)特的個(gè)性化貼紙。本文將介紹如何使用Python在Colab中構(gòu)建WhatsApp貼紙生成器，它支持多種藝術(shù)風(fēng)格處理，包括漫畫(huà)風(fēng)格與皮克斯風(fēng)格濾鏡等。

具體包括如何設(shè)置OpenAI圖像編輯API，在Colab中捕獲或上傳圖像，選擇預(yù)設(shè)趣味文本或輸入自定義內(nèi)容，并利用多個(gè)API 密鑰并行生成三種不同風(fēng)格的貼紙，大幅提升生成效率。最終，將創(chuàng)建一個(gè)基于GPT-Image-1和自定義文本提示驅(qū)動(dòng)的貼紙制作工具。

為什么選擇GPT-Image-1？

本文基于Leonardo.ai平臺(tái)上，對(duì) Gemini 2.0 Flash、Flux 及 Phoenix 等前沿圖像生成模型進(jìn)行了評(píng)估。研究發(fā)現(xiàn)，這些模型在準(zhǔn)確呈現(xiàn)文本與表情方面普遍存在困難。例如：

Google的Gemini 2.0圖像API即便接收到明確指令，其生成結(jié)果仍頻繁出現(xiàn)拼寫(xiě)錯(cuò)誤或文字混亂。例如，輸入“Big Sale Today!”時(shí)，輸出可能是“Big Sale Todai”或隨機(jī)亂碼。
盡管Flux模型生成的圖像整體質(zhì)量較高，但用戶普遍反映其“容易在渲染文本時(shí)引入細(xì)微錯(cuò)誤”。隨著文本長(zhǎng)度增加，拼寫(xiě)錯(cuò)誤或亂碼現(xiàn)象會(huì)愈發(fā)明顯。此外，該模型還傾向于生成高度相似的面部特征，除非施加嚴(yán)格約束，否則易出現(xiàn)“千人一面”的問(wèn)題。
Phoenix 模型雖然在圖像保真度和提示詞遵循方面進(jìn)行了優(yōu)化，但與多數(shù)擴(kuò)散模型一樣，它仍將文本視為視覺(jué)元素而非語(yǔ)義內(nèi)容進(jìn)行處理，因此常出現(xiàn)文本生成錯(cuò)誤。研究發(fā)現(xiàn)，Phoenix只能偶爾生成措辭正確的貼紙，并且對(duì)于相似提示會(huì)反復(fù)輸出相同的默認(rèn)面部特征。

總之，現(xiàn)有模型的這些局限性促使OpenAI開(kāi)發(fā)了GPT-Image-1。與上述模型不同，GPT-Image-1采用了專門(mén)調(diào)序的提示管道，能夠明確強(qiáng)制模型生成正確的文本和豐富的表情變化。

GPT-Image-1如何進(jìn)行圖像編輯

GPT-Image-1是OpenAI推出的旗艦多模態(tài)模型。它可以從文本和圖像提示中創(chuàng)建和編輯圖像，生成并輸出高質(zhì)量的圖像。其核心能力在于，可以依據(jù)文本指令對(duì)原始圖像進(jìn)行指定編輯。在本文的案例中，通過(guò)調(diào)用GPT-Image-1的API，對(duì)輸入照片施加趣味幽默的濾鏡效果并疊加文字，從而生成個(gè)性化貼紙。

通過(guò)精心構(gòu)建的提示詞，約束模型輸出符合貼紙規(guī)格（1024×1024PNG格式）的圖像。GPT-Image-1實(shí)際上成為了人工智能驅(qū)動(dòng)的貼紙創(chuàng)建者，它既能智能改變照片主體的外觀，又能為其添加幽默文本，最終完成貼紙的自動(dòng)化創(chuàng)作。

Python

# Set up OpenAI clients for each API key (to run parallel requests)
clients = [OpenAI(api_key=key) for key in API_KEYS]

因此，為每個(gè)API密鑰分別創(chuàng)建了一個(gè)OpenAI客戶端。通過(guò)配置三個(gè)獨(dú)立密鑰，用戶即可實(shí)現(xiàn)三次API調(diào)用的同步執(zhí)行。這種基于多密鑰與多線程的技術(shù)方案，依托ThreadPoolExecutor實(shí)現(xiàn)并行處理，使得每次運(yùn)行都能同時(shí)生成三張貼紙。正如代碼輸出所顯示，系統(tǒng)正通過(guò)“3個(gè)API密鑰并行生成”的方式，顯著提升貼紙的創(chuàng)建速度。

分步指南

許多人可能認(rèn)為創(chuàng)建自己的人工智能貼紙生成器是一項(xiàng)復(fù)雜的任務(wù)，但本指南將化繁為簡(jiǎn)。首先從在Google Colab中配置開(kāi)發(fā)環(huán)境開(kāi)始，接著介紹API的使用方法、理解提示詞類(lèi)別、驗(yàn)證文本，學(xué)習(xí)如何生成不同風(fēng)格的貼紙，最終實(shí)現(xiàn)并行生成多張貼紙。每個(gè)步驟均配有詳細(xì)的代碼示例和說(shuō)明，幫助用戶輕松上手。現(xiàn)在開(kāi)始編寫(xiě)代碼。

在Colab中安裝和運(yùn)行

合適的配置是成功生成貼紙的前提。本項(xiàng)目將使用PIL和rembg等Python庫(kù)進(jìn)行基礎(chǔ)圖像處理，并通過(guò)google-genai庫(kù)在Colab實(shí)例中調(diào)用相關(guān)服務(wù)。第一步是在Colabnotebook中直接安裝這些必備依賴項(xiàng)。

Python

!pip install --upgrade google-genai pillow rembg
!pip install --upgrade onnxruntime
!pip install python-dotenv

OpenAI集成和API密鑰

在安裝完成后，導(dǎo)入模塊并設(shè)置API密鑰。腳本為每個(gè)API密鑰創(chuàng)建一個(gè)OpenAI客戶端。這允許代碼在多個(gè)密鑰之間并行分發(fā)圖像編輯請(qǐng)求。然后，客戶端列表被貼紙生成函數(shù)使用。

Python

API_KEYS = [ # 3 API keys
            "API KEY 1",
             "API KEY 2",
             "API KEY 3"
]
"""# Stickerverse
"""
import os
import random
import base64
import threading
from concurrent.futures import ThreadPoolExecutor, as_completed
from openai import OpenAI
from PIL import Image
from io import BytesIO
from rembg import remove
from google.colab import files
from IPython.display import display, Javascript
from google.colab.output import eval_js
import time
clients = [OpenAI(api_key=key) for key in API_KEYS]

圖像上傳和拍攝（邏輯）

現(xiàn)在，下一步是調(diào)用攝像頭以拍攝照片或上傳圖像文件。capture_photo（）使用注入Colab的JavaScript打開(kāi)網(wǎng)絡(luò)攝像頭并返回捕獲的圖像。upload_image（）使用Colab的文件上傳組件，并使用PIL庫(kù)對(duì)文件格式進(jìn)行驗(yàn)證。

Python

# Camera capture via JS
def capture_photo(filename='photo.jpg', quality=0.9):
    js_code = """
    async function takePhoto(quality) {
        const div = document.createElement('div');
        const video = document.createElement('video');
        const btn = document.createElement('button');
        btn.textContent = '?? Capture';
        div.appendChild(video);
        div.appendChild(btn);
        document.body.appendChild(div);
        const stream = await navigator.mediaDevices.getUserMedia({video: true});
        video.srcObject = stream;
        await video.play();
        await new Promise(resolve => btn.onclick = resolve);
        const canvas = document.createElement('canvas');
        canvas.width = video.videoWidth;
        canvas.height = video.videoHeight;
        canvas.getContext('2d').drawImage(video, 0, 0);
        stream.getTracks().forEach(track => track.stop());
        div.remove();
        return canvas.toDataURL('image/jpeg', quality);
    }
    """
    display(Javascript(js_code))
    data = eval_js("takePhoto(%f)" % quality)
    binary = base64.b64decode(data.split(',')[1])
    with open(filename, 'wb') as f:
        f.write(binary)
    print(f"Saved: {filename}")
    return filename
# Image upload function
def upload_image():
    print("Please upload your image file...")
    uploaded = files.upload()
    if not uploaded:
        print("No file uploaded!")
        return None
    filename = list(uploaded.keys())[0]
    print(f"Uploaded: {filename}")
    # Validate if it's an image
    try:
        img = Image.open(filename)
        img.verify()
        print(f"?? Image verified: {img.format} {img.size}")
        return filename
    except Exception as e:
        print(f"Invalid image file: {str(e)}")
        return None
# Interactive image source selection
def select_image_source():
    print("Choose image source:")
    print("1. Capture from camera")
    print("2. Upload image file")
    while True:
        try:
            choice = input("Select option (1-2): ").strip()
            if choice == "1":
                return "camera"
            elif choice == "2":
                return "upload"
            else:
                print("Invalid choice! Please enter 1 or 2.")
        except KeyboardInterrupt:
            print("\nGoodbye!")
            return None

輸出：

類(lèi)別和短語(yǔ)示例

接下來(lái)將創(chuàng)建不同的短語(yǔ)類(lèi)別，用于貼紙內(nèi)容的生成。為此定義了一個(gè)包含多種主題的PHRASE_CATEGORIES字典，涵蓋企業(yè)宣傳、寶萊塢、好萊塢、托萊塢、體育賽事和網(wǎng)絡(luò)表情包等類(lèi)別。當(dāng)用戶選擇某一類(lèi)別后，系統(tǒng)會(huì)從該類(lèi)別中隨機(jī)選取三個(gè)不同的短語(yǔ)，并分別應(yīng)用于三種貼紙樣式的生成。

Python

PHRASE_CATEGORIES = {
    "corporate": [
        "Another meeting? May the force be with you!",
        "Monday blues activated!",
        "This could have been an email, boss!"
    ],
    "bollywood": [
        "Mogambo khush hua!",
        "Kitne aadmi the?",
        "Picture abhi baaki hai mere dost!"
    ],
    "memes": [
        "Bhagwan bharose!",
        "Main thak gaya hoon!",
        "Beta tumse na ho payega!"
   ]
}

類(lèi)別和自定義文本

生成器內(nèi)置了一個(gè)預(yù)設(shè)的短語(yǔ)類(lèi)別字典。用戶既可從指定類(lèi)別中隨機(jī)選取趣味短語(yǔ)，也可自由輸入個(gè)性化文本。系統(tǒng)同時(shí)提供了交互式選擇輔助功能，并包含一個(gè)簡(jiǎn)易的文本長(zhǎng)度校驗(yàn)函數(shù)，用于確保自定義短語(yǔ)符合貼紙生成的規(guī)范要求。

Python

def select_category_or_custom():
    print("\nChoose your sticker text option:")
    print("1. Pick from phrase category (random selection)")
    print("2. Enter my own custom phrase")
    while True:
        try:
            choice = input("Choose option (1 or 2): ").strip()
            if choice == "1":
                return "category"
            elif choice == "2":
                return "custom"
            else:
                print("Invalid choice! Please enter 1 or 2.")
        except KeyboardInterrupt:
            print("\nGoodbye!")
            return None
# NEW: Function to get custom phrase from user
def get_custom_phrase():
    while True:
        phrase = input("\nEnter your custom sticker text (2-50 characters): ").strip()
        if len(phrase) < 2:
            print("Too short! Please enter at least 2 characters.")
            continue
        elif len(phrase) > 50:
            print("Too long! Please keep it under 50 characters.")
            continue
        else:
            print(f"Custom phrase accepted: '{phrase}'")
            return phrase

對(duì)于自定義短語(yǔ)，在接受之前檢查輸入長(zhǎng)度（2～50個(gè)字符）。

短語(yǔ)驗(yàn)證和拼寫(xiě)防護(hù)機(jī)制

Python

def validate_and_correct_spelling(text):
    spelling_prompt = f"""
    Please check the spelling and grammar of the following text and return ONLY the corrected version.
    Do not add explanations, comments, or change the meaning.
    Text to check: "{text}"
    """
    response = clients[0].chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": spelling_prompt}],
        max_tokens=100,
        temperature=0.1
    )
    corrected_text = response.choices[0].message.content.strip()
    return corrected_text

現(xiàn)在將創(chuàng)建一個(gè)示例build_prompt函數(shù)，為代理設(shè)置一些基本級(jí)別的指令。另外需要注意，build_prompt（）會(huì)調(diào)用拼寫(xiě)驗(yàn)證器，然后將校正后的文本嵌入到嚴(yán)格的模板提示中：

Python

# Concise Prompt Builder with Spelling Validation
def build_prompt(text, style_variant):
    corrected_text = validate_and_correct_spelling(text)
    base_prompt = f"""
    Create a HIGH-QUALITY WhatsApp sticker in {style_variant} style.
    OUTPUT:
    - 1024x1024 transparent PNG with 8px white border
    - Subject centered, balanced composition, sharp details
    - Preserve original facial identity and proportions
    - Match expression to sentiment of text: '{corrected_text}'
    TEXT:
    - Use EXACT text: '{corrected_text}' (no changes, no emojis)
    - Bold comic font with black outline, high-contrast colors
    - Place text in empty space (top/bottom), never covering the face
    RULES:
    - No hallucinated elements or decorative glyphs
    - No cropping of head/face or text
    - Maintain realistic but expressive look
    - Ensure consistency across stickers
    """
    return base_prompt.strip()

風(fēng)格變體：漫畫(huà)vs皮克斯

這三種風(fēng)格模板存放于 STYLE_VARIANTS 中。前兩種為漫畫(huà)夸張化變形處理，第三種為皮克斯風(fēng)格的3D外觀。這些字符串將直接傳入提示詞生成器中，并決定最終的視覺(jué)風(fēng)格。

Python

STYLE_VARIANTS = [    "Transform into detailed caricature with slightly exaggerated facial features...",    "Transform into expressive caricature with enhanced personality features...",    "Transform into high-quality Pixar-style 3D animated character..."]

并行生成貼紙

該項(xiàng)目的真正優(yōu)勢(shì)在于貼紙的并行生成能力。系統(tǒng)通過(guò)多線程技術(shù)，使用三個(gè)獨(dú)立的API密鑰同時(shí)運(yùn)行三個(gè)生成任務(wù)，從而顯著縮短了等待時(shí)間。

Python

# Generate single sticker using OpenAI GPT-image-1 with specific client (WITH TIMING)
def generate_single_sticker(input_path, output_path, text, style_variant, client_idx):
    try:
        start_time = time.time()
        thread_id = threading.current_thread().name
        print(f"[START] Thread-{thread_id}: API-{client_idx+1} generating {style_variant[:30]}... at {time.strftime('%H:%M:%S', time.localtime(start_time))}")
        prompt = build_prompt(text, style_variant)
        result = clients[client_idx].images.edit(
            model="gpt-image-1",
            image=[open(input_path, "rb")],
            prompt=prompt,
            # input_fidelity="high"
            quality = 'medium'
        )
        image_base64 = result.data[0].b64_json
        image_bytes = base64.b64decode(image_base64)
        with open(output_path, "wb") as f:
            f.write(image_bytes)
       end_time = time.time()
        duration = end_time - start_time
        style_type = "Caricature" if "caricature" in style_variant.lower() else "Pixar"
        print(f"[DONE] Thread-{thread_id}: {style_type} saved as {output_path} | Duration: {duration:.2f}s | Text: '{text[:30]}...'")
        return True
    except Exception as e:
        print(f"[ERROR] API-{client_idx+1} failed: {str(e)}")
        return False
# NEW: Create stickers with custom phrase (all 3 styles use the same custom text)
def create_custom_stickers_parallel(photo_file, custom_text):
    print(f"\nCreating 3 stickers with your custom phrase: '{custom_text}'")
    print("   ? Style 1: Caricature #1")
    print("   ? Style 2: Caricature #2")
    print("   ? Style 3: Pixar Animation")
    # Map futures to their info
    tasks_info = {}
    with ThreadPoolExecutor(max_workers=3, thread_name_prefix="CustomSticker") as executor:
        start_time = time.time()
        print(f"\n[PARALLEL START] Submitting 3 API calls SIMULTANEOUSLY at {time.strftime('%H:%M:%S', time.localtime(start_time))}")
        # Submit ALL tasks at once (non-blocking) - all using the same custom text
        for idx, style_variant in enumerate(STYLE_VARIANTS):
            output_name = f"custom_sticker_{idx+1}.png"
            future = executor.submit(generate_single_sticker, photo_file, output_name, custom_text, style_variant, idx)
            tasks_info[future] = {
                'output_name': output_name,
                'text': custom_text,
                'style_variant': style_variant,
                'client_idx': idx,
                'submit_time': time.time()
            }
        print("All 3 API requests submitted! Processing as they complete...")
        completed = 0
        completion_times = []
        # Process results as they complete
        for future in as_completed(tasks_info.keys(), timeout=180):
           try:
                success = future.result()
                task_info = tasks_info[future]
                if success:
                    completed += 1
                    completion_time = time.time()
                    completion_times.append(completion_time)
                    duration = completion_time - task_info['submit_time']
                    style_type = "Caricature" if "caricature" in task_info['style_variant'].lower() else "Pixar"
                    print(f"[{completed}/3] {style_type} completed: {task_info['output_name']} "
                          f"(API-{task_info['client_idx']+1}, {duration:.1f}s)")
                else:
                    print(f"Failed: {task_info['output_name']}")
            except Exception as e:
                task_info = tasks_info[future]
                print(f"Error with {task_info['output_name']} (API-{task_info['client_idx']+1}): {str(e)}")
        total_time = time.time() - start_time
        print(f"\n [FINAL RESULT] {completed}/3 custom stickers completed in {total_time:.1f} seconds!")
# UPDATED: Create 3 stickers in  PARALLEL (using as_completed)
def create_category_stickers_parallel(photo_file, category):
    if category not in PHRASE_CATEGORIES:
        print(f" Category '{category}' not found! Available: {list(PHRASE_CATEGORIES.keys())}")
        return
    # Choose 3 unique phrases for 3 stickers
    chosen_phrases = random.sample(PHRASE_CATEGORIES[category], 3)
    print(f" Selected phrases for {category.title()} category:")
    for i, phrase in enumerate(chosen_phrases, 1):
        style_type = "Caricature" if i <= 2 else "Pixar Animation"
        print(f"   {i}. [{style_type}] '{phrase}' → API Key {i}")
    # Map futures to their info
    tasks_info = {}
    with ThreadPoolExecutor(max_workers=3, thread_name_prefix="StickerGen") as executor:
        start_time = time.time()
        print(f"\n [PARALLEL START] Submitting 3 API calls SIMULTANEOUSLY at {time.strftime('%H:%M:%S', time.localtime(start_time))}")
        # Submit ALL tasks at once (non-blocking)
        for idx, (style_variant, text) in enumerate(zip(STYLE_VARIANTS, chosen_phrases)):
            output_name = f"{category}_sticker_{idx+1}.png"
            future = executor.submit(generate_single_sticker, photo_file, output_name, text, style_variant, idx)
            tasks_info[future] = {
                'output_name': output_name,
                'text': text,
                'style_variant': style_variant,
                'client_idx': idx,
                'submit_time': time.time()
            }
        print("All 3 API requests submitted! Processing as they complete...")
        print("   ? API Key 1 → Caricature #1")
        print("   ? API Key 2 → Caricature #2")
        print("   ? API Key 3 → Pixar Animation")
        completed = 0
        completion_times = []
        # Process results as they complete (NOT in submission order)
        for future in as_completed(tasks_info.keys(), timeout=180):  # 3 minute total timeout
            try:
                success = future.result()  # This only waits until ANY future completes
                task_info = tasks_info[future]
                if success:
                    completed += 1
                    completion_time = time.time()
                    completion_times.append(completion_time)
                    duration = completion_time - task_info['submit_time']
                    style_type = "Caricature" if "caricature" in task_info['style_variant'].lower() else "Pixar"
                    print(f"[{completed}/3] {style_type} completed: {task_info['output_name']} "
                          f"(API-{task_info['client_idx']+1}, {duration:.1f}s) - '{task_info['text'][:30]}...'")
                else:
                    print(f"Failed: {task_info['output_name']}")
            except Exception as e:
                task_info = tasks_info[future]
                print(f"Error with {task_info['output_name']} (API-{task_info['client_idx']+1}): {str(e)}")
        total_time = time.time() - start_time
        print(f"\n[FINAL RESULT] {completed}/3 stickers completed in {total_time:.1f} seconds!")
        if len(completion_times) > 1:
            fastest_completion = min(completion_times) - start_time
            print(f"Parallel efficiency: Fastest completion in {fastest_completion:.1f}s")

在這里，generate_single_sticker() 函數(shù)負(fù)責(zé)構(gòu)建提示詞并調(diào)用圖像編輯接口，其參數(shù) client_idx 用于指定特定的API客戶端。并行處理層通過(guò)創(chuàng)建最大工作線程數(shù)為3的 ThreadPoolExecutor 線程池，同步提交全部生成任務(wù)，并借助 as_completed 方法對(duì)完成結(jié)果進(jìn)行實(shí)時(shí)收集與處理。這一機(jī)制確保了每個(gè)貼紙生成完成后均可被腳本立即記錄。

系統(tǒng)日志將完整追蹤各線程的執(zhí)行狀態(tài)，詳細(xì)記錄任務(wù)耗時(shí)及所應(yīng)用的風(fēng)格類(lèi)型（夸張漫畫(huà)或皮克斯風(fēng)格），為運(yùn)行狀態(tài)監(jiān)控與效果分析提供全面支持。

主執(zhí)行塊

在腳本的底部，__main__保護(hù)塊默認(rèn)運(yùn)行sticker_from_camera()。不過(guò)，可以根據(jù)需要注釋或取消注釋相關(guān)代碼，以運(yùn)行interactive_menu()、create_all_category_stickers()或其他函數(shù)。

Python

# Main execution
if __name__ == "__main__":
    sticker_from_camera()

輸出視頻：

https://cdn.analyticsvidhya.com/wp-content/uploads/2025/10/final_op_stickerverse_1.mp4

輸出圖像：

有關(guān)這個(gè)WhatsApp貼紙生成器代碼的完整版本，可以訪問(wèn)這個(gè)GitHub存儲(chǔ)庫(kù)。

結(jié)論

本文詳細(xì)介紹了如何配置GPT-Image-1調(diào)用、構(gòu)建貼紙生成提示、通過(guò)拍攝或上傳獲取圖像、選擇預(yù)設(shè)趣味短語(yǔ)或輸入自定義文本，并實(shí)現(xiàn)三種風(fēng)格變體的并行生成。整個(gè)項(xiàng)目?jī)H用數(shù)百行代碼，即可將普通照片轉(zhuǎn)化為漫畫(huà)風(fēng)格的個(gè)性化貼紙。

通過(guò)將OpenAI視覺(jué)模型與創(chuàng)意提示工程及多線程技術(shù)相結(jié)合，用戶能在數(shù)秒內(nèi)生成趣味十足的個(gè)性化貼紙。最終構(gòu)建的人工智能驅(qū)動(dòng)型WhatsApp貼紙生成器，支持一鍵生成貼紙并即時(shí)分享至所有好友與群組。現(xiàn)在就用你的精彩照片和最?lèi)?ài)的幽默短語(yǔ)，開(kāi)啟專屬貼紙創(chuàng)作之旅吧！