smart_toyAI/PROMPT ENGINEERING

Qwen3.6-35B-A3B on my laptop drew me a better pelican than Claude Opus 4.7

sourceSimon Willison

calendar_todayApril 16, 2026

schedule2 min read

lightbulb

EXECUTIVE SUMMARY

Qwen3.6-35B-A3B Outshines Claude Opus 4.7 in Creative AI Benchmark

Summary

The article discusses a humorous benchmark comparison between two AI models, Qwen3.6-35B-A3B from Alibaba and Claude Opus 4.7 from Anthropic, focusing on their ability to generate creative illustrations. The author concludes that Qwen3.6 produces superior results for the given tasks.

Key Points

Qwen3.6-35B-A3B is a 20.9GB quantized model developed by Alibaba.
Claude Opus 4.7 is a new release from Anthropic.
The benchmark involves generating illustrations of a pelican riding a bicycle and a flamingo riding a unicycle.
The author ran Qwen3.6 on a MacBook Pro M5 using LM Studio and the llm-lmstudio plugin.
Qwen3.6 consistently outperformed Claude Opus 4.7 in generating the requested illustrations.
The pelican benchmark is described as a joke, highlighting the absurdity of comparing AI models.
Despite the humorous nature, there is a correlation between the quality of outputs and the models' general usefulness.
The author expresses skepticism about the actual utility of Qwen's latest model compared to Anthropic's proprietary release.

Analysis

This comparison illustrates the ongoing competition in the AI model landscape, particularly in generative tasks. While the benchmark is lighthearted, it reflects the increasing capabilities of AI models and their potential applications in creative fields.

Conclusion

IT professionals should explore the capabilities of both Qwen3.6 and Claude Opus for creative tasks, considering the context of their use. Staying updated on these developments can inform decisions on which AI tools to integrate into workflows.