BrandFusion: Aligning Image Generation with Brand Styles

BrandFusion: Aligning Image Generation with Brand Styles

Abstract

While recent text-to-image models excel at generating realistic content, they struggle to capture the nuanced visual characteristics that define a brand’s distinctive style—such as lighting preferences, photography genres, color palettes, and compositional choices. This work introduces BrandFusion, a novel framework that automatically generates brand-aligned promotional images by decoupling brand style learning from image generation. Our approach consists of two components: a Brand-aware Vision-Language Model (BrandVLM) that predicts brand-relevant style characteristics and corresponding visual embeddings from marketer-provided contextual information, and a Brand-aware Diffusion Model (BrandDM) that generates images conditioned on these learned style representations. Unlike existing personalization methods that require separate fine-tuning for each brand, BrandFusion maintains scalability while preserving interpretability through textual style characteristics. Our method generalizes effectively to unseen brands by leveraging common industry sector-level visual patterns. Extensive evaluation demonstrates consistent improvements over existing approaches across multiple brand alignment metrics, with a 66.11% preference rate in human evaluation study. This work paves the way for AI-assisted on-brand content creation in marketing workflows.

Publication
The IEEE/CVF Winter Conference on Applications of Computer Vision
Parul
Parul
Machine Learning Research Scientist
Varun Khurana
Varun Khurana
Research Associate
Yaman Kumar Singla
Yaman Kumar Singla
Senior Research Scientist
Balaji Krishnamurthy
Balaji Krishnamurthy
Senior Principal Scientist and Senior Director