Intermediate Domain Alignment and Morphology Analogy for Patent-Product Image Retrieval

Haifan Gong ^{1,2 †} Xuanye Zhang ^{1,2 †} Ruifei Zhang^1,2 Yun Su³ Zhuo Li^1,2 Yuhao Du^1,2 Anningzhe Gao^1,2 Xiang Wan^1,2* Haofeng Li^1,2*

^†The two authors contribute equally to this paper

^*Corresponding email: lihaofeng@cuhk.edu.cn

¹The Chinese University of Hong Kong, Shenzhen ²Shenzhen Research Institute of Big Data ³University of Waterloo

Introduction

Recent advances in artificial intelligence have significantly impacted image retrieval tasks, yet Patent-Product Image Retrieval (PPIR) has received limited attention. PPIR, which retrieves patent images based on product images to identify potential infringements, presents unique challenges: (1) both product and patent images often contain numerous categories of artificial objects, but models pre-trained on standard datasets exhibit limited discriminative power to recognize some of those unseen objects; and (2) the significant domain gap between binary patent line drawings and colorful RGB product images further complicates similarity comparisons for product-patent pairs. To address these challenges, we formulate it as an open-set image retrieval task and introduce a comprehensive Patent-Product Image Retrieval Dataset (PPIRD) including a test set with 439 product-patent pairs, a retrieval pool of 727,921 patents, and an unlabeled pre-training set of 3,799,695 images. We further propose a novel Intermediate Domain Alignment and Morphology Analogy (IDAMA) strategy. IDAMA maps both image types to an intermediate sketch domain using edge detection to minimize the domain discrepancy, and employs a Morphology Analogy Filter to select discriminative patent images based on visual features via analogical reasoning. Extensive experiments on PPIRD demonstrate that IDAMA significantly outperforms baseline methods (+7.58 mAR) and offers valuable insights into domain mapping and representation learning for PPIR.

Publication

Paper - Coming soon | GitHub
If you find our work useful, please consider citing it:


        
        Coming soon

Dataset

PPIR is formulated as an open-set image retrieval task to simulate real-world scenarios.
Patent-Product Image Retrieval Dataset (PPIRD) comprises two components: (1) a testing set containing 439 product-patent pairs alongside a retrieval pool of 727,921 patent images and (2) an unlabeled pre-training set with 3,799,695 product/patent images. For each product and the corresponding potential patents infringed pair, we additionally offer extremely detailed product descriptions to verify potential patent-patent infringement. (The extremely detailed information required and the difficulties of verify potential patent-patent infringement both limits the infringed product-patent pair scale. )
The Intermediate Domain Alignment and Morphology Analogy (IDAMA) strategy for PPIR contains an Intermediate Domain Mapping (IDM) strategy and a Morphology Analogy Filter (MAF) strategy: 1) IDM aligns binary line drawing patent images and colorful RGB product images by mapping them into an intermediate sketch domain using an edge detector. We provide theoretical analysis to prove that this alignment effectively mitigates the domain discrepancy, enabling more accurate retrieval. 2) MAF leverages a cognitive principle of morphology analogy—-an unknown object can be described by analogy to a known object—-by using high classification confidence (regardless of the label) to select discriminative image for artificial patent images for retrieval similarity comparison. By selecting discriminative images for unseen artificial patents, MAF can grasp distinctive visual features of the patents for PPIR.