SIGIR 2007 Proceedings Poster Document Layout and Color Driven Image Retrieval Hewlett-Packard Laboratories 1501 Page Mill Road Palo Alto, California 94304, USA +1 650 857 42 70 Pere Obrador 3rd author's affiliation 1st line of address 2nd line of address Telephone number, incl. country code 3rd Author pere.obrador@hp.com 3rd E-mail This paper presents a contribution to image indexing applied to the document creation task. The presented method ranks a set of photographs based on how well they aesthetically work within a predefined document. Color harmony, document visual balance and image quality are taken into consideration. A user study conducted on people with a range of expertise in document creation helped gather the right visual features to consider by the algorithm. This shows some benefits for the traditional document creation task, as well as for the case of ever-changing web page banner colors and layout. ABSTRACT documents and images. The selection of features was driven by such findings, and the way they were combined is explained below. The results at the end of this paper show that these two features work in a somewhat orthogonal way, and the combination of the two produces very promising results. 2. USER STUDIES A set of documents were printed with a diverse set of images, shown to 8 subjects: expert photo-book creator, expert illustrator/publisher, expert in color science, two experienced photographers, and three other users. Their feedback has been condensed into the following list of findings: F1) The images should have little clutter, with well defined homogeneous regions (also described in [4]). F2) Left-right symmetrical visual balance [6] is preferred as opposed to center symmetry for document balancing. F3) Analogous color harmonies [5] are preferred, with large homogeneous color patches representing such colors. One main color, with one accent color seemed to be preferred. F4) Slight color tone differences between regions are singled out. F5) High contrasts between color patches in the document and analogous color patches in the image are undesirable. i.e., having the analogous colors close together is favorable. F6) Users will reject images if one of the features, either visual balance or color harmony, is below a certain threshold, no matter how good the other feature may be. This threshold seems to depend on the level of expertise for each user. F7) Chosen images have to be above a certain quality threshold. H.3.1. [Information Storage and Retrieval]: Content Analysis and Indexing - Indexing methods. Categories and Subject Descriptors General Terms Keywords Algorithms, Experimentation, Human Factors. Image analysis and indexing, document balance, color harmony. 1. INTRODUCTION When selecting an image to accompany a document, graphic artists/illustrators will usually follow three basic steps: (a) select the image based on content (i.e., semantic relevance to the article); (b) image quality; (c) document's layout [3], color scheme [5] and image composition [7]; (d) adjustments (e.g. color, crop) may be performed on the final image. In real life situations, graphic designers and journalists do not have all the time they need to find the photograph that would best aesthetically match a document (or perform those adjustments mentioned above); rather, they tend to make acceptable choices [3]. The work described below focuses on steps (b) and (c) above, where image quality, layout and color scheme are taken into consideration. Given a document (query), with a blank area to accommodate an image (i.e., the layout of the document is not altered), the goal of the presented algorithm is to rank a set of images based on their image quality and how well they visually balance the document's layout and color harmonize with the document's color scheme. A set of user studies were performed in order to learn what users value when aesthetically matching prior specific permission and/or a fee. Copyright is held by the author/owner(s). SIGIR'07, July 23­27, 2007, Amsterdam, The Netherlands. ACM 978-1-59593-597-7/07/0007.Conference'04, Month 1­2, 2004, City State Country 3. ALGORITHM DESCRIPTION From these findings, a combination of quality assessment, visual balance and color harmony seems to be the right approach to solve this problem. Each of these features has been used in the literature individually, but when combined the results are greatly improved (suggested by F6), as will be shown below. Figure 1. Top results. Figure 2. Weighted hull. 889 SIGIR 2007 Proceedings Poster 3.1 Computing the Color Harmony Measure Color harmony image indexing has been proposed where certain attributes are derived of such harmony, and used in keyword search [9] or in query by drawing [2]. In [5] the author presented an image indexing technique based on how well the image color harmonizes with the document color layout where it must be inserted; this is proportional to the color patch size (patchSize), it can be used in analogous harmony tuned to be very sensitive to color changes (colorDist), and it can be tuned to favor closer patches of analogous colors (centroidDist). See [5] for details. color _ measure = idocument jimage (1 + centroidDist )* (1 + (colorDist ) ) ij ij 4 patchSizei * patchSize j were also run individually. Figure 3 shows those three average precision-recall curves. The color harmony result on its own is the worst of the three. This is due to the fact that the method in [5] still allows for certain levels of non-homogeneity to be present with a high color harmony score, and as mentioned in our user studies (F1) and [4], this is an important factor when assessing the relevance of a photograph. The visual balance only result, instead, is reasonably good on its own since it favors images with a visual quality map concentrated in a particular area, which favors homogeneity. By combining the two algorithms: the nice homogeneity and symmetry of the visual balance complement the color harmony's lack of such; and the color harmony measure retrieves the well balanced images with the right color scheme at the top of the list. 1 Visual balance is obtained by balancing the visual weight (i.e., left-right, F2) for all objects in the document. Objects are paragraphs, titles, banners, images, etc. For an image object, each of the regions within the image should be considered [4][6], since certain regions will have higher visual weight than others [1]. In [6] the author showed that the image visual weight map can be approximated by an image appeal map that takes into consideration sharpness, contrast and chroma; this map is thresholded and the resulting region (visualWeight_region) is used to measure how well this image balances the rest of the objects in the document. The visual balance measure from [6] was improved by incorporating the difference in region sizes (width_height_dist) as presented in [8] (see [6] for details): balance _ measure = visualWeight _ regionQuality 1 + 5 * visualWeightCentroidDist 2 + width _ height _ dist 2 3.2 Computing the Visual Balance Measure Proposed M thod e 0.8 Color Harmony Visual Balance Average Precis 0.6 0.4 0.2 0 0 0.2 0.4 0.6 0.8 1 Average Recall 3.3 Combining Balance and Color Harmony Figure 3. Average precision-recall graph. Thick line: top 20. The color harmony measure was found to be less relevant than the visual balance measure (F1). Also the fact that when one of the features is below a threshold, the overall result is considered unacceptable (F6), yielded the following measure between image i and the query document: combined _ measure _ mi = balance _ measurei * color _ measurei 5. REFERENCES [1] Bajcsy, R. Active Perception, Proceedings of the IEEE, vol. 76, no. 8, pp. 996-1005, 1988. [2] Corridoni, J.M., Del Bimbo, A., De Magistris, S. Querying and retreiving pictorial data using semantics induced by colour quality and arrangement, Proc. Multimedia, 1996. [3] Markkula, M.and Sormunen, E. End-user searching challenges indexing practices in the digital newspaper photo archive. Information retrieval, 1:259-285, 2000. [4] Martinet, J., Chiaramella, Y. and Mulhem, P. A model for weighting image objects in home photographs. In ACM CIKM'05, pages 760-767, Bremen, Germany, 2005. [5] Obrador, P. Automatic color scheme picker for document templates based on image analysis and dual problem, in Proc. SPIE, vol. 6076, San Jose, CA, 2006. [6] Obrador, P., Content Selection based on Compositional Image Quality, in Proc. SPIE, vol. 6500, San Jose, CA 2007. [7] Savakis, A., Etz, S., and Loui A. Evaluation of image appeal in consumer photography, in Proc. SPIE vol. 3959, 2000. [8] Smith, R. J. and Chang, S.-F. Integrated Spatial and Feature Image Query, Multimedia Systems, 7(2):129--140, 1999. [9] Vasile, A., Bender, W.R. Image query based on color harmony, in Proc. SPIE Vol. 4299, San Jose, CA, 2001 This formula does not take care of the extreme cases (close to the axes) where one of the measures may be small and the other may be very large. Therefore, a hard-coded threshold (optimized from the gathered ground truth, see next section), was integrated in the algorithm, down-weighting the images whose coordinates would lie below the curve (Figure 2) (F6). In future work, a machine learning approach will be developed, in order to optimize these thresholds, based on a larger set of ground truth. 4. RESULTS AND CONCLUSION A collection spanning 9 days was used (882 images overall), all images, good and bad were considered, and no image adjustments were performed. Two different documents (Figure 1), were printed with each photograph, and were tagged as ground truth by the same users as above, into three sets: works, maybe (weighted at 50% of works in the retrieval experiments), and doesn't_work. The proposed method was run on the collection set twice (once per query). For comparison purposes, the ranking based on color harmony only, and the ranking based on document balance only 890