MACRO-LEVEL SIMILARITY MEASUREMENT IN VIZIR Horst Eidenberger and Christian Breiteneder
Vienna University of Technology, Institute of Software Technology and Interactive Systems
Favoritenstrasse 9-11 / 188/2, A-1040 Vienna, Austria
{eidenberger, breiteneder}@ims.tuwien.ac.at
ABSTRACT
points out relevant related work, Section 3 is dedicated to the VizIR project goals, and in Section 4 we revisit the content-based
This paper analyzes the similarity measurement in Content-based
querying process and propose conditions for feature merging. In
Image and Video Retrieval systems (CBIR). The goal is to
Section 5 we analyze the linear weighted method for feature
identify preliminaries for successful queries as the basis for the
merging and finally, in section 6 we explain how querying will be
implementation of a query engine in the Content-based Visual
Information Retrieval framework (VizIR). VizIR is an open CBIR framework for researchers, software developers and
2. RELATED WORK
instructors. Past efforts in CBIR have lead to several general-purpose prototypes. However, these prototypes differ in
Past efforts in CBIR have lead to several general-purpose
implemented feature classes, user-interfaces and similarity
prototypes like QBIC ([3]), Virage ([1]), VisualSEEk ([9]),
measurement. VizIR aims at overcoming this unsatisfactory
Photobook ([5]) and MARS ([4]). Next to the implemented
situation. The paper overviews wide-spread techniques for
feature classes and user-interfaces these prototypes differ in their
similarity measurement in CBIR, derives a general querying
model and proposes conditions for similarity measurement
Usually, CBIR similarity measurement follows the Vector
algorithms on the macro-level. Based on these conditions two
Space Retrieval model and is done by measuring the distances of
methods (the Linear Weighted Merging method and the Query
feature vectors with distance functions that are based on the
Model approach) are evaluated and the superior method chosen
Metric Axioms, combining the distance values of a single object
for the VizIR project. Additionally, the major goals of the VizIR
for multiple features by a merging function to a distance sum and
project are outlined and interested researchers are invited to
presenting the user the objects with the lowest distance sum as
the most similar ones. In Section 4 we will introduce a general model for CBIR querying.
According to the Metric Axioms distance measures d() have
1. INTRODUCTION
In this paper we analyze the similarity measurement in Content-based Image and Video Retrieval systems (CBIR). The goal is to
for the feature vectors f and f of two stimuli A and B (in
identify preliminaries for successful queries as the basis for the
implementation of a query engine in the Content-based Visual
Information Retrieval framework (VizIR). VizIR is an open
CBIR framework for researchers, software developers and instructors (see Section 3 for details).
CBIR ([8]) is the attempt to search for visual content in
media databases by deriving meaningful features and measuring the dissimilarity of visual objects by distance functions. Major
advantages of CBIR systems are fully automated indexing and
G(I%, I& ) G(I$, IF )
the description of visual content by visual features. Recently, the MPEG-7 standard for Multimedia content description was
Distance measures that fulfill the Metric Axioms are Minkowski
finalized. It contains a visual part with descriptors (features) for
distances, the Euclidean distance and the City Block measure.
image and video objects. Nevertheless, CBIR is still an area of
Experimental investigations during the last fifty years have turned
intense research. Each year, prototypes with new intuitive user-
out that Metric Axioms may be too restrictive for human
interfaces and sophisticated methods for iterative refinement, new
similarity perception. The triangle inequality (in CBIR sometimes
querying methods and many other innovations are introduced.
used for query acceleration) was even falsified ([6]). Newer
The rest of this paper is organized as follows: Section 2
theories as e.g., Monotone Proximity Structures or Tversky’s
Feature Contrast model suggest a better representation of human
on the visual part of the MPEG-7 standard for multimedia content
description. Reaching this goal requires the careful design of the
In many CBIR prototypes (e.g., in [3], [1]), when multiple
database structure and an extendible class framework as well as
features are employed for a query, the result set is ordered by a
seeking for suitable extensions and supplementations of the
ranking value derived from the weighted sum of the distance
MPEG-7 standard by additional descriptors and descriptor
values (position value). This method is called Linear Weighted
schemes, mathematically and logically fitting distance measures
Merging. The position value for each database object is defined
for all descriptors (distance measures are not defined in the
standard) and defining an appropriate and flexible model for
similarity definition. MPEG-7 is not information retrieval-
specific. One goal of this project is to apply the definitions of the
standard to visual information retrieval problems.
F represents the number of features, w the weight for feature
Additionally, we want to develop integrated, general-purpose
i and d the distance value for feature i between the query object
user interfaces for visual information retrieval. Such user
and the database object. This evaluation method assumes that all
interfaces have to include a great variety of different properties:
distance functions are normalized to the same interval (f. e. [0,
methods for query definition from examples or sketches,
1]). Its major advantages are the simple calculation and
similarity definition by positioning of visual examples in 3D
application. The major disadvantages are first, the fact that not all
space, appropriate result display and refinement techniques and
features show a linear relationship and linear merging therefore is
cognitively easy handling of visual content, especially video.
not a suitable method to combine such features and second, that
Finally, VizIR will include methods and test sets for
in most systems weights have to be provided by the user who is
benchmarking (measurement of retrieval quality), performance
evaluation (query execution time, etc.) and usability testing of the
For these reasons, the authors of [4] propose the employment
of the Boolean Model instead of Linear Weighted Merging.
The VizIR project intends to integrate various directions of
According to this model two stimuli A and B are similar for a
past and current research in an open framework to push CBIR
certain feature F, if they fulfill the following condition v :
research and teaching towards practical usefulness by overcoming
some of the serious problems. In the next section we will focus
on the querying aspect, outline the general CBIR querying
process and propose conditions for feature merging.
is called a degree of tolerance. It is a threshold for the
maximum distance of two stimuli. In Boolean retrieval multiple
4. CONTENT-BASED QUERYING PROCESS
conditions v can be combined by logical operators. The result set
Usually, the CBIR querying process for a set of example stimuli
consists of those stimuli that fulfill all AND-combined sub-
and an input data set consists of the following three steps (see
expressions. Boolean retrieval leads to better results than Linear
Weighted Merging but has the major drawback that it does not
1. Feature extraction–The properties of stimuli (e.g. images,
rank the stimuli in the result set. Before we go into details of the
video clips) are extracted by feature extraction functions and
querying process in VizIR, we will outline the project goals.
stored as descriptor vectors. This steps transforms the media
3. VIZIR PROJECT GOALS
space into feature space. Normally, only the features of the example stimuli have to be extracted during the querying
The goal of the VizIR project is to develop an open CBIR
process. The descriptors of the data set are fetched from a
prototype as a basis for teaching and further research in various
directions. The term open means that VizIR will be free software
2. Micro-level similarity measurement–The dissimilarity values
(including the source code) and extensible. VizIR was started in
for all features between an example stimulus and elements of
summer 2001 as a conclusion of experiences gained with earlier
the data set are measured with distance functions. Ideally, the
CBIR projects and is currently evaluated for scientific funding.
output of all distance functions in a CBIR system should be
The motivation behind VizIR is: an open CBIR platform would
normalized to the same range of values. This step transforms
make research (especially for smaller institutions) easier and
feature space into distance space, where each media object is
more efficient (because of standardized evaluation sets and
represented by a vector of distance values.
3. Macro-level similarity measurement–In this step a decision is
The VizIR project aims at the implementation of successful
derived from the dissimilarity values of all features for each
methods for automated information extraction from images and
stimulus in the data set, if it is similar to the example stimuli
video streams, definition of similarity measures that can be
or not. The most similar stimuli are ranked and returned as an
applied to approximate human similarity judgment and new,
better concepts for the user interface aspect of visual information
Today, rules exist for the first and second step, how they should
retrieval, particularly for human-machine-interaction for query
be performed and which constraints should be kept. MPEG-7
definition and refinement and video handling. This includes the
descriptors should be used for feature extraction and distance
implementation of a working prototype system that is fully based
measures should be based on the Metric Axioms (see Section 2),
Ordinal Properties (see [6]) or another similarity model. To the
authors’ surprise no such rule set exists for the third step. Since
P(,) = P( S(,))
such rules would be a valuable help for CBIR system developers we will propose four conditions for macro-level similarity
for each permutation p(I) of I. The result set must be
U<M IRUHDFK L M 1 P DQG U;L U<M 2
where i and j are the ranks of the result set elements ;
U (representing stimuli X and Y) and the result set O has m
elements. This means that m() must produce a ranked result set. It must derive at least a partial similarity order (objects with equal similarity may be ranked arbitrary).
P(, + , = P , + P ,
for all input object sets I and I . That is, m() should produce
the same result set for each partition (I , I ) of I.
Valuable similarity information can get lost in the merging step. These conditions should prevent the CBIR system developer from implementing absolute inappropriate merging algorithms. Part of the VizIR project will be the development of new macro-similarity measurement methods that fulfill these conditions. With these methods and the algorithms below we will try to falsify the proposed conditions in human-based evaluations in
Figure 1: example querying process for three stimuli A, B, E in
5. ANALYSIS OF THE LINEAR WEIGHTED MERGING
the input data set I and three features (f and invisible: g, h).
APPROACH
Stimulus E is the query example. The result set O consists of two elements: the query example and stimulus A.
A macro-level similarity measurement algorithm based on Linear Weighted Merging (LWM, Section 2) could look like this:
A merging algorithm m() for macro-level similarity measurement
1. Calculate the position value for each element of I.
O as the n elements of I with the lowest position values. n
is a parameter provided by the user or the CBIR system.
3. Rank the elements of O by the position values. The order of
where I is the set of input objects (described by their dissimilarity
objects with equal position value may be arbitrary.
values d for all F features) and O is the result set. I has n
This algorithm is implemented in QBIC and Virage. If we
evaluate this algorithm by our proposed conditions we get the
- LWM does not fulfill the minimality condition. If we set
I = I + I then the result set O contains only halve of
the objects of O and each object twice. This is just a minor
problem. We can introduce a new first step in our algorithm:
“1. Eliminate all duplicate rows from I”. Then, LWM fulfils
- It fulfills the second and third condition: it is non-
Here, r is the i-th of m elements in the result set. Index A
discriminating and generates a partial order and a ranked
describes that it represents element A of I. i is the rank of r . We
propose that each implementation of m() has to fulfill the
LWM does not meet the linearity condition. This is obvious,
because for an arbitrary partition (I , I ) both m(I ) and m(I )
would produce result sets with N elements – no matter if the
P(,) = P(, + V(,))
objects in these result sets are similar or not. This can not be corrected by a new rule. It is a structural problem of LWM.
for each subset s(I) of I. That is, the result set has to be
Even if we would allow that m(I ) and m(I ) may produce
independent from duplicates in I.
result sets with n/2 elements, condition 4 would only be
fulfilled for input data sets I and I with half the similar
the average query execution time in our test environment by 66%
(in comparison to a QBIC system with the same feature classes
Because LWM does not fulfill the merging conditions and
because of our experiences from earlier work, we conclude that LWM is not a suitable algorithm for macro-level similarity
7. CONCLUSION
measurement. In the next section we will outline the algorithm
In this paper we have presented a general view on the CBIR
querying process, pointed out related work in the field of
6. QUERYING IMPLEMENTATION IN VIZIR
similarity measurement and proposed a set of rules for similarity measurement on the macro-level. Then we have investigated two
In our earlier work we have developed a querying paradigm that
approaches for macro-level similarity measurement: the widely
is based on the Boolean Retrieval Model (see Section 2) but uses
applied Linear Weighted Merging method and our Query Model
a reduced set of logical operators. We call it the Query Model
approach that is based on the Boolean Retrieval Model. From the
approach. A Query Model consists of a set of layers and each
results we draw the conclusion to implement the Query Model
layer of a feature extraction function, a threshold for the
approach in our Visual Information Retrieval framework.
maximum distance of two objects and a weight for the
Finally, we would like to invite interested research
importance of the layer. All layers are combined by AND. This
institutions to join the discussion and participate in the design and
means that each layer is an information filter, which sorts out all
implementation of the open VizIR framework.
objects from the input data set taken over from the preceding layer that do not have a distance smaller than the threshold (if the
8. REFERENCES
threshold is greater or equal 0) or bigger than the threshold (if the
[1] J. Bach, C. Fuller, A. Gupta, A. Hampapur, B. Horowitz, R.
threshold is smaller than 0, logical NOT). No logical OR can be
Humphrey, R. Jain, and C. Shu, “The Virage image search
defined in a Query Model. The effect of the OR operator can be
engine: An open framework for image management”, Proc.
achieved better by running parallel independent queries. This is
SPIE Storage and Retrieval for Image and Video Databases
more transparent for the user. The querying process consists of
IV, San Jose CA, USA, pp. 76-87, February 1996.
[2] C. Breiteneder, and H. Eidenberger, “Performance-optimized
1. Apply the first Query Model layer on I. This is done with
feature ordering for Content-based Image Retrieval”, Proc.
function v (I) = O . v () is an implementation of the first
European Signal Processing Conf., Tampere, Finland, 2000.
Query Model layer as described in Section 2.
[3] M. Flickner, H. Sawhney, W. Niblack, J. Ashley, Q. Huang,
2. Apply the second layer on O using function v (). v (O ) = O .
B. Dom, M. Gorkani, J. Hafner, D. Lee, D. Petkovic, D.
Steele, and P. Yanker, “Query by Image and Video Content:
If we apply our proposed conditions for macro-level similarity
The QBIC System”, IEEE Computer, vol. 28, no. 9, pp. 23-
measurement on this algorithm, we get the following result:
- It fulfills the first and second condition: duplicates displace
[4] M. Ortega, R. Yong, K. Chakrabarti, K. Porkaew, S.
no other objects from the result set O and O is independent
Mehrotra, and T.S. Huang, “Supporting Ranked Boolean
Similarity Queries in MARS”, IEEE Transactions on
- It does not fulfill condition 3, because it does not rank O.
Knowledge and Data Engineering, vol. 10, no. 6, pp. 905-
This can be repaired by extending the algorithm with a new
final step: “Use the layers weights and Linear Weighted
[5] A. Pentland, R.W. Picard, and S. Sclaroff, “Photobook:
Merging to derive position values and rank the objects in the
Content-Based Manipulation of Image Databases”, SPIE Storage and Retrieval Image and Video Databases, no.
It fulfills the linearity condition. Because of the always AND-
connected layers the result set for each partition (I , I ) is
[6] S. Santini, and R. Jain, “Similarity Measures”, IEEE
equal to the result set of I +I .
Transactions on Pattern Analysis and Machine Intelligence,
The Query Model approach fulfills all four conditions. From
vol. 21, no. 9, pp. 871-883, September 1999.
this result and earlier experiments we are convinced that the
[7] G. Sheikholeslami, W. Chang, and A. Zhang, “Semantic
Query Model approach is an ideal solution for similarity
Clustering and Querying on Heterogeneous Features for
measurement in CBIR systems. Therefore we will implement a
Visual Data”, Proc. ACM Multimedia, Bristol, UK, pp. 3-
Query Model based querying engine in the VizIR framework.
In addition, the Query Model approach has a nice side-effect
[8] A.W.M. Smeulders, M. Worring, S. Santini, A. Gupta, and
on query execution time. Because of using only the logical AND
R. Jain, “Content-Based Image Retrieval at the End of the
to connect layers, the result set of a query is independent from the
Early Years”, IEEE Transactions on Pattern Analysis and
order of the layers. An algorithm that sorts the layers in a way
Machine Intelligence, vol. 22, no. 12, pp. 1349-1380,
that those, which sort out most objects and/or use the fastest
distance functions, are used first in the querying process, would
[9] J.R. Smith, and S.F. Chang, “VisualSEEk: a fully automated
lead to significant query acceleration. In [2] we have presented
content-based image query system”, Proc. ACM
the design and implementation of such an algorithm. It reduces
Multimedia, Boston MA, USA, pp. 87-98, 1996.
New therapies, along with improved diagnostic tests, are clearing the way to better care. With advances in veterinary medicine, we have an increased understanding of diseases that strike our horses. EquineCushing’s disease has been recognized for more than 70 years, but has often been misunderstood. Today, however, withour improved diagnostic and treatment options, Cushing’s horses are li
Student Permission Slip and Medical Authorization Form As parent(s)/guardian(s) of the above student, permission to granted for this student to attend the [SCHOOL NAME/GROUP NAME] ’s trip to [LOCATION] during the dates of [DATES OF TRIP]. I/We am/are aware that the [SCHOOL NAME/GROUP NAME] requires all participants on a trip to supply the following information in case a medical