Report from Digital Asset Management (DAM) Conference - London, 1 July
Ian Davis — July 2, 2009 - 4:57am
I spent Wednesday 1st July at the Henry Stewart DAM Conference in London.
In my slot I talked about, "Tagging Images for Findability - Making Your DAM System Work for You." I used my 30 minutes to raise the issue of organising images using metadata and controlled vocabulary to connect the images to the people who want to use them. I spent a little time looking at the ways to use text to categorise images and the advantages and disadvantages that brings. I devoted a lot of the presentation to raising issues to watch out for when tagging images, in particular specificity and focus in image depictions, abstract concepts and image 'aboutness' and the deceptive simplicity of visually simple images.
A far braver presentation than mine was given by Madi Solomon. Madi ditched the PowerPoint presentation to facilitate a refreshing debate on metadata. Questions from the floor came thick and fast. Madi did a great job of presenting 'on the edge' and drew out the experiences of many of the attendees and the challenges they were facing.
Also of note at the conference was a very informative presentation from Theresa Regli on 'Evaluating and Selecting Technologies' and a stimulating piece from Mark Davey on the old chestnut of ROI and Digital Asset Management Systems. Mark took a pretty dry subject and a slot directly after a good lunch and succeeded brilliantly in making it entertaining, informative and practical. Take a look at his excellent presentation Digital Asset Management ROI - the basics. I think this is a key resource for anyone interested in return on investment in the DAM space and it's fun to watch too.
I had a great day at DAM London and I hope my fellow delegates found the presentations as helpful and enlightening as I did.
Ian
Report from the ISKO Content Architecture Conference - 22-23 June, London, UK
Ian Davis — June 26, 2009 - 3:32am
I spent Monday and Tuesday of this week at the fascinating ISKO Content Architecture Conference.
On Monday I gave a presentation on, "Still Digital Images - the hardest things to classify and find."
My presentation looked at the image market and the ways in which images can be annotated - or is that processed, classified, categorized, tagged, keyworded… We need a controlled vocabulary to controlled the vocabulary of controlled vocabulary!
SLA Tech Zone: Taxonomy and SharePoint -- A Powerful Combination
Dan Segal — June 4, 2009 - 1:29pm
If you are planning to attend the upcoming SLA Annual Conference in Washington, DC, then you won't want to miss the SLA Tech Zone workshop Taxonomy and SharePoint--A Powerful Combination.
SharePoint helps your organization connect people to business critical information and expertise in order to increase productivity and reduce information overload. It achieves this by providing your employees with the ability to find relevant content in a wide range of repositories and formats. Understanding and using taxonomies within a SharePoint implementation to help users find content, is an essential part of ensuring a successful SharePoint deployment. Taxonomies can range from quite simple to very complex. In this session, we will cover the basics of evaluating what you can do to create a simple taxonomy that will yield the most benefits for your SharePoint implementations. You will have a chance to learn a range of Best Practices, from the basics of building a taxonomy to the hands-on skills of deploying that taxonomy within a SharePoint site.
This workshop is a suitable as either a quick-start or refresher in taxonomy managment for SharePoint. There are three sessions:
- Monday, 15 June 2009 9:00AM - 10:30AM (Ticketed Event #640)
- Monday, 15 June 2009 3:30PM - 5:00PM (Ticketed Event #660)
- Tuesday, 16 June 2009 11:30AM - 1:00PM (Ticketed Event #805)
Price: US $35 member / US $35 non-member / US $35 student member
For details and registration information, see the SLA 2009 site.
Classifying Images Part 3: Depicted Content
Ian Davis — June 2, 2009 - 4:44am
Welcome back to my occasional image classification series.
The last time I raised the topic of image classification I discussed the basic attributes of images. This time I want to focus on the thornier issue of the content, or concepts, depicted in them.
There is a danger of treating an image like a piece of text and classifying its attributes: Who created it? When? What techniques were used? Then writing a title or caption and leaving it at that. Sometimes little more need be done to a document than record this kind of information, especially with free text searching, but lots more needs to be done to most images.
Image findability
Image findability is the process of using search and browse to access the images required. A major aspect of image findability relates to the things depicted in them. Image users often search for images based on the generic things in them and also the proper names of these things. Classifying images based on depicted content means considering anything and everything that is and can be depicted in an image. When considering this I like to focus my efforts on understanding the images I'm dealing with, the users who are trying to find and work with the images, and the ways in which these people need to search and browse for the images they need. After an assessment of these areas I then tailor my approach.
Broadly speaking people searching for depicted content are looking for a number of types:
- Places: cities, towns, villages, streets...
- Built works: parks, skyscrapers, cottages, walls, doors, windows...
- Topography: mountains, valleys...
- Groups and organisations: air forces, choirs, police departments...
- People: roles, occupations, ethnicity and nationality: mothers, doctors, Caucasians, French, Germans...
- Actions, activities and events: running, writing, laughing, smiling, birthdays, parties, book signings, meetings...
- Objects: a myriad of items...
- Animals and plants: common and scientific names...
- Anatomy and attributes of people, animals and plants: arms, legs, adults, leaves, trunks, paws, tails...
- Depicted text shown in images - often signs or writing shown in images...
Many of these generic types can also have proper named instances:
- Proper names of people, places, buildings, topography, organisations, animals etc
When dealing with depicted content I've found some of the biggest issues to be:
- Identification - knowing what is in an image
- Focus and specificity - knowing what to include and what to exclude
- Consistency - applying the same term in the same way for the same depicted content
Identification - knowing what is in an image
Depicted content is a relatively black and white area - a dog is depicted so a dog is tagged. However, it might sound a little weird, but working out what is actually in an image can be a lot harder than you think.
Take a look at the image "Do You Know What This Is?" by Sister72 
This depicted content is fairly simple to see, but understanding what you're looking at is not that easy. Even if you know roughly what you're looking at, do you know what it's actually called?
One tip is to group similar images together when you're classifying them. Also, always start by assembling as much information as possible before you begin to classify images. It is especially important to gather together the information you have from the creator or custodians of the images.
Also important, when you have the luxury, is to get the image creator to add key metadata about the image at the point of creation, or soon after.
Focus and specificity
Knowing what to include and what to exclude, what to mention and what to ignore, is also much harder than it sounds.
Firstly, some image users will want a piece of depicted content tagged whenever it appears in an image, others will only want it tagged when the image shows a very good representation of that content, and of course many people will want something in between the two extremes.
Different users have different requirements. You need to understand the domain in which you're working and see the classification of depicted image content as supporting the needs of your users.
For example, Would you tag everything in this 'Messy Room' image?
What would you miss out and why?
Looking at the image of "Mountain Goats", from Thorne Enterprises
Would you tag this with goats as well as mountains? Would this be helpful?
Let's look at four images depicting windows:
What Light Through Yonder Window Breaks'?
and
Looking at these, it soon becomes clear that even deciding to apply a simple term like 'Windows' is not always easy.
Would you apply 'Windows' to the image of the cat looking out of the window? Is a window actually depicted in that image? If the image wasn't tagged with 'Windows' how else would anyone find an image of a cat looking out of a window?
The other three images show windows as parts of buildings. but is a building always depicted? Deciding when to apply a building type or the name of a building can be hard. Should you do this every time a part of a building is shown? Only when the whole building is shown? When enough of the building is visible? Or when a section of the building that to most people would represent the build is visible? For example, what part of the Empire State Building would you consider to depict that building? Rarely does anyone see it all - how much is enough? Would you treat the images of windows in a similar way and classify them all with a building type of 'Houses', or would you ignore the structure and focus on the parts - the window, the roof?
Consistency
Achieving consistent application of terms to images revolves partly around clear term definitions, well defined application rules and guidelines, and a robust quality assurance process.
Term definitions are very important. Defining the meaning of a term, and ensuring the people choosing which term to assign understand that meaning, can be crucial to term application. For example, creating a term such as 'Bow' without defining its meaning is not going to make it easy to apply.
Application rules that are well considered, thorough and clear are also very useful. Even a simple concept often needs some form of guidance linked to it. I remember a while ago needing two terms, 'Indoors' and 'Outdoors' to allow users to find images of people who were outside and inside - a simple concept you might think, one that people often need, and one that's easy to apply - who'd need guidelines for that? However, it soon became clear that guidelines were needed after I received a series of interesting questions: Is being on a train indoors? Should studio shots always be considered indoors? Does every shot of a person have to have indoors or outdoors assigned to it? If not, when should this term be used and when not? Is this a focus issue? If so, how much of a location needs to be seen before Indoors or Outdoors is used. A clear set of application guidelines followed an interesting meeting!
Strong quality assurance processes are very valuable. People make mistakes and images generate interesting issues. Appointing staff to review a percentage of classification work based on clear guidelines, and then sharing findings with the people who assigned the terms to the images, is an important way of assessing how well the image classification is progressing and keeping a classification team synchronised.
Today I’ve talked a lot about content depicted in images, next time I’ll focus on abstract concepts which are related to an images ‘aboutness’.
Synaptica Version 7.1 Feature Preview: Term AutoMatch
Jim Sweeney — May 11, 2009 - 9:14am
We are very excited to announce one of the many new features that will be available with our next release of Synaptica Version 7.1 (available mid-July). This new feature will provide a mechanism that will doubtless save tens if not hundreds of hours for those that currently do manual "mappings" between two different vocabularies.
The new Term AutoMatch tool will allow for the comparison of terms between two distinct Object Classes (vocabularies) and automatically perform a "mapping" between terms where there is a match based on whole or part of the term descriptions. The feature first allows for the selection of the two Object Classes and then for the matching method one would like to apply.
As you will see in detail in the above slidedeck, there are five matching levels including exact match, keywords match, a soundex match (based on phonetics) and finally a single or multiple word smart matches. One may then select the relationship type that should be applied between the terms as well as which types of matches should be displayed.
- Display all matches
- Display matches except where match already exists
- Display matches with no existing match from first to second object class
Next, one may use the built-in review tool to individually remove invalid matches and immediately submit the matches as new relationships in Synaptica. Optionally, the match results can be downloaded, adjusted within a spreadsheet and then re-uploaded to Synaptica to create new relationships. For the online version each term name is displayed as a link so one may enter the Item Summary for any term and edit the term directly. Each term will display an "Existing Relationship Count" that shows the number of relationships from that term to terms in the other object class.
This tool also uses synonyms to allows for matching between “preferred” and “non-preferred” terms, finding matches using synonyms that might otherwise be missed. A tips screen is available to assist with all of the features and the overall use of this tool.
We have already had tremendous response from customers that have seen an advance release of this new feature. We expect that it will be a tremendous benefit to those that perform this type of task on a routine basis and look forward to delivering it to customers.
Please contact us to find out more about Synaptica AutoMatch or for a demonstration and of course stay tuned for more announcements on Synaptica's upcoming features!








