video
VideoSurf - a new way to search for video?
Anonymous — November 26, 2008 - 7:52am
If you have been keeping up with my posts on this blog you won't be surprised to learn that today I spent my lunch hour exploring a video search offering that's new to me called VideoSurf. I was so interested in this new search tool that I interrupted my usual run of image indexing articles, and my lunch hour, to do some research and write up this post.
In a September press release VideoSurf claimed its computers can now, "see inside videos to understand and analyze the content." I would encourage anyone who has an interest in this area to take a look at the company's website, give it a whirl and see what they think.
In my experiences video search engines have relied on a combination of the metadata that is linked to the video clips, scene and key frame analysis, and automatic indexing of sound tracks synched with the video.
For example, sound tracks, synchronised to video content, can be transformed to text and indexed and then can be linked to sections of videos by looking for gaps in the video to identify scenes, with various techniques also used to create key frames, that attempt to represent a scene. These techniques are backed up with metadata to accompany a video clip.
If you have worked in the industry you know that video metadata is expensive to create. Most of what people see online is either harvested for free from other sources, or limited in size and scope. Such metadata may cover the title of a video clip, text describing the clip, clip length .etc. It may even include some information about the depicted content in the video or even abstract concepts which try to specify what a clip is about. Though this level of video metadata is the most time consuming and complex to create - it also offers the fullest level of access for users.
Audio tracks can be also be of great use and many information needs can be met by searching on audio in a video. There are however limitations; for example many VERY SCARY scenes have little dialogue in them, and depend heavily on camera-work and music to give the feeling of fear, how easy is it to find these scenes based on dialogue alone, or even based on 'seeing inside a video'. How can you look for 'fear' as a concept?
Content based image retrieval, looking at textures, basic shapes, and colours in still images, has yet to offer the promised revolution in image indexing and retrieval. In some contexts it works quite well, in many contexts end-users don't really see how it works at all. So adding a layer to video search that tries to analyse the actual content, pixel for pixel is an interesting development.
To my mind, a full set of access paths to all the layers of a video still demands the use of fairly extensive metadata, especially for depicted content and abstract concepts. Up to now, metadata has always been the way to find what an image, whether it's still or moving, is conceptually about, and what can be seen in individual images and videos. Even when that metadata is actually sounds, turned into text and stored in a database.
Is VideoSurf's offering really any different from what's gone before?
Is this system, which seems to be using Content-Based Image Retrieval (CBIR technology to some extent, a significant advance?
Reviewing some of the blog posts people have published it seems many others are interested in VideoSurf's offering as well.
For an initial idea as to how VideoSurf works, try taking a look at James McQuivey's OmniVideo blog post, "Video search, are we there yet?-. As James describes in the article, one pretty neat aspect of what VideoSurf can do is to match faces, enabling you to look for the same face in different videos, thus reducing the need to have the depicted person mentioned in the metadata exclusively. However, this clearly isn't much help if the person you're looking for is mentioned but not depicted, in which case indexed audio would help, or if the person is not well depicted, for example the person is only depicted from the side or the back. However, quibbles aside, if this works, then this is a pretty useful function in itself.
Here are some of the other bloggers who have be writing their thoughts on Video Surf. For example:
- An interesting post on this subject from the Rhondda's Reflections blog on Searching for videos with VideoSurf
- Phil Bradley comments on his Weblog on the VideoSurf Video Search
- And one of the the best current reviews of VideoSurf that I've found comes from Chris Sherman at SearchEngineLand.
Clearly, we're on the right track and there is a lot of interest in the opportunities and technologies around video search. However I think that there is a long way to go before detailed and automatic object recognition is of any meaningful use to people. As far as I can see, it's still not there with still or moving digital images. Metadata for me is still the 'king' of visual search. There however are a growing number of needs that automatic solutions can already resolve and a growing case for solutions that work by offering a combination of automatic computer recognition of image elements, metadata schemes and controlled vocabulary search and browse support.
I'd love to know what people think, about VideoSurf and other services that provide video search.
Synaptica Central : Dow Jones Video Library
Daniela Barbosa — October 19, 2008 - 12:35pm

Video might have killed the Radio Star but in today's video streaming world it certainly is helping distribute knowledge and that is why we are publishing a video page to augment our blog postings.
Very often i talk to clients and they are in need of information to learn about key concepts or even just to share a third party view with their colleagues about specific topics around controlled vocabularies that I know someone on the team has presented or written about. It could be for example providing a white paper about Audience Centric Views, a video overview of Taxonomy Management Tools and how to use these tools to collaborate around developing controlled vocabularies or a real life case study of an existing client using Synaptica. In the past, I have kept these references in a .txt file on my desktop that I reference when I need to, but since this blog is being used as a resource for both us internally here at Dow Jones as well as the community, i figured it would be a good time to start a Video Library of our Dow Jones public resources.
So without any further ado- our Dow Jones Video Library has been published.
This is just the start of turning Synaptica Central into a must go to resource for our community, so please watch this space for additional resource pages from recommended white papers, industry standards references, must see videos, must listen to podcasts and must read books!
Have suggestions of things we should make sure we add to our resource pages? Please leave them in the comments or drop me a note at daniela.barbosa@dowjones.com
Image|Flickr|traed mawr
What is the Hardest Content to Classify?
Anonymous — September 17, 2008 - 2:27pm
This week I saw that my first blog post on Synaptica Central had been published! After a few seconds of enjoyment, as a wave of pride washed over me, I realized that I now need to post pretty frequently to give our readers something interesting to read! So, without further ado…
The first topic that came to mind as I thought about topics to blog about is the whole area of classification of different types of content: text, sound, video and images.
I often speak to clients who have a range of item types stored in a number of repositories. They're often looking to classify new content, or to work on older content in order to improve its findability. They are always looking to get more value from their content.
In these circumstances a content audit is often called for, to answer the 'What do you have?' question. This then leads to a general discussion of the content types and the ways in which they can be classified, usually using a controlled vocabulary either applied by a machine, by a person, or by a mixture of the two.
One thing that often makes people ask me questions is my fairly frequent assertion that images are easily the
Why are Images the Hardest Content to Classify?

-Textual items contain text. Use of auto-categorising software, free text storage and access .etc .etc makes organising and finding textual items relatively easy.
-Sound can be digitised and turned into text.
-Video often has an audio track that can be turned into text too. Computers can be used to identify scenes. Breaking a video into scenes and linking a synched and indexed soundtrack together can provide pretty good access for many people - (though there's a whole blog post on the many access points to video that these process doesn't provide).
Images on the other hand have no text, no scenes, all you have are individual images, with the meaning and access points held in the visuals.
Some will say that this is really not a problem, all you need to do is use content based image retrieval software to identify colours, textures and shapes in your images, and you'll soon be searching for images without any manual indexing. However, whilst this technology is promising, it leaves a lot to be desired.
Today, the way to provide a wide and deep level of access to still images continues to be by using people to view images, write captions and assign keywords or tags to each image based on image 'depictions' and 'aboutness and attributes'. This manual process often requires the use of a controlled vocabulary to improve consistency and application.
However, how this indexing is done and what structures support it, will be the subject of further posts- I just wanted to get my thoughts out there !
So Stay tuned.
Ian
Why My Dog is Like a Search Engine Without a Taxonomy
Daniela Barbosa — September 15, 2008 - 9:22pm
A couple weeks back i asked my 130 pound dog Townes if he wanted a 'biscuit' and he ignored me- i had forgotten that he isn't the smartest dog in the world and that he can not make the automatic association that a 'biscuit' is a 'cookie' so as i spoke, he just stared into another room. (read: it really isn't that he is dumb just that he wasn't trained). Now he has become a bit of a taxonomy star and it might be getting to his head.

Today i ran into someone who had just read my personal blog for the first time recently and had seen my post titled "Why my dog is like a search engine without a taxonomy" and had loved the video i made. She told me that she used it in an internal discussion about taxonomies. She wasn't the first one who has said they loved the video so i figure i would repost it here.
So that morning when i asked him if he wanted a 'biscuit' instead of a 'cookie' and all i got was was a blank stare, i told him he was "like a search engine without a taxonomy" and then he looked even more confused. Then we made this video. He is a bit bored in this video because it is the second take so the intial reaction is not 100% but hopefully you get the idea!
The moral of the story? As humans we easily make associations between words- machine and dogs can't unless they are "trained".
He is a 4 year old Great Dane in case you are wondering- and yes he got lots of biscuits and cookies that morning! (like most days....)
- Daniela Barbosa's blog
- Login to post comments
- Read more

