Avatar Seeks Semantic Search

    July 27, 2007
    WebProNews Staff

Researchers at IBM Almaden have been developing a semantic search process that can delve into unstructured text to retrieve structured information.

Avatar Seeks Semantic Search
Avatar Seeks Semantic Search

While a lot of attention has been heaped upon Powerset and its almost-here natural language search, IBM has been working on a similar technology that may or may not be as close to public debut.

IBM calls their effort Avatar Semantic Search. Right now it doesn’t even have the nice minimalist home page Powerset has for early peek signups, but since everyone’s done reading ‘Harry Potter and the Deathly Hallows’, a little text to read is a good thing.

“Ongoing research in Avatar is at the cusp of a number of disciplines ranging from search and information retrieval to machine learning, information extraction, and probabilistic databases,” IBM announced on the project’s page.

We’ve looked at earlier IBM efforts to pull information out of unstructured resources. Their UIMA developments now occupy a place in the freely available IBM Omnifind Yahoo Edition enterprise search product, for example.

But UIMA is so 2005. While Powerset has drawn upon research performed by the Palo Alto Research Center, aka PARC, IBM reached out to the academic community to complement Avatar’s internal team.

They have approached the semantic search issue in three ways. Developing an information extraction system will allow Avatar to plunge into mounds of raw text, and emerge with structured data based on rules-based annotators.

IBM claimed this extraction system will permit unsophisticated users to build an annotator with Avatar and pull out the desired information from email, web pages, business reports, etc.

Through semantic search, the researchers think they can interpret queries people make, and model the real intent behind a query.

The real challenge comes from an effort they refer to as managing uncertainty and probabilistic databases. They’ve stepped deeply into theory here, well beyond any help Douglas Adams can provide for me.

IBM built momentum with UIMA starting well before I’d interviewed Marc Andrews about it in December 2005. It led to the co-branded, freely available Omnifind product I mentioned earlier, and I have to think Avatar may be on a similar track today.