All Posts Tagged Tag: ‘Semantic’
Most of you know that my job focuses on IBM’s OmniFind enterprise search and text analytics products. And I’ve written before about semantic search—I’ve even written about what semantic search isn’t. I keep talking about it because semantic search is probably the easiest to understand application of text analytics.
The days of keyword stuffing, single phrase optimization and concentrating only on incoming links to gain traffic are slowly being phased out as a more holistic approach to judging website content comes online. This new concept has many webmasters hopping, and it should. Latent semantic indexing is quickly becoming the wave of now.
You may have heard the term "semantic search," but do you really know what it is? Some people have very big ideas of how computers will understand the meaning of text, but today’s semantic search falls far short of that. Regardless, what’s possible today is still very useful.
To understand how hard it is for computers to really understand the meaning of text, let’s not look at understanding entire documents or even paragraphs. Let’s not even look at sentences. No, let’s start with something extremely simple: noun phrases.
Having worked closely with latent semantic indexing, during my time at FI, I’ve become a big advocate of making sure you have structured themes in your content, and that you include a supporting cast of semantically connected keywords.
In this clip from SMX Advanced, Matt Cutts shares how Google is continually testing the use of LSI, and keyword themes.
The language used to describe the Semantic Web is complicated enough – at a glance, it looks a bit quantum theory-ish, just enough to make your eyes roll back into your head to look for ways to kill themselves – but Tim Berners-Lee, who’s responsible for all those Ws littering your URLs, inspired enough faith that whatever the Semantic Web was, it could be accomplished.
Innovation in making data relevant to the one or two words that we type into a search engine is Web 2.0. Adding to the plethora of data is the advent of social networking, Ajax; shared apps across the back end internet cloud, there are already frameworks that are proposed in making Web 3.0 and reputation 1.0 reliable in the greater context of the internet.
Living in Silicon Valley has been an intoxicating and suffocating experience all wrapped up into a lavish party with gourmet food and cocktails poured through a block of chiseled ice. Everyday I live, breath, sleep everything two dot oh, and what started as a way of making the web more dynamic and interactive, is now one big pool of punch where programmers, marketers, and startup founders are the new rock stars and everyone wants to jump in to take a dip and take a sip.
LSI is a methodology for automatic document classification. It examines all the words in all the documents of a corpus and calculates similarity measurements for each document or for individual terms.
One thing that I have learned in over a decade developing web sites is that the Net is continually changing, and to keep up you need to change with it.
The list of dubious means of search engine optimisation lengthens year on year. In theory, of course, we could all employ such means, but there are ethical issues to be tackled. And even if we ignore fair-play principles for a moment, it’s worth pointing out that cheap, scam-like promotion methods usually look cheap and scam-like, annoy Internet users and have a short lifespan because counter-measures are created.