It’s All Semantics: Searching for an Intuitive Internet

Newer generations of researchers not schooled in more traditional, library-based (pre-Internet) research methods are used to doing keyword searches on the Internet to discover information. “But if you come from outside a given field, you don’t necessarily know what those keywords are,” says Alyssa Goodman, a Harvard University astronomy professor. A Semantic Web setup would enable researchers to craft their queries in more natural language. Goodman adds, however, that a fully semantic Web that can read, comprehend and categorize information beyond keywords requires a level of artificial intelligence that is currently not available, something Rensselaer’s researchers are trying to address with this new tool kit.

The National Science Foundation (NSF) awarded a team of researchers at Rensselaer Polytechnic Institute in Troy, N.Y., $1.1 million in October to create a software programming tool kit by mid-2010 that scientists and other researchers will be able to use to make data from their work available to all.

This project is an excellent example of the groundwork happening behind the scenes that ultimately will affect what we in school libraries and our classrooms should be teaching students about effective search techniques.

These are issues I’m very curious about, and am going to spend time in 2010 digging deeper to learn more about the changes taking place.

I will be sharing what I find out at presentations at ACEC2010 and ISTE2010.

I think I’m going to enjoy  this!

Reblog this post [with Zemanta]

Content used to be king

There was a time when books, newspapers, magazines and journals were the prime source of content and information.  It was always your move! navigating the authority maze,  enjoying slow reading of (limited) information sources in order to gain a knowledge base that matched a particular curriculum outline.

This was when content was king and the teacher was the sage on the stage.

Now communication is the new curriculum, and content is but grist to the mill that churns new knowledge. Why?  I came across a few good reads this week that set me thinking and wondering about the changes that we must support in our teaching and in our library services.

Think about this:

The era of Teacher Librarians  ‘taking a class’ in order to show kids how to search, get basic skills, or navigate resources is over. This is a teachers job!!  Teach the teacher by all means (that’s professional development) but don’t waste time doing repeat performances for a teacher who hasn’t caught up with how to integrate information resources into the curriculum.  How can they claim to be good teachers if they can’t model how to use information effectively?  How to use new search tools? How to navigate databases? These ARE NOT specialist skills any more – they are core skills for learning!

The era of collaborating, communicating and integrating resources flexibly and online is here to stay. Every form of interactive and social media tools should be deployed by school libraries to support learning, teaching and communicating with and between students. Are teachers ready for this?  Are your own library staff ready for this?

So what is the situation with content?

Dave Pollard wrote about The Future of Media: Something More than Worthless News. Agreed, the reason he wrote the post is quite different to mine – but in a lateral kind of way, what he wrote has huge relevance to information professionals. Media is changing, and the way media can work for or against learning is deeply concerning. Dave writes

Few people care to take the time needed either to do great investigative work, or to think creatively and profoundly about what all the mountains of facts really mean.

There’s the rub – mountains of fact. Authority and relevance are as nothing when we are confronted with mountains of information to sift and verify. The alternative is to grab ‘something’ and miss the opportunity to engage in real metacognitive knowledge activities.

The diagram Dave offers provides a strong framework for information professionals. How do we deal with new and urgent information need? What value do we place on media scrutiny?

Of course we can’t answer these questions effectively without taking into consideration the shifting dimensions of interoperability and semantic search. We are datamineing on the one hand, and creating data on the other.

Now what’s the implications of this? Semantic search depends on our tags! and our tags depend on our understanding of the strengths and weaknesses inherent in data sets.   It all depends on how things are defined and linked! Duplicate and meaningless content is created by poor  search engine optimization and keyword cannibalisation.  This means that the info junk pile continues to grow. The Search Engine Journal provides a good set of graphics (with explanations) that spell out these problems .

Here’s a simple image that demonstrates a good interlinking strategy. Then go and examine the canonical solution – looks like the stuff of good information professionals to me!

Of course, alongside the need for good search engine optimization is the growth in search functionality and growth in search engine options. Google has  some new features that have been tested in the past months. Google wants to expose some advanced search options that allow you to refine the results without opening a new page. The options are available in a sidebar that’s collapsed by default, but it can be expanded by clicking on “Show options”.

You’ll be able to restrict the results to forums, videos, reviews and recent pages. There’s an option that lets you customize the snippets by making them longer or by showing thumbnails, much like Cuil. Google wants to make the process of refining queries more fun and exploratory by adding a “wonder wheel” of suggestions.

Maybe I’ll just stop thinking and wander right off and do some Semantic Web Shopping!

What? more issues to consider?  not my move anymore? ….. massive change is pushing us into a  21st century information maze.

Change is coming (image by Maria Reyes-McDavis)

Change is coming (image by Maria Reyes-McDavis)

From Tim Berners-Lee to … Muriel?

Twenty years ago today, Tim Berners-Lee wrote his original proposal for a better kind of linked information system. He was doing consulting for CERN in Switzerland, and found that its communication infrastructure was leading to information loss. So he proposed a solution using something called Hypertext. This led to the Hypertext Markup Language, or, as it’s more commonly known now, HTML. That in turn, led to the World Wide Web.

Were you around to see all these changes?  I certainly was, and I definitely remember the trouble I had teaching teachers the concept of the WWW, what it might do for learning, and how to go about using it.  Navigation nightmare – that’s what it was!  But now we all use the Net for stuff – and mostly we incorporate it into our learning experiences for our students, albeit badly at times.  But the argument is won and we have moved onto the whole new media thing – and the relevance of connectedness.

So what’s next?

In the TED Talk below Tim Berners-Lee provides insight into developments that will power the semantic web, and the basis for it’s development which is rooted in linked data.  Way back in 2006 Tim was already writing about ‘linked data‘ which no doubt explains the advances made in subsequent years in semantic web research.  As he explained then

The Semantic Web isn’t just about putting data on the web. It is about making links, so that a person or machine can explore the web of data.  With linked data, when you have some of it, you can find other, related, data.

Now we understand the potential of the semantic web differently and the implications are profound. You must read The Future of Federated Search: Muriel doesn’t search, but DFAST does, by Lee LeBlanc. This will give you a ‘picture’ of what might be – in a way that we can understand. I would never have understood what Tim was trying to explain in his original proposal for the web.  But now I understand virtual environments and crave interoperability and interactivity 24/7!  I won’t be contributing to the evolution any time soon, like the folks over at LinkedOpenCommunity at W3C SWEO Community, but I sure am grateful for their efforts!

A couple of snippets here, then watch the video :-)

Our information seeking behaviors will come to be shaped by the information we seek. Devices and the access channels we seek information through will further define our search behaviors. The computer is only one of these devices; interaction search technologies another.

In 1995, a user expended time searching; in 2035, a user spends precious time thinking -differently. The days of sitting in front of a dumb search box are over. Users no longer pound the keys in frustration getting zero results or billions or results. How will this happen?

Technology trends and the Semantic Web

It’s the time of the year when we see the predictions for technology developments for the coming year. Michael Stephens at Tame the Web has published his Top Ten Trends and Technologies for 2009, and has made it easy for us to to get hooked on his discussion by being able to  Download a PDF of the post here.

The ten on the list are:

  1. The Ubiquity of the cloud
  2. The Changing Role of IT
  3. The Value of the Commons
  4. The promise of micro-interaction
  5. The Care & Nurturing of the Tribe
  6. The triumph of the portable device
  7. The importance of Personalization
  8. The impact of Localization
  9. The evolution of the Digital Lifestyle
  10. The shift toward Open Thinking

There are many themes running through these trends and technologies,  but you can’t go past the shift in devices, the power of the cloud, and the importance of the digital shifts that mean that the environment and information services of school libraries have a big challenge ahead of them.

Kathryn Greenhill at Librarians Matter alerts us to Top Tech Trends for ALA midwinter.

My favourites are:

  • Linked data is a new name for the Semantic Web – The Semantic Web is about creating conceptual relationships between things found on the Internet. Believe it or not, the idea is akin to the ultimate purpose of a traditional library card catalog. Have an item in hand. Give it a unique identifier. Systematically describe it. Put all the descriptions in one place and allow people to navigate the space. By following the tracings it is possible to move from one manifestation of an idea to another ultimately providing the means to the discovery, combination, and creation of new ideas. The Semantic Web is almost the exactly the same thing except the “cards” are manifested using RDF/XML on computers through the Internet.
  • Blogging is peaking – There is no doubt about it. The Blogosphere is here to stay, yet people have discovered that it is not very easy to maintain a blog for the long haul. The technology has made it easier to compose and distribute one’s ideas, much to the chagrin of newspaper publishers. On the other hand, the really hard work is coming up with meaningful things to say on a regular basis.
  • Word/tag clouds abound – It seems very fashionable to create word/tag clouds now-a-days. When you get right down to it, word/tag clouds are a whole lot like concordances — one of the first types of indexes. Each word (or tag) in a document is itemized and counted. Stop words are removed, and the results are sorted either alphabetically or numerically by count.

The Semantic Web is really struggling to emerge, but I believe it will happen.

Tim Berners-Lee had a vision for the internet, believing that the Semantic Web would be able to assist the evolution of human knowledge as a whole.

Human endeavor is caught in an eternal tension between the effectiveness of small groups acting independently and the need to mesh with the wider community. The Semantic Web, in naming every concept simply by a URI, lets anyone express new concepts that they invent with minimal effort. Its unifying logical language will enable these concepts to be progressively linked into a universal Web. This structure will open up the knowledge and workings of humankind to meaningful analysis by software agents, providing a new class of tools by which we can live, work and learn together.

Roy Tennant considered this vision, writing about the Promises of the Semantic Web, and the state of Linked Data systems, programming and data structures that need to emerge to provide the kind of Semantic Web that Tim Berners Lee envisioned.

Folksonomy and tagging are very useful, but they are not the Semantic Web – not in the way Tim Berner-Lee imagined.  All we are doing is aggregating our information (and our collective intelligence), but we are doing so idiosyncratically.  Without standards, we have erratic compilations. The onotology of our data structures are the challenge – if the data strings don’t match, then the inferences won’t hold across data sets for the meanings of the content being expressed.  There is great wisdom in the clouds, but there is no precision without accuracy!  Somehow the Semantic Web will eventually be able to utilise machine languages to snap ‘meaning’  to a grid of structured data.

Read the Future of MicroFormats and Semantic Technologies.  You can’t escape metadata, and you have to rely on markup languages.

The future of microformats is bright, by making it simple to encode your data, there is no reason not too. Tackling very common facets of the web, such as; people, places and events, microformats have helped to break the chicken and the egg issue. “Why should I mark-up my data if no one else is?” or “I’m not going to mark-up my data if there are no tools to extract it”.

Luckily the menagerie of tools is copious and being extended everyday. But I must admit I didn’t know that Firefox has the Operator toolbar which can detect and act on any information found in the page. Operator requires information on the Web to be encoded using microformats, and since this method for semantically encoding information is relatively new, not all sites are using microformats yet. However, Operator works great with any blog that uses rel-tag, and the sites Yahoo! Local, Flickr, and Upcoming.org, all of which contain millions of pieces of information expressed using microformats. As more sites begin to semantically encode data with microformats, Operator will automatically work with them as well.

Right!  School libraries?  Where are you in the discussion of these issues?  I have a lot to learn!

‘Low level’ semantic systems are easy to understand.  Today I noticed the ‘semantic’ support available in Feedly – my RSS reader.

The Reuters Open Calais service  “is a rapidly growing toolkit of capabilities that allow you to readily incorporate state-of-the-art semantic functionality within your blog, content management system, website or application”. Apparently Calais “doesn’t just make data searcheable, it makes knowledge searchable”.

more about “OpenCalais – Semantic access“, posted with vodpod
Reblog this post [with Zemanta]