By Marc Stephenson |
January 24, 2025
How to achieve more accurate search results with AI auto-tagging
For many years, organisations (and in particular users) have sought to avoid adding metadata to content. I get it. It’s boring, it’s time-consuming, and it’s difficult – a killer combination that almost guarantees failure.
People say: “You don’t need to add metadata anymore – we have good search engines.”
People have been saying this for years, because technology vendors and search engine providers have been saying it for years. But marketing isn’t always the truth.
Effectively, yes, search engines are pretty good these days – and have been for a while. However, for organisational search, are they good enough? If you want to find a decent HDTV, search is OK because you don’t mind getting 4 million hits. But if you want to find the official company maternity policy, you only want one hit. And you want it to be the right one. First time.

AI tools, such as ChatGPT, are becoming the new search engines, and in theory, they can return useful search results.
But there is an issue – these AI tools work on the principle of most likely results, NOT the most accurate results (and how can they, as they have no concept of “accuracy”). But there is a solution. Meet Artificial Intelligence (AI) for metadata tagging – or AI auto-tagging for short.
How AI auto-tagging can help your organisation
Using Artificial Intelligence for auto-tagging and then using search for retrieval, is a great way to improve search results.
The retrieval will now be accurate, as the (auto-)tagging is. You have replaced the human-driven, boring, time-consuming, difficult tagging task with AI – to me that sounds exactly like what AI should be doing!
Generic context vs specific context
You can also take another step to improving search accuracy if the auto-tagging is given direction by the specific context of what is to be tagged.
AI tools, and Large Language Models (LLMs) specifically, are very good at using the generic context – LLMs are gigantic gatherers of generic context. The key is to augment this generic context (common sense if you like), with the domain specifics of what you are tagging. This has implementation implications, such as needing a localised LLM instance, in order to protect your IP, but the best way to use the specifics of your content is to use a taxonomy or ontology.
You can, of course, also use AI to generate a taxonomy or ontology, but that’s a subject for another blog.
In short, AI can be a valuable tool in information management, but it needs human expertise to really leverage that value.