I had a chance (finally) to read Dean Allemang’s recent article about the 2023 KGC Conference,
Woolf at the Door: LLMs and Knowledge Graphs.
First, I loved the Virginia Woolf reference – very, very clever.
However, one point Dean made in this must-read article brings up a very interesting point that needs reiterating because it is one of the strongest cases to be made for when one should use a knowledge graph vs. an LLM.
To me, a knowledge graph is a giant, multi-key index. When I want to remember something in a knowledge graph, if I know the keys, I can retrieve the value associated with those keys at virtually no cost (essentially just the cost of connection). For instance, suppose that I wanted to find out the square root of five. If I asked an LLM (say ChatGPT), getting this answer involves the following steps:
- Me: Send a prompt saying “What is the square root of 5?”
- ChatGPT: Do I understand the concept of square root? Yes, I do … it’s a math function.
- ChatGPT: There is a Python function that can be used to invoked that function, in the Python Math Library. Retrieve that library.
- ChatGPT: Evaluate the number 5 with the function call to get the value 2.235.
- ChatGPT: Construct a response and send that response back to the client.
This assumes that everything goes right.
Now, there are several different approaches that you can take with a knowledge graph. The first is to have enough awareness to say:
prefix fn: <http://www.w3.org/2005/xpath-functions#>
Select ?value WHERE {bind(value(fn:sqrt(5))) as ?value)}
It is not terribly friendly, mind you, but it is fast – likely several thousand times as fast. Why the discrepancy? Because even in this simple case, the LLM has to go through the cognitive processing steps of figuring out the concept of a mathematical function before it can do anything with it, has to establish a Python context, has to construct the function to be evaluated before evaluating it, then needs to wrap this in some kind of meaningful text, before caching all of this.
This process needs to be undertaken every time for every conversation
Suppose I ask ChatGPT who the president of the United States is in 2023. In that case, it has to parse these concepts out, has to note that I’m asking for information about a specific date, then has to go off to its guard rails to determine whether in fact it knows anything about the year 2023 (which it doesn’t, as the information is outside the arbitary 2021 date). It would then punt, find thread about events that occured in 2021, find events about presidents in 2021, and finally would come back with knowledge about Joe Biden. It does this very, very fast, mind you, because neural networks can be very fast, but it is still going to take a while to respond with this information, because this network is also very deep and dense (and has to reconstruct things holographically.
RDF would express this same information as:
[
a :OfficeHolder ;
OfficeHolder:person Person:JoeBiden ;
OfficeHolder:office Office:President ;
OfficeHolder:country Country:UnitedStates ;
OfficeHolder:startDate "2021-06-21"^^xsd:date ;
].
To query this with the known keys would take microseconds on a typical server in SPARQL:
SELECT ?personName WHERE {
?record a :OfficeHolder .
?record OfficeHolder:person $Person .
$Person rdfs:label ?personName .
?record OfficeHolder:office $Office .
$Office rdfs:label ?officeName .
?record OfficeHolder:country $Country .
$Country rdfs:label ?countryName .
?record OfficeHolder:startDate ?startDate .
optional {
?record OfficeHolder:endDate ?endDate .
}
filter (($CurrentDate >= ?startDate) && (!bound(?endDate) || $CurrentDate < ?endDate))
}
Each variable beginning with a $ is a parameter (though not all, or even any, parameters explicitly must be supplied). Each line reduces the set of items being searched, in essence, doing the same kind of searches that the neural nets have been trained to do, albeit by rough example. It also does it again in about a thousandth of the time and a thousandth the space because the neural net builds multiple hierarchies (office -> country -> date), (office -> date -> country), (date -> country -> office), etc., compared to the one cluster of information built by the knowledge graph.
Getting those keys can be a bit more convoluted, which is why most knowledge graphs also index by various label permutations as text indexes. These searches are still far faster than an LLM can handle (and one reason that LLMs require so much compute power), and consequently, knowledge graphs can be thought of as being much, much, much denser in terms of overall information content.
This difference becomes much more pronounced when you start getting into inferences and reasoning. One of the key differences between LLMs and knowledge graphs is that while both are pretty fair at reasoning, KGs are often built for single query operations. In contrast, LLMs usually build up a context of information. Ironically, the more people start using Sparql update pipelines (which allow for a similar buildup of context), the more I suspect can be done with KGs to make them look and act like LLMs while doing so faster and with a better audit trail.
LLMs have to figure things out. They follow an iterative feedback loop called a langchain, with either a human, itself, or a combination of the two. This langchain model should be emulatable with SPARQL Update. I’m playing around with this idea on Jena/Fuseki, and the early results are … intriguing. The key is to recognize that you are doing mutations to the database, which makes many DBAs cringe. However, I don’t think there is any way you can get to conversational AI on a knowledge graph without constantly building (and, when necessary, destroying) contextual graphs.
Realistically, I think the LLM model is likely wasteful and inefficient, which seems to be confirmed by experimentation and recent developments that point to reasonable models that can fit comfortably in a laptop or even a very high-end phone. It is here, in particular, that I see knowledge graphs playing the most significant difference.
Ultimately, the question will come down to indexing (knowing) … as it almost invariably does.
Kurt Cagle is the Editor in Chief of The Cagle Report, a former community editor for Data Science Central, and the principal for Semantical LLC, as well as a regular contributing writer for Linked In. He has written twenty-four books and hundreds of articles on programming and data interchange standards. He maintains a free Calendly consultation site at https://calendly.com/semantical – if you have a question, want to suggest a story, or want to chat, set up a free consultation appointment with him there.
I use the terms speculative and specified to identity the distinctions between an LLM response and a KG response, respectively. LLMs hallucinate due to speculative edge relations KGs do not because they use specified edge relations.
That makes a great deal of sense to me.