Angel To Exit

LLMs, huh, what are they good for?

Over the last year or so, I have been expressing the view that LLMs are unsuited for lots of things. And that remains true. But I have also been working through their improvements and uses, and I have found some things that LLMs are useful for.

Use cases

The technology of large language models is often mistaken for some form of intelligence. But the term "AI" is not really useful as a replacement for intelligence. The reality behind the hype of AI has always been automation. Just like automating factory processes can be a great improvement, especially for repetitive tasks, so can the many forms of AI developed over the years. Here are some examples:

Parsing and rewriting natural language: One of the things modern LLMs do well is take inputs the way people write or speak them (after processing them with speech to text), and parse them into simple statements. Their ability to summarize is quite good, and because of the randomness parameters in these systems, you can do this with relatively little change in terminology (or lots if you like misinterpretation as 'creativity'). By keeping track of what produces what, you can also retain references back to the reasons the output was produced (cause and effect with the somewhat nebulous mechanism of transformers).
Categorizing linguistic statements: Another thing LLMs are good at is putting things into simple categories. The more complex and numerous the categories, the worse it does, and the more precisely the categories are specified, the better it does. It's really up to you to define the categories in a way that the LLM will produce useful results.
Comparing natural language statements to each other: One of the best things LLMs seem to do well is comparing simple statements of fact to each other. For example, if I say as an assumption that all cars made by Joe have black or white exteriors, and I ask whether Jane's statement "Joe makes rainbow colored cars" is true under that assumption, LLM will almost always give me the right answer. This is very much the sort of parsing that has been done historically, but LLMs seem to do much better on the association of arbitrary linguistic expressions than older AI linguistic parsing approaches.
Generating different forms of output from specifications: LLMs are increasingly good at generating structured output containing linguistic expressions. As a simple example, if you ask it to compare 5 statements to 3 facts and provide the output in a table of 3 columns, providing details of what those columns should contain, it will do this quite well most of the time.

Putting these things together

Just like any other programming approach, you build up programs from the assembly of parts. For example, you can provide a list of facts, tell the LLM to parse human statements into a list of claims made in those statements, compare the claims to the facts, and produce a table listing which claims are true, which are false, and which are not determinable by the facts, producing a factual rebuttal to claims that are false, and it will do it pretty well, quickly, inexpensively, and reliably.

Like in programming, where you can express things many different ways to produce similar results, how you request things from LLMs leads to different, even if similar results. In traditional programming, errors accumulate, so for example, 1/3 expressed in binary always produces an inexact answer. Whether high or low in the last bit, as you do more and more with it, the errors accumulate unless controlled, until the answers come out completely wrong, errors of kind rather than small errors of amount.

Unlike traditional programs, the same input and program often do not produce the same outputs in LLMs. Part of the problem of building up more complex use cases is constraining the expansion of outcomes to desired subsets. The expansion of outputs, if otherwise uncontrolled, leads to unpredictable results that go far astray as the overall LLM 'program by prompt' method expands minor errors into huge ones in a less predictable manner, and because the pivot points to get to errors of kind rather than amount are unclear, some programmatic process is most effective in constraining results by reforming them, rejecting a step and trying again, and other similar methods.

Thus today, the most successful attempts I have seen, use LLM for some steps and traditional programming between those steps. This mix and how to implement it become a fundamental of success in this space for business purposes.

Rules of thumb Here are a few rules to thrive by in the LLM space:

Determine requirements for trusting results: You really need to think about a trust architecture here. What do you trust for what purpose to what extent over what time frame, and how do you limit the harm by architecting the trust. This is one of the interesting functions of how you put things together. In essence, the mix of traditional and LLM programming uses the traditional to limit harm and the LLM to enhance value.
Verify form and format of results: The traditional methods are very good for strictly constraining intermediate (and final) results, but the LLMs should also be used in such a way as to produce more predictable results that are more easily checked by the traditional methods.
Keep the human in the loop (for now): At the end of the day, if harm can come from the results, and the harm is sufficient to warrant the cost, a human check for final (and sometimes intermediate) results makes sense. How do you decide how much human to put where? This is determined by your trust architecture.
It's going to do weird things occasionally: Any time you use a modern LLM, you will likely at least occasionally, get a weird result. Without the traditional programming checks or the human review process, these results will produce some level of mayhem, particularly when it hits the traditional systems which are much more brittle than the LLMs, which usually keep running regardless of how bad or ridiculous the results may become.
You can trick it - and you can trick people too: A lot of hubbub is made about jailbreaking LLMs, and the security threats they produce, and so-called hallucinations (which are no such thing). But the same is really true of any system or person. In reality, for the most part, if properly architected, the user trying to trick the system will only trick themselves. In most sensible applications, honest people are trying to do an honest job using LLMs to make them more efficient and effective. People make mistake, so do LLMs, and so do computers without LLMs. If you have a decent trust architecture, you will manage these issues, limit harm to reasonable levels, and have a system worthy of the trust placed in it.

Conclusions

LLMs are good for a lot of things, if properly architected in the context of a trust architecture for the context of their use.

or as rewritten by ChatGPT: LLMs can be highly effective when designed within a well-structured trust architecture tailored to their intended use.

More information?

If you want more details, join us on our monthly free advisory call, usually at 0900 Pacific time on the 1st Thursday of the month:

Advisory Session

and we will be happy to answer any of your questions.

In summary

LLMs are good for lots of things, if you know how to use them well for those purposes.