I have little doubt…
That somewhere in this world, in these past months, an enterprise IT or digital marketing exec has been called to the corner office or a board room to explain ChatGPT and generative artificial intelligence and large language models.
If in that position – and knowing my audience would have no patience for tech talk – I might steer the discussion to these five points.
1. Yes, this is a big deal. Very real and advancing faster than we can imagine.
You saw on Tom Friedman’s article in The New York Times. He terms it a “Promethean Moment” that in time will change nearly everything it touches, in ways we can’t yet know. In The Wall Street Journal, Henry Kissinger, Eric Schmidt, and Daniel Huttenlocher asserted that generative AI will “transform human cognitive process as it has not been shaken up since the invention of printing.” There’s every indication that this is a technology – like the internet, like smartphones, like social media – that we didn’t know about last year, won’t live without next year, and can’t yet know how it will change our world.
We can understand this as a combination of AI language recognition (natural language processing) with advanced AI pattern synthesis (generative AI) and a massive, enormous quantity of trained textual data (a large language model).
It’s the last that makes the big difference.
A language model uses various statistical and probabilistic techniques to determine the probability of a sequence of words. It’s not a new idea. The first papers on the topic date back to 1948, and work that underlies today’s innovation is decades old.
What is new are language models that are Everest-levels of large. In this realm, capacity creates capability. More data leads to ever-higher precision in prediction.
Here’s a comparison, drawn recently by experts at Santa Clara University: the entire size of Wikipedia as of late 2022 was roughly 57 million pages, and required a compressed storage of about 22 gigabytes. The GPT-3 language model – which is last year’s model – requires about 800 gigabytes. It was trained using text databases from the internet, and included 570 gigabytes of data obtained from books, web texts, Wikipedia, articles, and other pieces of writing on the internet. Some 300 billion words were fed into the system.
Very, very, very, very, very big.
When we give such a system a prompt – a properly phrased conversational request – it can respond with what might be described as creativity: it draws on content and styles of expression (remember, part of this is pattern recognition) from the available data and interactions with users. As such, it can develop blogs, write code, summarize spreadsheets, and turn reports into presentations. You can make it work by typing or talking. The information is not always accurate, but the results can be absolutely magical.
2. No, there won't be one mondo, all-encompassing language model. There will be hundreds, thousands, and in time, millions of language models – large and less so.
In fact, we owe it to our shareholders to begin work – now – on our own brand-specific or domain-specific language models.
Two of our most precious assets are the equity of our brands, and the data, the information, that informs and sustains that equity. We invest a lot of resources to protect the truth of both. They’re behind firewalls for a reason.
A multi-purpose, large language model contains a whole lot of everything as of a cerain moment in time. It might include bits of our brand truth. It might include last year’s product data, but not this year’s. It might also include comments from someone who claims our logo is a sign of the devil.
For us, the question of a language model’s value is probably less about its size and much more about its ability to provide trustworthy responses to our customers, and trustworthy inputs to our workflows. Siri co-founder Adam Cheyer told an audience in February that the competitive differentiator in all this won’t be size, but results – who has unique, reliable data. Former Gillette Chief Marketing Officer Dick Cantwell noted recently that brands must seek “trustworthy conversations” that can only emerge from trustworthy data and transparent, accountable development of conversational AI messaging.
Yes, we could take slices of our public-facing data and share it with a large language model through what’s called a plug-in. A plug-in is a software component that adds a specific feature to an existing computer program. Several companies have already done so. Several companies, such as Walmart, don’t want to go there. The very smart folks at OpenAI – who now offer a plug-in for ChatGPT – tell us that sharing data with a large language model offers both “new opportunities as well as significant risks.” For now, we’re in a watch, learn, and test phase. We’re focused on the protection of our data assets.
What this means: we will see a proliferation of language models. We’ll see more large language models. We’ll see a rapid growth of brand and domain-specific language models. We’ll see domain- or brand-specific language models (with privacy and access protection) inside large language models. We may see federations of language models that grow to become large language models of industries or across brands (perhaps a health advisory model certified by leading research hospitals?) – all with the capabilities of ChatGPT. Culture warriors are right now developing large language models that reflect their version of the truth. No doubt pornographers will have theirs.
It will be a very, very diverse and complex future.
Wise conversational AI pros such as Brandon Kaplan of Journey and Raj Koneru of Kore.ai see that three general types of language models will emerge:
- A curated, brand-specific or domain-specific model. These will be controlled by the business or the brand, and reflect brand truth. They’ll be used for external and internal audiences. We’ll need these. Envision a pared-down large language model that we could run locally (apart from the cloud), keeping our data safe and protecting our business from potential outages at OpenAI or other LLM providers. There are companies (our friend Veritone is one) that can quickly build domain-specific language models.
- A “tuned” large language model that safely integrates your company’s data. These are increasingly available.
- General-purpose large language models.
We’ll eventually use all three.
All this diversity and complexity is leading us to a vision of conversational AI that must work like the web. For the bettermend of our company and brands, we want that vision to become a reality. Which is why we’re so invested in the work of the Open Voice Network, the non-profit, vendor-neautral Linux Foundation project. We just published the draft specifications for what we describe as “dialog Events” – it’s the first building block of what we’ll all need to build once and connect to everyone.
3. Yes, our legal, data security, and compliance officers are right to raise questions.
Whether this new gernation of conversational AI is good or bad for our company and our society is up to the humans in this room, and the humans developing it.
A December 2021 study from researchers from the British AI research firm DeepMind (a subsidiary of Alphabet) and three prestigious universities identified six broad areas of concern with the use of large language models: 1) output that perpetuate stereotypes and ignore segments of the population; 2) output that includes private (including personally identifiable) or sensitive data; 3) output that is unintentionally false or misleading (imagine a healthcare situation); 4) output that is intentionally misleading, for purposes of political disinformation or fraud (for example, see this); 5) the use of human-like, ever-more personal conversational AI to extract personal information from or manipulate users, and 6) social, economic, and sustainability costs, including job loss.
(Regarding employment: a March 2023 paper from researchers at OpenAI, OpenResearch, and the University of Pennsylvania found that 80% of the U.S. workforce could have at least 10% of their work tasks affected by the introduction of large language models, and 19% of workers could see 50% of their tasks affected.)
More recently, U.S. ethicist Laura Miller noted eleven potential areas of ethical concern with a release of Google’s Bard large language model, ranging from the absence of attribution to ease of consent, visibility of privacy guidelines, the need for protections for users of minority age.
There are also questions about intellectual property – citation/source identification and copyright.
Thus far, most NLP/GenAI/LLMs do not automatically provide citation and source identification – though Microsoft, for one, has shown how it can be done. And as to the question of ownership: it’s unclear who can copyright or claim ownership of AI-generated works. The requester of the work, or the AI system that produces it? Both? Neither?
Several recent articles point to current U.S. law that assigns copyright protection to work that is the result of original and creative authorship by a human author. It may be that an AI-created work is immediately in the public domain, or owned by – well, to be determined.
A very good path forward is provided by the Open Voice Network (OVON). Our TrustMark Initiative, led by OVON Senior Advisor Oita Coleman, provides a clear, concise summation of the core ethical principles that should guide our use of conversational AI – as well as employee education classes and self-assessment tools.
4. This is a big forward leap in the evolution of automation. For a business like ours, these new conversational AI technologies will lift productivity.
I don’t yet have any idea as to how all of us will behave differently in the age of conversational AI. But I’m pretty sure that we will.
We need to prepare – for now – for conversational AI to be part of our everyday, ever-more-productive life here in the company. Microsoft is now working to incorporate the magic of this into everyday Office and Teams workflows. It’s not as fanciful as a Shakespearean ode to the alma mater in Mandarin, but it is about doing our work in a better-faster-smarter way.
One of the best explanations of the real, day–to-day value of conversational AI / generative AI / large language models (and I can’t, for the moment, remember who said it) is that these technologies will dramatically accelerate the automation evolution.
Even so, this shouldn’t threaten our best creatives or programmers.
Yes, the tools of generative AI and large language models generative AI and large language models generate new content. But we must remember that the content is a mash-up – topic, data, style, cadence – of what has come before, and what has been captured and trained in the model.
Yes, conversational AI systems will be able to generate new content in seconds. A revised HR manual? Certainly. This quarter’s five-slide managerial review deck? Sure. The film script for a profitably repetitive franchise extension, something like Revenge in Spandex: Superheroes Go to War VI? Why not?
But for something truly original, something based on a deep and unique perspective of a client or business pain point?
Not yet.
5. Voice – the spoken interface – is still important. In fact, it will be even more important than before.
Up until late November 2022, the term conversational AI had largely been synonymous with voicebots such as Amazon Alexa and Apple Siri. Recently, those breakthrough voice assistants have been labeled as “dumb as a rock” and “too awkward.”
Still, a lot of people are using those voice assistants, whether in the home through a smart speaker, or via a smartphone. In fact, according to the 2022 version of Vixen Labs’ study of consumer behavior, nearly two-thirds of U.S. adults use a voice assistant – and more importantly, some 38% of U.S. adults do so daily. That’s a cohort of more than 98 million individuals.
Yes, those of you who have played with ChatGPT have done so by typing. But that is changing right now. Users will demand the natural, inclusive interface of voice. Faster-smarter-better at home and in the workplace will demand the natural, inclusive interface.
Siri co-founder Cheyer sees the potential for a voice assistance reaissance. “I do think it is about quality,” he told the Financial Times in March 2023. “Fundamentally, this technology will enable that breadth and flexibility and complexity that has not existed with the previous generation of voice assistance.”
So now what?
Let’s talk. About your brand. Your brand’s future.
Let’s talk about trust.