The English language ChatGPT issue

They say you learn something new every day… but it appears that’s not the case with ChatGPT! I hadn’t even thought of this issue until coming across this article recently: apparently the AI phenomenon is replicating how English dominates language around the globe – and may mean other languages and cultures become more and more sidelined in the contribution they make to the development of ChatGPT.

How does ChatGPT work?

Let’s begin by making it very clear that ChatGPT doesn’t understand the meaning of words, and therefore a language as such. It generates definitions by sifting through a range of definitions, narrowing these down to a response that suits the context of a query. It therefore uses context clues, stylistic structures, writing forms, linguistic patterns and word frequency to respond to queries.

What this means in reality is that ChatGPT maintains the dominant modes of writing and language – while less common ones are sidelined.

The dominance of English language ChatGPT

We all know that English is the dominant language in the global business world.

And it seems that ChatGPT is following in the same footsteps. Languages such as English, Spanish and Mandarin that are the most widely used globally account for the majority of the data in ChatGPT (90%), while those that are less widespread such as for Europe Finnish, Greek, or Norwegian have little influence in the dataset used by ChatGPT to generate its content.

Other limitations of ChatGPT

Despite the huge advances made in AI over recent years, some limitations still remain – and whether they will ever be resolved or no longer exist is anyone’s guess.

  • Thanks to social media, the English language is evolving faster than ever as it absorbs new words from other languages. But interestingly the dataset that is used by ChatGPT to generate content is capped at 2021… this means that any of the words that have emerged since then or have been added to the dictionary will not be included. So when you think about how quickly we absorb new words into our daily communication – the Love Island effect, for example, where occupants of the villa may use a new word extensively and these new words filter through to our own communication in the outside world – you wonder how quickly ChatGPT will begin to become behind the times…
  • The delicate and complex pragmatic meanings that humans instinctively portray through their language use simply isn’t possible with AI.
  • There’s even the worrying assertion that ChatGPT lies more readily in languages other than English. When put to the test to generate articles in English and in Chinese from inaccurate information, it would do so in Chinese every single time – but in English only 1 out of 7 times. That raises the question: can you trust the output produced by ChatGPT in a language other than English?

One thing is for certain: it seems clear that when you ask ChatGPT to generate content for you, you should always query where that answer came from and if the data it is based on is itself trustworthy. And when it comes to producing quality transcription, AI certainly isn’t the answer...

Contact Fiona Shipley for all your transcription needs. Call us on 01737 852 225 or email alex@fionashipley.com.

Leave a comment