layout: default # Or a specific ‘post’ layout if you create one title: “Language Models and AI: June 2025 Edit” date: 2025-05-31 19:00:00 +0100 # Optional: time and timezone categories: thoughts # Optional: categorise your post
Language models and AI: June 2025 Edit
Let me start by saying I am not a massive fan of the term AI. It used to be an okay abbreviation for the scientific discipline of Artificial Intelligence—indeed, a previous director of mine has a PhD in AI from thirty years ago—but now it’s evolved to become a vague buzzword that doesn’t actually mean much. AI is also not some future end goal or state… there are many applications and implementations of AI already out there in use. Content recommendation algoriths may be the most used version to the general person, however “AI” is finding uses in weather prediction, process optimisation, drug discovery and medical diagnoses. And then, of course, there’s ChatGPT. ChatGPT is an example of a Large Language Model (LLM). Since ChatGPT—and its creator, OpenAI—were the first to release such an impressive chatbot with widescale adoption, it remains the most popular product and application of AI ever. The name ChatGPT may almost be at risk at being genericised! Whilst they became the first to mass-market, they are not the only player in this space. Google’s Gemini (my personal tool of choice), Anthropic’s Claude, XAI’s Grok, and myriad others are now out there and possible to use. You will have noticed that these have been integrated everywhere… in all the enterprise tools I use at work, WhatsApp, even Google Search itself. It’s a bit overkill, if you ask me. But I don’t see this trend slowing down anytime soon. AI is the flavour of the decade so far, just like Blockchain and Quantum Computing before it.
What is a Language Model?
In essence, an LLM is a word predction machine. You feed it a sample of text, and it will predict and Generate the most likely next word in the chain. How it does this is incredibly complex and computationally intensive, requiring the pretraining of hundreds of billions of parameters, working in 120-dimensional space and multiplying together massive matrices in objects called Transformers. (That’s the GPT in ChatGPT, by the way.) This process is out of scope here, but I strongly encourage you to check our 3Blue1Brown’s amazing series on deep learning, neural networks and language models. Grant’s work is some of the most high quality on the internet in my opinion, and it’s all free. He has even open-sourced his own animation engine for the visuals. Something I want to play around with more in the future…
How do I use LLMs?
Let me start by stating that I am not an expert, nor a power user of these models. I do pay for Gemini Pro, which is less that £20 per month, which gets me access to Gemini 2.5 Pro, which as of writing is top of most leaderboards for general-purpose LLMs. I have personally hitched my wagon to the the Google horse given their immense resources, legacy of AI research, and integration with the other Google services I frequently use. That’s not to say that it’s currently the best for everything, but it’s a great general purpose model for someone who cannot justify paying for multiple AI subscriptions. Again, there is so much I’ve not really experimented with… Fully AI software development (“Vibe Coding”), image and video generation (more on that below), and AI agents and workflows. However, I do use Gemini on an almost daily basis, for a myriad of tasks in my personal and professional life. Note of caution: my company has provided us enterprise licenses of Gemini, meaning our chats are not saved and used to further train the models. The same cannot be said for most free/personal versions of these models so I caution you sharing highly personal and/or confidential information. An exception to this warning to this is models that you locally host and run, and therefore can be completely disconnected from the internet. However, given the sheer size and resources required to run the largest models, this is not available to most people. I will talk more about this below. In my work and personal life, I find LLMs incredible useful for summarisations, translations, image-to-text descriptions, developing questions/topics where I’m out of the loop, coding help, legal text interpretation, cross-referencing documents, finding snyonyms, rhymes and alliterations for captions, and ELI5s. I’m sure there’s some things that I’ve missed out here.
Some of the things people have built using these models is AMAZING, and I believe these models are here to stay and will become a commodity. I don’t think you need to learn how they work deep-down, but for desk-based “knowledge” jobs, knowing when to use them is going to be critical. I personally think they will not replace a massive amount of jobs yet, but they can certainly increse productivity and output.
Open Source AI Models
Most of the models commonly talked about are commercial, proprietary or closed-source. This means that a private company has created these, have perhaps commercialised them, and don’t want to share too much special sauce with others. However—much like with traditional software applications—there is a thriving community of open source and source-available language models which can be viewed, modified and downloaded for personal (and even commercial) use. HuggingFace has become the platform of choice for open source machine learning models. There, you can explore, interact with, and test drive models before downloading and running them. I personally have only tried downloading one LLM for use locally: a small version of Meta’s LLAMA 3.1 model, using Ollama. Even this was 4 GB, and performs not too bad given its size and the laptop that I’m running it on. I initially experimented with it to allow me to enter more personal and confidential information (since the models runs locally, not connected to the internet and is not hosted by some company). As I was writing this section, I decided to download DeepSeek’s 8B parameter “thinking” model (~5.5 GB)—again using Ollama—and wow this is impressive. Lots more experimentation to do this, but DeepSeek’s work is genuinely impressive. Earlier in 2025, Chinese AI startup DeepSeek made some serious waves in the LLM world with their R1 reasoning model which performs almost on-par with the leading commercial western models. Why the hype? This model is drastically less resource intensive and is open source with an incredibly permissive MIT license (read: you can basically do whatever you want with it, for whatever purpose, including commercial). Now, there are genuine concerns that have been raised over the filtering and training of a chinese-developed model—censorship perhaps front of mind to most—however this does not appear to be a problem in my downloaded “distilled” version. It’s genuinely exciting to me to see widely-available, open source LLMs go toe-to-toe with commercial ones from tech giants. Who knows how the future in this space will unfold.