AI coding assistants are here, but they’re not perfect. Discover the strengths & weaknesses of popular tools and the future of AI-powered coding. #AICoding #AItools #CodeAssistant
Explanation in video
Hey everyone, John here! Today we’re diving into something super interesting that’s shaking up the world of computer programming: AI coding assistants. Imagine you’re trying to build something complicated, like a giant LEGO castle, and you have a helper who can fetch bricks for you, suggest what piece to use next, or even build small sections on their own. That’s kind of what these AI tools aim to do for programmers!
Lila: “Oh, so they’re like super-smart helpers for people who write code, John?”
Exactly, Lila! The author of an article I just read, who has spent months working closely with these tools, describes them as feeling like “junior pair programmers.” Think of them as very bright and eager interns. They can be incredibly helpful, but they’re still learning, so they might get a bit sidetracked or need some careful guidance. He’s tried out six popular AI coding assistants and is generally impressed with the idea, but he also sees a lot of room for them to get even better.
Let’s take a peek at what he found out about each one, and what he thinks is still needed to make them truly amazing.
ChatGPT: The Generalist That Runs Out of Room
First up is OpenAI’s ChatGPT. Many developers start their AI coding journey here because it’s pretty good at understanding all sorts of requests you throw at it. If you’re using it on a Mac, you can even send it an open file you’re working on, and it can send back a list of suggested changes. The article calls this a ‘unified diff’ – a big improvement from the early days of lots of copying and pasting!
Lila: “A ‘unified diff,’ John? What’s that, exactly?”
Great question, Lila! Imagine you’ve written a story, and your friend reads it and suggests some changes. A unified diff is like a special report that only shows you the lines in your story that were added, removed, or changed by your friend. It highlights just the differences, so you don’t have to re-read the whole thing to see what’s new. It’s very handy for programmers to quickly see what the AI is suggesting.
But ChatGPT isn’t perfect for coding. If your coding project involves many different files, or if you’re working with a programming language that ChatGPT can’t actually run itself, you’re back to doing a lot of manual copy-pasting. Sometimes, if you give it a really long piece of code, it might just get stuck or run into what are called ‘token limits.’
Lila: “‘Token limits’? Is that like the AI running out of words it’s allowed to use for that task?”
That’s a perfect way to think about it, Lila! These AI models have a sort of ‘attention span’ or ‘working memory’ for each conversation or task, and this is measured in ‘tokens.’ Tokens are like small pieces of words or code. If your request is too long, or the conversation goes on for too many turns, it can use up all its tokens for that session and can’t process any more information. The author mentions there are now ‘plugins and extensions’ that are supposed to help with this, but he hasn’t had much luck with them yet.
Lila: “And what are ‘plugins and extensions,’ John?”
Think of plugins and extensions as little add-on tools that give a main program extra powers or features. It’s like adding a special lens to your phone’s camera to take wide-angle shots, or installing an app that adds new emojis to your keyboard. For these AI assistants, plugins might help them connect to other software or handle bigger chunks of code.
So, for ChatGPT, the author sums it up like this:
- Strengths: It’s generally very smart and understands a wide variety of requests (what he calls ‘broad model quality’). It’s also good at showing you changes for a single file clearly (‘single-file diffs’).
- Limits: It doesn’t really understand the whole big picture of a complex coding project (no real ‘project context’). It can’t run code that’s outside its own system (‘external execution’), and it sometimes struggles with very large amounts of code or long instructions (‘occasional size limits’).
Lila: “What do you mean by ‘project context,’ John? Does it mean the AI doesn’t know what the whole software is supposed to do?”
Exactly! Imagine you’re building that LEGO castle we talked about. ChatGPT might be great at helping you build one perfect tower, but it doesn’t necessarily see the blueprints for the entire castle. So, the tower might be fantastic on its own, but it might not fit perfectly with the drawbridge or the walls you’re planning next.
GitHub Copilot: Inline Speed, Narrow Field of View
Next on the list is GitHub Copilot. The author says its standout feature – or ‘killer feature’ as he puts it – is how smoothly it suggests code right inside popular code-writing programs like Visual Studio and Visual Studio Code. He calls this ‘friction-free inline completion.’
Lila: “‘Inline completion’? Does that mean it tries to finish your sentences for you, but for computer code?”
You’ve nailed it, Lila! As a programmer types a comment explaining what code they want to write next, they can just press a key (like the Tab key), and Copilot will often suggest a whole snippet of code to do the job. It’s like having a super-fast autocomplete that understands programming. And ‘inline’ just means it happens right there in the line of code you’re currently working on. By the way, these code-writing programs are often called IDEs.
Lila: “IDEs? What’s an IDE?”
An IDE stands for Integrated Development Environment. Think of it as a programmer’s ultimate workshop. It’s a software application that bundles together all the essential tools a programmer needs: a special text editor for writing code, tools for testing the code, and tools for finding and fixing errors (called ‘debugging’). It’s like an all-in-one toolkit for software creation.
GitHub Copilot also has a chat feature that can help rewrite code across several files. However, the author feels it’s still best at making suggestions for just one file at a time. Trying to make big changes across many files in a project (what he calls ‘cross-file refactors’) or fundamental changes to how the whole program is structured (‘deep architectural changes’) can still be quite awkward with Copilot.
Lila: “So, ‘cross-file refactors’ means changing code in lots of different places all at once?”
Precisely! Imagine you wrote a big recipe book, and then you decide to change the name of a key ingredient that appears in twenty different recipes. Updating all those recipes accurately would be like a ‘cross-file refactor.’ It can be tricky for an AI to manage all those interconnected changes perfectly.
The author thinks that while GitHub Copilot is good, it might be the tool of choice for developers who are just starting to experiment with AI-assisted development, rather than for those who rely on these tools heavily every single day.
So, for GitHub Copilot:
- Strengths: Its autocomplete feature is very smooth and works directly within the programmer’s main coding tools (‘native IDE support’).
- Limits: It struggles with making edits across many files simultaneously, and its understanding (‘context’) is mostly limited to whatever files the programmer has open at that moment.
Cursor: Inline Diff Done Right, at a Price
Alright, let’s talk about Cursor. The author believes this tool really demonstrates how powerful ‘inline diff review’ can be. You give it a prompt (an instruction), and it writes code, often making changes across dozens of files in your project.
Lila: “We talked about ‘diff’ with ChatGPT, showing the changes. So, ‘inline diff review’ means you can see those suggested changes right there in your code editor, John?”
Spot on again, Lila! Instead of getting a separate report, Cursor shows you the proposed changes directly within your code, line by line. This makes it super easy to see what the AI wants to do and then accept or reject those changes. The author thinks Cursor does this better than any other tool.
Now, Cursor is based on another very popular code editor called VS Code. The article mentions it’s a ‘VS Code fork.’
Lila: “A ‘fork’? Like a fork in the road, or the thing you eat with?”
More like a fork in the road! In the software world, a ‘fork’ happens when developers take the original source code of a piece of software (like VS Code) and start developing it independently as a new, separate version. So, Cursor started as a copy of VS Code, but now it’s evolved into its own distinct tool. It’s like taking a photocopy of a famous cookbook, and then you start adding your own recipes and notes into your photocopied version. The original cookbook might get new editions with new recipes, but your photocopied version won’t automatically get those updates.
One downside of this ‘fork’ approach is that Cursor loses some built-in features that VS Code has, like a specific tool for ‘C# debugging,’ due to ‘licensing issues.’
Lila: “‘Licensing issues’? And what’s ‘C# debugging’?”
Good questions! ‘Licensing issues’ refer to the legal rules and permissions about how software code can be used, shared, and modified. Because