Context
After several months away from using VS Code and GitHub Copilot, I decided to revisit them to see how much the experience has evolved. This time, I used GitHub Copilot Pro inside VS Code, paired with Claude Sonnet 4.5. I also experimented with GitHub's new code review assistant on the web, and plan to try Copilot's CLI next.
My goal was to see how well the agentic LLM performs a full feature addition in a real Django project, with me acting only as a reviewer.
Experiment Setup
I picked up an old Python project, Finished Games, a Django-based CMS for cataloguing videogames. It's no longer online, but I can spin it up with Docker, update and export data, and display it through another small app, FG Viewer.
The new feature: add support for gameplay time, pulling information from the local database file of the GOG Galaxy application.
I guided Copilot through the process, avoiding direct coding except for minor renames or reordering.
The Project and Task
The project isn't huge, but it's not trivial either. It uses an outdated Django version, and I gave Copilot deliberately vague instructions, often just pointing to a single file or template. From there, it needed to trace dependencies through views, models, and template tags.
Despite the vague prompts, Copilot handled the reasoning chain well. It backtracked correctly from HTML templates to models and applied consistent updates across related files. Even when dealing with legacy and end-of-life code, it avoided incompatible patterns.
Observations
The short version: Copilot performed impressively well.
Across the entire workflow, I faced only one error while reapplying a change, and it looked like a diff issue, not an LLM hallucination. Every other suggestion was valid and aligned with my intent.
The pull request reviewer was especially useful; it flagged potential security issues, suggested ORM optimizations, and even identified “to-dos” I had left intentionally, treating them as possible code smells. It didn't just spot syntax issues; it offered meaningful feedback.
Copilot also launched a Docker shell automatically to run Django migrations (under my supervision). It took some time to confirm task completion, but the interaction between chat and terminal was smooth and reliable.
In the end, I spent far more time curating data mappings than coding or reviewing.
The agent handled the implementation work with minimal intervention. Other than improvement suggestions, it could have been fully automated.
Results
This small experiment showed clear progress in Copilot's ability to understand real project context and act autonomously within it.
You can check the related PRs here: I, II, and this commit.