Kris van Rens looks at the future of systems development and how developer happiness is an important aspect of software engineering
Artificial intelligence in general and large language models (LLMs) in particular are undeniably changing how we work and write code. Especially for learning, explaining, refactoring, documenting and reviewing code, they turn out to be extremely useful.
For me, however, having a chat-style LLM generate production-grade code is still a mixed bag. The carefully engineered prompt for a complex constrained task often outsizes the resulting code by orders of magnitude, making me question the productivity gains. Sometimes, I find myself iteratively fighting the prompt to generate the right code for me, only to discover that it casually forgot to implement one of my earlier requirements. Sometimes also, the LLMs generate code featuring invalid constructs: they hallucinate answers, invariably with great confidence. What’s more, given the way LLMs work, the answers can be completely different every time you input a similar query or at least highly dependent on the given prompt.
OpenAI co-founder Andrej Karpathy put it well: “In some sense, hallucination is all LLMs do. They’re dream machines.” This seemingly ‘black magic’ behavior of LLMs is slightly incompatible with my inner tech-driven urge to follow a deterministic process. It might be my utter incompetence at prompt engineering, but from where I’m standing, despite the power of generative AI at our fingertips, we still need to absolutely understand what we’re doing rather than blindly trusting the correctness of the code that was generated by these dream machines. The weird vibe-induced feel and idiosyncrasy of LLMs will probably wear off in the future, but I still like to truly understand the code I produce and am responsible for.
Probably, AI in general is going to enable an abstraction shift in future software development, allowing us to design at a higher level of abstraction than we often do nowadays. This might, in turn, diminish the need to write code manually. Yet, I fail to see how using generated code in production is going to work well without the correctness guarantees of rigorous testing and formal verification – this isn’t the reality today.
''An aspect of software engineering where LLMs can make an overall positive difference is interpreting compiler feedback.''
Positive difference
Another application area of LLMs is in-line code completion in an editor/IDE. Even this isn’t an outright success for me. More than once, I’ve been overwhelmed by the LLM-based code completer suggesting a multiline solution of what it thinks I wanted to type. Then, instead of implementing the code idea straight from my imagination, I find myself reading a blob of generated suggestion code, questioning what it does and why. It’s hit-and-miss with these completions and they often tend to put me on the wrong foot. I’ve been experimenting with embedded development for microcontroller units lately and have found that especially with code in this context, the LLM-based completion just takes guesses, sometimes even making up non-existent general-purpose IO (GPIO) pin numbers as it goes. I do like the combination of code completion LLMs with an AI model that predicts editor movements for refactoring. Refactors are often batches of similar small operations that the models are able to forecast well.
An aspect of software engineering where LLMs can make an overall positive difference is interpreting compiler feedback. C++, for example, is notorious for its hard-to-read and often very long compiler errors. The arrival of concepts in C++20 was supposed to accomplish a drastic improvement here, but I haven’t seen it happen. Perhaps this is still a work in progress, but until then, we’re forced to deal with complex and often long error messages (sometimes even hundreds of lines in length). Because of their ability to interpret or summarize compiler messages, combined with their educational and generative features, LLMs with a large context window are fit to process such feedback, making them a great companion tool for C++ developers. There’s an enormous body of already existing C++ code and documentation to learn from, which is a good basis for training an LLM.
Other drawbacks of C++ are the ever-increasing language complexity and the compiler’s tendency to fight rather than help you. Effective use of LLMs to combat these issues might well save the language in the short term. C++ language evolution is slow, but tool potency is tremendous. Given the sheer amount of existing C++ code in use today, the language is here to stay, and any tool that helps developers work with it is appreciated.
''To me, writing code is a highly creative, educational and enjoyable activity.''
Developer happiness
Using LLMs for code generation also takes away part of the joy of programming for me. To me, writing code is a highly creative, educational and enjoyable activity, honing my skills in the process; having a magic box do the work for me kills this experience to some extent – even manually writing the boring bits and the tests has some educational value.
Fellow educator in the software development space Ger Cloudt in his works on software quality asserts that organizational quality, a part of which is developer happiness, is half the story. According to him, organizational quality is key as it enables design, code and product quality. Sure, clean code and architecture are important, but without the right tools, mindset, culture, education and so on, the development process will eventually grind to a halt.
LLMs undoubtedly help in the tools and education department, but there’s more to programming than just producing code like a robot. Part of the craft of software engineering – as with any craft – is experiencing joy and pride in your work and the results you produce. Consider me weird, but it can bring me immense satisfaction to create beautiful-looking code with my own two hands.