A programming and hobby blog.
(Before we begin, I like the idea of robots, which we now call AI, enabling humanity’s progress. This post is a lament, but not yet a eulogy for the software engineer’s way of life that is going away.)
My team at $currentCompany has been using Claude and other AI tools recently to build a project that is beyond my comprehension. As part of a team, I feel both a desire to help others, but also an accountability for when things go wrong. As I see my team lean more into AI generation of code, both of these team oriented feelings are evaporating. I feel cold and unfeeling about what they are working on, and no stock in the success or failure of their project. (Its project?) As a software romantic, I can’t help but feel emotionally torn about the UX of AI, as it seems to offer both incredible ability, but at the cost of all our dearly earned practices.
The premise of the problem is simple. We must justify our salaries, and we must have something to work towards promotion, so a new big project it is. However, how can we show enormous and lightning fast impact? As our fearless leaders have pronounced, the answer is to use AI to write our code.
What does that mean?
In my case, the answer is “writing” lots of code. Code whose authorship is not quite certain. Code which is by most humans’ reading is distasteful. Code which does, in fact, fulfill the desire of the human who asked for it. For those of you who haven’t seen this yet, think about all the AI art you have seen over the past 2 years, and then imagine it as code. Most people I know consider AI art to be somewhere in the Uncanny Valley. So we have lots (thousands of lines a day) of code written and being checked in.
Many companies expect their software engineers to engage in code review. It’s frequently a legal requirement. Many human beings would agree that the practice of having someone else read, review, and provide commentary on code is a good thing. But here is where I see our way of life beginning to change. Let’s check our assumptions. Why do we think code review is valuable?
Code review spreads the knowledge of what one human is working on with the rest of the humans. The other humans know, at least a little, what is changing. If something is hard to understand (for a human), they can provide that feedback to the author. The knowledge flows from the review to the author too! A [senior] reviewer can provide feedback on possibly better ways to doing things, either through different structure or different APIs. Knowledge is spread between the humans, and everyone increases in skill as a result.
With AI generated code, that’s all gone. The Author vibe codes 1,500 lines of something, sends the PR out for review, and then submits at the first sign of approval. Does the [human] author understand what it does? Well not really. But that’s okay. Our $fearlessLeaders said it was okay to do it. You’re not going to directly contradict them, arrrrreeee youuu?
So curmudgeonly me provides feedback on 1,500 lines of mystery meat. Another reviewer comes in and approves the whole mess, and the code is submitted without any knowledge pollination. My feedback is, at best, ignored. The old way of doing things. Understanding. Providing. Learning. At worst, it’s much worse.
As I read through countless lines of slop, I provide my thoughts. 14 years of hard earned battle experience. To my amazement, the author takes my feedback seriously, and sends out the next commit in minutes, completely addressing the 20 minutes I took to reason through the mess. The “author” scrapes the GitHub comments, feeds it into their agent, applies the changes, and sends it right back out. No need to spend time learning, arguing, disagreeing, or learning. I’m absolutely right!™
It’s at this point I realize my feedback is not valuable. I’m not working to help improve my teammates or improve the code. I’m being used to train my replacement. Any more words that I say are basically going to be used against me. Those 14 years of being a hard worker don’t seem so good now. The “author” of the code is merely a proxy to the agent. (One wonders if the author realizes how little they matter in this? Why do we need a meat-bag to copy paste the words from one reviewer to the agent who really wrote it all?)
There’s a deeper problem here though. Let’s check our assumptions. We assumed that writing good, easy to maintain, easy to understand code, is a good thing. But why? If humans are going to be maintaining and modifying the code, it is a good thing! But that’s not the future. The machine is capable of swallowing and digesting all knowledge, all code, all things ever written. And it does not forget. So why is good code needed, when the machine can keep track of everything? A machine can remember all things, and keep an enormous working set in its digital brain.
The conclusion is that “good” code, is really just good-for-meat-bags code. Since AI lacks our weaknesses of limited brainpower, it can re-absorb everything in a moment. Consider the case where you have joined a team with a 15 year old code base, and the code evolved from tens of amateur programmers, to the point it’s a hot mess of undebuggable garbage. And your manager wants you to add a big, complex feature, in the next 7 days, or else. You have no hope! You might as well take some vacation days because there is no way you’ll untangle the Gordian knot of pig-shit code with your puny, staff software engineer, brain.
But with AI, that isn’t a problem any more. It’s no challenge at all that the code is bad. There is no problem at all to make sweeping changes. The goal-oriented approach of software development means that we can verify that the new code delivers the feature. Why bother “reviewing” code, when it can be “fixed” with an utterance of agent prose?
Here lies the deeper problem. With AI, it can keep track of more details than you or I ever could. It can know all things. It will write code that exceeds both your and my ability to understand it. It meets the goals, but humans can no longer grok it. As a result, the only way we can interact with it from now on is through the agent. In effect, it becomes the only entity able to code. And the longer this goes on, the longer only it knows WTF is happening.
I have to return here to my central premise: that our way of life is going away. In the words of Scarlett O’Hara: “Where shall I go? What shall I do?” How we adapt to the new world isn’t clear. Even being experienced and wise does not seem to be enough. My experience is being used by my replacement. It’s hard to see how I provide value in such a way that my future value is real. I don’t think our way of life, as experienced software engineers, is going to stick around much longer. We are going to be sucked dry by the machine, or left by the roadside as the sacrificial lambs re-purpose our work, one last time.
Python programs can sometimes be compute bound in surprising ways. Recently I
tried refactoring a program that downloaded 4 JSON files, parsed them, and
made them available to be used in a larger program. When I rolled out my
“improvement”, it actually made the code slower, and I had to quickly fix it.
How could have I avoided this?
What We Should Expect from a Good Program
A few things would make our lives easier. Python has not traditionally made
the following easy, but we are right on the cusp of having our cake and eating
it too. Here’s what I would expect from a good program:
Easy to Parallelize. If the code is slow, we should be able to split it up.
Easy to Profile If the code is slow it should be easy to figure out why.
Let’s see if we can get both at the same time.
Hard to Parallelize
The original authors had used os.fork() to acheive parallelism, which has
problems. I assumed that this
was to avoid using threads directly, or some other reason, but it turned out to
not be the case. “Downloading some JSON and sticking it in Redis? That’s
definitely IO-bound”. Wrong. The JSON parser in Python is very slow. To the
point that trying to download and parse all 4 versions ended up taking
more than 60 seconds. The refresh interval for this code was only 1 minute
long. When I replaced the fork-based code with a ThreadPoolExecutor, the code
started taking minutes to nearly hours to finish. It seemed IO bound, but it
was actually CPU bound.
Hard to Profile
A more seasoned engineer might point out that I should have profiled this code
before trying to “optimize” it. However, Python only
recently gained the
ability to integrate with perf. Unfortunately, the implementation creates a
new, PID-named file, at an unconfigurable location, each time the procress
starts. In a fork-based concurrency world, that’s a lot of PIDs. And because
these perf-based files aren’t small, it runs the risk of maxing out the disk of
the server you are profiling on. Secondly, these forks flare into, and out-of
existence quickly (i.e. seconds), so it’s hard to catch them in the act of what
they are doing. A long lived process would be much easier to observe.
And Still Hard to Parallelize?
When I replaced my ThreadPoolExecutor with a ProcessPoolExecutor, this problem
reared its head again. Because the processes associated with the pool aren’t
associated with the tasks, it’s hard to identify which processes to profile. The
same problem exists; tracking down all the PIDs associated with my pool is
trickier. Secondly, switching from ThreadPoolExecutor to ProcessPoolExecutor is
not straightforward. All the functions and arguments now need to be Pickle-able,
meaning things like references to class methods no longer work.
Parallel, Profile-able Python
Python 3.14 adds a new module and APIs for creating sub-interpreters.
(e.g. InterpreterPoolExecutor) Significant work has gone into CPython to make
the Interpreter state a thread-local, meaning it’s possible to run multiple
“Pythons” in the same process. This helps us a lot because it means we can get
the parallelism we want, without the system overhead of running multiple
processes. Specifically:
- There’s no overhead of starting up multiple processes. Processes can share Page
tables, Signal Handlers, file descriptors, and so on.
- PIDs are way more stable. The Process ID of the parent thread is the same as
the ID of the child (sub) threads.
- Memory sharing (is | will be) easier. Rather than have to convert from Python
objects in one interpreter to a serialized (cough Pickle cough) form,
it will be much easier to synchronize with other workers. (also shout out to Ray which has done the hard work to make this sharing
a lot easier).
The multiple-runtimes-in-one-process model is not new, with the most notable
example being NodeJS. But, it is a greatly welcome addition to Python. Given the
amazing improvements in GIL removal and JIT addition in Python 3.13, Python is
becoming a much more workable language for server development.
After watching Brian Goetz’s Presentation
on Valhalla, I started thinking more seriously about how value classes work. There are a few things
that are exciting, but a few that are pretty concerning too. Below are my thoughts; please
reach out if I missed something!
Equality (==) is No Longer Cheap
Pre-Valhalla, checking if two variables were the same was cheap. A single word comparison.
Valhalla changes that to depend on the runtime type of the object. This also implies an extra
null check, since the VM needs can’t load the class word eagerly. With a segfault handler
to try and skip the null check, the performance of == would no longer be consistent.
This isn’t the end of the world for high performance computing, but it doesn’t seem like that
big of a win. Everyone’s code bears the cost.
It appears most of the performance optimizations available to Valhalla are not yet in, so it’s
hard to tell if the memory layout improvements are worth the expense.
Minor: IdentityHashMap now is a performance liability. Don’t accidentally put in a value object
or else.
AtomicReference
How value classes will interact with AtomicReference seems to be an issue. While value objects
can be passed around by value, they can also be passed by reference, depending on the VM.
However, AtomicReference is defined in terms of == for ops like compareAndSet. Value objects
no longer have an atomic comparison. What will happen? Consider the following sequence of
events:
value record Point(int x, int y, int z) {}
static final AtomicReference<Point> POINT =
new AtomicReference<>(new Point(1, 2, 3));
- T1 - start
POINT.compareAndSet(new Point(1, 2, 3), new Point(4, 5, 6))
- T2 - start
POINT.compareAndSet(new Point(1, 2, 3), new Point(1, 2, 3))
- T2 - finish and win
compareAndSet()
- T1 - finish
compareAndSet()
A regular AtomicReference would return false for T1, despite the value being the expected value
before, during, and after the call. We can use it to resolve a race. A value based object though:
what could it do?
Where is the Class Word?
Without object identity, most of the object header isn’t needed. The identity hash code,
synchronization bits, and probably any GC bits aren’t needed any more. But, what about
valueObj.getClass() ?
I can’t see an easy way of implementing it. If the class word is adjacent to the object state in
memory, we don’t get nearly the memory savings we wanted.
If we had a single class pointer for an array of value objects, it still doesn’t help. Consider:
value record Point(int x, int y, int z) {}
Object[] points =
new Object[]{new Point(1, 2, 3), new Point(4, 5, 6)};
for (Object p : points) { System.out.println(p.getClass()); }
The VM would have to either prove every object in the array has the same class, or else store it
per object.
It would be great to see how the class pointer is elided in real life.
Intrusive Linked Lists and Trees
Value objects’ state is implicitly final, which means they can’t really be used for mutable data
structures. One of the things I miss from my C days is having a value included in a linked list
node. This saves space, but doesn’t appear to work for value objects. The same goes for trees.
I haven’t thought extensively about it, but denser data-structures don’t seem to be served by the
Valhalla update.
Values Really Don’t Have Identities.
Ending on a positive note, one of the things I liked about JEP 401
was the attention called to mutating a value object. Specifically:
Field mutation is closely tied to identity: an object whose field is being updated is the same
object before and after the update
Many years ago, I had an argument with a coworker about Go’s non-reentrant mutex, v.s. Java’s
reentrant synchronizers. As most [civil] arguments go, both of us learned something new: Go’s
mutexes can be locked multiple times. Behold!
package main
import (
"fmt"
"sync"
)
func main() {
var m sync.Mutex
m.Lock()
m = *(new(sync.Mutex))
m.Lock()
defer m.Unlock()
fmt.Println("Hello")
}
This code shows the problem. The mutex becomes a new object upon reassignment, despite being
the same variable. If the second .Lock() call is removed, this code actually panics, despite
the Lock call coming before the Unlock, and there being the same number of Locks and Unlocks.
Java is saying the same thing here. Mutability implies identity.
Conclusion
At this point, I think the Valhalla branch is interesting, but not enough to carry it’s own weight.
Without being able to see the awesome performance and memory improvements, it’s hard to tell if
the language and VM complexity are justified.