SmackerNews

AI demands more engineering discipline. Not less

422 points · 213 comments · 2 days ago · BerislavLopac

charitydotwtf.substack.com

ryandvm2 days ago
It is now significantly harder to figure out who understands the systems and is using AI effectively and who doesn't know shit and is just slinging LLM copypasta around. Before 2025, the underperformers/coasters were at least relatively identifiable by the paucity of their contributions. Now all of the sudden every single engineer is filing PRs, code reviews, technical design documents, and every other artifact under the sun with perfect formatting and at least superficial plausibility. This is mostly due to incredible pressure from the C-level for every engineer to be using as much AI as possible, but it's also just a game theory respopnse because it's in every engineer's best interest to be as prolific as possible.
We are absolutely drowning in documentation and code that seems legit and the only recourse is to lean on AI to help process the sheer quantity of it. I have a feeling that the fallout from this phase of the industry is going to be an exotic form of technical debt that is remarkable mostly in its enormity.
trjordan2 days ago
Those are not code problems. They are evaluation problems.
Code becomes precious when it is the only place knowledge lives.
Reading AI code all day is _agonizing_. Just, a horrible way to live, and it melts people's brains at the moment you need them to be the most capable.
Manual programming has this really productive and gratifying feedback loop, where you read the code, write the code, and fix it until it compiles/runs/does what you want. AI code not only does half that for you, but it makes the "click" at the end uninspiring because you're never sure if it's cheated a bit to get to that moment.
Trying to operate with AI-generated code as the only durable artifact of programming is a dead end for the industry. Charity points to (and correct discards) architecture diagrams/specs as an interesting space to work in. My suspicion is that it's closer to the thing that's hand-written: prompts, markdown plans, and other nudges. Focus on the thing that you, as a human, produce, and that's the basis for both the core loop of "did the AI follow my instructions" and it's higher-leverage when you go to code review.
By the time you get to the PR, you've probably typed enough to Claude that you can regenerate the code, but the current industry default is to just throw away all those sessions and ship the code. That's backwards!
msteffen2 days ago
I liked this article, and I see a lot of other commenters didn't, so I'll give my take:
When starting on a new codebase, how do you make yourself into a helpful contributor as quickly as possible? I go straight for the humans and their human docs. What problem was the system originally built to solve? What was the original design, and what were its biggest problems? Who is currently using it? If you know these, reading the code is much easier because you can guess why things were done the way they are.
Also, this blog post has gotten popular: https://blog.gpkb.org/posts/just-send-me-the-prompt/
I think Charity is observing a very old problem and expecting the new technology to lead to a new solution of some kind. I doubt she thinks even the current generation of tools are the end of the AI software development story. She's not saying we'll drop design docs right into Claude code and walk away (design docs aren't complete either, that's why when you're ramping up you also have to talk to people, read old tickets and postmortems, etc.)
What she's observing is that, in prod, people don't like infra where it's hard to tell how it got into is current state, and so infra-as-code is what we do now. She's also observing that, "it's hard to tell how it got into its current state" is the status quo with codebases, which other people have observed going back to "Programming as Theory Building" and earlier. And she's expecting that, analogous to infra, software development will somehow be done with tools focused on making "how the code got into its current state" clearer.
simonw2 days ago
What happened in 2025 was this: the economics of code production were turned upside down. Instead of being very hard, time-consuming, and expensive to generate code, it became effectively free and instant. Lines of code went from being treasured, reused, cared for and carefully curated, to being disposable and regenerable, practically overnight.
I've been thinking about this a whole lot recently. So much of my intuition about software development is based on 25 years of accumulated experience on how long it will take to write different bits of code.
Should I add validation for this one edge-case which won't break everything but will make a little bit of a mess if someone hits it? If that's an extra couple of hours of code I might skip it. If it's one more prompt, why wouldn't I?
This new feature would be a lot easier to understand if there was a custom API explorer for it. There's no way I could justify investing in that... unless it's just 10 minutes with Codex, and it was: https://tools.simonwillison.net/datasette-extras-explorer#ur... (linked from the release notes https://docs.datasette.io/en/latest/changelog.html#extra-sup...)
That's just on the small scale. There are entire projects that I'd never previously have considered, because I don't need a custom SQLite SELECT query parsing library enough to justify spending a week or more building one. But now... https://github.com/simonw/sqlite-ast
People get VERY upset (and condescending) any time you suggest that being able to produce lines of code faster is a valuable thing. And sure, measuring output through "lines of code" is stupid.
But measuring "lines of verified code that deliver valuable" isn't stupid at all. That's the thing we can do faster now.
ncruces2 days ago
I just spent a week reviewing this ~200 LoC PR: https://github.com/ncruces/wasm2go/pull/37
It was submitted by a seasoned user, who probably asked a frontier LLM. It still felt… wrong. I didn't understand it, and I wouldn't merge it without understanding it.
I also suspected it was wrong, in a way that would cause issues in the future.
So I reviewed it 4 different ways: (1) try to understand/improve it; (2) do it with better algorithms; (3) avoid it by fixing the issue upstream; (4) rewrite it from scratch probably just to match my brain.
I expected either (2) or (3) would be the answer. (2) didn't work, rather it's the correct answer but I need to redo the project from scratch to use it; (3) I wanted really bad to work, but didn't.
So I got to a blend of (1) and (4). I'm still not entirely convinced, but now I understand the issue/solution. I obviously think my approach is better.
Still, I still stripped both of comments, and asked my LLM to review.
The LLM came back and said the original one was clearly better. I explained why not, it then answered I was correct.
If I try it with comments, LLMs say the mine is better. Because I found a real issue (one that I pointed at in the original comment thread). But is it saying mine is better because I coerced it to say so?
Elzair2 days ago
I read the article, and it seems she is forgetting the aphorism "all models are wrong". This is a common mistake that people who like "realistic" "simulation" RPGs often make. Any suitably comprehensive model of a thing is just the thing itself. To have a model of a location that includes all the detail of the actual location, you would need a 1:1 scale model, which is just a copy of the location. Any plan (i.e. prompt to a model) sufficiently capable of reliably replicating 100% of the functionality of a system is likely the source code of the system itself.
glouwbug2 days ago
Before 2023 I remember everyone here on HN championed that removing lines of code was the strongest senior metric
workbox2 days ago
I did not enjoy reading this article. The writing was fine, and each individual paragraph was fine, but the whole thing together was meandering and dare I say pointless. It was so many words and yet so little seems to have been said.
youknownothing2 days ago
"We treated code as permanent because the labor to produce it was the bottleneck."
I don't think that's true. We treated code as permanent because we considered code to be the source of truth. Computers don't run documents, computers run code. If the requirements document contradicted the code, then the default was to assume that the requirements document was wrong.
You can't separate code from spec because the code is the spec.
e12e2 days ago
Great article. I'm not sure the author is correct - but I think something is happening to the adage:
A sufficiently detailed specification is runnable code.
In a way I think LLMs will enable the dream of 4gl and "sufficiently smart compilers"[c].
LLMs aren't smart, but they are capable. Especially capable of translation and transformation.
I can certainly see them help move the abstraction horizon at which we work - so that rigid high level descriptions of the desired logic/process along with the process for quality testing - become the relevant curated artifacts - and the generated go/rust/java/python/etc code become incidental and mutable; subject to constant rewriting as part of the deployment of systems.
[c] You know, the ones that take naive C/C++ and produce executables that fully leverage RISC/EPIC platforms to be better than CISC. See also: Intel Itanium
sltr22 hours ago
from the post:
It was reasonable to be skeptical the first time
It's still reasonable to be skeptical. A few weeks ago a post was discussed here on HN [1] that asked:
What would have to be true for us to ‘check English into the repository’ instead of code?
to which I replied:
Code is already the cheapest path to working, correct software. LLMs do not change the calculus because figuring out what to make is the expensive part, not coding it up. Skipping code makes the specification of what to make even more expensive and throws away the tools that keep precision affordable. Programming in English would be more expensive than just using a programming language. [2]
[1] https://annievella.com/posts/finding-comfort-in-the-uncertai...
[2] https://www.slater.dev/2026/05/why-english-will-never-be-a-p...
hibikir2 days ago
We don't even have to go that deep: If anything accelerates our rate of code change, but doesn't lower our incidents per change, we are still stuck in a larger pile of incidents, and that's if the code quality is exactly the same as before.
Without more, better testing, hopefully more invariants stored in type systems that are easy to reason about, and more recording of the reasons why we change things, we get a more unstable system in practice. One were fewer people can work at once.
cadamsdotcom2 days ago
PLEASE do not rest your killer argument for humans in software on us being the best quality gate
Rather than dismissing humans for quality control, we should take an asymptotic approach, where humans verify less and less as more verifications are automated, but are never out of the loop. Get down to 1% of the things, then 0.1%, then 0.01% and so on.
Automate all the linting you can before the agent is allowed to make a PR, make sure it passes the tests, add custom linting for dumb AI-isms you’re sick of telling the agent not to do - yes, you can lint for that fallback & backcompat code you never asked for, you just have the agent generate a script that walks the AST and flags the problem by line and file, then put that in your pre-commit checks - the agent treats it like just another lint error. Now you never have to review for that thing again.
But you still have value!
Even when you automated everything you can think of, there’s still tremendous value in human review. It’s your last chance to fully understand the implementation before it melds with the codebase. You also pick up more antipatterns to add to your automated reviewer (the automated reviewer is just a long prompt with an ever growing list of bullet points)
And the asymptotic nature of QC extends to observability and production. You cannot really ever automate a loop directly from observability to code fixes? Even when the agent presents a fix to an unhandled exception in production - if it was bad data, should you clean it in a backfill? If a key business metric dropped off a cliff because of a bug, should you add an alert once you fix the bug?
ezoe2 days ago
I have a doubt that one of Three Virtues of a Programmer, laziness is still considered a virtue on AI coding era.
Now that AI coding speed and performance outperformed most of human. But AI still need human to be commanded. Yes, you can let AI agent manage sub-agents but still, human is at the top of manager who order AI what should be written.
So human must command and final say on when it's done.
Is laziness still a good virtue in AI era?
jreynar1 day ago
Articles like this are exactly why I doubt that SWE jobs are going away. The SWE job of 2026 doesn't look like one from 2020, let alone 1990 so why would anyone believe the false dichotomy that either the SWE jobs of 2026 will remain or all be eliminated? I worked at Google a zillion years ago when the idea of reviewing all code was novel. Before that, when I worked at MS things mostly didn't get reviewed until the end of the project when the stakes were high because code got burned onto a CD and put in a box. The way SWEs spent their time changed radically from 2000 to 2004 and I think for the better since it increased shared understanding and fostered more collaboration.
If AI writes the code and humans spend more time reviewing it, that might not be a bad thing, but when the AI code is good enough, people are going to view thorough reviews as optional. Then the job of a SWE will look very, very different than before since SWEs won't write much code or spend much time reviewing it. The IDE may go the way of the dodo. And maybe the focus will move to setting up the goals and tests that keep the AI coding team on task. Maybe SWEs will spend more time architecting since they're likely to know where projects are heading and won't want AI to rewrite things as goalposts legitimately move. Maybe more will be spent exploring: build it one way and another and another and compare and generate new ideas from the different approaches.
I have no better idea than anyone else, but I'd be heavily against the role going away and in favor of it evolving, like it's done many times before, though perhaps never as rapidly as it is right now.
BerislavLopacOP2 days ago
It's interesting how most of the comments here seem to miss the most important part of the article, which is this:
What happened in 2025 was this: the economics of code production were turned upside down. Instead of being very hard, time-consuming, and expensive to generate code, it became effectively free and instant. Lines of code went from being treasured, reused, cared for and carefully curated, to being disposable and regenerable, practically overnight.
A little but further reinforced by this:
I am just barely old enough that my first job title was “System Administrator”. [...] I lived through the shift from handcrafted server pets to immutable infrastructure cattle.
What is happening now is nothing new, we have seen it many times before: a shift in technology which is bringing changes in the ecosystem, required skills and so on. This happened with stocking frames, steam engines [1], automobiles, servers, and now the code. Just like before, many will be - and already are - harmed by this, but ultimately the world will adapt and accept the new paradigm.
[1] There's an infamous screenshot of a tweet being shared around, where someone suggests various names for writing code without AI, and someone else responds with "software engineering". Allow me to add my on contribution to this debate: codejamming.
amatheus2 days ago
I liked the article overall but found it a little wishy-washy in some of the conclusions.
People do not want to wake up every day and log in to Slack and find the buttons and menus all subtly moved around. People do not want financial transactions that complete most of the time. Determinism is not going anywhere, my friends.
Well, I can't reconcile people not wanting things moving around and determinism with the promises of acceleration made by AI. The way I see it either AI makes "massive, discontinuous returns on investment" by way of changing things or we get a sustainable rate of change; these seem like contradicting goals to me.
ManuelKiessling2 days ago
I fully agree with the „AI demands more engineering discipline“ premise.
And I‘ve quickly realized that it’s also much easier to follow that premise.
Not only because agents obviously help with writing documentation, test cases, DX tools, and so on.
But also because it feels so much more rewarding to know that someone — even if it’s just a soulless agent — actually cares to read and use and follow these.
I have always been the guy on the team who would write the tools and documentation, and it’s always been a bit frustrating to know that only half the team would care to read and use and follow them, at best.
jay_kyburz2 days ago
"...never fix a running thing. Replace it.
AI pushes this premise beyond infrastructure and into application code itself. When rewriting is cheap, editing in place becomes risky. Mutation accumulates entropy. Replacement resets it."
I've always found verifying some code works correctly much harder and time consuming than writing code. Replacing big chunks code means much _more_ verification and validation.
When you see a bug, and you "fix the running thing" you only need to verify what you changed.
sdicker2 days ago
Thanks, great to have the perspective of thoughtful engineers who have been in the trenches for a long time
keybored2 days ago
A few days back I wrote a piece called “AI enthusiasts are in a race against time, AI skeptics are in a race against entropy.”
Guess who the author is.
> The enthusiasts are not wrong. We are starting to see real, non-imaginary, discontinuous leaps in capabilities from teams that lean in hard to working with AI. And this does not feel like a normal technology cycle where you can wait for the dust to settle; teams that sit this out while competitors are hustling could be out of business before the dust settles. That’s a real, existential threat.
It’s not imaginary. It’s real. This time it’s different. And on a higher level, the FOMO is real. It’s not imaginary. It’s even existential.
Why do they all write the same as well? It’s so emphatic.
The tech is cool, but as a thinking, feeling, breathing human who cares about other people, it can be hard to get excited about anything that so many people are this upset about. It’s also hard to get excited about something when so many of the loudest voices are out there talking gleefully about putting everyone permanently out of work, and so many artists and writers and people from developing nations are talking openly about the impact on them.
Hold your desire to jump in and berate me here, I beg you. Like I said, I will deal with the ethics and morality of using AI in my very next post. Be honest, your attention span is no more up for reading a 10,000-word essay than mine is up for writing one. (Can we blame AI for that too?)
More Inevitability Soothsaying. All our feelings are crashing with Existentinal Threat Reality.
K0balt2 days ago
This has been my position from the beginning of when agentic coding harnesses became genuinely useful.
I now do documentation driven development, and with very few exceptions I am committing code that is better written, better documented, easier to reason about and maintain, with less library overuse than I ever did as a senior lead with a smal team, and I’m doing it for 1/4 the price, at 4x the speed.
But it’s not vibe coding. Discipline is critical, as is deep systems understanding.
romaniv2 days ago
>"It’s easy to forget, but for most of 2025, the idea that AI-generated code was slop and might always be slop was not only a reasonable position to hold, it was the default, mainstream position.
That question was answered decisively last November."
It's easy to forget that people said this exact thing about every model after GPT 3.5. This is a standard trick the industry uses to invalidate negative experience with LLMs. 'You are prompting it wrong' becomes 'you are using Gemini, but you should use Clade' which then becomes 'well, all of your criticism is now irrelevant, because everything is fixed in this new version'.
This "discussion" about capabilities is set up to be asymmetrical and basically non-falsifiable.
QuantumNoodle2 days ago
Given how many critical systems software touches these days, I am surprised that there they are not licensed. Imagine if civil engineers can just go building shit as minimum viable products? Sure prototype quickly but someone should be found liable for final product.
bilater2 days ago
If you ask a surgeon if you need surgery...
In general most developers are going to find themselves fighting incentives which will color their opinion. AI isn't there yet but if you are going to abase your whole world view on a point on a graph and not on the trajectory you are in for a bad time.
SwtCyber1 day ago
It's good to see the hype around "programmers are no longer needed" giving way to a more realistic view. Generating lines of code was never the hardest part of engineering. What's much harder is understanding exactly what we're building, how it integrates with legacy systems, and making sure it doesn't crash the database under load
[deleted]
AndrewKemendo2 days ago
Broadly concur with this and in fact it’s all of this is going to make doing real engineering easier in my opinion
The author makes the wrong assumption though that the majority of people who are doing engineering want to do even more engineering.
It’s my experience that most technology workers just want a high paycheck and have some kind of association with being in tech and doing cool things
nullorempty2 days ago
AI demands more engineering discipline.
Well, that's a loaded statement. I'am yet to see a Claude session where Claude would tell me to hold off and make my prompts more disciplined.
So it does not demand more discipline.
It can, otoh, build better from disciplined prompts but people too, build better software from better specs.
schmuhblaster2 days ago
What worries me personally is the dopamine hit I seem to get from watching my ideas get built in front of me. There is a big temptation to just add feature after feature without really checking what the code actually looks like. So yes, more discipline is needed.
kazinator2 days ago
Wrong!
AI demands more of your hours cranking out the same discipline, due to the volume of stuff that needs to be verified.
Normally the term "more discipline" is understood as increased rigor, not simply more work at the same level of rigor.
deaton2 days ago
It demands more and yet it makes it so, so much easier to get by with less. I don't quite understand who is supposed to be the winner here.
SrslyJosh2 days ago
So, using artificial intelligence requires more expertise than not using it?
sandover2 days ago
the ur-text behind this piece is: people just not understanding exponential capability growth. including the author of the piece!
if you could look clearly at the progress from 2020 to 2023, as someone like Gwern did, and from 2023 to 2024 with the invention of reasoning modes, then it was not that hard to understand what would happen in late 2025. Opus 4.5 was not a surprise to anyone who was actually paying attention.
But people (including the author) still mistake the current state as a stable state and future gains as incremental. he says
“I am not asserting that all code will eventually be AI-generated to spec, bypassing human understanding”
I AM asserting that, and it’s incredibly easy to do so.
The question of “when” is separate.
[deleted]
steve_adams_862 days ago
There's one thing that hasn't changed much with LLMs, and that's the notion of 'moving the needle'. People who sling slop don't meaningfully accomplish that if you're paying close attention to how your team or org or company actually needs to move. Although, if your team is focused on PRs and LOC, sure, the needle is popped off the gauge by LLMs. But your problem is not LLMs in that case.
I agree that AI demands more engineering discipline, but it also demands more domain knowledge, purpose, and intent. Suddenly we can actually accomplish most of our goals a little faster. I can take on work I couldn't before.
Before I even begin getting disciplined about engineering, I need to ask: does this work actually make sense? Should I do it? If it's done... What do I think will change for my team or organization? Will it have practical results that move us in the right direction?
The better you get at asking that question, the less you'll find yourself prompting and planning and shoving PRs into the chute. It's still somewhat difficult to find important work in many places.
Still, engineering discipline is and always has been critical when going ahead with important work.
My gut feeling is that many of us simply aren't doing important work, and the discipline might be nice but is ultimately irrelevant. The sloppers are doing a faster version of something they always have, and much of it will be lost to time just like our pre-slop work has been.
I find LLMs aren't as helpful when applied to well-thought and intentional work towards very specific goals in complex domains. They're still helpful, but, the deeper you go and the more specific you get, the more they tend to deliver results you can't use. If you're on the rails they can be incredible. Diverging from the track and having exacting requirements, eh, it gets pretty hit or miss and you can spend a lot of time herding a digital cat. This certainly demands a lot more engineering discipline.
kstenerud2 days ago
This has been my experience with AI.
Writing software begins with a solid design that is defensible. If you don't have that, the AI will produce slop.
Once you're happy with the design, you need a solid plan. If you don't have that, the AI will produce slop.
Once you're happy with the plan, you can set the AI loose, but don't get too complacent! Anything that you missed in the previous phases could very well lead to slop (although likely localized).
And then then, as your project matures and you gain more understanding of the space, you start to notice deficiencies in your model. This is where AI really shines: design and code changes to adapt to reality.
otabdeveloper42 days ago
Instead of being very hard, time-consuming, and expensive to generate code
Was this article written by AI? It's certainly stupid enough!
socketcluster2 days ago
This is why I built https://saasufy.com/ - Vibe coders shouldn't trust themselves with backend security. Unfortunately, it's extremely difficult to get right. There's a lot to think about;
- Schema validation with appropriate size limits on all relevant fields.
- Authentication.
- Access control.
- Backpressure management and rate limiting in case a (possibly malicious) user tries to perform too many computationally expensive actions in a short time.
- Ensuring that the actions of one user doesn't throttle another user which is connected to the same process/host, e.g. using async constructs to avoid freezing the main process.
- DDoS mitigation.
- Avoiding race conditions.
- Designing a good database schema, with well chosen indexes, with deterministic IDs/idempotency to avoid double-insertion scenarios. You don't want to be forced to rely on overly complex queries with a lot of joins. This doesn't scale well and rarely necessary.
- Logging and error handling.
- Avoiding conflicts and accidental overwrite with old data when multiple users are editing different fields of the same resource concurrently.
- Efficient distribution of realtime messages.
- Scalability.
The list goes on and on... And every piece has to be implemented perfectly. This involves a huge number of carefully thought-out decisions.

news.ycombinator.com/item?id=48570948