68 points · straydusk · 7 hours ago
benshoemaker.usteecha
kace91
I can’t imagine any other example where people voluntarily move for a black box approach.
Imagine taking a picture on autoshot mode and refusing to look at it. If the client doesn’t like it because it’s too bright, tweak the settings and shoot again, but never look at the output.
What is the logic here? Because if you can read code, I can’t imagine poking the result with black box testing being faster.
Are these people just handing off the review process to others? Are they unable to read code and hiding it? Why would you handicap yourself this way?
GalaxyNova
Never thought this would be something people actually take seriously. It really makes me wonder if in 2 - 3 years there will be so much technical debt that we'll have to throw away entire pieces of software.
gchamonlive
That goes a bit against the article, but it's not reading code in the traditional sense where you are looking for common mistakes we humans tend to make. Instead you are looking for clues in the code to determine where you should improve in the docs and specs you fed into your agent, so the next time you run it chances are it'll produce better code, as the article suggests.
And I think this is good. In time, we are going to be forced to think less technically and more semantically.
oxag3n
Become a CTO, CEO or even a venture investor. "Here's $100K worth tokens, analyze market, review various proposals from Agents, invest tokens, maximize profit".
You know why not? Because it will be more obvious it doesn't work as advertised.
Ifkaluva
The answer is clear: I didn’t write the code, I didn’t read it, I have no idea what it does, and that’s why it has a bug.
prewett
Groxx
So basically a return to waterfall design.
Rather than YOLO planning (agile), we go back to YOLO implementation (farming it out to dozens of replaceable peons, but this time they're even worse).
sho_hn
Which is perhaps what they should do, of course. Any transition is a chance to get ahead and redefine yourself.
dougthesnails
andai
I tried doing clean room reimplementations from specs, and just ended up with even worse garbage. Cause it kept all the original garbage and bloated it further!
Giving it a description of what you're actually trying to do works way better. Then it finds the most elegant solution to the problem, both in terms of the code and the UI design.
letstango
So humble. Who is he again?
sho_hn
What it's trying to express is that the (T)PM job still should still be safe because they can just team-lead a dozen agents instead of software developers.
Take with a grain of salt when it comes to relevance for "coding", or the future role breakdown in tech organizations.
insin
andai
I haven't used Codex though, so maybe there's something I'm missing about the parallel-ness of it here.
pjmlp
yodsanklai
Recently I picked a smallish task from our backlog. This is some code I'm not familiar with, frontend stuff I wouldn't tackle normally.
Claude wrote something. I tested, it didn't work. I explained the issue. It added a bunch of traces, asked me to collect the logs, figured out a fix, submitted the change.
Got bunch of linter errors that I don't understand, and that I copied and pasted to Claude. It fixed something, but still got lint errors, which Claude dismissed as irrelevant, but I realized I wasn't happy with the new behavior.
After 3 days of iteration, my change seems ok, passed the CI, the linters, and automatic review.
At that stage, I have no idea if this is the right way to fix the problem, and if it breaks something, I won't be able to fix it myself as I'm clueless. Also, it could be that a human reviewer tells me it's totally wrong, or ask me questions I won't be able to answer.
Not only, this process wasn't fun at all, but I also didn't learn anything, and I may introduce technical debt which AI may not be able to fix.
I agree that coding agents can boost efficiency in some cases, but I don't see a shift left of IDEs at that stage.
bigwheels
https://steve-yegge.medium.com/welcome-to-gas-town-4f25ee16d...
Is it a nano banana tendency or was it probably intentional?
Xx_crazy420_xX
gtm1260
ottah
We are very far away from this being a settled or agreed upon statement and I really struggle to understand how one vendor making a tool is indicative of an industry practice.
hollowturtle
frank00001
franze
the constant asking drives me crazy
thefz
Also, the generated picture in this post makes me want to kick someone in the nuts. It doesn't explain anything.
jwpapi
9/10 my ai generated code is bad before my verification layers 9/10 its good after.
Claude fights through your rules. And if you code in another language you could use other agents to verify code.
This is the challenge now, effectively verify the code. Whenever I end up with a bad response I ask myself what layers could i set to stop AI as early as possible.
Also things like namings, comments, tree traversal, context engineering, even data-structures, multi-agenting. I know it sounds like buzzword, but these are the topics a software-engineer really should think about. Everything else is frankly cope.
timhh
Something I know very little about is coding. I know there are different languages with pros and cons to each. I know some work across operating systems while others don't but other than that I don't know too much.
For the first time I just started working on my own app in Codex and it feels absolutely amazing and magical. I've not seen the code, would have basically no idea how to read it, but i'm working on a niche application for my job that it is custom tailored to my needs and if it works I'll be thrilled. Even better is that the process of building is just feels so special and awesome.
This really does feel like it is on the precipice of something entirely different. I think back to computers before a GUI interface. I think back to even just computers before mobile touch interfaces. I am sure there are plenty of people who thought some of these things wouldn't work for different reasons but I think that is the wrong idea. The focus should be on who this will work for and why and there, I think, there are a ton of possibilities.
For reference, I'm a middle school Assistant Principal working on an app to help me with student scheduling.