(Someone deleted a comment about why you'd want a mobile Codex app. This is the answer I wrote.)
Once you've used these coding agents a lot, you develop a pretty intuitive feel for how they work, what they're capable of, what they're good at, and where their weaknesses are. Hopefully, you're already pretty familiar with the code base you're working on. Combining the two, this means you can get quite far essentially "vibe coding" (i.e. not looking at the actual code) on a new branch.
So if you have some idea or some issue you want to fix on the go, you just iterate with the agent for a bit (presumably no more than a couple hours) until the agent outputs an implementation. Here, I do claim there is some "skill" (which is a function of your codebase familiarity, general SWE ability, and facility with AI agents), and if you're good, this implementation will be halfway decent a high percentage of the time. Then when you're back at your desktop, you can review the changes carefully/do some proper testing/debugging etc. But you've saved a good chunk of time- an initial draft is already waiting for you.
For major new features on my SaaS this is exactly what I do on my phone/laptop sometimes over days or weeks. I never look at the code until I get a feeling that it's far enough along and then I will hop into the actual code and start manually making changes or using CC locally to make the changes iteratively over weeks until it's ready for release. In the early stages of a major new feature/product it can be counterproductive to closely monitor the AI. Of course like you said in your comment this requires very very strong knowledge of the code base and a lot of experience with using the agents in the first place. But once you can do this sort of workflow it's very powerful because you can do this in parallel with other work (just an hour or two per day over a week or two on your phone can get you to a really good first draft even on a major new feature/product. And of course I'm not saying it's ready for production that can still take weeks but that's not really the point)
I was doing exactly this for a while with Claude Code. Very helpful when I'm away from home but can't stop thinking about my project. The remote agent has access to all the docs and instructions in my repo and most of the time gives me a decent draft I just need to polish later.
I unsubscribed from Claude after the performance regressions around the time of the Opus 4.7 update made it unusable. Been using Codex since then, and I've definitely missed being able to make these drafts. So I'm looking forward to trying this out.
I have been doing exactly this by bookmarking Codex and using 'Add to home screen'
Process (all on my phone)
* Create new repo on github
* Tell Chatgpt the project and ask for a readme and agents file
* Manual upload the files to github
* Go to Codex and tell it to review the code and carry out steps in readme
* Connect project to Vercel
* If needed, create a DB
* Ask ChatGPT for the schema and run the sql
I have done this kind of work for years and now I can create things like this on the way back from a meeting. It's broken my business model by the way.
Here is one of the apps, for mental health - pretty much all done on my phone.
AI agents for devops and troubleshooting has been fairly powerful for me.
I have Claude Code with access to Azure environment (via CLI) where app components are deployed and also to the code base repo. I paste an error message or explain the symptom. CC works through various configuration checks and network tests etc across the Azure resource list and also the application logic and surfaces the root cause of the error precisely. Easily 1-2 days of effort if I had done all those myself (this is an inherited code base) -- would have had to learn a few of those skills along the way or may not have thought about some of the checks if i were on my own -- all done in about 45 mins with basic human-in-the-loop guidance.
Of course learning it the hard way would have meant deeper understanding and first-hand exprience for me. But there is no guarantee I wouldnt have given up mid way frustrated or other priorities prevented me from pursuing this in full.
I've been vibe-refactoring a fork of get-shit-done (a skill collection for coding agents) for about the past week. I've had to revisit the same ideas multiple times because the agent doesn't always get it right at first, but it's still so much faster than I could have been at the same work + it's already mostly working (I've been dogfooding it for a day or two now). And I have gotten by just bringing up issues I notice from the LLM's implementation comments, rather than actually inspecting the code even once so far.
No, the phone connects to your local device. This isn't "codex web" on mobile. Basically you work through your desktop on your phone. So to be clear, there are security risks (you can wipe your entire desktop from your phone).
You can run Codex Desktop on Linux. It's on AUR already. Granted, just a repacked ASAR from Windows version but still does work quite well.
Haven't tested connection to mobile yet but the integration with cloud environments already works.
For now it appears that it talks only to the Codex App. Some users in this thread are saying that apparently the Codex CLI will support it on the next official release.
Not sure about how it works with Codex now, but with Claude you can just start a terminal session of claude code with your code checked out locally on your computer, and then enable remote control which lets you control that session from your phone.
So basically, it is like you are typing on your terminal on your computer from your phone.
I tried Codex web. It kinda sucks and OpenAI doesn't seem to be promoting it? Look elsewhere if you want a Linux VM in the cloud. (I quite like exe.dev and they do have good mobile support.)
I mean I'd love for them to take it further. If you put me on the phone with a talented software engineer I could supervise all sorts of changes. I wish I could do the same thing with my coding agent. Being able to be like, "hey remind me what's in that database table ... got it okay let's rename it to ..."
I'm also completely fine if it gives me hold mustic while it's working.
When I hear about features like this, at a certain point it looks more like compulsion/addiction instead of a useful thing to do. Like, if I'm at some sort of activity or event maybe I should just be at that thing instead of trying to aim the slop cannon 24/7
Forgetting code exists is by definition not suitable for serious work. However, OP said in the following paragraph, that this would be a first draft, and that the code would actually be reviewed and tested properly before being integrated.
At which point it is by definition no longer vibe coding, because you do care about the code! It's just an AI assisted workflow, but now we call all of those vibe coding for some reason. (Naming things is hard!)
If vibe coding means not caring about the code, then a literal translation of the term would be "not caring about coding" coding.
> Forgetting code exists is by definition not suitable for serious work
This is just like everyone who says, “An iPad is not suitable for serious work.”
By which they (and you) generally mean, “What I do is serious work. What you do is unserious work.”
I think I do serious work – I mean they pay me for it? And I have only copy/pasted and just run whatever code’s been generated by AI for the past 12 months or so. Whenever I can I just let the AI run it itself.
Sad to learn that I’ve been so unserious all this time.
No, not reading the code. I vouch for the correctness because it runs and produces output that is useful. It might not be 100% right but neither was the code I wrote by hand.
I find funny the trend of software engineers being shocked at the idea that someone would issue a set of instructions to a coder and not look at the code, or only glance at it.
How do you think the world has worked for the past thirty years? AI has just caught up with human skill is all.
What OP said works quite well for a lot of tasks, and if you've set up base instructions on coding style they (Codex, Claude) generate code accordingly.
A key point is that after the "vibe" session you should also have a lot of tests written. So they can easily refactoring the code afterwards if there are major aspects you don't like when you get back to your desktop.
Unbelievable. This is the silent de-skilling of this industry.
Imagine saying that you don't need to look at the roads or have no hands on the wheel whilst driving because someone-else said that the car can 'drive' itself; therefore, no need for anyone (including taxi drivers) to learn how to drive.
Just because a machine can generate plausible looking code does not mean you don't need to look at it or not know how it works or why it doesn't work.
I am not sure I understand the time savings you're describing here. Do you mean you saved the "time to write prompt into the text input box" because you got to do that sooner from your phone rather than write down your idea and do it when you got back to your computer?
Wouldn't you be doing the exact same thing had you been sitting at your computer when you had the idea?
Perhaps the person who wrote that had the mindset of "when I am away from my work, I want to be disconnected and present with the world around me, this updates now makes it so that I now have an excuse to carry work with me"
Maybe they're in a toxic/abusive work relationship where taking breaks is already difficult and this might lead to justifying working from your phone as "expected"
My question to you is: what is wrong with moving a little slower? Is time to prompt an optimization of a real bottleneck?
You can use STT and include a workflow that automatically extracts the requirements (filters all the um's, ah's, pauses) and it becomes more like an interaction where you act as the Product Owner/Manager and Codex is your Architect/Dev.
At least, that's how I code through my phone. But it does require some forethought in establishing your automated workflows. I'm at the point where my entire dev system has established templates for CI/CD so I can preview work in staging and production is still a manual step (obviously).
Sure, I too do that on the computer. Computers have microphones these days, and STT runs on my macOS as well. What was your point about in regards to my comment? I am not sure I understood you.
Sometimes I get random inspiration for an idea while out on a walk or otherwise away from the computer. It's really nice to be able to throw a couple instructions out there, let your agent run with it, and see what it came up with later. Sometimes I do this 3-5 times before returning to my computer. IMO it's really nice to be able to start from X% done rather than 0% when I finally do sit down to review/iterate on the code.
The numbers don't play out because international chinese students only make up 5-7% (maybe less) of the undergraduate student body. Self-reported cheating frequencies are much higher.
That's kind of a large number. Honor system is a solidarity thing. There can be 0% cheating cause nobody wants to be that person, but if 5% come in and egregiously cheat anyway, it can poison more. Most people don't want to cheat, but they may feel disadvantaged not to.
The style of writing and the inclusion of the word "mandarin" made me assume that you were implying WASPs were not participating in the "high stakes struggles". You still have not explicitly stated your view one way or the other. As you can see from the other comments, almost everyone read an undercurrent of xenophobia in your post. I sense you're a skilled interlocutor- I concede I fell into your trap.
You are aware that Stewart Alsop was writing about the death of WASP elites in 1970 or so. Does that mean you think Princeton exploded in cheating in 1971?
My types? The person I was responding to claims that if I have a problem with someone shoplifting alcohol and condoms from Walgreens, then it's a moral failing on my part. I responded because I found that absurd. For the record, I do not condone managers editing timecards.
This viewpoint is curdling rapidly. The definition of "reasoning" and "intelligence" will be debated for ages by philosophers and cognitive scientists, but whatever type of logical/critical thinking is going on in the heads of software engineers and mathematicians, frontier LLMs can now emulate to a very high degree. For mathematics in particular, examples like the following will become common place:
Emulating being the key word here. Putting words in a similar order as a critical thinker would, isn't the same as critical thinking. Have you looked at the output of "reasoning" models? It's funny, for sure, but not impressive or threatening. It exposes the models for the statistical word generators they are.
Add the fact that they totally suck at tasks outside of those spanned by the training data. I know there's a vision of the future where humans are all gig workers generating specialised training data for LLMs, but it doesn't sound much more plausible to me than a future where intellectual progress forever stops at the 2022 level, because everything will be done by LLMs and that's when anything new stopped being thought of.
This happens whenever a disruptive technology is introduced to a field and I will never get over the irony of a software engineer (in a profession whose entire goal is to automate tasks) not noticing the hypocrisy.
"Insanity is doing the same thing over and over again and expecting different results."
There is certainly randomness in model output that the user has to work around, but sending the same prompt with the same context (or even worse- with added entropy leaving the previous failed prompt in the context) over and over again akin to pulling a slot machine lever is certainly user error and not the way to "hold it".
With this paper by Microsoft and the infamous paper by Apple last year, it seems the tech giants that don't have their own models are getting a bit insecure.
Once you've used these coding agents a lot, you develop a pretty intuitive feel for how they work, what they're capable of, what they're good at, and where their weaknesses are. Hopefully, you're already pretty familiar with the code base you're working on. Combining the two, this means you can get quite far essentially "vibe coding" (i.e. not looking at the actual code) on a new branch.
So if you have some idea or some issue you want to fix on the go, you just iterate with the agent for a bit (presumably no more than a couple hours) until the agent outputs an implementation. Here, I do claim there is some "skill" (which is a function of your codebase familiarity, general SWE ability, and facility with AI agents), and if you're good, this implementation will be halfway decent a high percentage of the time. Then when you're back at your desktop, you can review the changes carefully/do some proper testing/debugging etc. But you've saved a good chunk of time- an initial draft is already waiting for you.
reply