27 Jun 2025

10 min read

My frustrations with Gemini CLI compared to Claude Code

Agentic AI Gemini Claude

TLDR: I will still prefer Claude Code and only use Gemini CLI if I am rate limited on Claude

2 days ago, Google announced its own agentic CLI-based Claude Code competitor in Gemini CLI. To test out its capabilities, I tried it out in a direct competition while building a very simple iOS app, to track what I spend my time on.

The idea for the app was very simple:

I wanted some sort of easy way to track what I spend my time on
I needed it to be on my smartphone since that is always on me
I wanted a csv export so I could run analysis on it

It’s not the prettiest thing I ever built (see results), but it does it’s job in my eyes. I have never built an app for iOS before and am completely unfamiliar with xcode including the setup. So this was a full blind experience for me. I did no significant code changes. I only removed some small parts of the code if gemini/claude got stuck on something which should not have been there in the first place.

How I got set it up

Since you now have the backstory, I wanted to see what was possible to be built when just providing a very generic set of instructions. I had a chat with the normal gemini 2.5 pro online version to find a list of features I would like. Had all of them put into a list. This resulted in what I described, I wanted in my last blogpost.

Setup:

Claude with Pro Plan
Gemini CLI as shipped
Environment had two files with a general overview of the idea and todos
I had both of the tools setup their own Gemini.md / Claude.md at the start and added that they should git commit everything after each change
I had both of them break down the todos into their own files so they can work with them on their own later on. (Todos are listed at the end)
And then I just said go/continue to go until all todos were finished
They were running side-by-side until I went to testing with xcode

I will now go through some anecdotal evidence of what I went through while working with both AIs.

What I noticed very quickly was how Gemini just wanted to be left alone, after a few initial asks for permissions it just quickly went and did its thing. While I had to frequently interact with Claude throughout. For Gemini, going from completing task 1 to task 13 took one option I had to green light. Whereas Claude asked every 1 or 2 tasks for some interaction.

Luckily I was still looking at the screen when Gemini notified me that I would be switching to 2.5 Flash since it was taking too long to work with 2.5 Pro. This would have just been hidden otherwise.

Gemini was very quick initially while still using 2.5 Flash and was far ahead of Claude before Claude even started to work on its assignment.

When the dust settled, I had a lot of time to work with the Gemini version of the app. While I was still figuring out how to use xcode and doing some setup work, I started to run into significant prompting issues when letting Gemini know that something was not compiling, providing the error message from xcode with the file name and line number. That did not work for a long time. I provided a set of buttons, roughly how I wanted them to look, their functionality and where they should be placed.

I am having this bug in the taskviewmodel in line 19 Value of type 'ModelContext' has no member 'registeredObjects'

Sometimes it would fix that issue, but I would still need to copy paste about 8 errors it created into the terminal, then it would sometimes get completely lost even though every error had a file name a line and an error message. And this would just go on and on.

I actually gave up on a feature which was included in the Claude version perfectly from the start. The Today’s Tasks which shows the last few tasks of what you did today and how long you were working on them. Even after 10 prompts, explaining the feature, where I wanted it to be, it kept implementing a start, stop feature for the tasks. Which I did not need, it added needless clicking to something that should run very quickly.

Then after finally having the Gemini Version working, which I was happy with. I looked at the claude version for the first time. It already looked way better even though I had to interact with it less through prompting. After removing some parts through prompting, it fully crashed the app. Where I was unable to start it anymore, when providing the error log on the other hand, it immediately fixed all issues and was perfectly running again afterwards.

This was very interesting especially because of the sequencing of how I worked with the Gemini and Claude versions. I assumed I would have a bunch of issues when wrapping up the Claude version as well, after fighting with Gemini to do what I wanted. Claude kind of ‘got me’ way better. Maybe this kink gets worked out over the coming days of Gemini CLI, but even for other tasks, I would prefer Claude since I found it easier to interact with. But Gemini is a good second place whenever I run into rate limiting or want to work with notebook.lm for a specific topic.

Results

	Claude	Gemini
Initial Speed	40 min	20 min
Cost	20$/m	free
Necessary Edits	~5 prompts	~25 prompts
Total time with edits	~ 1.5 hours	~ 2.5 hours

Normal App
Analytics

Interestingly enough, the UI of the Claude version looks much more refined, even though I had to fiddle with the Gemini UI a lot more. I prompted it several times to try to add a completed task list, but it just tried to add a button to stop a task and then manually start the next one. Also the analytics of Claude came out just like that, without any additional input from my side.

Gemini CLI - what I noticed

It asked for way less permissions when doing changes, like 10x less times
It arbitrarily switched from 2.5 pro to 2.5 flash without indication or any safeguards. This might have resulted in the poor performance towards the end
Gemini was significantly faster initially. It finished everything when claude was about 1/3 done with the list
But the time to get everything as I wanted was substantially longer compared to Claude.
Since it was unsure if I had an Apple Developer Account or not, it committed those changes fully commented out, which was nice (since I didn’t have one)

Claude - what I noticed

It assumed I had a developer account, removing those changes through prompting (I know going back with git was possible), it took a while
It fully crashed the app at some point because of bad changes made during that deletion
It ignored the git commit etc. instructions frequently
But it still did a better job in my eyes

Conclusion

I still prefer Claude over Gemini currently
Claude usually gets the gist of what I am trying to convey with less information, Gemini just does its thing
Geminis free plan is outrageously generous with up to 1000 free requests per day, costing nothing and it’s high context window
Random model switching in Gemini is a fatal flaw in my eyes, in the middle of a running prompt it goes from 2.5 pro to 2.5 flash. Without any prompt before or after to acknowledge this in any way

Addendum

In this section, I just provide a bunch of more information for the interested reader. Gemini for example gives you detailed usage statistics after quitting a session, which I included here as well.

Additionally I included the ‘generated’ feature set of what the AIs should have implemented in the first place and in which order. I focused mainly on the straight forward few points I had in mind, but liked giving the AI a bit more tasks to do, so it might start tripping up because of context issues or bad editing etc.

Gemini Usage

Taking care of all the todos: 1 & 2

prompt_todo_1

prompt_todo_2

Prompting for clean-up

prompt_clean_up

Todos

Epic 1: Core Timing & Tracking

This epic covers the fundamental functionality of starting, stopping, and viewing time entries.

CL-1 [MVP]: As a user, I want to input a description for a new task and tap a “Start” button, so that the app begins tracking time for that task.
CL-2 [MVP]: As a user, when I start a new task, I want the previous task to be automatically stopped and saved, so that I have a continuous timeline of my day.
CL-3 [MVP]: As a user, I want to see the currently running task’s name and its live-updating elapsed time on the main screen, so I know what I’m tracking at a glance.
CL-4 [MVP]: As a user, I want to see a simple, chronological list of my past time entries for the day, showing the task name and its final duration.

Epic 2: Data Management & Export

This epic covers how the user’s data is stored, managed, and retrieved.

CL-5 [MVP]: As a user, I want all my time entries to be saved locally on my device, so that my data persists even if I close the app or restart my phone.
CL-6 [MVP]: As a user, I want to tap an “Export” button that generates a CSV file of all my time entries, so that I can analyze my data in a spreadsheet program.
Acceptance Criteria: The CSV file must be shareable via the standard iOS Share Sheet.
Acceptance Criteria: The CSV must contain at least the following columns: task_name, start_date, end_date, duration_in_seconds.
CL-7 [Post-MVP]: As a user, I want to be able to edit the name and start/end times of a past time entry, so I can correct mistakes.
CL-8 [Post-MVP]: As a user, I want to be able to delete a time entry, so I can remove accidental or irrelevant logs.

Epic 3: Usability & Enhancements

This epic covers features that improve the user experience and add advanced capabilities.

CL-9 [Post-MVP]: As a user, I want the app to suggest previously used task names as I type, so that I can quickly start common tasks without re-typing.
CL-10 [Future]: As a user, I want to see simple in-app reports, like a bar chart showing total time per task for the week, so I can get quick insights without exporting.
CL-11 [Future]: As a user, I want my data to sync via iCloud, so that my time entries are available and consistent across my iPhone and iPad.
CL-12 [Future]: As a user, I want a Lock Screen Widget to see my currently running timer and its name, so I don’t have to unlock my phone for a status check.
CL-13 [Future]: As a user, I want to add optional notes or tags to a time entry, so I can add more context to my logs.