I totally disagree. Git is not hard. The way people learn git is hard. Most developers learn a couple of commands and believe they know git, but they don’t. Most teachers teach to use those commands and some more advanced commands, but this does not help to understand git. Learning commands sucks. It is like a cargo cult: you just do something similar to what others do and expect the same result, but you don’t understand how it works and why sometimes it does not do what you expect.
To understand git, you don’t need to learn commands. Commands are simple and you can always consult a man page to know how to do something if you understand how it should work. You only need to learn core concepts first, but nobody does. The reference git book is “Pro Git” and it perfectly explains how git works, but you need to start reading from the last chapter, 10 Git Internals. The concepts described there are very simple, but nobody starts learning git with them, almost nobody teaches them in the beginning of classes. That’s why git seems so hard.
Ahhhhh, that’s why! I should’ve know to read from the end not beginning lmao. Jokes aside, thanks for the advice I’ll try it out :)
Authors should write it in the opposite order.
I agree, the teaching is wrong. I always teach it visually. That seems to do the trick
Came here to say the same thing. The git book is an afternoon’s reading. It’s well worth the time - even if you think you know git.
People complain about the UX of the cli tool (perhaps rightly) but it’s honestly little different from the rest of the unix cli experience: ad hoc, arbitrary, inconsistent.
What’s important is a solid mental model and the vocabulary of primitive and compound operations built with it. How you spell it in the cli is just a thing you learn as you go.
h
I disagree, hard.
I disagree with the general conclusion - I think it’s very easy to understand*: each repo has a graph of commits. Each commit includes the diff and metadata (like parent commits). There is a difference between you repo seeing the state of another repo (fetch) and copying commits from another repo into your repo (merge; pull is just a combination of fetch and pull). Tags are pointers to specific commits, branches are pointers to specific commits that get updated when you add a child commit to this commit. That’s a rather small set of very clear concepts for such a complex problem.
I also disagree with a lot of the reasoning. Like “If a commit has the same content but a different parent, it’s NOT the same commit” is not an “alien concept”. When I apply the same change to different parents, I end up with different versions. Which would be kinda bad for a Version Control System.
“This in turn means that you need to be comfortable and fluent in a branching many-worlds cosmology” - yes, if you need to handle different versions, you need to switch between them. That’s the complexity of what you’re doing, not the tool. And I like that Git is not trying to hide things that I need to know to understand what’s happening.
“distinguish between changes and snapshots that have the same intent and content but which are completely non-interchangeable and imply entirely different flows of historical events” How do you even end up in a situation like that? Anyway, sounds like you should be able to merge them without conflicts, if they are in fact completely interchangeable?
“The natural mental model is that names denote global identity.” Why should another repo care, which names I use? How would you even synchronize naming across different repos without adding complexity, e.g. if two devs created a branch “experimental” or “playground”. Why on earth should they be treated as the same branch?
“Git uses the cached remote content, but that’s likely out of date” I actually agree that this can lead to some errors and confusion. But automation exists - you can just fetch every x minutes.
“Branches aren’t quite branches, they’re more like little bookmark go-karts.” A dev describing what basically is just a pointer in this way leads to the suspicion that it might not be Git’s mental model that is alien.
“My favorite version of this is when the novice has followed someone’s dodgy advice to set pull.rebase = true” Maybe don’t do stupid stuff you don’t understand? We know what fetch is, we know what merge is. Pull is basically fetch & merge.
““Pull” presents the illusion that you can just ask Git to make everything okay for you” Just… what? The rest of the sentence doesn’t really fix this error in expectations.
- except the CLI of course, but I can use GUI-tools for most tasks
Each commit includes the diff and metadata (like parent commits).
Commits don’t store diffs, so you’re wrong from the start here.
Hence why people say “git is hard”
Yeah, you’re right, technically it’s not a “diff”, it’s the changed files.
I don’t think this technical detail has any consequences for the general mental model of Git though - as evidenced by the fact that I have been using Git for years without knowing this detail, and without any problems.
It’s all the files. Content-addreasable storage means that they might not take up any more space. Smart checkout means they might not require disk operations. But it’s the whole tree.
One problem, I think, is that git names are kinda bad. A git branch is just a pointer to a commit, it really doesn’t correspond to what we’d naturally think of as a branch in the context of a physical tree or even in a graph.
That’s a bit problematic for explaining git to programming newbies, because grokking pointers is famously one of the stumbling blocks people have, along with recursion. Front-end web developers who never learned C might not really grok pointers due to never really having to deal with them much.
Some other version control systems like mercurial have both a branch in a more intuitive sense (commits have a branch as a bit of metadata), as well as pointers to commits (mercurial, for example, calls them bookmarks).
As an aside, there’s a few version control systems like darcs where instead of the first-class concept being snapshots, it’s diffs. There’s no separate cherrypick command in darcs, it’s just one way you can use the regular commands.
A git branch is just a pointer to a commit, it really doesn’t correspond to what we’d naturally think of as a branch in the context of a physical tree or even in a graph.
But as the article points out, a commit includes all of its ancestors. Therefore pointing to a commit effectively is equivalent to a branch in the context of a tree.
Some other version control systems like mercurial have both a branch in a more intuitive sense (commits have a branch as a bit of metadata), as well as pointers to commits (mercurial, for example, calls them bookmarks).
I mean, git has bookmarks too, they’re called tags.
What happens after you merge a feature branch into main and delete it? What happens to the branch?
Afterwords, what git commands can you run to see what commits were made as part of the feature branch and which were previously on main?
Mercurial bookmarks correspond to git branches, while mercurial tags correspond to git tags.
Hot take: Git is hard for people who do not know how to read a documentation.
The Git book is very easy to read and only takes a couple of hours to read the most significant chapters. That’s how I learnt it myself.
Git is meant for developers, i.e. people who are supposed to be good at looking up online how stuff works.
developers, i.e. people who are supposed to be good at looking up online how stuff works.
How I wish this were true.
Each commit includes the diff
It doesn’t. ☺
In this thread - tons of smart people thinking that the tools we use to replace “make a backup of a file on a server somewhere” should require entire reference books, as if that’s normal.
Saying “it’s a graph of commits” makes no sense to a layperson. Hell the word “diff” makes no sense. Requiring training to get something right is acceptable, but “using CVS” is a tiny tiny part of the job, not the whole job. I mean, even most of the commenters on this thread are getting small things wrong (and some are handwaving it away saying “oh that small detail doesn’t matter”).
Look, git is hard. It’s learnable, but it’s hard. The concepts are medium hard to understand, and the way it does things is unique and designed for distributed, asynchronous work - which are usually hard problems to solve.
While I agree 100% with your main point,
"it’s a graph of commits” makes no sense to a layperson
You’re probably putting your standards too low. Every coder should know what a graph is, the basic concept at least. Really if elementary schoolers are supposed to understand multiplication, they can be taught to understand graphs too. Restrict it to “tree” and it’s even easier.
the word “diff” makes no sense
diff is short for difference. And that basically explains it
Saying “it’s a graph of commits” makes no sense to a layperson.
Sure, but git is aimed at programmers. Who should have learned graph theory in university. It was past of the very first course I had as an undergraduate many years ago.
Git is definitely hard though for almost all the reasons in the article, perhaps other reasons too. But not understanding what a DAG is shouldn’t be one of them, for the intended target audience.
git gets easier once you get the basic idea that branches are homeomorphic endofunctors mapping submanifolds of a Hilbert space.
(source)
Edit: but to actually have content in this comment, I’m not sure the mental model is the problem. It’s not that alien that a good explanation wouldn’t help, but it took a long time for git to start paying any sort of attention to “human readability.” It was and still is in a way “aggressively technical” and often felt like it purposefully wanted to keep anybody but the most UNIX-bearded kernel hackers from using it. The man pages were rarely helpful unless you already understood git, the options were very unintuitively named, etc etc. And considering Linus’ personality, I’m not exactly surprised.
With a little bit of more thought on how to make it more usable right from the start, I’m not sure it’d have such a reputation as it has now. The reason why I think this endofunctor joke is so funny is that that sort of explanation to “simplify” git wouldn’t have been at all out of place – followed by the UNIX beards scoffing at the poor lusers who didn’t understand their obviously clear description of what git branches are.
Reminds me of the old joke that monads are easy to understand, you just have to realize monads are just monoids in the class of endofunctors.
My favorite version of this is when the novice has followed someone’s dodgy advice to set pull.rebase = true, then they pull a shared branch that they’re collaborating on, into which their coworker has just merged origin/main. Instant Sorcerer’s Apprentice-scale chaos!
Why are you doing that? Don’t do that.
And anyway… it’s trivial to fix. If you still have the commit ID of the tip of the branch before the pull, go back to that. If not, look it up in the reflog. If that’s too much of a hassle, list the commits you only have locally, stash any changes, reset to the origin/the_branch and cherry-pick your commits again and/or apply the stash.
I really embraced git once I understood that whatever I did locally, it’s most of the time relatively easy to recover from cock-ups. And it’s really difficult to lose work from the moment you’ve added it to a (local) commit or stashed it.
I do understand that git is daunting however, and there is plenty where I think the defaults are bad. Too often I’ve seen merge commits where someone merged a the remote of a branch into the local copy of the same branch, or even this on main. And once this stuff gets pushed it’s neigh impossible to go back.
In my, rather short (5ish years profesionally), career I needed to rebase once. And it was due to some releasing fuck up, a branch had to be released earlier and hence needed to be rebased on another feature branch scheduled for release.
Otherwise, fetch » pull » merge, all day, every day.
I rebase almost daily. I (almost) never merge the main branch into a feature branch, always rebase. I don’t see the point of polluting the history with this commit (assuming I’m the only dev on this branch). I also almost always do an interactive rebase before actually pushing a branch for the first time, in order to clean up commits. I mostly recreate my commits from scratch before pushing, but even then I sometimes forget to include a change in a commit I’ve just made so I then do an interactive rebase to fold fixup commits into the commits they should’ve been in.
I like merging for actually adding commits from a feature branch to main (or release or …)
I honestly don’t get why folks dislike rebase. I use it constantly, especially to squash commits so that my pull requests are a single commit that can be reverted easily.
It’s also kinda annoying to have a history full of “merge” commits polluting the commit messages and an entwined mix of parallel branches crossing each other at every merge all over the timeline. Rebasing makes things so much cleaner, keeping the branches separate until a proper merge is needed once the branch is ready.
I use rebase when I’m working in a dev branch. If someone else has pushed changes to the main branch, rebasing the dev branch on top of main is a way to do the hard work of resolving merge conflicts up front. Then I can rerun tests and make sure everything still works with changes from the main branch. And finally, when it is time to merge my dev branch to main, it’s a simple fast-forward.
Because rebase is fraught with peril, if you also push rebased branches upstream and someone else works off that branch.
If you stick to the rule of only using rebase on local branches that have never been pushed upstream, it’s an awesome tool. If you don’t, you’re eventually going to cause someone to have a bad day.
Yeah, basically anything that rewrites already pushed history and is then (force-) push is bound to create problems (unless it’s a solo dev only ever coding on a single device, who uses the remote repo as a mere backup solution).
deleted by creator
deleted by creator
I might be suffering from stockholms syndrome here, but my prefered ways of working with git are the cli and the fugitive vim plugin which is a fairly thin wrapper around the cli. It does take a middle ground approach on hiding the magic and forcing you to learn the magic which I suppose can be confusing for beginners when you work collaboratory and something happens that forces you to go beyond pull/add/commit/push
I only stick with these:
- pull
- add
- commit
- push
Easy.
Merge is love merge is life, get the hell out of here with that rebase witchcraft.
In my (admittedly limited) experience, mercurial is much more intuitive than git. I really dislike that git branches are only tags on the heads and completely ephemeral. It favours creating a single clean history instead of preserving what actually happened.
LazyGit is a thing ❤️🙌
Personally it was when I was trying to commit and I got stuck in an authentication loop of git asking for my username or email (even used --global) and it would not work or remember no matter what I tried (was recommended to reinstall mint, yeah no lmao not again).
Ended up installing the unofficial GUI and I’m MUCH happier but I tell ya if you bork something in Mint it’s really hard to fix it if your not a CLI wizard.
Git GUI wise I can do all the basic stuff I need and if something breaks than I use the CLI because there’s more documentation on it
Excellent article
deleted by creator