Why Facebook does not use Git – and why most other devs do • DEVCLASS

Why Facebook does not use Git – and why most other devs do • DEVCLASS (devclass.com)
from AnActOfCreation@programming.dev to programming@programming.dev on 18 Jul 2024 03:25
https://programming.dev/post/17013197

Facebook does not use Git due to scale issues with their large monorepo, instead opting for Mercurial.
Mercurial may be a better option for large monorepos, but Git has made improvements to support them better.
Despite some drawbacks, Git usage remains dominant with 93.87% share, due to familiarity, additional tools, and industry trends.

#programming

threaded - newest

masterspace@lemmy.ca on 18 Jul 2024 04:14 next collapse

Facebook uses Mercurial, but when people praise their developer tooling it’s not just that. They’re using their CLI which is built on top of Mercurial but cleans up its errors and commands further, it’s all running on their own virtual filesystem (EdenFS), their dev testing in a customized version of chromium, and they sync code using their own in-house equivalent of GitHub, and all of it connects super nicely into their own customized version of VS Codium.

camr_on@lemmy.world on 18 Jul 2024 04:33 next collapse

Damn that sounds sick

masterspace@lemmy.ca on 18 Jul 2024 04:39 next collapse

The source control was so smooth and pleasant that it convinced me that git isn’t the be all end all, and the general developer focus was super nice, but some of that tooling was pretty janky, poorly documented, and you had no stack overflow to fall back on. And some of it (like EdenFS), really felt like it was the duct tape holding that overloaded monorepo together (complete with all the jankiness of a duct tape solution).

morrowind@lemmy.ml on 18 Jul 2024 05:21 next collapse

And kinda horrifying. If something goes wrong, no Google, it’s straight to IT

computergeek125@lemmy.world on 18 Jul 2024 12:38 collapse

There’s probably specific ticket queues and wiki/doc spaces for each support team.

Problem with an app? Send it to the internal dev/support team. Then if needed it gets routed.

[deleted] on 18 Jul 2024 14:05 collapse

Evotech@lemmy.world on 18 Jul 2024 06:20 collapse

What you can do with 84000 employees

Miaou@jlai.lu on 18 Jul 2024 06:47 collapse

And some good management. Probably not a common opinion around here, but my company is not a tenth of that size, with a hundredth the number of devs, yet different teams still end up copy pasting libraries. Because it’s faster than convincing management DevOps is important.

villainy@lemmy.world on 18 Jul 2024 14:19 next collapse

The inhouse tooling from the massive tech companies is very cool but I always wonder how that impacts transferrable skills. I work in a much smaller shop but intentionally make tech decisions that will give our engineers a highly transferrable skill set. If someone wants to leave it should be easy to bring their knowledge to bear elsewhere.

ByteOnBikes@slrpnk.net on 18 Jul 2024 15:01 next collapse

Speaking from my own experience and a few other seniors I work with, you try to recreate solutions you like at those smaller shops. It may not be identical, but you know what’s possible.

I came into a company that didn’t have a system to manage errors. At my old job, errors would get grouped automatically and work can be prioritized through the groupings. The new company only handled errors when they saw it, by word of mouth.

Immediately went to work setting up a similar system.

senkora@lemmy.zip on 18 Jul 2024 16:11 collapse

There’s also a whole industry of ex-Googlers reimplementing Google tooling as SaaS services to sell to other ex-Googlers at other companies.

There’s even a lookup table: github.com/jhuangtw/xg2xg

(some of those are open source projects, some are SaaS services)

lud@lemm.ee on 18 Jul 2024 16:30 next collapse

The inhouse tooling from the massive tech companies is very cool

I agree. I personally know nothing about tooling like this but I went through the tooling used at rockstar for example GTA V and it was very cool to how much they have automated and made tools easier to use.

sukhmel@programming.dev on 20 Jul 2024 11:12 collapse

Made easier to use like in when their codebase was leaked and no one had successfully built a game from it?

in-house tools often encourage making a mess heavily reliant on those tools or working around their limitations, in my experience

lud@lemm.ee on 20 Jul 2024 12:42 collapse

People have successfully compiled GTA V if that is what you are saying.

Of course no one would make another game using leaked tools, that would be incredibly stupid.

sukhmel@programming.dev on 20 Jul 2024 14:07 collapse

No, that was what I meant, I thought they didn’t, I was wrong, it turned out

lud@lemm.ee on 20 Jul 2024 14:15 collapse

Yeah, people successfully compiled and ran the game within a few days of the leak.

I tried myself but I didn’t get it to work. But I’m no developer and I tried doing it in a VM (no way those files touch my real computer) which was annoying so I gave up quite quickly.

sukhmel@programming.dev on 18 Jul 2024 17:32 next collapse

Oh, it impacts indeed. And I would expect that to be partially to keep the devs from hopping away, as they will have a hard time transferring

On the other hand, onboarding is longer and wastes more time and money of the company ¯\_(ツ)_/¯

UFODivebomb@programming.dev on 20 Jul 2024 02:01 collapse

Absolutely does. Source: worked for Amazon.

[deleted] on 19 Jul 2024 07:58 next collapse

Simmy@lemmygrad.ml on 19 Jul 2024 08:00 collapse

They should call it VS Copium.

MajorHavoc@programming.dev on 18 Jul 2024 05:15 next collapse

I’m pleased to report that git has made significant strides, and git submodule can now be easily used to achieve a mono-repo-like level of painful jankiness.

ace@lemmy.ananace.dev on 18 Jul 2024 05:31 next collapse

Mercurial does have a few things going for it, though for most use-cases it’s behind Git in almost all metrics.

I really do like the fact that it keeps a commit number counter, it’s a lot easier to know if “commit 405572” is newer than “commit 405488” after all, instead of Git’s “commit ea43f56” vs “commit ab446f1”. (Though Git does have the describe format, which helps somewhat in this regard. E.g. “0.95b-4204-g1e97859fb” being the 4204th commit after tag 0.95b)

SkyNTP@lemmy.ml on 18 Jul 2024 07:13 collapse

I suspect rebasing makes sequential commit IDs not really work in practice.

wewbull@feddit.uk on 18 Jul 2024 11:34 collapse

Rebasing updates the commit ids. It’s fine. Commit IDs are only local anyway.

One thing that makes mercurial better for rebase based flows is obsolescence markers. The old version of the commits still exist after a rebases and are marked as being made obsolete by the new commits. This means somebody you’ve shared those old commits with isn’t left in hyperspace when they fetch your new commits. There’s history about what happened being shared.

AnActOfCreation@programming.dev on 18 Jul 2024 12:48 next collapse

Commit IDs are only local anyway.

Whay do you mean by that?

wewbull@feddit.uk on 19 Jul 2024 18:44 collapse

You and I both clone a repo with ten changes in it. We each make a new commit. Both systems will call it commit 11. If I pull your change into my repo your 11 becomes my 12.

The sequential change IDs are only consistent locally.

AnActOfCreation@programming.dev on 19 Jul 2024 19:13 collapse

Got it! Are they renumbered chronologically? Like if my 11 was created before your 11, would yours be the one that’s renumbered?

wewbull@feddit.uk on 20 Jul 2024 12:32 collapse

No. They are not renumbered. Your 11 is always the same commit. It’s consistent locally (which is what I mean by “local only”) otherwise they’d change under your feet. You just can’t share them with others and expect the same results. You have to use the hash for that.

FizzyOrange@programming.dev on 18 Jul 2024 13:28 collapse

That’s exactly the same in git. The old commits are still there, they just don’t show up in git log because nothing points to them.

aport@programming.dev on 18 Jul 2024 17:16 collapse

Old, unreachable commits will be garbage collected.

FizzyOrange@programming.dev on 18 Jul 2024 21:10 collapse

Does that not happen with Mercurial? If not that seems like a point against it.

aport@programming.dev on 19 Jul 2024 02:51 collapse

I’m confused, the behavior you just said was “exactly the same in git” is now a problem for Mercurial?

FizzyOrange@programming.dev on 19 Jul 2024 09:13 collapse

I thought it was exactly the same based on the description.

wewbull@feddit.uk on 19 Jul 2024 10:49 collapse

No the old commit is always there, marked as obsolete with the information of what it became. No holes in history. (Assuming you use the obsolecense markers)

Evotech@lemmy.world on 18 Jul 2024 06:23 next collapse

And google uses Piper to do the same thing

dl.acm.org/doi/10.1145/2854146

[deleted] on 18 Jul 2024 06:43 next collapse

0x0@programming.dev on 18 Jul 2024 08:54 next collapse

That’s hardly the VCS’s fault.

Sir_Kevin@lemmy.dbzer0.com on 18 Jul 2024 10:37 collapse

That depends on who you believe their customer to be.

Treczoks@lemmy.world on 18 Jul 2024 06:53 next collapse

What kind of RCS is used always depends on the organisation. We are actually using GIT and SVN, and both make sense for the departments that are using them.

x1gma@lemmy.world on 18 Jul 2024 08:03 collapse

Serious question, why do they use SVN, as in what does SVN better than Git for the department using it?

ra1d3n@lemm.ee on 18 Jul 2024 08:40 next collapse

The manager likes it.

Malunga@derpzilla.net on 18 Jul 2024 09:17 next collapse

Because we always used it!

Mikina@programming.dev on 18 Jul 2024 09:56 next collapse

While I’m not using it, since we started our small-team hobby project in git and moving away from it would be a bother, there is one use-case of SVN that would save us a lot of headaches.

SVN being centralized means you can lock files. Merging Unity scenes together is really pain, the tooling mostly doesn’t work properly and you have no way how to quickly check that nothing was lost. Usually, with several people working on a scene, it resulted in us having to decide whose work we will scratch and he will do it again, because merging it wouldn’t work properly and you end up in a situation where two people each did hundreds or thousands of changes to a scene, you know that the Unity mergetool is wonky at best, and checking that all of those changes merged properly would take longer and be more error prone than simply copying one persons work over the other.

We resorted to simply asking in chat if anyone has any uncommited work, but with SVN (or any other centralized VSC, I suppose) we wouldn’t have to bother with that - you simply lock the scene file and be safe.

x1gma@lemmy.world on 18 Jul 2024 13:25 next collapse

Right, completely forgot that locking exists in SVN, and I guess it definitely makes sense if you’re collaboratively editing unmergeable files.

Thanks!

FizzyOrange@programming.dev on 18 Jul 2024 13:33 collapse

Git LFS does actually support file locking. But in general I find LFS to be hackily pasted onto Git and not very good (as with submodules).

Treczoks@lemmy.world on 18 Jul 2024 18:37 next collapse

SVN has the big advantage of serialized revision numbers. Which is essential for out build- and release-system.

leds@feddit.dk on 19 Jul 2024 21:11 collapse

SVN admin here:

easy to partial checkouts, no need to clone entire repo
euuuh…
much simpler for non developers that need version control e.g. engineers

Mikina@programming.dev on 18 Jul 2024 09:59 next collapse

My best VCS experience so far was when working with Plastic SCM. I like how it can track merges, the code review workflow is also nice, and in general it was pretty nice to work with.

Fuck Unity, who paywalled it into unusability, though. Another amazing project that was bought and killed by absurd monetization by Unity, same as Parsec.

computergeek125@lemmy.world on 18 Jul 2024 12:35 collapse

How was Parsec before the acquisition?

I only really have experience after, and it’s the only Unity product I’ve actually found that I like. My only major complaint is that it’s not compatible with the base configuration of Palo Alto, but that’s really more of a Palo Alto problem than a Parsec problem.

Mikina@programming.dev on 18 Jul 2024 15:43 collapse

I still use Parsec for remote, and I don’t have any issue with it, it works great and I like it. However, they also did offer a free SDK (Unity plugin) to integrate remote play into your game natively (just like you can have “Invite to Steam Remote Play” button from Steam SDK), which was exactly what we needed - and Steam Remote was never working without issues for us, in comparison to Parsec which worked amazingly well every time we tried it.

I found numerous mentions of Parsec SDK and how easy it is to integrate, but after Unity bought it, I couldn’t find it anywhere. Only mention was that if you need it, you should contact them.

So I did that, mentioning that we are a small team of students working on a offline co-op only 2 player game in our free time, and that since Steam Remote wasn’t working for us and I have great experience with Parsec, I asked what we have to do to get access to the SDK/Unity plugin.

Unity’s answer? Sure, no problem, they will be happy to give us access, with first step being that we pay them 1 000 000$ for it.

Like, wtf? Did they even read the email? How out of touch you have to be, to casually ask a small student team to pay 1 000 000$?

computergeek125@lemmy.world on 18 Jul 2024 16:56 collapse

Okay that’s fair. Their pricing is awful in general, and that’s especially egregious for something that used to be free

mke@lemmy.world on 18 Jul 2024 11:31 next collapse

As far as performance goes, Microsoft did manage to make git work for them later on (…with many contributions upstreamed and homegrown solutions developed—but then, Facebook is the same, isn’t it?).

MonkderDritte@feddit.de on 18 Jul 2024 12:16 next collapse

Split up the monorepo?

FizzyOrange@programming.dev on 18 Jul 2024 13:31 collapse

That brings more problems. Despite the scaling challenges monorepos are clearly the way to go for company code in most cases.

Unfortunately my company heavily uses submodules and it is a complete mess. People duplicating work all over the place, updates in submodules breaking their super-modules because testing becomes intractable. Tons of duplicate submodules because of transitive dependencies. Making cross-repo changes becomes extremely difficult.

bellsDoSing@lemm.ee on 18 Jul 2024 17:00 collapse

But if not for using submodules, how can one share code between (mono-)repos, which rely on the same common “module” / library / etc.? Is it a matter of “not letting submodules usage get out of hand”, sticking to an “upper limit of submodules”, or are submodules to be avoided entirely for monorepos of a certain scale and there’s a better option?

nous@programming.dev on 18 Jul 2024 18:19 collapse

You don’t share code between monorepos, the whole point of a monorepo is you only have one repo where all code goes. Want to share a library, just start using it as it is just in a different directory.

Submodules are a poor way to share code between lots of small separate repos. IMO they should never be used as I have never seen them work well.

If you don’t want a mono repo then have your repos publish code to artifact stores/registries that can be reused by other projects. But IMO that just adds more complexities and problems then having everything in a single repo does.

bellsDoSing@lemm.ee on 18 Jul 2024 20:29 collapse

So AFAIU, if a company had:

frontend
backend
desktop apps
mobile apps

… and all those apps would share some smaller, self developed libraries / components with the frontend and/or backend, then the “no submodules, but one big monorepo” approach would be to just put all those apps into that monorepo as well and simply reference whatever shared code there might be via relative paths, effectively tracking “latest”, or maybe some distinct “stable version folders” (not sure if that’s a thing).

Anyway, certainly never thought to go that far, because having an app that’s “mostly independant” from a codebase perspective be in it’s own repo seemed beneficial. But yeah, it seems to me this is a matter of scale and at some point the cost of not having everything in a monorepo would become too great.

Thanks!

FizzyOrange@programming.dev on 19 Jul 2024 20:41 collapse

Yeah exactly that. Conceptually it’s far superior to manyrepos. But it does have downsides:

git will be slower, and it doesn’t really have great support for this way of working. I mean it provides raw commands for partial checkouts… but you’re kind of on your own.
You can’t realistically view a git log --graph any more since there will be just way too many commits. Though tbf you can get to that state without a monorepo if you have a big project and work with numskulls who make 50 commits for a small MR and don’t squash.

Also it’s not really a downside since you should be doing this anyway, but you need to use a build tool that sandboxes dependencies so it can guarantee there are no missing edges in your dependency graph (Bazel, Buck, Pants, Please, Landlock Make, etc.). Otherwise you will be constantly breaking master when things aren’t checked in CI that should be.

bellsDoSing@lemm.ee on 20 Jul 2024 10:52 collapse

True, git itself can’t prevent people from creating a mess of a commit graph.

TBH, lots of build systems mentioned here I’ve never encountered so far. But this makes it clearer that one can’t reason about how viable a “one big monorepo only” approach mighy be by just considering the capabilities of current git, coming from a “manyrepo” mindset. Likely that was the pitfall I fell into coming into this discussion.

Anticorp@lemmy.world on 18 Jul 2024 16:08 next collapse

Because Facebook is a terrible company that can’t even build a functional website. They think they know better than the entire industry, yet can’t get basic features like browser history, link sharing, back buttons, or even comments and zooming working. Fuckin idiots.

collapse_already@lemmy.ml on 19 Jul 2024 03:54 next collapse

I use git daily and still wonder why I had fewer merge issues on a larger team in the 1990s with command line rcs on Solaris. Maybe we were just more disciplined then. I know we were less likely to work on the same file concurrently. I feel like I spend more time fighting the tools than I ever used to. Some of that is because of the dumb decisions that were made on our project a decade or more ago.

EatATaco@lemm.ee on 19 Jul 2024 10:07 collapse

I know we were less likely to work on the same file concurrently.

I mean, isn’t that when merge conflicts happen? Isn’t that your answer?

collapse_already@lemmy.ml on 19 Jul 2024 14:53 collapse

I was trying to say that tools were better about letting us know that another developer was modifying the same file as us, so we would collaborate in advance of creating the conflict.

EatATaco@lemm.ee on 19 Jul 2024 16:29 collapse

I gotcha, I misunderstood

greysemanticist@lemmy.one on 19 Jul 2024 21:07 collapse

jujutsu is a fresh take on git-- you describe the work you’re about to do with jj new -m ‘message’. Do the work. Anything not previously ignored in .gitignore is ready to commit with jj ci. You don’t have to git add anything. No futzing with stashes to switch or refocus work. Need that file back? jj restore FILENAME.

AnActOfCreation@programming.dev on 19 Jul 2024 22:17 collapse

It’s very optimistic to think people will be able to describe what they’re going to do before they do it. I find things rarely go exactly as planned and my commit messages usually include some nuance about my changes that I didn’t anticipate.

greysemanticist@lemmy.one on 20 Jul 2024 15:59 collapse

This is true. But at jj ci you’re plonked into an editor and can change the description.