Tom Butler's programming blog

Git Forked: The decentralised but better connected git ecosystem I'd like to see

In my last post I wrote about GitHub's demise and why I don't think it's a bad thing. I also said that mass migrations to services like GitLab are not the answer to the problem.

Let's go back to the start and look at the pros and cons of a centralised service like GitHub (or hosting on gitlab.com):

The Good

The main benefit of a centralised service like GitHub is that I can quickly and easily post issues or pull requests to any project on the server. Because almost everyone uses GitHub now, I can have a single account and manage everything from a single location.

Before GitHub I needed a different bugzilla account for every single project I wanted to report a bug to. I'm lazy. There were a lot of bugs I never bothered reporting because I'd spend more time registering, inevitably getting the captcha wrong and checking my email to activate my account than I would finally posting the bug report.

GitHub solved this: I already have an account, I can log in and post a bug. I can also more easily contribute via pull requests. If I start my project on GitHub, others can contribute to it with ease.

It also allows me to see all my notifications about all the projects I contribute to or have posted bugs about in a single place.

The Bad

Got a neat idea for a GitHub organisation for your project? Sorry, that name is already taken. Probably by someone with a single repository with 100 lines of code that hasn't been updated in six years. You'll have to think of a new name, sorry!

If you have a private repository on GitHub, it's not that private: GitHub (and now Microsoft) have access to your code.

The Ugly

Your project is also bound by the rules of the platform. Want to use the word "retard" in your project? Sorry, GitHub have deemed that haram and will remove your repo and force you to change it.

By using GitHub you are bound by their terms and conditions. They can delete your repo on a whim for a minor infraction. Unlikely? Maybe, but we know that WebMConverter never saw it coming, the maintainer woke up one morning to find the repo had been disabled.

We live in an age where what is deemed "offensive" is growing exponentially. An innocent project name or description today may be too naughty for GitHub tomorrow. In 5, 10, 15 years time, abandoned projects where the maintainer cannot be contacted may be deleted because that dongle joke you made in the readme now violates the terms of service.

If you rely on centralised platforms like GitHub (or OneDrive, Google Docs, etc) you have no control over your own data. The company who owns the servers could take them offline one day without any prior notice. This is extremely unlikely to happen but I'm a firm believer that people should be in control of their own data and projects.

There are also well documented privacy and security issues with centralised services that I won't go into here.

A Solution

I am not proposing this as the only answer but this is what I personally would like to see happen.

We need to move back to decentralised services. Host your project on your own website where you are in control of everything.

Everyone should run their own GitLab (or other service) and track issues on their own server. Each project is then 100% in control of its own data, can choose its own policies and private code and discussions are truly private.

This brings the drawbacks I mentioned earlier but we are already part way there to solving them. A small push to update the way login systems are handled could see a change, and I think GitLab are in the best position to solve it.

Back to 200 Accounts?

We need to avoid the problems of decentralised services if we're going to go back to them. The main issue is user accounts. I do not want to have to create an account on every server I visit.

Every server should offer OAuth (or similar). I can register an account on my private server, tom@r.je and use it to log in to git.kde.org or kernel.org or any other website. We'd need a replacement or fix to OAuth to do that and it's not a difficult problem to solve.

Use DNS to configure the authentication: If I try to log in to kde.org with tom@r.je, kde.org does a DNS lookup of, for example, oauth.r.je and takes me through the authentication process on my server. I can then accept the login.

This currently kind of works. The problem is, (at least as I understand it) to accept OAuth logins, a server has to register with the services it wants to accept logins from. That is, if I configure my website to allow logins with a GitHub account, you can login with a GitHub account, but you can't login with a Google account unless you configure that separately. And it requires configuration at both ends (on Google and on your local server).

I'd like to see a single protocol based on OAuth that allows anyone to easily set up a server and a client. I can then log in to any other website with my single account. I would be able to set up an authentication server on r.je and then use my r.je account to log in to any website which was running a client. Anyone who had registered with any OAuth server could log into my website using their authentication server.

Are you Symfony developer who wants to post a bug report on one of my projects? Log in to git.r.je with your @symfony.com account and post a bug. When you log in, you are actually logging in to Symfony.com which does the validation and then approves my site to see some of your details.

Yes, we'd need some nuance with permissions: By default after logging in to something, the server you've logged into should only have your display name, profile URL (e.g. for me http://git.r.je/tom) and possibly avatar.

Notifications

Rather than using email for notifications, a notification protocol could be set up. When you log in to a server you allow it to send you notifications (and can revoke the privilege at any time).

When I respond to the issue you posted on my GitLab server, my GitLab server will send a notification to your GitLab server. You can log in to your own GitLab server where your account is located and see notifications about all the projects you have contributed to. Your GitLab server can then send you an email if you configure it to do so.

If I post a bug on git.kde.org I log in to git.r.je and see your response. Clicking on the notification takes me to git.kde.org to see the full thread.

Yes, we'd need some sanity checks here. GitLab servers should not blindly accept arbitrary notifications. If they did we'd log in and see endless notifications about "Enlarge your dongle" and "Cant get it up loaded? Buy Viagra".

Wishful Thinking

This is unlikely to happen, of course. It would involve standardisation across different platforms. However, I think GitLab is in a very strong position to get the ball rolling.

If you self host GitLab then you can enable OAuth and let people log in with their GitLab or GitHub account. This mostly solves the problem, if GitLab could make it so GitLab also acted as an authentication server and allowed anyone to log into to any GitLab installation from any other GitLab installation then we'd really start to see decentralisation work.

Yes we'd need restrictions and this cross-site login functionality should be able to be disabled but for small projects and most people this would be ideal. I don't want people to have to register with my server just to report a bug.

I could sit here and draft a specification or even put together a proof of concept, but unless a medium to big player implements it, it's not going to go anywhere. I'd love to see GitLab try this, they seem to have the kind of philosophy where they wouldn't see decentralised accounts (and not being able to track all their users as easily) as a bad thing.

For now, I'd like to see each project hosting its own GitLab (or similar) installation and allow logging in with GitLab and GitHub accounts. When enough people are doing this we should start to see the rest fall into place.