Git Product home page Git Product logo

dist-git's Introduction

DistGit

DistGit (Distribution Git) is Git with additional data storage. It is designed to hold content of source rpms and consists of these three main components:

  1. Git repositories
  2. Lookaside cache to store source tarballs
  3. Scripts to manage both

Read here for information about the most recent release: https://github.com/release-engineering/dist-git/wiki

How Does It Work

RPM source package typically contains a spec file and the sources (upstream tarball + additional patches). Source tarballs, being binary and potentially large, are not very well suited to be placed in a Git repository. On each their update, Git would produce a huge, meaningless diff. That's why DistGit was introduced as it employs an efficient lookaside cache where the tarballs can be stored. The Git repo itself can then be left to do what it does best: keep track of changes on the spec file, downstream patches, and an additional text file called sources that contains link to the source tarball in the lookaside cache.

storage

Video Tutorial

DistGit video tutorial

User Guide

1. Build and Install the Package:

The project is prepared to be built as an RPM package. You can easily build it on Fedora or CentOS with EPEL7 enabled using a tool called tito. To build the current release, use the following command in the repo directory:

$ tito build --rpm 

Install the resulting RPM package:

# tito build --rpm -i

2. Configuration:

Enable the lookaside cache by using and modifying the example httpd config:

# cd /etc/httpd/conf.d/dist-git/
# cp lookaside-upload.conf.example lookaside-upload.conf
# vim lookaside-upload.conf

Lookaside Cache uses https communication and client authenticates with ssl client certificate. The Dist Git service provider needs to issue the client certificate for every user.

3. Users and Groups:

All DistGit users need to:

  1. have an ssh server access with private key authentication
  2. be in a packager group on the server
  3. be provided with an ssl client certificate to authenticate with the lookaside cache

4. Install DistGit Web Interface:

Install Cgit, the web interface for Git:

# dnf install cgit

And point it to the DistGit repositories:

echo "scan-path=/var/lib/dist-git/git/" >> /etc/cgitrc

It is useful to comment out cache-size entry in /etc/cgitrc (or set it to zero) to always get up-to-date repository state at each page refresh.

The web interface will be available on address like http://your-server/cgit.

5. Systemd Services:

# systemctl start sshd
# systemctl start httpd
# systemctl start dist-git.socket

6. DistGit client tools:

To interact with DistGit server, you can use use rpkg or fedpkg command-line tools.

7. Deployment

You can see examples of Ansible deployment scripts in Fedora Infastructure dist-git role and Copr dist-git role.

Related

  • Source-git - project started in 2020. Intended as layer on top of dist-git.

Developer Guide

Unit tests

$ pytest -v .

Integration tests

Please, see beaker-tests/README.md.

LICENSE

Whole project use MIT license. File upload.cgi uses GPLv1.

dist-git's People

Contributors

asamalik avatar brandongray avatar clime avatar dturecek avatar frostyx avatar praiskup avatar puiterwijk avatar pypingou avatar rjhjk avatar schlupov avatar stratakis avatar xsuchy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dist-git's Issues

Cannot build

When I try to build dist-git package it fails with:

LC_ALL=C rpkg srpm
Wrote: /tmp/rpkg/dist-git-4-93ku35xc/dist-git.spec
error: Bad source: /tmp/rpkg/dist-git-4-93ku35xc/dist-git-1.10.tar.gz: No such file or directory
Failed to execute command.

@clime do you have an idea why it is failing?

[RFE] Identify more reliably upstream archive changes that have potential security implications

Fedora has long tried to detect mismatches between the upstream archives uploaded by packagers in its build system, and the current state of the same archives published by upstream.

A mismatch can indicate a security event Fedora-side, an upstream silently "fixing" its code or, worse, fixing the result of an intrusion, or some other problem. There is no warranty whatsoever that the archive currently uploaded is the correct archive to use by Fedora, only that it looked good to the packager at the time.

Historical Fedora change detection mechanisms are centered around storing full-archive hashes.

Unfortunately those mechanisms are now invalidated by the move of many upstreams to hosting platforms, where archives are dynamically generated on-demand from an SCM state. In such a system there is no warranty whatsoever the archive hash will stay constant over time. The hosting platform can upgrade, for example, one of the components used to create archives : tar, gzip, bz2, xz, producing archives with different checksums, from the same scm content. It can change the way it names its archive or the topdir within, and so on.

As a result, continuing to use full-archive hashes as a change detection mechanism results in many false positives, deterring human packagers from actually investigating change events. This is bad since some of those events are indications of security problems upstream or Fedora-side.

Therefore it should be nice to upgrade the mechanism to something reliable in the face of dynamic archive generation, for example:

As long as the spec file is unchanged, only warn people, if the hash of the content above the topdir in one of the sourceX files changed:

  • ignore changes in download URL, or archive name (those can change because of upstream http(s) redirections or changes in centralized Fedora macros outside the spec),
  • ignore archive container checksum changes (those can change because of new tar or gzip upstream),
  • ignore topdir renamings within the archive,
  • only check for changes in the content that rpm will process

I needed to patch this downstream

-%if 0%{?fedora} || 0%{?rhel} > 7
-Requires:       (dist-git-selinux if selinux-policy-targeted)
-%else
 Requires:       dist-git-selinux
-%endif`

Because of koji build error under master: Dependency tokens must begin with alpha-numeric, '_' or '/': Requires: (dist-git-selinux if selinux-policy-targeted)"

Provision error

I tried to up a vagrant machine simply by vagrant up, it fails at this step of provision.

==> distgit: Running provisioner: shell...
    distgit: Running: inline script
    distgit: Created symlink /etc/systemd/system/multi-user.target.wants/httpd.service → /usr/lib/systemd/system/httpd.service.
    distgit: Job for httpd.service failed because the control process exited with error code.
    distgit: See "systemctl  status httpd.service" and "journalctl  -xe" for details.
The SSH command responded with a non-zero exit status. Vagrant
assumes that this means the command failed. The output for this command
should be in the log above. Please read the output to determine what
went wrong.

These two SSL relative directives are not configured properly,

SSLCertificateFile: file '/etc/pki/tls/certs/localhost.crt' does not exist or is empty
SSLCertificateKeyFile: file '/etc/pki/tls/private/localhost.key' does not exist or is empty

httpd -t reports syntax error in /etc/httpd/conf.d/ssl.conf.

fatal: ambiguous argument "": unknown revision or path not in the working tree.

This commit that allows configuring the different name of main branch does not have some default value. If no default_branch is specified in dist_git.conf then SRC_BRANCH expands to empty string which leads to this error:

Creating new module branch 'f37' for 'nikromen/test-that-dist-git-builds-from-forks-work-1679990637.4999745/hello'...
fatal: ambiguous argument '': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'
fatal: not a valid object name: 'master'
ERROR: Branch nikromen/test-that-dist-git-builds-from-forks-work-1679990637.4999745/hello f37 could not be created
/usr/share/dist-git/mkbranch: line 151: popd: directory stack empty

some check like if default_branch is None then SRC_BRANCH="master" would be handy

RFE: (optional) grokmirror support

It would be nice if dist-git supported (optionally) grokmirror out of the box.

If grokmirror is installed.
And the config enables it.
We add a grokmirror hook to all repos to update a manifest file everytime they get a commit.
Then we generate a initial manifest and place it at the top level.

Then users (or infrastructure backups) could poll every 5minutes or something and only mirror the git content thats changed.

https://github.com/mricon/grokmirror
it is already packaged in Fedora/EPEL.

packaging issues

dist-git-selinux depends on dis-git (but not needed)

installing dist-git-selinux results to:

  Running scriptlet: dist-git-selinux-1.12-3.el8.noarch                                                                                                                                                                                  1/1 
/usr/sbin/restorecon: SELinux: Could not get canonical path for /var/lib/dist-git/cache restorecon: No such file or directory.
/usr/sbin/restorecon: SELinux: Could not get canonical path for /var/lib/dist-git/cache/lookaside restorecon: No such file or directory.
/usr/sbin/restorecon: SELinux: Could not get canonical path for /var/lib/dist-git/cache/lookaside/pkgs restorecon: No such file or directory.
/usr/sbin/restorecon: SELinux: Could not get canonical path for /var/lib/dist-git/git restorecon: No such file or directory.
/usr/sbin/restorecon: SELinux: Could not get canonical path for /var/lib/dist-git/web restorecon: No such file or directory.

  Verifying        : dist-git-selinux-1.12-3.el8.noarch      

Allow using a single local repository with more remote ones (i.e. Fedora, Copr, etc.)

Ideally, it should be possible to set rpkg URL per branch and use single repository for managing packages for e.g. fedora and copr based on the branch one is currently using.

So if I'm using f20 branch rpkg build would pick correct configuration and rpkg build would end up in fedora's koji. When using e.g. copr_f20 rpkg build would use copr for building my package and so on.

This would make it much easier for maintainers as there's no need to remember correct rpkg alias which in principle is only configuration.

Outdated Vagrantfile

I wanted to run my own local instance of dist-git for development purposes and discovered several issues with Vagrantfile:

  • It uses F29 which is now EOL
  • It uses Tito but AFAIK the newest dist-git versions were released via rpkg
  • When I tried to do vagrant up, hoping that I would at least get an outdated Fedora with an old dist-git package, vagrant didn't provision the instance successfully.

This is an issue particularly because beaker-tests seem to be tied to vagrant, so at this moment, we are not able to launch the latest version of dist-git and verify that it works.

Support Git-LFS as an alternative to separate lookaside system

When Dist-Git was first conceived by Fedora Infrastructure years ago, there were no recommended mechanisms for dealing with binary data. Today, we have Git-LFS, which is natively supported by GitLab and GitHub.

For fresh deployments, it would make sense to leverage Git-LFS rather than use the older solutions, since Git-LFS is natively supported in git (well, as close to natively as you can get, really). Unless there's a compelling reason not to, I don't see why fresh Git setups wouldn't use it.

"rpkg srpm && rpkg local" failed on fedora

The system I am using is fedora28.
I am using the newest code.
[root@bogon dist-git]# git log -1
commit 772daf8 (HEAD -> master, origin/master, origin/HEAD)
Merge: 7f8dc8e c4ed46a
Author: Michal Novotný [email protected]
Date: Tue Feb 16 03:15:22 2021 +0100

Merge pull request #45 from xsuchy/fullpath

specify full path

When I run "rpkg srpm && rpkg local", I got the following error merssage:
[root@bogon dist-git]# rpkg srpm && rpkg local
Wrote: /tmp/rpkg/dist-git-1-spg05b0g/dist-git.spec
Wrote: /tmp/rpkg/dist-git-1-spg05b0g/dist-git-1.16.tar.gz
auto-packing: This function is deprecated and will be removed in a future release.
Wrote: /tmp/rpkg/dist-git-1-spg05b0g/dist-git-1.16-1.fc28.src.rpm
Wrote: /tmp/rpkg/dist-git-2-mdxsk0np/dist-git.spec
error: Bad source: /tmp/rpkg/dist-git-2-mdxsk0np/dist-git-1.16.tar.gz: No such file or directory
rpmbuild --define '_sourcedir /tmp/rpkg/dist-git-2-mdxsk0np' --define '_specdir /tmp/rpkg/dist-git-2-mdxsk0np' --define '_builddir /tmp/rpkg/dist-git-2-mdxsk0np' --define '_buildrootdir /tmp/rpkg/dist-git-2-mdxsk0np' --define '_srcrpmdir /tmp/rpkg/dist-git-2-mdxsk0np' --define '_rpmdir /tmp/rpkg/dist-git-2-mdxsk0np' -ba /tmp/rpkg/dist-git-2-mdxsk0np/dist-git.spec | tee /tmp/rpkg/dist-git-2-mdxsk0np/.build-1.16-1.fc28.log

Do you have an idea why it is failing?

systemd dependency

hi,
since 0.3 it is not 'packagable' for rhel6.
i would like to see support for rhel6/centos6.

best regards
j.

Should pkgs-git-repos-list be replaced with repositories

Should this

echo "project-list=/var/lib/dist-git/git/pkgs-git-repos-list" >> /etc/cgitrc

be replaced with ?

echo "project-list=/var/lib/dist-git/git/repositories" >> /etc/cgitrc

Because, after installing dist-git which is built on the top of master branch, I only see two directories repositories and rpms under /var/lib/dist-git/git/.

API for searching repos

There are several front-ends built on top of rpkg, which work with dist-git for packagers. To be more friendly for packagers and newbies, rather than opening a browser, go to pkgs.*.org, for example pkgs.fedoraproject.org/cgit/ , it would be cool to make those front-ends be able to search from dist-git to discover repositories they are interested in. It would be done by

fedpkg search python

as a result, all repositories that contains keyword python will be shown. Furthermore, this could also support to search by namespace.
An API provided by dist-git to search repositories will be benefit and make above feature be possible. Meanwhile, I'm not sure whether dist-git has any APIs currently that could be usable for the search feature.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.