Git Product home page Git Product logo

citelang's Introduction

CiteLang

PyPI version DOI status

Welcome to CiteLang! CiteLang provides methods and graph-based modeling to study software ecosystems. You can use CiteLang for your research, or a provided tool to generate software graph artifacts, including (but not limited to):

  1. Generate basic software credit trees (citelang graph, badge, or markdown credit)
  2. Give credit accounting for dependencies! (see software-credit.md)
  3. Actions (automation) for the above!

For the examples above, we aren't using DOIs! A manually crafted identifier that a human has to remember to generate, in addition to a publication or release, is too much work for people to reasonably do. As research software engineers we also want to move away from the traditional "be valued like an academic" model. We are getting software metadata and a reference to an identifier via a package manager. This means that when you publish your software, you should publish it to an appropriate package manager.

Getting Started

If you want to use CiteLang as an analysis library, jump into the more detailed ⭐️ Documentation ⭐️ or look specifically at the Python API. As an example analysis, the RSEPedia Software Ecosystem is a completed automated setup that parses and summarizes dependencies across the Research Software Encyclopedia weekly, and it's powered by CiteLang! You can do similar analyses or build your own tools using CiteLang. We will provide a small summary of the tools available here.

Badges

CiteLang Badges can show an entire credit tree for a project:

https://raw.githubusercontent.com/vsoch/citelang/main/docs/assets/img/pypi-citelang.png

or can be generated to be interactive web interfaces as shown here.

https://raw.githubusercontent.com/vsoch/citelang/main/docs/getting_started/img/badge.png

See the badge documentation for more examples of customizing the look, or level of abstraction. You can automatically generate or update a badge for your repository using the provided GitHub Action.

Credit and Graph

If you want to visually show dependency graphs, using Credit will print this to the console, and optionally in json if you want just the data. With the Graph command you can render different kinds of pretty graphs (or data formats dot, cypher, gexf) using this same data.

https://raw.githubusercontent.com/vsoch/citelang/main/examples/console/citelang-console-pypi.png https://raw.githubusercontent.com/vsoch/citelang/main/examples/cypher/graph.png

Contributions

CiteLang has a Contrib command and underlying API that can dig into your git history and look at contributions based on lines. You can read a complete write-up and see examples in this blog post. It is currently being used by the SingularityCE project to say thank you to contributors!

asciicast

If you want to generate data programatically, we provide A GitHub action.

Render and Generate

The functionality that originally derived the name - a "markdown syntax for citations" means that we can start from a markdown paper that has some number of CiteLang formatted references, and result in a rendered paper that includes a credit table. This is done with the Render command, or you can just output a table into its own markdown file with Generate. We provide an example here and also provide a GitHub action for you to generate this for your own repository.

Contributors

We use the all-contributors tool to generate a contributors graphic below.

Vanessasaurus
Vanessasaurus

💻
Dave Trudgian
Dave Trudgian

💻
Traceton
Traceton

💻

License

This code is licensed under the MPL 2.0 LICENSE.

citelang's People

Contributors

dtrudg avatar github-actions[bot] avatar traceton avatar vsoch avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

citelang's Issues

citelang contrib result depends on local REFs due to git log --all

Describe the bug

Because citelang contrib uses git log with --all to find the range of commits to analyze, it may include unrelated commits from any local git REF that has some form of shared history. E.g. if you have a 2nd remote that is a fork of your project, an upstream, or similar, and you have previously checked out a branch from it locally, then git log --all will show commits from it, and it's not possible to filter these out later.

This is because of the meaning of --all which is:

        --all
           Pretend as if all the refs in refs/, along with HEAD, are listed on
           the command line as <commit>.

To Reproduce

  • Clone github.com/sylabs/singularity
  • Add a remote for github.com/apptainer/apptainer & checkout a branch from it
  • Run a citelang analysis between singularity tags and observe that some apptainer commits appear, that have not
    ever been picked or merged into the singularity branches.

Sorry I don't have a specific example right now... working from some rough notes of an analysis I did on my other computer which isn't with me.

Expected behavior

citelang should not use --all with git log and then filter the resulting commit list.

Instead it should ask git log for the specific commit range between --start and --end directly e.g. git log v3.9.1..v3.9.2.

Version that produced the bug

0.0.28

Console table output can have poor contrast

Describe the bug

Because the code is choosing random colors per table column, it can sometimes pick a color that is too close to the terminal background, making it difficult to read.

To Reproduce

Run citelang contrib ... repeatedly, until a poor contrast color combination is chosen. E.g.

image

Expected behavior

A minimum contrast versus the background is used for table columns, or the random colors are picked from those that should be safe (e.g the basic 1-6, 9-14 from the 16 color terminal palette which don't include white or black to avoid clashes with most common light or dark terminal setups)

Version that produced the bug

e90f472

[JOSS] paper bibliography

The way you cite urls

howpublished = "\url{https://vsoch.github.io/citelang/getting_started/user-guide.html#github-action}",

looks correct according to a number of latex faqs. However, the output isn't generating href links which makes it difficult for readers to visit the cited url.

This may have to fixed on the JOSS template but for now I suggest to use a simpler url citation scheme to enable href linking.


issue is part of openjournals/joss-reviews#4352

Lower than expected counts from citelang contrib

Describe the bug

I've been using citelang to analyze contributions for https://github.com/sylabs/singularity between v3.9.0 and 3.10.0 and am seeing some unexpected counts. The 'count' for a contributor is lower than I'd expect given the number of lines changed by their commits.

If I understand correctly, citelang is intending to count authors' contributions on a git blame line by line basis through each commit in the specified range. I haven't had a chance to investigate the underlying cause yet, but I have been able to isolate a smaller example on the singularity repository looking at a small range of v3.9.1...v3.9.2 with a contribution from "Richard Hattersley"...

citelang contrib --start v3.9.1 --end v3.9.2 --filters ~/filters.yaml
Found 18 commits.
                                       ┏━━━━━━━━━━━━━━━━━━━━┳━━━━━━━┓
                                       ┃ Name               ┃ Count ┃
                                       ┡━━━━━━━━━━━━━━━━━━━━╇━━━━━━━┩
                                       │ David Trudgian     │ 264   │
                                       │ Dave Trudgian      │ 2     │
                                       │ Richard Hattersley │ 1     │
                                       └────────────────────┴───────┘

Richard's commit in this range is sylabs/singularity@b871d84 and there are 4 lines changed here in 3 files.

The --detail report shows citelang is only counting the change in CHANGELOG.md, and not the other files:

citelang contrib --start v3.9.1 --end v3.9.2 --filters ~/filters.yaml --detail
Found 18 commits.
Loading cached result /Users/dtrudg/Git_sylabs/singularity-analysis/.contrib/v3.9.1-v3.9.2.json
                             ┏━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━┓
                             ┃ Name               ┃ Paths                    ┃
                             ┡━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━┩
                             │ David Trudgian     │ debian/copyright: 63     │
                             │                    │ e2e/plugin/plugin.go: 19 │
                             │                    │ debian/s...              │
                             │ Dave Trudgian      │ CHANGELOG.md: 2          │
                             │ Richard Hattersley │ CHANGELOG.md: 1          │
                             └────────────────────┴──────────────────────────┘

To Reproduce

git clone [email protected]:sylabs/singularity singularity-analysis
cd singularity analysis
citelang contrib --start v3.9.1 --end v3.9.2 --filters ~/filters.yaml --detail

My ~/filters.yaml is:

ignore_files:
  - go.mod
  - go.sum

Expected behavior

The count for Richard Hattersley is 4... matching the 4 lines that git blame attributes in the files modified in the commit in question:

git blame CHANGELOG.md | grep Hattersley
b871d846ca (Richard Hattersley   2021-11-25 09:29:02 +0000   12) - Correct documentation for sign command r.e. source of key index.

git blame CONTRIBUTORS.md | grep Hattersley
b871d846ca CONTRIBUTORS.md (Richard Hattersley   2021-11-25 09:29:02 +0000  88) - Richard Hattersley 

git blame cmd/internal/cli/sign.go | grep Hattersley
b871d846ca cmd/internal/cli/sign.go           (Richard Hattersley 2021-11-25 09:29:02 +0000  20) 	privKey int // -k encryption key (index from 'key list --secret') specification
b871d846ca cmd/internal/cli/sign.go           (Richard Hattersley 2021-11-25 09:29:02 +0000  71) 	Usage:        "private key to use (index from 'key list --secret')"

Version that produced the bug

0.0.28

Create separate contribution UI

We should be able to use the action here (under development) to then:

  1. run on releases
  2. generate the contribution data
  3. update some interface to show contributors

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.