Comments (19)
Creating and curating catalogs for software tools that aid sustainability (perhaps categorized by domain, programming languages, architectures, and/or functions, e.g., for code testing, documentation)
from meetings.
I would suggest adding to https://elixir-registry.cbs.dtu.dk
from meetings.
We have recently started the first-ever official catalog of software at LBNL, from the Computational Research Division. This was the primary initial task of the Software Engineering Management Committee (SEMC), which I led. See: http://crd.lbl.gov/software/
There is an ongoing effort to provide internal catalog functions as well, such as download tracking and the ability to automatically generate reports for the Innovation Partnerships Office (IPO). Software catalogs are intimately connected to issues of open-source and licensing.
from meetings.
I suppose one question would be whether there should be a single catalog, or different catalogs per domain. The ELIXIR registry seems fairly specific to life sciences, although it could certainly serve as a good example for a broader catalog. Similarly, the LBNL CRD software catalog seems specific to software developed at LBNL, but could also serve as a good example/starting point.
However, to propose an answer to my own question, I think a catalog for tools that aid sustainability should probably be multi/cross-disciplinary, since many of the tools would also be themselves. So, the two existing catalogs mentioned here would likely not be sufficient, unless it was possible to extend them significantly. I think a better option may be to use them as a base off which to build.
from meetings.
It is less obvious who would support and maintain a single catalog, but even if there was one CRD and I presume ELIXIR would like to preserve their domain or organization-specific views of it. We are actually creating a simple Django-based backend for the internal functionality I mentioned, which would probably be a good starting point (or at least something to look at) for something more general. I can try and make that relatively clean and open-source itself on Github before I come..
from meetings.
I would like to see some information on a catalog that would help filter the entries so that you could easily see new, popular, etc software, this would reduce the support burden for a catalog since out of date entries would naturally not be seen.
from meetings.
@dangunter: that sounds like a great starting point!
@gridrebel: I think a release or last updated date could certainly be one element of a catalog, in addition to labels based on functionality, relevant disciplines, etc.
from meetings.
I wonder if the fact that general web catalogs have mostly disappeared in favor of search engines is relevant.
from meetings.
I think it is; keeping a catalog up to date is very labor intensive.
Apache has projects provide a DOAP description file which they then crawl;
funders could require projects to provide that as part of the annual
reporting process, then crawl it.
http://oss-watch.ac.uk/resources/doap
As it is projects are required to include the grant number on any
publication (including websites, right?) so one might be able to leverage
that (some set of pages to be crawled?)
Another option is to think about what would incentivize projects to keep
information updated; only thing I can think of is a website for registering
your preferred citation for users of the software.
--James
On Sat, Sep 12, 2015 at 6:04 AM, Daniel S. Katz [email protected]
wrote:
I wonder if the fact that general web catalogs have mostly disappeared in
favor of search engines is relevant.—
Reply to this email directly or view it on GitHub
#44 (comment).
from meetings.
Yeah, @danielskatz makes a good point... and keeping a catalog maintained would not only be labor-intensive but also potentially difficult, since it would require finding the software in the first place!
I like the DOAP suggestion, as it takes much of the work off the catalog and puts it onto the projects themselves. Then, they would just need to submit an appropriate link to the catalog maintainer (or the whole thing could be automated).
I agree that a preferred citation is one strong incentive, although I imagine that potentially having access to a larger user base (and thus more people to cite the work) would be one as well.
However, it just occurred to me is that for this discussion I've been thinking more about a catalog for sustainable software itself, rather than tools that would support the development of such...
from meetings.
Although catalogs as a way of indexing "everything" have disappeared, catalogs are still alive and well. Amazon, iTunes, the App Store, etc. etc. This is, to me, the appropriate analogy for a software catalog -- "we" are showing our wares (with an "s"!). This is not new -- groups and labs commonly have webpages for such things, and the LBNL/CRD attempt is merely an attempt to broaden and standardize this effort slightly. What is a little different is the realization that we could leverage this catalog for the purposes of aiding with our parochial needs for tracking and reporting downloads (usage), and also gaining a sense of the adoption of software engineering practices across the entire portfolio.
from meetings.
I edit the Astrophysics Source Code Library (ASCL), and in my lightning talk at WSSSPE, am extending an offer of our infrastructure to anyone who wants to use it to build their own software registry/repository, meaning we will give folks a clone of our infrastructure, and then they can change it however they'd like to suit their discipline/needs. The ASCL is built with open source tools that have really large userbases, so it's not hard to find people with skills using these tools, or to develop them. (The ASCL is a completely volunteer effort so this has been important to us.) We'll even host your site if you'd like at no cost!
You can see a mostly-emptied-out-of-our-stuff clone here. In addition to the functionality you can see as a user, there are administrator tools that let you stage a new entry before publishing it, assign a unique identifier to it, edit existing entries, etc., and some simple reports that are essentially ways of getting info out of the database in different formats. One of the ASCL's reports, for example, allows the main indexing service for astrophysics (Astrophysics Data System, or ADS) to pull updates in their preferred format.
If you want more info, please let me know; thanks!
from meetings.
There is a High Performance Math Software Catalog at http://wotug.org/parallel/nhse/rib/repositories/hpc-netlib/catalog/
It is however generated back in 1999.
A 2015-16 NASA software catalog (2nd edition) is available as PDF at https://software.nasa.gov
NASA started publishing software catalog only in 2014.
Some universities publish software catalog that they have provided access for faculty/staff/students, for example, https://it.stonybrook.edu/services/software-catalog/
from meetings.
Having been involved in the creation and running of three "general" software catalogues, I can confirm that they are a lot of work to keep updated (we had 2 FTE and it wasn't quite enough). A federated approach spreads the burden, and we did try using DOAP for a while but at the time (six years ago), the tooling wasn't good enough for the end users.
One of the benefits of e.g. the iTunes store / Google Play, is that the developers provide their own information using a set template.
from meetings.
I've looked at previous efforts to establish a repository/registry in astronomy, and one of the problems was too much metadata. Everyone loves to have it, but it's hard to keep a lot of metadata up-to-date, and that -- inability to keep the metadata current -- sank more than one of these other efforts. That's why the ASCL is so light. We have no full-time staff, just two part-time volunteers creating entries and vetting submissions. There are lots of things the ASCL can't do, but one of the things it has been able to do is survive!!
from meetings.
note this NIH workshop report on their efforts to build a software discover index: http://softwarediscoveryindex.org/report/
Also note the metadata harmonization effort in https://www.mozillascience.org/code-as-as-research-object-new-phase
from meetings.
I have accidentally stumbled into a math software service this afternoon and I like it:
http://www.swmath.org
from meetings.
sctchoi, wow! That's fabulous; thank you!
from meetings.
I think that maintenance is one of the biggest challenges in running such catalogs. To run it efficiently, one has to actively engage with developers and encourage them to keep information up-to-date. The value of swMATH is that it's not a catalogue of the software (like e.g. a related ORMS catalogue), but it's a database of citations of mathematical software, drawing information from https://zbmath.org/.
swMATH allows to submit new software package at http://www.swmath.org/contribute/main and to suggest updates of existing entries. For example, I've updated the version number for GAP in summer, and my update has been processed relatively fast. On the other hand, swMATH does not track citations of a particular version of the software, but that's a more global problem since the versioning information in the citation may not be accurate in the first instance.
My suggestion is to recommend to other bibliographical databases, for example, MathSciNet, to treat software citations in a similar manner.
from meetings.
Related Issues (20)
- Development: Determining principles for engineering design for sustainable software HOT 1
- Development: Create guidance giving examples of specific metrics for the success of scientific software HOT 13
- Training: Writing a white paper on training for developing sustainable software, and coordinating multiple ongoing training-oriented projects HOT 7
- Training: Developing curriculum for software sustainability, and ideas about where such curriculum would be presented, such as a summer training institute HOT 1
- Credit: Hacking the credit and citation ecosystem (making it work, or work better, for software) HOT 22
- Credit: Developing a taxonomy of contributorship/guidelines for including software contributions in tenure review
- Credit: Documenting case studies of receiving credit for software contributions
- Credit: Developing a system of awards and recognitions to encourage sustainable software
- Publishing: Developing a categorization of journals that publish software papers, and case studies of alternative publishing mechanisms that have been shown to improve software discoverability/reuse HOT 9
- Publishing: Determining what journals that publish software paper should provide to their reviewers HOT 3
- Reproducibility and testing: Building a toolkit that could allow conference organizers to easily add a reproducibility track
- Reproducibility and testing: Documenting best practices for code testing and code review HOT 1
- Documentation: Develop landing pages that enable the community to easily find up-to-date information on a WSSSPE topic HOT 8
- Executable Papers/Non-Traditional Publishing
- Development: Principles of engineering design for sustainable software
- WSSSPE3 report: citation vs. footnote for links? HOT 5
- Community: Develop guidance on how to build a sustainable community around science software HOT 3
- Activities at WSSSPE4 HOT 1
- WSSSPE4 report: purpose of the Code of Conduct section HOT 7
- WSSSPE4 report: Verifying best practices & metrics for sustainable research software HOT 9
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from meetings.