Migrate DwC code about dwc HOT 21 CLOSED

tdwg commented on July 25, 2024

Migrate DwC code

from dwc.

Comments (21)

tucotuco commented on July 25, 2024

Let's do this as soon as I make the current release. I am about one day
away from this if the Executive can decide on the dcterms:license issue.

On Wed, Nov 5, 2014 at 5:28 PM, Peter Desmet [email protected]
wrote:

Migrate Darwin Core code to GitHub

Migrate SVN repo to git (locally). There should be tools for this.

Push all code to GitHub

Create release (as an archive)

Clean up code to remove irrelevant elements.

Document this process (will be useful for other migrations, e.g.
GBIF repos)

—
Reply to this email directly or view it on GitHub
#16.

from dwc.

mdoering commented on July 25, 2024

In the 4th step we should get rid of the following directories in the current svn trunk as they only serve archival purposes which can be better achieved by using tags:
From https://code.google.com/p/darwincore/source/browse/trunk/

archive
all dated folders
- 2009-02-12
- 2009-02-20
- 2009-04-29
- 2009-05-25
- 2009-07-06
- 2009-09-23
- 2009-10-08
- 2009-12-07
- 2011-10-26
- 2013-10-22

What about downloads/old ?

from dwc.

tucotuco commented on July 25, 2024

We can store the contents of those folders as zipped archives such as DarwinCoreStandard-2013-10-22.zip.

The files in the downloads/old folder are those that are available on the Google Code site downloads at https://code.google.com/p/darwincore/downloads/list. We want the files as supporting documents, but we don't need that folder structure.

from dwc.

peterdesmet commented on July 25, 2024

@mdoering, @tucotuco, can you create separate issues for what needs to be done with the old files (tagging, releasing, or zipping)?

The cleanest option would be to tag those in the git commit history and create releases (which are zip files).

from dwc.

peterdesmet commented on July 25, 2024

For cleanup stuff, I created #19.

from dwc.

tucotuco commented on July 25, 2024

Working on the release in the branch version/2014-11-08 https://github.com/tdwg/dwc/tree/version/2014-11-08

from dwc.

tucotuco commented on July 25, 2024

Moved the SVN repository in #22. Not much to document in this process. Had SVN locally, cloned this repo, copied what I wanted to keep into structure I wanted in new branch in this repo, push the branch, created pull request (#22). Now on to Issue #19.

from dwc.

peterdesmet commented on July 25, 2024

See my comment in #22. By copying, we loose the SVN history and cannot tag older versions of the standard.

from dwc.

tucotuco commented on July 25, 2024

We only need to tag the zip files. They contain everything for that version.

On Wed, Nov 12, 2014 at 1:04 PM, Peter Desmet [email protected]
wrote:

See my comment in #22 #22. By copying,
we loose the SVN history and cannot tag older versions of the standard.

—
Reply to this email directly or view it on GitHub
#16 (comment).

from dwc.

peterdesmet commented on July 25, 2024

Releases in GitHub works differently. You tag a certain point in your commit timeline and that becomes a release. That means that releases are serial and do not exist as parallel versions in your code. So, to do this nice and clean, we would need the complete SVN history.

We can also do it more dirty, and create several releases for the same point in the commit history, but add a different binary zip file for each.

from dwc.

mdoering commented on July 25, 2024

Does the SVN have a single archive file for the standard that changes over time?
Is it this one?
https://code.google.com/p/darwincore/source/list?path=/trunk/archive/darwincore.zip&start=1680

If SVN has the versions in parallel (folders named by release date) in the trunk we should not really tag them all. I fear the cleanest would be to replay the versions in git based on the new structure we give to the repo/files?

from dwc.

tucotuco commented on July 25, 2024

We need to have a repository of the past versions of the standard that
people can get to easily, for historical purposes, without having to go
through revision histories. To do that, I think we should keep making the
darwincore.zip files of releases and accululating those, as always. We do
not need the dated folders in which those versions are unzipped, nor do we
need to keep the whole history of SVN changes. To me, the snapshot from
2014-11-08 is our Github starting point and all other relevant history is
in the accumulated zip files in the versions directory. From here on out we
can tag releases, but I would also insist on creating a zip file of the
standard at that point to put in the versions directory with the latest on
the TDWG page for the standard (http://www.tdwg.org/standards/450/) for
downloading from there.

On Wed, Nov 12, 2014 at 3:50 PM, Markus Döring [email protected]
wrote:

Does the SVN have a single archive file for the standard that changes over
time?
Is it this one?

https://code.google.com/p/darwincore/source/list?path=/trunk/archive/darwincore.zip&start=1680

If SVN has the versions in parallel (folders named by release date) in the
trunk we should not really tag them all. I fear the cleanest would be to
replay the versions in git based on the new structure we give to the
repo/files?

—
Reply to this email directly or view it on GitHub
#16 (comment).

from dwc.

peterdesmet commented on July 25, 2024

@tucotuco, I think we more or less agree on how to do it in the future. Each time the standard is at a stable release, you tag and release it through GitHub. This will automatically create a zip file (for example: https://github.com/tdwg/prior-standards/releases/tag/website-archive) that can be referenced. I have added a webhook to this repository so we can even have DOIs for those. There is no need to keep those zips as version-named files IN the repository.

The main thing to decide is how to handle previous releases. I see three options:

Make sure we have the complete SVN history in this repository. We then manually tag the historical releases (based on date), so they become available through GitHub here: https://github.com/tdwg/dwc/releases. This is more work, but historical and recent releases are handled equally.
We cheat and replay the version history manually, by adding and committing the different versions one by one (I think this is what @mdoering proposes). We than tag those commits and we have releases. This has the advantage that historical and recent releases are handled equally, and we don't need to import the whole SVN history. All of this can be tested on a branch.
We tag the current version as 2014 11 08, which includes named version files for previous releases. We indicate in the release description that historical releases can be found in a folder. We then remove the folder and use the standard method for creating releases. The disadvantage is that historical and recent releases are not treated equally.

from dwc.

mdoering commented on July 25, 2024

That summarises the options pretty well, Peter. I would be in favor of #1 or #2. But like I said I fear that we create huge accumulating archives by strictly following SVN. This is

due to the entire standard being redundantly included as a zip archive
and old versions are also part of the trunk

For that reason alone I tend to lean to version #2. Or maybe there is a solution in between by importing the complete SVN, then doing the cleanup and finally replay the proper dwc versions in rdf?

from dwc.

timrobertson100 commented on July 25, 2024

For my understanding - does #2 mean:

svn co ...trunk/"release date"
git commit, push, release
repeat until all historical releases are done

If so, I'd vote +1 on that.

from dwc.

peterdesmet commented on July 25, 2024

I also prefer option 1 or 2. @timrobertson100, if I understand correctly, option 2 is, more verbosely:

Copy all files from trunk locally to somewhere else
Empty trunk
Populate trunk with all files from first release
Commit
Push
Release (= tag a commit)
Start over from step 2, but this time upload files from the next version, until we are at the last version.

@tucotuco, would that be fine to you as well?

@mdoering, if so, is this something you can do? Maybe on a historical-releases branch to practice. I think you can actually do steps 5 and 6 later, since tags (and I assume releases) can be added retro-actively: http://git-scm.com/book/en/v2/Git-Basics-Tagging

from dwc.

tucotuco commented on July 25, 2024

I'm not convinced yet of the utility of capturing the commit history aside
from competeness and consistency. Not a bad thing to have, but with what
effort?

If it was just to capture the diffs between the contents of the dated
folders (as opposed to the commit history), then I would recommend a slight
variation.

Make a branch off of master
Copy all files from the branch locally to somewhere else
Empty the branch
Unzip the archive for the first version darwincore-2009-02-12.zip
into the root of the branch
Commit
Push
Release (= tag a commit)
Start over from step 3, but this time unzipping files from the next
version (e.g.darwincore-2009-02-20.zip), until we are at the most recent
version.

This is tractable and I wouldn't mind this.

On Wed, Nov 12, 2014 at 5:11 PM, Peter Desmet [email protected]
wrote:

I also prefer option 1 or 2. @timrobertson100
https://github.com/timrobertson100, if I understand correctly, option 2
is, more verbosely:

Copy all files from trunk locally to somewhere else

Empty trunk

Populate trunk with all files from first release
https://code.google.com/p/darwincore/source/browse/#svn%2Ftrunk%2F2009-02-12

Commit

Push

Release (= tag a commit)

Start over from step 2, but this time upload files from the next
version, until we are at the last version.

@tucotuco https://github.com/tucotuco, would that be fine to you as
well?

@mdoering https://github.com/mdoering, if so, is this something you can
do? Maybe on a historical-releases branch to practice. I think you can
actually do steps 5 and 6 later, since tags (and I assume releases) can be
added retro-actively: http://git-scm.com/book/en/v2/Git-Basics-Tagging

—
Reply to this email directly or view it on GitHub
#16 (comment).

from dwc.

tucotuco commented on July 25, 2024

Testing on new branch version/history starting with legacy pre-standard Darwin Core.

from dwc.

peterdesmet commented on July 25, 2024

@tucotuco, great! I already started a pull request, so it's easy for us to follow along. Don't forget to bring back the recent files (README, CONTRIBUTING, LICENSE) after you committed the last release.

from dwc.

mdoering commented on July 25, 2024

Using the zip files sounds like a great @tucotuco

from dwc.

tucotuco commented on July 25, 2024

Finished with the migration of the standard in all of its releases.

from dwc.

Migrate DwC code about dwc HOT 21 CLOSED

Comments (21)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent