Git Product home page Git Product logo

git-migration's People

Contributors

akgrant avatar cdlm avatar peteruhnak avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

git-migration's Issues

git-fast-import has wrong parents sometimes

  • Omitting from/merge will take the current commit as parent, which may not be always desired.
  • Commits are sometimes automerged for no reason... or I am not clear about how git-fast-import treats from and merge

Opened file limit is being hit on Mac

"1284 mczs and then a file primitive failed. It wa due to the limit of 256 open files for a Mac process."
temporary workaround sudo launchctl limit maxfiles 65536

There is no reason why the file limit should be hit as files are read one by one... so maybe bad closing of streams.

Empty commit messages

In Monticello there can be MCZs with an empty commit messages (at least I found some in the GT repo). Now they get imported in git to commits with an empty commit messages. This causes issues with some operations in git like rebase, which will fail if the git commit has an empty message. It can be fixed with something like the following on the git side:

git filter-branch -f --msg-filter '
read msg
if [ -n "$msg" ] ; then
    echo "$msg"
else
    echo "The commit message was empty"
fi'

But could be better to have a default template for missing commit messages ().

Deleted method can be preserved

Repository:

MCSmalltalkhubRepository
	owner: 'PavelKrivanek'
	project: 'Tuppu'
	user: ''
	password: ''

Generation of fast import file:

migration := GitMigration on: 'PavelKrivanek/Tuppu'.
migration cacheAllVersions.
migration allAuthors.
migration authors: {'PavelKrivanek' -> #('Pavel Krivanek' '<[email protected]>')}.
migration
	fastImportCodeToDirectory: 'src'
	initialCommit: '5e53cc6'
	to: 'import-tuppu.txt'

fast import:

# git fast-import < ../import-tuppu.txt
git-fast-import statistics:
---------------------------------------------------------------------
Alloc'd objects:       5000
Total objects:          278 (       302 duplicates                  )
      blobs  :          197 (       300 duplicates         36 deltas of        191 attempts)
      trees  :           74 (         2 duplicates         34 deltas of         74 attempts)
      commits:            7 (         0 duplicates          0 deltas of          0 attempts)
      tags   :            0 (         0 duplicates          0 deltas of          0 attempts)
Total branches:           1 (         1 loads     )
      marks:           1024 (         7 unique    )
      atoms:            113
Memory total:          2294 KiB
       pools:          2098 KiB
     objects:           195 KiB
---------------------------------------------------------------------
pack_report: getpagesize()            =       4096
pack_report: core.packedGitWindowSize =   33554432
pack_report: core.packedGitLimit      =  268435456
pack_report: pack_used_ctr            =          8
pack_report: pack_mmap_calls          =          1
pack_report: pack_open_windows        =          1 /          1
pack_report: pack_mapped              =      32607 /      32607
---------------------------------------------------------------------

Then load both packages from the orriginal repository and try to merge with sources in the Git migrated repository. Several methods are added:

Tuppu>>#mutex
Tuppu>>#mutex:
Tuppu>>#open
TuppuRepository>>#fileName
...and many others

#isClassDefinition not found in OrderedCollection

I get this exception when trying to move the Cryptography (this is not the only one that triggers this error, it is just an example) repository to git.

screen shot 2018-09-25 at 10 23 56

It happens when executing the last step.

migration fastImportCodeToDirectory: 'Cryptography' initialCommit: '785682d219b2dfef1320ee6657211c74fb15ebf0' to: 'Cryptography.txt'.

My entire script is:
`
migration := GitMigration on: 'http://www.squeaksource.com/Cryptography'.

migration downloadAllVersions.
migration populateCaches.

migration allAuthors.

newAuthors := OrderedCollection new.
migration allAuthors do: [ :each |
|email| email := '<',each,'>'.
newAuthors add: each -> {each . email} ].
newAuthors.
newAuthors add: ('GeorgeGanea' -> #('George Ganea' '[email protected]')).
migration authors: newAuthors.

migration
fastImportCodeToDirectory: 'Cryptography'
initialCommit: '785682d219b2dfef1320ee6657211c74fb15ebf0'
to: 'Cryptography.txt'.
`

This happens in
Pharo 7.0
Build information: Pharo-7.0+alpha.build.1261.sha.9ed1473c3fb9c3853ee730e406bfb012c9fa8297 (32 Bit)
Is there anything else I could do to help with this issue?

Bad utf8-encoded commit messages corrupt the import file

When converting the (legacy) Seaside30 Smalltalkhub repository to github, I encountered import errors which I traced to an incorrect declared byte size of the data command that carries the commit message text.

I also discovered that the class GitMigrationCommitInfo already has code to clean a commit message in inlineDataFor: but that is was unused. Changing the writeCommitPreambleFor: method such that it calls the GitMigrationCommitInfo>>inlineDataFor: method seems to fix the problem adequately.

I will prepare a PR after I check the result of migrating the Seaside30 repository with those changes.

not porting branch history make diff useless

Hi, I understand that it is difficult to port branches history across several packages: information is missing as whether package_P.branch_2 relates to package_Q.branch_4...

Since most MC branches are nameless, there is no hint for resolving such dilemma.

But I'm convinced that we can do better, and we should do better.

Simple cases first: for a single package, it's perfectly doable (see for example https://github.com/hpi-swa/Squot).
Example of simple MC branches with versions A B C D:

    *-B---*
   /       \
A-*         *-D
   \       /
    *---C-*

will appear as:

A---B---C---D

when we search what C was good for, we see diffs between parallel branches B and C, that is all diffs between A and B, (B-A) union all diffs between A and C (C-A)... It's exactly the MC reparent option which was proposed recently in Squeak, and which is kind of dangerous for these reasons.

If (B-A) is big, (it might in fact span several commits), then we completely loose the point of C, and history becomes useless. That's why we should do better if we can. It occurs quite often in VMMaker for example, see OpenSmalltalk/opensmalltalk-vm#305

In case of single package P, then I would say reproduce P topology.

For multi-package, I have some ideas (and heuristics), but it will be too long to discuss here.
I will try to think longer about details and gather them into another place.

For now, if we don't want to enter into multi-package parallel branches complexity, then the least we can do is to rebase C on B (with automatic conflict resolution on ours, that is, newest wins). At least, the diff will be relevant (modulo the conflicts, but conflicts are kind of relevant too, aren't they?).

This may not work so well when renaming classes/methods in one of the branches, because MC does not track such operation, but in most cases, it will lead to a more useful diff than we are getting now.

Also, the commit message should contain meta-information of which MC version is committed, and what special operation (rebase or reparent) did take place at translation.

Error in topologicallySort: when MCVersionInfo is incomplete

When migrating the GT repo (Moose/GToolkit) there is an #hour was sent to nil exception in #topologicallySort: as a MCVersionInfo has the time attribute nil.

hour was sent to nil

I can bypass this by setting a time of 00:00 but not sure why the time is not extracted from the mcz? Can it be missing? The MCVersionInfo looks kind of strange.

Anything that could be done in the migration tool?

Commit with null byte when importing the history within git

When importing the history for Moose/GToolkit into git, there is a commit that has a null byte. Because of this it's not possible to push the changes.

Running git fsck shows:

Checking object directories: 100% (256/256), done.
warning in commit 5c31cc85a7b9e629a167998edfde8f4639b54670: nulInCommit: NUL byte in the commit object body
Checking objects: 100% (26884/26884), done.

When pushing:

Counting objects: 26884, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (10393/10393), done.
error: RPC failed; curl 55 SSL_write() returned SYSCALL, errno = 32
fatal: The remote end hung up unexpectedly
Writing objects: 100% (26884/26884), 14.59 MiB | 8.05 MiB/s, done.
Total 26884 (delta 16424), reused 26882 (delta 16422)
fatal: The remote end hung up unexpectedly
Everything up-to-date

The commit in question is:

commit 5c31cc85a7b9e629a167998edfde8f4639b54670
Author: Alexandre Bergel <[email protected]>
Date:   Fri Aug 22 11:50:53 2014 +0000

    Some tab to Class:

screen shot 2018-10-04 at 19 16 24

Package Filter?

Is this currently possible? For example, I sometimes have an St repo with multiple tiny projects. When I'd like to port one to GH, I'd need to be able to specify a package filter (e.g. a block).

I hacked this behavior with:

GitMigration>>versionsByPackage
	| versionsByPackage all |
	versionsByPackage := Dictionary new.
	all := repository versionsWithPackageNames.
	(all select: [ :e | e first beginsWith: 'Val' ])
		do: [ :quad | 
			(versionsByPackage at: quad first ifAbsentPut: [ OrderedCollection new ])
				add: (self cachedVersions at: (quad last withoutSuffix: '.mcz')) ].
	^ versionsByPackage

Infinite loop when exporting in TonesWriter>>#splitMethodSource:

During the export of the history in Glamour there seems to be an infinite loop when exporting the mcz Glamour-Examples-tg.36. After half an hour the export made no progress. Interrupting the execution a few times always end up in TonesWriter>>#splitMethodSource: for the method a MCMethodDefinition(GLMSTNamedModel>>#nameØY). The strange part is that the selector name is nameØY while the source code of the method is:

named: aString
	^ self named: aString environment: self defaultEnvironment

user interrupt

NotFound: [ :wc | cat beginsWith: wc packageName ] not found in Array

When exporting Glamour for the package Glamour-Scripting-tg.2 there is an error NotFound: [ :wc | cat beginsWith: wc packageName ] not found in Array when exporting the file.

notfounderror

The error happens when exporting the method GLMPresentation>>#display:. This method is present in the mcz and looks fine there. The issue might be related to the fact that it's an extension method.

!GLMPresentation methodsFor: '*glamour-scripting' stamp: ' 25/2/09 11:51'!
display: aBlock
	
	self transformation: aBlock! !

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.