Git Product home page Git Product logo

Comments (15)

GoogleCodeExporter avatar GoogleCodeExporter commented on August 14, 2024
The output that I see is different. Instead, I see 
"ExceptionInInitializerError".
Upon hacking the source and implanting appropriate debugging, though, I see 
that the
underlying cause is indeed the {javaJavaIdentifierPart} category.

It should be possible to recreate the category manually without having to rely 
on the
JDK 5+ specific category.

Original comment by [email protected] on 28 Apr 2008 at 9:17

from jarjar.

GoogleCodeExporter avatar GoogleCodeExporter commented on August 14, 2024
[deleted comment]

from jarjar.

GoogleCodeExporter avatar GoogleCodeExporter commented on August 14, 2024
I see from SVN that line 30 of com.tonicsystems.jarjar.PackageRemapper is

    private static final Pattern ARRAY_FOR_NAME_PATTERN
        = Pattern.compile("\\[L[\\p{javaJavaIdentifierPart}\\.]+?;");

wouldn't the following accomplish the same?

    private static final Pattern ARRAY_FOR_NAME_PATTERN
        = Pattern.compile("\\[L[[A-Z,a-z,0-9,\\$,\\_]\\.]+?;");

This might be overly simplistic.
http://java.sun.com/j2se/1.4.2/docs/api/java/lang/Character.html#isJavaIdentifie
rPart(char)
states:

 A character may be part of a Java identifier if any of the following are true:

    * it is a letter
    * it is a currency symbol (such as '$')
    * it is a connecting punctuation character (such as '_')
    * it is a digit
    * it is a numeric letter (such as a Roman numeral character)
    * it is a combining mark
    * it is a non-spacing mark
    * isIdentifierIgnorable returns true for the character 

Additionally,
http://java.sun.com/j2se/1.4.2/docs/api/java/lang/Character.html#isIdentifierIgn
orable(char)
states:

 The following Unicode characters are ignorable in a Java identifier or a Unicode
identifier:

    * ISO control characters that are not whitespace
          o '\u0000' through '\u0008'
          o '\u000E' through '\u001B'
          o '\u007F' through '\u009F' 
    * all characters that have the FORMAT general category value 

Original comment by [email protected] on 28 Apr 2008 at 9:47

from jarjar.

GoogleCodeExporter avatar GoogleCodeExporter commented on August 14, 2024
[deleted comment]

from jarjar.

GoogleCodeExporter avatar GoogleCodeExporter commented on August 14, 2024
[deleted comment]

from jarjar.

GoogleCodeExporter avatar GoogleCodeExporter commented on August 14, 2024
I can't speak for the developers but I suspect this approach won't fly for 
legal reasons.

Would it not be better to have a more permissive regex for 
ARRAY_FOR_NAME_PATTERN
with a capture group for the javaJavaIdentifierPart section and then just 
manually
check that for each char in the captured group Character.isJavaIdentifierPart(c)
returns true?


Original comment by [email protected] on 28 Apr 2008 at 11:32

from jarjar.

GoogleCodeExporter avatar GoogleCodeExporter commented on August 14, 2024
You are probably right with regards to your legal assessment. It seems that the 
java
source code may not be re-distributed without the express written permission of 
Sun.
So I am deleting my comment that includes Sun's source.

Original comment by [email protected] on 29 Apr 2008 at 12:50

from jarjar.

GoogleCodeExporter avatar GoogleCodeExporter commented on August 14, 2024
What about using Jakarta Regexp (http://jakarta.apache.org/regexp) instead of 
the
java.util.regex classes? Jakarta Regexp even includes an analog of the
javaJavaIdentifierPart character category (called javapart in Jakarta Regexp). 
The
license is compatible, too. And since the java.util.regex package is only 
referenced
in two classes in the jarjar project, initially integrating Jakarta Regexp into
jarjar should not be difficult.

Regression testing might be a chore, but at least we would end up with a 
product that
is truly compatible with java 1.4.

Original comment by [email protected] on 29 Apr 2008 at 1:01

from jarjar.

GoogleCodeExporter avatar GoogleCodeExporter commented on August 14, 2024
I retrofitted my jarjar workspace to use jakarta regexp instead of 
java.util.regex.
And my results are a success. Mostly.

There is a problem. The unit tests will only complete successfully under java 
1.5
because the classes in enumtest.jar were compiled with java 1.5. Subsequently, 
if I
try to run the test-enum target under java 1.4, I get an
UnsupportedClassVersionError. The other unit tests do complete successfully.

Again, this solution works in practice for my java 1.4 projects that use 
jarjar. So
here is how you can implement it yourself:

- Download jakarta-regexp-1.5.jar to the jarjar lib directory
(http://mirrors.sirium.net/pub/apache/jakarta/regexp/source/jakarta-regexp-1.5.z
ip).

- Apply the attached diffs to build.xml,
com.tonicsystems.jarjar.PackageRemapper.java, and 
com.tonicsystems.jarjar.Wildcard.java

- Run the ant jar target and see what happens.

Original comment by [email protected] on 29 Apr 2008 at 6:56

from jarjar.

GoogleCodeExporter avatar GoogleCodeExporter commented on August 14, 2024
Here is the patch I referred to in comment 9.

Original comment by [email protected] on 29 Apr 2008 at 6:57

Attachments:

from jarjar.

GoogleCodeExporter avatar GoogleCodeExporter commented on August 14, 2024
Here is the Wildcard.diff with comments added.

@@ -16,18 +16,19 @@

 package com.tonicsystems.jarjar;

-import java.util.regex.Matcher;
-import java.util.regex.Pattern;
 import java.util.ArrayList;
 import java.util.Arrays;

+import org.apache.regexp.RE;
+
 class Wildcard
 {
-    private static Pattern dstar = Pattern.compile("\\*\\*");
-    private static Pattern star  = Pattern.compile("\\*");
-    private static Pattern estar = Pattern.compile("\\+\\??\\)\\Z");
+    private static RE dstar = new RE("\\*\\*");
+    private static RE star  = new RE("\\*");
+    private static RE estar = new RE("\\+\\??\\)\\Z");
+    private static RE nstar = new RE("\\(\\.\\(\\[\\^\\/\\]\\+\\)\\?\\)");

-    private final Pattern pattern;
+    private final RE pattern;
     private final int count;
     private final ArrayList parts = new ArrayList(16); // kept for debugging
     private final String[] strings;
@@ -42,12 +43,28 @@
             throw new IllegalArgumentException("The sequence '***' is invalid in a
package pattern");

         String regex = pattern;
-        regex = replaceAllLiteral(dstar, regex, "(.+?)");
+        // Changed to .*? from .+? because apparently java.util.regex is more
+       // permissive. There is a problem with this approach, though because
+       // when we apply the single star replacement, the star in .*? is
+       // changed.
+        regex = replaceAllLiteral(dstar, regex, "(.*?)");
         regex = replaceAllLiteral(star, regex, "([^/]+)");
         regex = replaceAllLiteral(estar, regex, "*)");
-        this.pattern = Pattern.compile("\\A" + regex + "\\Z");
-        this.count = this.pattern.matcher("foo").groupCount();
+        // And because the single star in the double star replacment was
+       // replaced, we need to change it back again.
+        regex = replaceAllLiteral(nstar, regex, "(.*?)");
+        this.pattern = new RE("^" + regex + "$");

+        // We match the raw pattern. Otherwise,
+       // org.apache.regexp.RE.getParenCount returns 0. Source diving indicates
+       // that this is because the paren count is only updated when
+       // org.apache.regexp.RE.match returns true.
+        this.pattern.match(pattern);
+        // We subtract one because, unlike java.util.regex.Matcher.getCount, 
the
+       // org.apache.regexp.RE.getParenCount method includes the whole
+       // expression as a group.
+        this.count = this.pattern.getParenCount() - 1;
+        
         // TODO: check for illegal characters
         char[] chars = result.toCharArray();
         int max = 0;
@@ -94,15 +111,14 @@
     }

     public boolean matches(String value) {
-        return getMatcher(value) != null;
+        return pattern.match(value);
     }

     public String replace(String value) {
-        Matcher matcher = getMatcher(value);
-        if (matcher != null) {
+        if (pattern.match(value) && checkIdentifierChars(value, "/")) {
             StringBuffer sb = new StringBuffer();
             for (int i = 0; i < strings.length; i++)
-                sb.append((refs[i] >= 0) ? matcher.group(refs[i]) : 
strings[i]);
+                sb.append((refs[i] >= 0) ? pattern.getParen(refs[i]) : 
strings[i]);
             return sb.toString();
         }
         return null;
@@ -108,13 +124,6 @@
         return null;
     }

-    private Matcher getMatcher(String value) {
-        Matcher matcher = pattern.matcher(value);
-        if (matcher.matches() && checkIdentifierChars(value, "/"))
-            return matcher;
-        return null;
-    }
-
     private static boolean checkIdentifierChars(String expr, String extra) {
         for (int i = 0, len = expr.length(); i < len; i++) {
             char c = expr.charAt(i);
@@ -126,9 +135,9 @@
         return true;
     }

-    private static String replaceAllLiteral(Pattern pattern, String value, 
String
replace) {
+    private static String replaceAllLiteral(RE pattern, String value, String 
replace) {
         replace = replace.replaceAll("([$\\\\])", "\\\\$0");
-        return pattern.matcher(value).replaceAll(replace);
+        return pattern.subst(value, replace);
     }

     public String toString() {

Original comment by [email protected] on 29 Apr 2008 at 7:13

from jarjar.

GoogleCodeExporter avatar GoogleCodeExporter commented on August 14, 2024
I've just been thinking about this some more. (I know, STOP IT ALREADY ;) )

Might it be desirable to make the Regular Expression engine to use user 
definable
from the task tag? I know that most developers probably won't care, and the 
impact of
such a change would be quite large with a minimal reward, and the change will
introduce another layer of complexity to the project.

Maybe I just answered my own question.

Original comment by [email protected] on 5 May 2008 at 4:38

from jarjar.

GoogleCodeExporter avatar GoogleCodeExporter commented on August 14, 2024
Does anyone still need 1.4 support? I am considering fixing this bug by moving 
jarjar
to 1.5-only, which would also allow the use of generics.

Since jarjar is a tool that is only used at build time I don't see a 1.5 
requirement
being a big deal. Thoughts?

Original comment by [email protected] on 30 Aug 2008 at 5:27

from jarjar.

GoogleCodeExporter avatar GoogleCodeExporter commented on August 14, 2024
It actually was a problem for me because I was forced by company mandate to use 
the
IBM 1.4 JDK for development -- even new development. I don't think this is a 
common
scenario, though, and since this is a build tool, most people will only start 
using
it in new projects -- projects that should really be using more modern versions 
of
java anyway.

So fire away. If I ever find myself forced to start a new project in 1.4 land, 
and I
need jarjar, I have a version that works for me.

Original comment by [email protected] on 31 Aug 2008 at 1:25

from jarjar.

GoogleCodeExporter avatar GoogleCodeExporter commented on August 14, 2024
Thanks, the latest source now relies on 1.5 features like generics.

Original comment by [email protected] on 13 Sep 2008 at 8:22

  • Changed state: WontFix

from jarjar.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.