Git Product home page Git Product logo

enjarify's Introduction

Note: This repository may be out of date. Future development will occur at https://github.com/Storyyeller/enjarify.

Introduction

Enjarify is a tool for translating Dalvik bytecode to equivalent Java bytecode. This allows Java analysis tools to analyze Android applications.

Usage and installation

Enjarify is a pure python 3 application, so you can just git clone and run it. To run it directly, assuming you are in the top directory of the repository, you can just do

python3 -O -m enjarify.main yourapp.apk

For normal use, you'll probably want to use the wrapper scripts and set it up on your path.

Linux

For convenience, a wrapper shell script is provided, enjarify.sh. This will try to use Pypy if available, since it is faster than CPython. If you want to be able to call Enjarify from anywhere, you can create a symlink from somewhere on your PATH, such as ~/bin. To do this, assuming you are inside the top level of the repository,

ln -s "$PWD/enjarify.sh" ~/bin/enjarify

Windows

A wrapper batch script, enjarify.bat, is provided. To be able to call it from anywhere, just add the root directory of the repository to your PATH. The batch script will always invoke python3 as interpreter. If you want to use pypy, just edit the script.

Usage

Assuming you set up the script on your path correctly, you can call it from anywhere by just typing enjarify, e.g.

enjarify yourapp.apk

The most basic form of usage is to just specify an apk file or dex file as input. If you specify a multidex apk, Enjarify will automatically translate all of the dex files and output the results in a single combined jar. If you specify a dex file, only that dex file will be translated. E.g. assuming you manually extracted the dex files you could do

enjarify classes2.dex

The default output file is [inputname]-enjarify.jar in the current directory. To specify the filename for the output explicitly, pass the -o or --output option.

enjarify yourapp.apk -o yourapp.jar

By default, Enjarify will refuse to overwrite the output file if it already exists. To overwrite the output, pass the -f or --force option.

Why not dex2jar?

Dex2jar is an older tool that also tries to translate Dalvik to Java bytecode. It works reasonable well most of the time, but a lot of obscure features or edge cases will cause it to fail or even silently produce incorrect results. By contrast, Enjarify is designed to work in as many cases as possible, even for code where Dex2jar would fail. Among other things, Enjarify correctly handles unicode class names, constants used as multiple types, implicit casts, exception handlers jumping into normal control flow, classes that reference too many constants, very long methods, exception handlers after a catchall handler, and static initial values of the wrong type.

Limitations

Enjarify does not currently translate optional metadata such as sourcefile attributes, line numbers, and annotations.

Enjarify tries hard to successfully translate as many classes as possible, but there are some potential cases where it is simply not possible due to limitations in Android, Java, or both. Luckily, this only happens in contrived circumstances, so it shouldn't be a problem in practice.

Performance tips

PyPy is much faster than CPython. To install PyPy, see http://pypy.org/. Make sure you get PyPy3 rather than regular PyPy. The Linux wrapper script will automatically use the command pypy3 if available. On Windows, you'll need to edit the wrapper script yourself.

By default, Enjarify runs optimizations on the bytecode which make it more readable for humans (copy propagation, unused value removal, etc.). If you don't need this, you can speed things up by disabling the optimizations with the --fast option. Note that in the very rare case where a class is too big to fit in a classfile without optimization, Enjarify will automatically retry it with all optimizations enabled, so this option does not affect the number of classes that are successfully translated.

Disclaimer

This is not an official Google product (experimental or otherwise), it is just code that happens to be owned by Google.

enjarify's People

Contributors

ajinabraham avatar jjqq2013 avatar reinerh avatar storyyeller avatar yurushao avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

enjarify's Issues

No module named 'enjarify'

I used the python-3.5.0-embed-amd64.zip to excute,because no 'python3' command in it, so edited the second line in 'enjarify.bat' file------'python -O -m enjarify.main %*'
and the result is below:
...\python-3.5.0-embed-amd64\python.exe: Error while finding spec for 'enjarify.main' (<class 'ImportError'>: No module named 'enjarify')

20151027132317

help me ths

when i use this tools,it tell me that "IO ERROR,no such file or path".So , i want to know where i should put my "apk"or there is another problem i didn't notice!

missing convenient python object cache generation/distribution (+ possible filesystem cluttering on uninstall)

Hey πŸ˜„

I wanted to report a small problem and suggest a possible solution.
The problem is that when the .py files are distributed into the filesystem from a package (f.e. under /usr/lib/enjarify or /usr/share/enjarify) by just copying them over, then no python caching files are created. One problem arises if anyone executes enjarify as root (yes that shouldn't be the case but why not) then python will create those cache object files for all python modules that are imported somewhere within your module. If anyone uninstalls your package afterwards, then the filesystem is left in a cluttered state as the generated object files are not tracked and remain.
Additionally there may also be a small performance improvement if python cache object files are distributed.

One possible solution would be to use python setuptools and create a setup.py that will distribute your python module files into f.e. /usr/lib/python3.5/site-packages/enjarify, which will also take care to create the cache entries if the -O1 parameter is passed when calling the setup.py.
You could still have your enjarify.sh script that checks for all your favorite python interpreters.

What do you think about this idea? If you need feedback or something like that, feel free... i try to help where i can πŸ˜„

cannot translate debug info

dex2jar 0.0.9.x(0.0.9.9-0.0.9.15) can translate debug info from Dalvik ,but it's too old.Is enjarify planned to support translate debug info?

PyPy3 Issue

For some reason when using pypy3-2.4.0-win32 and BCV I get this exception:
Traceback (most recent call last):
File "C:\Users\null.Bytecode-Viewer\enjarify_2\enjarify-master\enjarify\main.py", line 75, in main
outfile = open(outname, mode=('wb' if args.force else 'xb'))
ValueError: invalid mode: xb

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "H:\Programs\pypy3-2.4.0-win32\lib-python\3\runpy.py", line 161, in _run_module_as_main
"main", fname, loader, pkg_name)
File "H:\Programs\pypy3-2.4.0-win32\lib-python\3\runpy.py", line 74, in _run_code
exec(code, run_globals)
File "C:\Users\null.Bytecode-Viewer\enjarify_2\enjarify-master\enjarify\main.py", line 94, in
main()
File "C:\Users\null.Bytecode-Viewer\enjarify_2\enjarify-master\enjarify\main.py", line 76, in main
except FileExistsError:
NameError: global name 'FileExistsError' is not defined

However with Python 3.4 it works fine, this is on Windows 7.

Is there maybe something I messed up on with BCV? If that's the case sorry for opening a ticket here, I just wasn't sure because Python 3.4 works perfectly fine, whereas PyPy3 doesn't.

ValueError: bytes length not a multiple of item size

1. aeda1299870d0e04d633324db0d7c524

    self.u16s = array.array('H', data)
ValueError: bytes length not a multiple of item size

I fixed it with:

        length = len(data)
        for i in range(0, length):
            try:
                self.u16s = array.array('H', data[0:length-i])
            except ValueError:
                continue
            if i > 0:
                break
        assert(self.u16s.itemsize == 2)

        for i in range(0, length):
            try:
                self.u32s = array.array('I', data[0:length-i])
            except ValueError:
                continue
            if i > 0:
                break
        assert(self.u32s.itemsize == 4)

it can work, but 133691d92e1127efeee6686b2106dc57 will raise another error.

2. 133691d92e1127efeee6686b2106dc57

    results = _FUNC[fmt](*shorts[pos:pos+size])
TypeError: pAAopCCBB() missing 1 required positional argument: 'w2'

    return _descToScalar[desc[0]]
TypeError: 'NoneType' object is not subscriptable

    label = writer.labels[target]
KeyError: 8522

2862 classes translated successfully, 140 classes had errors

It seems could work, but not perfect.

PyPy3 speed

I got this results running latest enjarify with latest PyPy3 with Python 3.5 support (hashtests.py using range 10 ):

pypy3 -m enjarify.hashtests  21,44s user 0,11s system 99% cpu 21,552 total

python3.5 -m enjarify.hashtests  73,71s user 0,37s system 99% cpu 1:14,09 total

So there's ~3.4x speed improvement

s390x: array index out of range

When running enjarify on s390x it fails with the following error:

Traceback (most recent call last):
  File "/usr/lib/python3.5/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.5/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/usr/lib/python3/dist-packages/enjarify/main.py", line 107, in <module>
    main()
  File "/usr/lib/python3/dist-packages/enjarify/main.py", line 97, in main
    translate(data, opts=opts, classes=classes, errors=errors)
  File "/usr/lib/python3/dist-packages/enjarify/main.py", line 27, in translate
    dex = parsedex.DexFile(data)
  File "/usr/lib/python3/dist-packages/enjarify/parsedex.py", line 258, in __init__
    self.classes.append(DexClass(self, defs.off, i))
  File "/usr/lib/python3/dist-packages/enjarify/parsedex.py", line 205, in __init__
    self.name = dex.clsType(words[0])
  File "/usr/lib/python3/dist-packages/enjarify/parsedex.py", line 274, in clsType
    desc = self.type(i)
  File "/usr/lib/python3/dist-packages/enjarify/parsedex.py", line 270, in type
    return self.string(self.u32s[self.type_ids.off//4 + i])
IndexError: array index out of range

The dex file used is available here: https://anonscm.debian.org/git/reproducible/diffoscope.git/tree/tests/data/test1.dex

git tag for packaging

Hey,

any chance to get a git tag to make it easier to package this great tool?
That would be amazing! πŸ˜„

If yes, then it would be gorgeous if you sometimes create a new tag πŸ˜‹

cheers
anthraxx

struct.error: 'H' format requires 0 <= number <= 65535

Traceback:

com/rise/app/wallet/application/RisingApp.class Traceback (most recent call last):
  File "~/Programs/enjarify/enjarify/main.py", line 38, in translate
    class_data = writeclass.toClassFile(cls, opts)
  File "~/Programs/enjarify/enjarify/jvm/writeclass.py", line 116, in toClassFile
    pool, rest_stream = classFileAfterPool(cls, opts=opts)
  File "~/Programs/enjarify/enjarify/jvm/writeclass.py", line 100, in classFileAfterPool
    writeMethods(pool, stream, cls.data.methods, opts=opts)
  File "~/Programs/enjarify/enjarify/jvm/writeclass.py", line 65, in writeMethods
    code_attrs = writebytecode.finishCodeAttrs(pool, code_irs, opts=opts)
  File "~/Programs/enjarify/enjarify/jvm/writebytecode.py", line 75, in finishCodeAttrs
    return {irdata.method: writeCodeAttributeTail(pool, irdata, opts=opts) for irdata in code_irs}
  File "~/Programs/enjarify/enjarify/jvm/writebytecode.py", line 75, in <dictcomp>
    return {irdata.method: writeCodeAttributeTail(pool, irdata, opts=opts) for irdata in code_irs}
  File "~/Programs/enjarify/enjarify/jvm/writebytecode.py", line 80, in writeCodeAttributeTail
    bytecode, excepts = jumps.createBytecode(irdata)
  File "~/Programs/enjarify/enjarify/jvm/optimization/jumps.py", line 84, in createBytecode
    packed_excepts.append(struct.pack('>HHHH', s_off, e_off, h_off, c))
struct.error: 'H' format requires 0 <= number <= 65535

Looks like s_off, e_off and h_off can be greater than 65k when a DEX file is large enough.

I tried the following fix and it worked:

    try:
        packed_excepts.append(struct.pack('>HHHH', s_off, e_off, h_off, c))
    except struct.error:
        packed_excepts.append(struct.pack('>IIIH', s_off, e_off, h_off, c))

realpath: command not found

in enjarify.sh line 39:
export PYTHONPATH=$(dirname "$(realpath "${BASH_SOURCE[0]}")")

mac osx 10.10.4,there is no realpath command,please replace with readlink command。

Missing Java Annotations

I was dealing with an app using Retrofit (Java Annotations play an important role in that library). Enjarify seemed to work smoothly, but when I opened output .jar frustrating things came out. I can't find any Java Annotation that should have applied to methods.
Unluckily Iβ€˜m not good at python so have no idea. Is enjarify planned to support Java Annotations?

missing tests in tarballs, which can be very handy

Hey, me again... hope I'm not annoying πŸ˜‰

I have another topic that would be great if you could consider it: Re-adding the tests. I have noticed your .gitattributes changes that add git-export ignores that will keep the tests out of the release tarball.
The tests seem to use 1.7 MB... I guess you wanted to get rid of that "bloat"?

From the packaging and distribution point of view i find the tests quite usefull as its always nice and handy if our packaging script can run the upstream tests so we can easily detect possible regressions before we ship the final package to the users.

1.7MB doesn't sound too much nowadays, but I understand if you want to keep that out of your tarball, in such case i can change the build process to pull from a git tag... but first wanted to ask for your feedback and consideration to re-add them πŸ˜„

cheers and thanks for enjarify!

fail cases

Hi,

I used enjarify to covert some commercial apks into jar files. I found it work well on most cases. However, I still find some fail cases, especially when involving some expection sentences. And many fail cases also fails with the use of dex2jar.

So I would like to know what the reason of it is. And what the main advantage of enjarify is. Does it use some unique methods or just consider some special cases?

Thanks for your help! : )

python3 test regression (test2)

Hi,
I'm a package maintainer and while building enjarify in a CI to early catch problems I have detected that the test2 is failing.
Python version: 3.5.2
commit: 2a94b40
log:

==> Starting check()...
running test test1
running test test2
Traceback (most recent call last):
  File "/usr/lib/python3.5/runpy.py", line 184, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.5/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/build/enjarify-git/src/enjarify-git/enjarify/runtests.py", line 46, in <module>
    executeTest('test{}'.format(i), opts)
  File "/build/enjarify-git/src/enjarify-git/enjarify/runtests.py", line 40, in executeTest
    assert result == expected
AssertionError
==> ERROR: A failure occurred in check().
    Aborting...
==> ERROR: Build failed

No module named enjarify.main

It didn't works. Just show log:

Using python3 as Python interperter
./enjarify/enjarify.sh: line 39: realpath: command not found
/usr/local/bin/python3: No module named enjarify.main

struct.error: unpack str size too short for format

I get the following error while using enjarify with pypy3:

struct.error: unpack str size too short for format

Using python3 fixes the issue. Also, the script will not use python because python -c "print(range)" returns "<built-in function range>", not "<class 'range'>".

ACC_SUPER is not set in class file headers

ACC_SUPER may not be used by JVMs (as commented in flags.py), but it is used by the Android dexer. The Dalvik opcode used to call superclass methods depends on whether ACC_SUPER is set in the class file. If ACC_SUPER is not set then the wrong opcode will be used.

enjarify doesn't set the flag, so jars created by enjarify can't be put back through the dexer and run successfully.

dex2jar sets ACC_SUPER in every class that isn't an interface.

Unknown problem

Hi, I obtain the same error several times with different samples:

Using python3 as Python interperter
Traceback (most recent call last):
File "/usr/lib/python3.4/runpy.py", line 170, in _run_module_as_main
"main", mod_spec)
File "/usr/lib/python3.4/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/asanchez/Installations/enjarify_project/enjarify/main.py", line 102, in
main()
File "/home/asanchez/Installations/enjarify_project/enjarify/main.py", line 92, in main
translate(data, opts=opts, classes=classes, errors=errors)
File "/home/asanchez/Installations/enjarify_project/enjarify/main.py", line 27, in translate
dex = parsedex.DexFile(data)
File "/home/asanchez/Installations/enjarify_project/enjarify/parsedex.py", line 234, in init
self.u16s = array.array('H', data)
ValueError: string length not a multiple of item size

The last time was with this sample:
https://koodous.com/apks/6a4ec95ce3f6a88786e8beb9ed68e8d7920cd9eedf202b3dcaca01a8e67da459

Thanks for you project!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.