Git Product home page Git Product logo

dotnetfile's Introduction

dotnetfile

dotnetfile is a Common Language Runtime (CLR) header parser library for Windows .NET files built in Python. The CLR header is present in every Windows .NET assembly beside the Portable Executable (PE) header. It stores a plethora of metadata information for the managed part of the file.

dotnetfile is in a way the equivalent of pefile but for .NET samples.

The library provides an easy-to-use API, but also tries to contribute new methods to improve file detection. This includes the MemberRef hash (experimental) and the original and a modified version of TypeRef hash.

The aim of this project is to give malware analysts and threat hunters a tool to easily pull out information from the CLR header. You don't need to be an expert in the CLR header and get lost in its specification to use this library. By using the API, you'll also learn how the header is structured and hopefully get a better understanding of this file type in general.

Installation

dotnetfile requires Python >= 3.7 and pefile.

PyPI

You can easily install dotnetfile with pip:

pip install dotnetfile

Local setup

To install dotnetfile as a module, please use the provided setup.py file. This can be done with the help of Python:

python3 setup.py install

Usage

To use dotnetfile, all you have to do is to import the module and create an instance of the class DotNetPE with the .NET assembly path as a parameter. A minimal example that prints out the number of streams of an assembly is shown below:

# Import class DotNetPE from module dotnetfile
from dotnetfile import DotNetPE

# Define the file path of your assembly
dotnet_file_path = '/Users/<username>/my_dotnet_assembly.exe'

# Create an instance of DotNetPE with the file path as a parameter
dotnet_file = DotNetPE(dotnet_file_path)

# Print out the number of streams of the assembly
print(f'Number of streams: {dotnet_file.get_number_of_streams()}')

You are invited to explore the example scripts: https://github.com/pan-unit42/dotnetfile/blob/main/examples/

Documentation

The full documentation can be found at https://pan-unit42.github.io/dotnetfile/

Authors

This project was started in 2016 with the development of the parser library for internal use at Palo Alto Networks. It was improved/extended with the interface library and open-sourced in 2022 by the following people:

This project is a work in progress. If you find any issues or have any suggestions, please report them to the GitHub project page.

dotnetfile's People

Contributors

r3mrum avatar yaronsamuel-zz avatar yaronsamuel1 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dotnetfile's Issues

Get AssemblyRef versions

Here's code for 'dotnetfile.py' to return the assembly name along with the version. Use as is or modify as need but please just add something like this. Thanks.

# Table 35
@metatable
class AssemblyRef:

    ....

    def get_assemblyref_names_with_versions(self, deduplicate: bool = False) -> Dict:
        """
        Get a list of referenced assembly names and their versions
        """
        result = {}

        for table_row in self.dotnetpe.metadata_tables_lookup['AssemblyRef'].table_rows:
            string_address = table_row.string_stream_references['Name']
            if string_address:
                assembly_name = self.dotnetpe.get_string(string_address)
                if deduplicate:
                    if assembly_name in result:
                        continue
                vs = Struct.AssemblyInfo(table_row.MajorVersion.value, table_row.MinorVersion.value, table_row.BuildNumber.value, table_row.RevisionNumber.value)
                result[assembly_name] = f'{vs.MajorVersion}.{vs.MinorVersion}.{vs.BuildNumber}.{vs.RevisionNumber}'

        return result

Support pathlib

Rather than a path string, the DotNetPE class should allow pathlib path objects in addition to the path string.

PR coming right up.

Distribution

Hi,

I noticed the installation method here involves directly invoking setup.py which is a pretty antiquated way of managing Python libraries for users.

  1. Is there any intent to upload this to PyPI for an easier pip install? (You should at least claim the name to prevent any supply chain attacks from someone who thinks python3 -m pip install dotnetfile is valid).

  2. Pip supports Git URLs so a user can python3 -m pip install git+https://github.com/pan-unit42/dotnetfile. If the library isn't going to be distributed through PyPI, then this is an easier method to allow users to update and might save some users confusion when it comes to managing their dependencies across multiple Python versions.

Incomplete Dotnet Resourse Parsing?

This first screenshot is parsing a dotnet resource from a malicious file using YARA to find the offset and size and then dump that location.
image2

The last cell shows the start of the ICO icon and then the PNG image data.

And here is a screenshot of dotnetfile parsing the same file.
image1

There appears to still be some dotnet-looking header data between the start of the "data" and the ICO icon.

Is this header structure able to be parsed? The sample in question in both screenshots is:
40cd96e25835eeba956645398ed73a0f0e14563375530fa5f2db3bcf44dd88d7

US Strings Wrong Encoding?

When extracting US strings, the encoding does not match the expected output. File: https://www.virustotal.com/gui/file/ab9cd59d789e6c7841b9d28689743e700d492b5fae1606f184889cc7e6acadcc

dotnetfile output:

ñÜY�}�}ÛTÜ×Ûh�oÛ©Ùh�²ÛÊÜ@�Ì�pÜ4�4�¯Üò�ÖÜúÛ    Ù
e6196fd98b57
íÛ��ÚÛÇ�]Û�Ü�ÜÂÙÜÛ@�,Üt�£Û��ÜÕ�ÕÛ�Ù    ÜãÜ
d4bd11ffd15f9756710a
Ãܵ�ßÛ ÜÍÜüÙ§Ù�ÙZÛëÛ$�ÖÙ�ÙÁ�­ÙÃ�'�è�×ÛÒÙ�Ü÷Û)Û²ÛÙ�Ü
89cdbd2d

dnlib output:

ﱉۨې﯐ﳣ﮿ڈﮖﶴڈﯪﱲݼݸﲗ۴۴ﲼۍﳬﯞﴅ
e6196fd98b57
ﭕ؄ﯾݨ﮻ﰓﰤﵢﯼݼﱫڣﮱ؀ﰻۯﯯ﴿ﰅﱆ
d4bd11ffd15f9756710a
ﱦڠﭙﱀﱵ�ﶵﴮﯩﭓٛ�ﴈ٥﷊ݦٽݔ﮿�ﰵﯡﭝﯪﴀﰙ
89cdbd2d

dnspy output:

<Module>.smethod_0("ﱉۨې﯐ﳣ﮿ڈﮖﶴڈﯪﱲݼݸﲗ۴۴ﲼۍﳬﯞﴅ");
<Module>.smethod_1("e6196fd98b57");
<Module>.smethod_0("ﭕ؄ﯾݨ﮻ﰓﰤﵢﯼݼﱫڣﮱ؀ﰻۯﯯ﴿ﰅﱆ");
<Module>.smethod_1("d4bd11ffd15f9756710a");
<Module>.smethod_0("ﱦڠﭙﱀﱵ�ﶵﴮﯩﭓٛ�ﴈ٥﷊ݦٽݔ﮿�ﰵﯡﭝﯪﴀﰙ");
<Module>.smethod_1("89cdbd2d");

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.