thalium / symless Goto Github PK
View Code? Open in Web Editor NEWLicense: MIT License
License: MIT License
Maybe add github actions to auto test plugins. ( not sure)
add custom dockerfile with secrets protected or others method to upload ida software with protection (ida license key) in github actions
also maybe create a beta branch to test experimental feature ?
thanks for the plugins anyway!
In pre-analysis mode, symless will name fields and create repeatable comments based on the binary's base address (and probably many more things are based on it).
Once the database is created, rebasing it will create many inconsistencies (e.g. method names in vtable structures show the wrong address). This is especially annoying for repeatable comments which should automatically link to functions (for instance, repeatable comments that link to implementation candidates for vtable methods).
Not a bug per se, but maybe it would be nice to have a feature (like a command line option) to tell symless to automatically rebase binaries to 0x0 during pre-analysis?
Right now the only typed structure fields are:
__vftable
pointers in structures;We could use IDA type information while propagating to automatically type more fields.
For example if we created a struc_A
with a field field_c
from the following assignement:
A.field_c = "some_string";
We can guess that field_c
type is char*
, and automatically type it.
Or if we have the following call:
fct(A.field_c);
And knowing fct
prototype to be:
void fct(char* str);
We can guess the same.
This focuses more on the plugin version.
We have the following structure:
struct A {
struct B* field_0;
};
Let's say we use our plugin to propagate struct A
from register rcx
, and we reach this instruction:
mov rdx, [rcx]
rdx
now contains a pointer on struct B
, we should consider this and also propagate struct B
in rdx
.
Structure propagation is applied from an entry point (malloc, ctor) and only goes down into callees. If the entry point function returns the propagated object, it could be interesting to propagate after calls to the function (in the function's callers).
Two cases have to be distinguished:
*this
. We do not want to propagate in the ctor callers, as they can be ctors to derived classes and propagating a base class into a derived ctor is wrong;Hooks are very limited and described by JSON => Replace it with Python functions with examples on some syscalls like socket/files/registry/etc.
If IID_xyz is used then import xyz vtables to be used
reset members quand on passe aux vtables
creer un objet pour le champ members qui contient : type/name/valeur/taille
gestion de conflits sur type/name/valeur/taille
It seems that it does not support IDA7.7. When I use it with ida7.7, IDA crashes.
Right now the information forwarded during a function call depend on the callee's calling convention: Only the register / stack parameters are forwarded to callee.
The only advantage of knowing a function's cc is that it can be used to know if it is worth to propagate in a callee, by looking if any interesting info is present in its parameters. For performance sake this is required, but could be avoided by improving how the propagation is done.
Do not consider callee's cc when propagating information in a call:
The calling convention is only required when setting a function's type. For that we need to track what registers / stack offsets are used in the function without beeing set before, and guess the function's cc from that. This should be easy for register-parameters, less for stack parameters.
Each structure is propagated individually, from its entry point.
First find all entry points for propagation (allocations + ctors), then only propagate once in each concerned function, and inject every structure at its entry point.
Re-iterate the step for each new entry point found (virtual functions, ...), and keep injecting every structure at its entry point to save structures interactions info.
Investigate about the benefits of using the __cppobj
attribute on create cpp classes.
Structures are identified from 2 entry points:
This misses some structures. Another way might be to search register disp in every functions, meaning that a structure offset is accessed. From there using reverse-propagation we can find every structure creation / entry point (is structure from stack, from parameters, returned by a function..). We can now propagate every structure present in the binary. The only problem is that we do not have information to merge duplicates, leading to create multiple times the same structure.
Finding heuristic to identify duplicates from this state would be interesting.
Reflechir au mode incremental, flagger toutes les decisions prises sur les conflits pour savoir si ca vient de l'user ou de symless.
Quelle est la priorité user vs symless. Est ce que ca ecrase les anciens conflits .. etc ?
Avoir symless stable lorsqu'on l'applique deux fois sur la base
Type automagically output of function like queryinterface or f(x, IID, z, out)
est-ce possible d'eviter le doublon : self.sid = -1 # set by context_t
self.sid_ida = -1 # set when add_struc is called
Add usage of regression tests in README
Make regression tests more resistant/communicative about errors when applying symless or dumping information
Regression tests only on branch or on commit ?
Dont run make apply and make dump when it has already been done for a given branch/commit
I copied the symless directory and the symless.py to /opt/idapro-8.4/plugins, and then when I launch IDA Pro I get the following error, and there is no "symless" entry in the "Edit" -> "Plugins" menu.
/opt/idapro-8.4/plugins/symless.py: undefined function __plugins__symless.PLUGIN_ENTRY
Faudrait plutôt faire en sorte que le model conserve les candidats aux conflits, si tu relance une passe sur une nouvelle fonction ça peut être utilisé pour répérer les duplicatas / faire en sorte que la nouvelle fonction ne crée pas des structures déjà identifiées. En gros virer le remplacement de type et tout conserver
et le existing.from_structure faut pas le virer, par exemple si on lance l'analyse / le plugin sur une base où quelqu'un a déjà bossé ça peut être utilisé pour retrouver des vtables définies par l'user, en autres
In our implementation register size is not taken into account. rax, eax, ax, ah and al are all considered to be the same register.
For example:
mov ax, 1h
will set the value of rax to 1. This is not right, only the lower 16 bits should be affected.
Register size should be taken into account:
movzx
is used)Utilisateur
Parfois pour trouver où une structure est utilisée, je cherche un accès à cette structure (genre [RCX+4243] ). Ici je fais une recherche d'immediate avec la valeur 0x4243
Du coup ça me sort une nouvelle fenêtre avec toutes les fois où IDA a trouvé cette valeur. L'idée serait de déclencher l'analyse en mode "batch" sur tous les résultats de la recherche
Developpeur
ok, et ca serait, à partir de cette instruction, ou tu voudrais que ca remonte aussi dans l'idéal ^^?
Utilisateur
Symless peut propager en remontant ?
:)
2022-05-30 03:16:19,537 - symless - CRITICAL generation.py:00350 - generate_structs - Name of the structure struc_?MakeAndInitialize@?$ProcessLocalStorageData@UProcessLocalData@details_abi@wil@@@details_abi@wil@@CAJPEBG$$QEAV?$unique_any_t@V?$mutex_t@V?$unique_storage@U?$resource_policy@PEAXP6AXPEAX@_E$1?CloseHandle@details@wil@@YAX0@ZU?$integral_constant@_K$0A@@wistd@@PEAXPEAX$0A@$$T@details@wil@@@details@wil@@Uerr_returncode_policy@3@@wil@@@3@PEAPEAV123@@Z is not correct
2022-05-30 03:16:19,538 - symless - CRITICAL generation.py:00350 - generate_structs - Name of the structure struc_?MakeAndInitialize@?$ProcessLocalStorageData@VFeatureStateData@details_abi@wil@@@details_abi@wil@@CAJPEBG$$QEAV?$unique_any_t@V?$mutex_t@V?$unique_storage@U?$resource_policy@PEAXP6AXPEAX@_E$1?CloseHandle@details@wil@@YAX0@ZU?$integral_constant@_K$0A@@wistd@@PEAXPEAX$0A@$$T@details@wil@@@details@wil@@Uerr_returncode_policy@3@@wil@@@3@PEAPEAV123@@Z is not correct
2022-05-30 03:16:19,545 - symless - CRITICAL generation.py:00350 - generate_structs - Name of the structure struc_??0?$_Hash@V?$_Uset_traits@V?$unique_any_t@V?$unique_storage@U?$resource_policy@PEAUHBITMAP__@@P6AHPEAX@Z$1?DeleteObject@@YAH0@ZU?$integral_constant@_K$0A@@wistd@@PEAU1@PEAU1@$0A@$$T@details@wil@@@details@wil@@@wil@@V?$_Uhash_compare@V?$unique_any_t@V?$unique_storage@U?$resource_policy@PEAUHBITMAP__@@P6AHPEAX@Z$1?DeleteObject@@YAH0@ZU?$integral_constant@_K$0A@@wistd@@PEAU1@PEAU1@$0A@$$T@details@wil@@@details@wil@@@wil@@U?$hash@V?$unique_any_t@V?$unique_storage@U?$resource_policy@PEAUHBITMAP__@@P6AHPEAX@Z$1?DeleteObject@@YAH0@ZU?$integral_constant@_K$0A@@wistd@@PEAU1@PEAU1@$0A@$$T@details@wil@@@details@wil@@@wil@@@std@@U?$equal_to@V?$unique_any_t@V?$unique_storage@U?$resource_policy@PEAUHBITMAP__@@P6AHPEAX@Z$1?DeleteObject@@YAH0@ZU?$integral_constant@_K$0A@@wistd@@PEAU1@PEAU1@$0A@$$T@details@wil@@@details@wil@@@wil@@@4@@std@@V?$allocator@V?$unique_any_t@V?$unique_storage@U?$resource_policy@PEAUHBITMAP__@@P6AHPEAX@Z$1?DeleteObject@@YAH0@ZU?$integral_constant@_K$0A@@wistd@@PEAU1@PEAU1@$0A@$$T@details@wil@@@details@wil@@@wil@@@4@$0A@@std@@@std@@IEAA@AEBV?$_Uhash_compare@V?$unique_any_t@V?$unique_storage@U?$resource_policy@PEAUHBITMAP__@@P6AHPEAX@Z$1?DeleteObject@@YAH0@ZU?$integral_constant@_K$0A@@wistd@@PEAU1@PEAU1@$0A@$$T@details@wil@@@details@wil@@@wil@@U?$hash@V?$unique_any_t@V?$unique_storage@U?$resource_policy@PEAUHBITMAP__@@P6AHPEAX@Z$1?DeleteObject@@YAH0@ZU?$integral_constant@_K$0A@@wistd@@PEAU1@PEAU1@$0A@$$T@details@wil@@@details@wil@@@wil@@@std@@U?$equal_to@V?$unique_any_t@V?$unique_storage@U?$resource_policy@PEAUHBITMAP__@@P6AHPEAX@Z$1?DeleteObject@@YAH0@ZU?$integral_constant@_K$0A@@wistd@@PEAU1@PEAU1@$0A@$$T@details@wil@@@details@wil@@@wil@@@4@@1@AEBV?$allocator@V?$unique_any_t@V?$unique_storage@U?$resource_policy@PEAUHBITMAP__@@P6AHPEAX@Z$1?DeleteObject@@YAH0@ZU?$integral_constant@_K$0A@@wistd@@PEAU1@PEAU1@$0A@$$T@details@wil@@@details@wil@@@wil@@@1@@Z is not correct
2022-05-30 03:16:19,545 - symless - CRITICAL generation.py:00350 - generate_structs - Name of the structure struc_??0?$_Hash@V?$_Umap_traits@$$CBV?$basic_string@GU?$char_traits@G@std@@V?$allocator@G@2@@std@@U?$pair@$$CBV?$vector@V?$basic_string@GU?$char_traits@G@std@@V?$allocator@G@2@@std@@V?$allocator@V?$basic_string@GU?$char_traits@G@std@@V?$allocator@G@2@@std@@@2@@std@@$$CBV12@@2@V?$_Uhash_compare@$$CBV?$basic_string@GU?$char_traits@G@std@@V?$allocator@G@2@@std@@Ucase_insensitive_hash_t@details@PublisherProtection@@Ucase_insensitive_equal_to_t@45@@2@V?$allocator@U?$pair@$$CBV?$basic_string@GU?$char_traits@G@std@@V?$allocator@G@2@@std@@U?$pair@$$CBV?$vector@V?$basic_string@GU?$char_traits@G@std@@V?$allocator@G@2@@std@@V?$allocator@V?$basic_string@GU?$char_traits@G@std@@V?$allocator@G@2@@std@@@2@@std@@$$CBV12@@2@@std@@@2@$0A@@std@@@std@@IEAA@AEBV?$_Uhash_compare@$$CBV?$basic_string@GU?$char_traits@G@std@@V?$allocator@G@2@@std@@Ucase_insensitive_hash_t@details@PublisherProtection@@Ucase_insensitive_equal_to_t@45@@1@AEBV?$allocator@U?$pair@$$CBV?$basic_string@GU?$char_traits@G@std@@V?$allocator@G@2@@std@@U?$pair@$$CBV?$vector@V?$basic_string@GU?$char_traits@G@std@@V?$allocator@G@2@@std@@V?$allocator@V?$basic_string@GU?$char_traits@G@std@@V?$allocator@G@2@@std@@@2@@std@@$$CBV12@@2@@std@@@1@@Z is not correct
2022-05-30 03:16:19,546 - symless - CRITICAL generation.py:00350 - generate_structs - Name of the structure struc_??$emplace@AEBU?$pair@$$CBV?$basic_string@GU?$char_traits@G@std@@V?$allocator@G@2@@std@@U?$pair@$$CBV?$vector@V?$basic_string@GU?$char_traits@G@std@@V?$allocator@G@2@@std@@V?$allocator@V?$basic_string@GU?$char_traits@G@std@@V?$allocator@G@2@@std@@@2@@std@@$$CBV12@@2@@std@@@?$_Hash@V?$_Umap_traits@$$CBV?$basic_string@GU?$char_traits@G@std@@V?$allocator@G@2@@std@@U?$pair@$$CBV?$vector@V?$basic_string@GU?$char_traits@G@std@@V?$allocator@G@2@@std@@V?$allocator@V?$basic_string@GU?$char_traits@G@std@@V?$allocator@G@2@@std@@@2@@std@@$$CBV12@@2@V?$_Uhash_compare@$$CBV?$basic_string@GU?$char_traits@G@std@@V?$allocator@G@2@@std@@Ucase_insensitive_hash_t@details@PublisherProtection@@Ucase_insensitive_equal_to_t@45@@2@V?$allocator@U?$pair@$$CBV?$basic_string@GU?$char_traits@G@std@@V?$allocator@G@2@@std@@U?$pair@$$CBV?$vector@V?$basic_string@GU?$char_traits@G@std@@V?$allocator@G@2@@std@@V?$allocator@V?$basic_string@GU?$char_traits@G@std@@V?$allocator@G@2@@std@@@2@@std@@$$CBV12@@2@@std@@@2@$0A@@std@@@std@@QEAA?AU?$pair@V?$_List_iterator@V?$_List_val@U?$_List_simple_types@U?$pair@$$CBV?$basic_string@GU?$char_traits@G@std@@V?$allocator@G@2@@std@@U?$pair@$$CBV?$vector@V?$basic_string@GU?$char_traits@G@std@@V?$allocator@G@2@@std@@V?$allocator@V?$basic_string@GU?$char_traits@G@std@@V?$allocator@G@2@@std@@@2@@std@@$$CBV12@@2@@std@@@std@@@std@@@std@@_N@1@AEBU?$pair@$$CBV?$basic_string@GU?$char_traits@G@std@@V?$allocator@G@2@@std@@U?$pair@$$CBV?$vector@V?$basic_string@GU?$char_traits@G@std@@V?$allocator@G@2@@std@@V?$allocator@V?$basic_string@GU?$char_traits@G@std@@V?$allocator@G@2@@std@@@2@@std@@$$CBV12@@2@@1@@Z is not correct
2022-05-30 03:16:19,546 - symless - CRITICAL generation.py:00350 - generate_structs - Name of the structure struc_??$emplace@V?$unique_any_t@V?$unique_storage@U?$resource_policy@PEAUHBITMAP__@@P6AHPEAX@Z$1?DeleteObject@@YAH0@ZU?$integral_constant@_K$0A@@wistd@@PEAU1@PEAU1@$0A@$$T@details@wil@@@details@wil@@@wil@@@?$_Hash@V?$_Uset_traits@V?$unique_any_t@V?$unique_storage@U?$resource_policy@PEAUHBITMAP__@@P6AHPEAX@Z$1?DeleteObject@@YAH0@ZU?$integral_constant@_K$0A@@wistd@@PEAU1@PEAU1@$0A@$$T@details@wil@@@details@wil@@@wil@@V?$_Uhash_compare@V?$unique_any_t@V?$unique_storage@U?$resource_policy@PEAUHBITMAP__@@P6AHPEAX@Z$1?DeleteObject@@YAH0@ZU?$integral_constant@_K$0A@@wistd@@PEAU1@PEAU1@$0A@$$T@details@wil@@@details@wil@@@wil@@U?$hash@V?$unique_any_t@V?$unique_storage@U?$resource_policy@PEAUHBITMAP__@@P6AHPEAX@Z$1?DeleteObject@@YAH0@ZU?$integral_constant@_K$0A@@wistd@@PEAU1@PEAU1@$0A@$$T@details@wil@@@details@wil@@@wil@@@std@@U?$equal_to@V?$unique_any_t@V?$unique_storage@U?$resource_policy@PEAUHBITMAP__@@P6AHPEAX@Z$1?DeleteObject@@YAH0@ZU?$integral_constant@_K$0A@@wistd@@PEAU1@PEAU1@$0A@$$T@details@wil@@@details@wil@@@wil@@@4@@std@@V?$allocator@V?$unique_any_t@V?$unique_storage@U?$resource_policy@PEAUHBITMAP__@@P6AHPEAX@Z$1?DeleteObject@@YAH0@ZU?$integral_constant@_K$0A@@wistd@@PEAU1@PEAU1@$0A@$$T@details@wil@@@details@wil@@@wil@@@4@$0A@@std@@@std@@QEAA?AU?$pair@V?$_List_const_iterator@V?$_List_val@U?$_List_simple_types@V?$unique_any_t@V?$unique_storage@U?$resource_policy@PEAUHBITMAP__@@P6AHPEAX@Z$1?DeleteObject@@YAH0@ZU?$integral_constant@_K$0A@@wistd@@PEAU1@PEAU1@$0A@$$T@details@wil@@@details@wil@@@wil@@@std@@@std@@@std@@_N@1@$$QEAV?$unique_any_t@V?$unique_storage@U?$resource_policy@PEAUHBITMAP__@@P6AHPEAX@Z$1?DeleteObject@@YAH0@ZU?$integral_constant@_K$0A@@wistd@@PEAU1@PEAU1@$0A@$$T@details@wil@@@details@wil@@@wil@@@Z is not correct
These names seem to be not demangled correctly by the idaapi and then the name is invalid
Nevertheless, in HexRays name are demangled ..
GUID_43826d1e_e718_42ee_bc55_a1e261c37bfe to GUID_symbol_name
Let's say we have two classes A
& B
, with B
inheriting A
. Symless has propagated A
& B
into the same function: A:A()
(A
constructor).
After conflict resolution A
members should be applied to the corresponding offset operands in A:A()
. This will create xrefs on members of A
, but the corresponding members in B
will not have their xrefs. This does not allow to know where these B
members are used.
A solution would be to add custom xrefs on those B
members. After this every structures fields should have at least one xref, or the field is a padding field.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.