libyal / dtfabric Goto Github PK
View Code? Open in Web Editor NEWTooling for data type and structure management
License: Apache License 2.0
Tooling for data type and structure management
License: Apache License 2.0
The name libyal was initially a pun on the naming theme of the various library projects. Now it serves the purpose of providing an overview of the available projects in a single location and as a home for scripts to help maintain the projects. For more information see: * Project documentation: https://github.com/libyal/libyal/wiki/Home * Overiew of available projects: https://github.com/libyal/libyal/wiki/Overview
Windows Recycler INFO2 files contain strings that have a predefined storage size but contain an end-of-string character. Unused bytes can contain remnant data.
Complete first version of definitions and runtime
add support for structure member conditions
add support for sections / member groups in structures
Some data formats such as sqlite and leveldb use varints to efficiently serialise integer values.
On s390x, the testCompositeMapByteStream
fails, apparently because it is not "big-endian proof". See the relevant snippet below:
======================================================================
FAIL: testCompositeMapByteStream (runtime.data_maps.StructureMapTest)
Tests the _CompositeMapByteStream function.
----------------------------------------------------------------------
Traceback (most recent call last):
File "/tmp/autopkgtest.pzI7mw/build.2X4/src/tests/runtime/data_maps.py", line 1284, in testCompositeMapByteStream
self.assertEqual(instance_block_header.property_value_offsets, (1, 2, 3))
AssertionError: Tuples differ: (16777216, 33554432, 50331648) != (1, 2, 3)
First differing element 0:
16777216
1
- (16777216, 33554432, 50331648)
+ (1, 2, 3)
----------------------------------------------------------------------
Ran 187 tests in 0.470s
FAILED (failures=1)
This is happening in Ubuntu's autopkgtest environment. The full log can be found here: https://autopkgtest.ubuntu.com/results/autopkgtest-kinetic/kinetic/s390x/d/dtfabric/20220811_075338_a3ecd@/log.gz.
add support for 24-bit and 48-bit integer
Consider adding support to define remaining data size
name: mru_value_string_and_shell_item
type: structure
attributes:
byte_order: little-endian
members:
- name: string
type: string
encoding: utf-16-le
element_data_type: wchar16
elements_terminator: "\x00\x00"
- name: shell_item
type: stream
element_data_type: byte
elements_data_size: <<remaining data size>>
Maybe instead of a structure use a format data type?
Do this via context() instead?
Current method of doing it would be to read string, determine the byte size using the context and use that as the byte offset.
strip elements terminator from strings
https://github.com/libyal/dtfabric/blob/master/dtfabric/runtime/fabric.py#L18 specifies that the expected type for yaml_definition is str, but this argument is then passed to a BytesIO object.
Please clarify whether this method is intended to take a string or bytes argument.
add support for alignment padding
From: log2timeline/plaso#1917
Traceback (most recent call last):
File "/bin/log2timeline.py", line 68, in <module>
if not Main():
File "/bin/log2timeline.py", line 54, in Main
tool.ExtractEventsFromSources()
File "/usr/lib/python2.7/site-packages/plaso/cli/log2timeline_tool.py", line 368, in ExtractEventsFromSources
scan_context = self.ScanSource(self._source_path)
File "/usr/lib/python2.7/site-packages/plaso/cli/storage_media_tool.py", line 1099, in ScanSource
self._source_scanner.Scan(scan_context)
File "/usr/lib/python2.7/site-packages/dfvfs/helpers/source_scanner.py", line 565, in Scan
self._ScanNode(scan_context, scan_node, auto_recurse=auto_recurse)
File "/usr/lib/python2.7/site-packages/dfvfs/helpers/source_scanner.py", line 366, in _ScanNode
scan_node.path_spec, resolver_context=self._resolver_context)
File "/usr/lib/python2.7/site-packages/dfvfs/resolver/resolver.py", line 55, in OpenFileEntry
path_spec_object, resolver_context=resolver_context)
File "/usr/lib/python2.7/site-packages/dfvfs/resolver/resolver.py", line 158, in OpenFileSystem
resolver_helper = cls._GetResolverHelper(path_spec_object.type_indicator)
File "/usr/lib/python2.7/site-packages/dfvfs/resolver/resolver.py", line 35, in _GetResolverHelper
from dfvfs.resolver_helpers import manager
File "/usr/lib/python2.7/site-packages/dfvfs/resolver_helpers/__init__.py", line 10, in <module>
from dfvfs.resolver_helpers import cpio_resolver_helper
File "/usr/lib/python2.7/site-packages/dfvfs/resolver_helpers/cpio_resolver_helper.py", line 10, in <module>
from dfvfs.vfs import cpio_file_system
File "/usr/lib/python2.7/site-packages/dfvfs/vfs/cpio_file_system.py", line 6, in <module>
from dfvfs.lib import cpio
File "/usr/lib/python2.7/site-packages/dfvfs/lib/cpio.py", line 43, in <module>
class CPIOArchiveFile(data_format.DataFormat):
File "/usr/lib/python2.7/site-packages/dfvfs/lib/cpio.py", line 59, in CPIOArchiveFile
'cpio_binary_big_endian_file_entry')
File "/usr/lib/python2.7/site-packages/dtfabric/runtime/data_maps.py", line 1811, in CreateDataTypeMap
return DataTypeMapFactory.CreateDataTypeMapByType(data_type_definition)
File "/usr/lib/python2.7/site-packages/dtfabric/runtime/data_maps.py", line 1829, in CreateDataTypeMapByType
return data_type_map_class(data_type_definition)
File "/usr/lib/python2.7/site-packages/dtfabric/runtime/data_maps.py", line 1342, in __init__
data_type_definition, self._data_type_map_cache)
File "/usr/lib/python2.7/site-packages/dtfabric/runtime/data_maps.py", line 1542, in _GetMemberDataTypeMaps
member_definition)
File "/usr/lib/python2.7/site-packages/dtfabric/runtime/data_maps.py", line 1829, in CreateDataTypeMapByType
return data_type_map_class(data_type_definition)
File "/usr/lib/python2.7/site-packages/dtfabric/runtime/data_maps.py", line 217, in __init__
self._operation = self._GetByteStreamOperation()
File "/usr/lib/python2.7/site-packages/dtfabric/runtime/data_maps.py", line 150, in _GetByteStreamOperation
return byte_operations.StructOperation(format_string)
File "/usr/lib/python2.7/site-packages/dtfabric/runtime/byte_operations.py", line 55, in __init__
'with error: {0!s}').format(exception))
dtfabric.errors.FormatError: Unable to create struct object from data type definition with error: Struct() argument 1 must be string, not unicode
Some formats might require data transforms e.g XOR
dtFabric might not be the place to implement this but keeping this issue to track ideas
name: keychain_database_schema
type: structure
attributes:
byte_order: big-endian
members:
- name: size
data_type: uint32
- name: number_of_tables
data_type: uint32
- name: table_offsets
type: sequence
element_data_type: uint32
number_of_elements: keychain_database_schema.number_of_tables
Currently byte-order does not appear to be propagated to table_offsets
Handling data files in Python is not optimal.
Having large yaml strings in Python files is not optimal.
Per https://codereview.appspot.com/328230043/ have an easy deployment of dtfabric yaml files in Python
When building packages with 20200621 on openSUSE on ppc64 (big-endian) I get these erorrs:
[ 54s]
[ 54s] ======================================================================
[ 54s] ERROR: testMapByteStreamWithSequenceWithExpression (runtime.data_maps.StructureMapTest)
[ 54s] Tests the MapByteStream function with a sequence with expression.
[ 54s] ----------------------------------------------------------------------
[ 54s] Traceback (most recent call last):
[ 54s] File "./dtfabric/runtime/data_maps.py", line 1651, in _CompositeMapByteStream
[ 54s] byte_stream, byte_offset=byte_offset, context=subcontext)
[ 54s] File "./dtfabric/runtime/data_maps.py", line 1940, in MapByteStream
[ 54s] return self._map_byte_stream(byte_stream, **kwargs)
[ 54s] File "./dtfabric/runtime/data_maps.py", line 1819, in _LinearMapByteStream
[ 54s] self._CheckByteStreamSize(byte_stream, byte_offset, members_data_size)
[ 54s] File "./dtfabric/runtime/data_maps.py", line 170, in _CheckByteStreamSize
[ 54s] data_type_size, byte_stream_size))
[ 54s] dtfabric.errors.ByteStreamTooSmallError: Byte stream too small requested: 12 available: 452
[ 54s]
[ 54s] During handling of the above exception, another exception occurred:
[ 54s]
[ 54s] Traceback (most recent call last):
[ 54s] File "./dtfabric/runtime/data_maps.py", line 973, in _CompositeMapByteStream
[ 54s] byte_stream, byte_offset=byte_offset, context=subcontext)
[ 54s] File "./dtfabric/runtime/data_maps.py", line 1940, in MapByteStream
[ 54s] return self._map_byte_stream(byte_stream, **kwargs)
[ 54s] File "./dtfabric/runtime/data_maps.py", line 1658, in _CompositeMapByteStream
[ 54s] raise errors.ByteStreamTooSmallError(exception)
[ 54s] dtfabric.errors.ByteStreamTooSmallError: Byte stream too small requested: 12 available: 452
[ 54s]
[ 54s] During handling of the above exception, another exception occurred:
[ 54s]
[ 54s] Traceback (most recent call last):
[ 54s] File "./dtfabric/runtime/data_maps.py", line 1651, in _CompositeMapByteStream
[ 54s] byte_stream, byte_offset=byte_offset, context=subcontext)
[ 54s] File "./dtfabric/runtime/data_maps.py", line 1146, in MapByteStream
[ 54s] return self._map_byte_stream(byte_stream, **kwargs)
[ 54s] File "./dtfabric/runtime/data_maps.py", line 989, in _CompositeMapByteStream
[ 54s] raise errors.ByteStreamTooSmallError(exception)
[ 54s] dtfabric.errors.ByteStreamTooSmallError: Byte stream too small requested: 12 available: 452
[ 54s]
[ 54s] During handling of the above exception, another exception occurred:
[ 54s]
[ 54s] Traceback (most recent call last):
[ 54s] File "/home/abuild/rpmbuild/BUILD/dtfabric-20200621/tests/runtime/data_maps.py", line 1388, in testMapByteStreamWithSequenceWithExpression
[ 54s] sphere = data_type_map.MapByteStream(byte_stream)
[ 54s] File "./dtfabric/runtime/data_maps.py", line 1940, in MapByteStream
[ 54s] return self._map_byte_stream(byte_stream, **kwargs)
[ 54s] File "./dtfabric/runtime/data_maps.py", line 1658, in _CompositeMapByteStream
[ 54s] raise errors.ByteStreamTooSmallError(exception)
[ 54s] dtfabric.errors.ByteStreamTooSmallError: Byte stream too small requested: 12 available: 452
[ 54s]
[ 54s] ======================================================================
[ 54s] ERROR: testMapByteStreamWithSequenceWithExpression2 (runtime.data_maps.StructureMapTest)
[ 54s] Tests the MapByteStream function with a sequence with expression.
[ 54s] ----------------------------------------------------------------------
[ 54s] Traceback (most recent call last):
[ 54s] File "./dtfabric/runtime/data_maps.py", line 1651, in _CompositeMapByteStream
[ 54s] byte_stream, byte_offset=byte_offset, context=subcontext)
[ 54s] File "./dtfabric/runtime/data_maps.py", line 1146, in MapByteStream
[ 54s] return self._map_byte_stream(byte_stream, **kwargs)
[ 54s] File "./dtfabric/runtime/data_maps.py", line 1004, in _CompositeMapByteStream
[ 54s] raise errors.ByteStreamTooSmallError(error_string)
[ 54s] dtfabric.errors.ByteStreamTooSmallError: Unable to read: data from byte stream at offset: 260 with error: missing element: 255
[ 54s]
[ 54s] During handling of the above exception, another exception occurred:
[ 54s]
[ 54s] Traceback (most recent call last):
[ 54s] File "/home/abuild/rpmbuild/BUILD/dtfabric-20200621/tests/runtime/data_maps.py", line 1474, in testMapByteStreamWithSequenceWithExpression2
[ 54s] extension_block = data_type_map.MapByteStream(byte_stream)
[ 54s] File "./dtfabric/runtime/data_maps.py", line 1940, in MapByteStream
[ 54s] return self._map_byte_stream(byte_stream, **kwargs)
[ 54s] File "./dtfabric/runtime/data_maps.py", line 1658, in _CompositeMapByteStream
[ 54s] raise errors.ByteStreamTooSmallError(exception)
[ 54s] dtfabric.errors.ByteStreamTooSmallError: Unable to read: data from byte stream at offset: 260 with error: missing element: 255
[ 54s]
[ 54s] ======================================================================
[ 54s] ERROR: testMapByteStreamWithSequenceWithValues (runtime.data_maps.StructureMapTest)
[ 54s] Tests the MapByteStream function with a sequence with values.
[ 54s] ----------------------------------------------------------------------
[ 54s] Traceback (most recent call last):
[ 54s] File "./dtfabric/runtime/data_maps.py", line 1835, in _LinearMapByteStream
[ 54s] '{0!s}'.format(value) for value in supported_values])))
[ 54s] dtfabric.errors.MappingError: Value: 33554432 not in supported values: 2, 3
[ 54s]
[ 54s] During handling of the above exception, another exception occurred:
[ 54s]
[ 54s] Traceback (most recent call last):
[ 54s] File "/home/abuild/rpmbuild/BUILD/dtfabric-20200621/tests/runtime/data_maps.py", line 1343, in testMapByteStreamWithSequenceWithValues
[ 54s] structure_with_values = data_type_map.MapByteStream(byte_stream)
[ 54s] File "./dtfabric/runtime/data_maps.py", line 1940, in MapByteStream
[ 54s] return self._map_byte_stream(byte_stream, **kwargs)
[ 54s] File "./dtfabric/runtime/data_maps.py", line 1846, in _LinearMapByteStream
[ 54s] raise errors.MappingError(error_string)
[ 54s] dtfabric.errors.MappingError: Unable to read: structure_with_values from byte stream at offset: 0 with error: Value: 33554432 not in supported values: 2, 3
[ 54s]
[ 54s] ======================================================================
[ 54s] ERROR: testMapByteStreamWithStream (runtime.data_maps.StructureMapTest)
[ 54s] Tests the MapByteStream function with a stream.
[ 54s] ----------------------------------------------------------------------
[ 54s] Traceback (most recent call last):
[ 54s] File "./dtfabric/runtime/data_maps.py", line 1651, in _CompositeMapByteStream
[ 54s] byte_stream, byte_offset=byte_offset, context=subcontext)
[ 54s] File "./dtfabric/runtime/data_maps.py", line 1242, in MapByteStream
[ 54s] self._CheckByteStreamSize(byte_stream, byte_offset, elements_data_size)
[ 54s] File "./dtfabric/runtime/data_maps.py", line 170, in _CheckByteStreamSize
[ 54s] data_type_size, byte_stream_size))
[ 54s] dtfabric.errors.ByteStreamTooSmallError: Byte stream too small requested: 67174396 available: 260
[ 54s]
[ 54s] During handling of the above exception, another exception occurred:
[ 54s]
[ 54s] Traceback (most recent call last):
[ 54s] File "/home/abuild/rpmbuild/BUILD/dtfabric-20200621/tests/runtime/data_maps.py", line 1505, in testMapByteStreamWithStream
[ 54s] extension_block = data_type_map.MapByteStream(byte_stream)
[ 54s] File "./dtfabric/runtime/data_maps.py", line 1940, in MapByteStream
[ 54s] return self._map_byte_stream(byte_stream, **kwargs)
[ 54s] File "./dtfabric/runtime/data_maps.py", line 1658, in _CompositeMapByteStream
[ 54s] raise errors.ByteStreamTooSmallError(exception)
[ 54s] dtfabric.errors.ByteStreamTooSmallError: Byte stream too small requested: 67174396 available: 260
[ 54s]
[ 54s] ======================================================================
[ 54s] ERROR: testMapByteStreamWithString (runtime.data_maps.StructureMapTest)
[ 54s] Tests the MapByteStream function with a string.
[ 54s] ----------------------------------------------------------------------
[ 54s] Traceback (most recent call last):
[ 54s] File "./dtfabric/runtime/data_maps.py", line 1651, in _CompositeMapByteStream
[ 54s] byte_stream, byte_offset=byte_offset, context=subcontext)
[ 54s] File "./dtfabric/runtime/data_maps.py", line 1431, in MapByteStream
[ 54s] byte_stream, byte_offset=byte_offset, **kwargs)
[ 54s] File "./dtfabric/runtime/data_maps.py", line 1242, in MapByteStream
[ 54s] self._CheckByteStreamSize(byte_stream, byte_offset, elements_data_size)
[ 54s] File "./dtfabric/runtime/data_maps.py", line 170, in _CheckByteStreamSize
[ 54s] data_type_size, byte_stream_size))
[ 54s] dtfabric.errors.ByteStreamTooSmallError: Byte stream too small requested: 4096 available: 18
[ 54s]
[ 54s] During handling of the above exception, another exception occurred:
[ 54s]
[ 54s] Traceback (most recent call last):
[ 54s] File "/home/abuild/rpmbuild/BUILD/dtfabric-20200621/tests/runtime/data_maps.py", line 1533, in testMapByteStreamWithString
[ 54s] utf16_string = data_type_map.MapByteStream(byte_stream)
[ 54s] File "./dtfabric/runtime/data_maps.py", line 1940, in MapByteStream
[ 54s] return self._map_byte_stream(byte_stream, **kwargs)
[ 54s] File "./dtfabric/runtime/data_maps.py", line 1658, in _CompositeMapByteStream
[ 54s] raise errors.ByteStreamTooSmallError(exception)
[ 54s] dtfabric.errors.ByteStreamTooSmallError: Byte stream too small requested: 4096 available: 18
[ 54s]
[ 54s] ======================================================================
[ 54s] FAIL: testReadFrom (runtime.byte_operations.StructOperationTest)
[ 54s] Tests the ReadFrom function.
[ 54s] ----------------------------------------------------------------------
[ 54s] Traceback (most recent call last):
[ 54s] File "/home/abuild/rpmbuild/BUILD/dtfabric-20200621/tests/runtime/byte_operations.py", line 33, in testReadFrom
[ 54s] self.assertEqual(value, tuple([0x78563412]))
[ 54s] AssertionError: Tuples differ: (305419896,) != (2018915346,)
[ 54s]
[ 54s] First differing element 0:
[ 54s] 305419896
[ 54s] 2018915346
[ 54s]
[ 54s] - (305419896,)
[ 54s] + (2018915346,)
[ 54s]
[ 54s] ======================================================================
[ 54s] FAIL: testWriteTo (runtime.byte_operations.StructOperationTest)
[ 54s] Tests the WriteTo function.
[ 54s] ----------------------------------------------------------------------
[ 54s] Traceback (most recent call last):
[ 54s] File "/home/abuild/rpmbuild/BUILD/dtfabric-20200621/tests/runtime/byte_operations.py", line 46, in testWriteTo
[ 54s] self.assertEqual(byte_stream, b'\x12\x34\x56\x78')
[ 54s] AssertionError: b'xV4\x12' != b'\x124Vx'
[ 54s]
[ 54s] ======================================================================
[ 54s] FAIL: testGetSizeHint (runtime.data_maps.StructureMapTest)
[ 54s] Tests the GetSizeHint function with a string.
[ 54s] ----------------------------------------------------------------------
[ 54s] Traceback (most recent call last):
[ 54s] File "/home/abuild/rpmbuild/BUILD/dtfabric-20200621/tests/runtime/data_maps.py", line 1619, in testGetSizeHint
[ 54s] self.assertEqual(size_hint, 18)
[ 54s] AssertionError: 4098 != 18
[ 54s]
[ 54s] ======================================================================
[ 54s] FAIL: testMapByteStreamWithPadding (runtime.data_maps.StructureMapTest)
[ 54s] Tests the MapByteStream function with padding.
[ 54s] ----------------------------------------------------------------------
[ 54s] Traceback (most recent call last):
[ 54s] File "/home/abuild/rpmbuild/BUILD/dtfabric-20200621/tests/runtime/data_maps.py", line 1439, in testMapByteStreamWithPadding
[ 54s] self.assertEqual(structure.data_size, 256)
[ 54s] AssertionError: 1 != 256
[ 54s]
[ 54s] ======================================================================
[ 54s] FAIL: testMapByteStreamWithSequenceWithCondition (runtime.data_maps.StructureMapTest)
[ 54s] Tests the MapByteStream function with a sequence with condition.
[ 54s] ----------------------------------------------------------------------
[ 54s] Traceback (most recent call last):
[ 54s] File "/home/abuild/rpmbuild/BUILD/dtfabric-20200621/tests/runtime/data_maps.py", line 1305, in testMapByteStreamWithSequenceWithCondition
[ 54s] self.assertEqual(structure_with_condition.flags, 0x8001)
[ 54s] AssertionError: 384 != 32769
[ 54s]
[ 54s] ----------------------------------------------------------------------
[ 54s] Ran 161 tests in 3.617s
[ 54s]
[ 54s] FAILED (failures=5, errors=5)
[ 54s] Using Python version 3.6.13 (default, Feb 19 2021, 18:59:35) [GCC]
Complete build log with all details on package versions, exact steps, etc.
currently structure with padding must have a size that is aligned with the padding, add flag to support optional padding?
add support for maximum stream and string size, to speed up incremental read of string
this additional information allows to improve support for size hint
Consider addding support for sequence with variable element size? or add a higher (abstract) level data type for this?
For now, dtFabric will error on sequence with variable size element data type
Define a layout type for BSM structure with token format
add data type size check e.g. limit integer to 1, 2, 4, 8 byte?
add support for structure member units e.g. number of blocks
size == copy_of_size
for evt (use validation rules?)"strings arrays" are used by multiple formats add a data type definition to support them
Per: https://codereview.appspot.com/347790043/diff/20001/plaso/parsers/data_formats.py there is a need to expose data_type_definition.name in DataTypeMap
Some formats have a base type, eg. lnk data blocks https://github.com/libyal/liblnk/blob/main/documentation/Windows%20Shortcut%20File%20(LNK)%20format.asciidoc#6-extra-data, that have a base type that can be read without specific support for identifying values, versus eg. bsm token, https://github.com/libyal/dtformats/blob/main/documentation/Basic%20Security%20Module%20(BSM)%20event%20auditing%20file%20format.asciidoc#1-overview, that requires a known token type
to consider:
Replace skipUnlessHasTestFile decorator with unittest.SkipTest
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.