dkovar / analyzemft Goto Github PK

View Code? Open in Web Editor NEW

421.0 421.0 115.0 178 KB

License: Other

Python 100.00%

analyzemft's People

Contributors

Stargazers

Watchers

Forkers

ghanshyamdrathod mrkimkim gaenari kevin0788 giltt andy737 peterclemenko cblack pballen securerandom thorsten-sick vicgc apolkosnik-old parkisan jallred priestd09 ghostrider1124 ajnelson j3ddai nimrodyo areej-master glassdfir ahmedomarjee sha8e joon-y lijiajing openspherelab solidstorm peterpt cspensky lautriv lespea mgardner99 bokdong2 mrmugiwara malzoek nickkarras jackbro 453483289 killberos martinvicha ecapuano analyzedfir bridgew syben pogn hardware-forest-utopia emjay101 mpilking ir4n6 davehull i5r3doe2l oporih gitlabsyncint peritoinformatico moddingg33k threezerobravoteam n4rr34n6 chester777 fdt2k morrigangirl hed-fb seabreg 3453-315h incidedigital eddsalkield gndu91 slad99 stevengoossensb heretekk luciferr0666 onlyone0001 likescam 5l1v3r1 kushal-infosec digitalsleuth basvanschaik omerda123 hartl3y94 ohio813 borjapintos papadope-zz no-sec ekmixon dunkelheitx allisonshen gavz filecensus elysianblue 4144 cybertamir dandycheung fero-z ayanacihan cosrah augustin-fl phucsnguyenv iamdank majenkins31 th3tul1p3

analyzemft's Issues

Make -o optional if -b specified

bodyfile output occurs in CSV output module. Need to untangle that.

Additional rules

M - modified, B - birth, A - accessed:

If M < B then likely file copy Detected at B
If M and B < A == volume file move

Datarun oddity

Investigate records that generate "datarun oddity" errors. Dataruns with len 0 or > 6

Files not marked as deleted in bodyfile output

I noticed there is no designation in the bodyfile output that a file is marked as deleted vs. active. Also, it appears that on deleted files, sometimes the recovered parent path is inaccurate.

Some files are reporting a file size of 0. Validated data shows file sizes greater than 0. Here is a link with some additional information in regards to MFT file sizes: https://twitter.com/keydet89/status/504684996473675777

Thanks!

Investigate update sequence numbers

http://msdn.microsoft.com/en-us/library/bb470124(v=vs.85).aspx

Google MFT update sequence number

Filename bug

The mft has problems dealing w/ filenames with periods in them. For example, "adobe/reader 9.0" is reported as "adobe/reader~~1.0" and "9.3.0" becomes "932E79D~~1.0". I tried looking the the unicode hack, but couldn't come up w/ any obvious solutions.

Link in README is broken

The link in README to http://data.linux-ntfs.org/ntfsdoc.pdf does not point to anything. You likely want to fix this.

Incorrect file sizes reported

Defined in "decodeFNAttribute()", the value for the file size derives from
"d['real_fsize'] = struct.unpack("<q",s[48:56])[0]" .

If I add "int(record['fn',0]['real_fsize'])" to the output of (e.g., as seen in "mft_to_body()"), the resulting file sizes reported are incorrect. I've validated the file sizes show up correct in two other tools.

Unable to handle $FILE_NAME attributes stored in $ATTRIBUTE_LIST

https://github.com/dfirlabs/ntfs-specimens/blob/master/generate-specimens-behavior.bat#L282

The ntfs_file_name_list.vhd image contains an MFT entry with $FILE_NAME attributes stored in an $ATTRIBUTE_LIST. Rough outline of the file system hierarchy.

testdir1
testdir1\testfile1
testdir10
testdir10\hardlink9
testdir11
testdir11\hardlink10
testdir12
testdir12\hardlink11
testdir13
testdir13\hardlink12
testdir14
testdir14\hardlink13
testdir15
testdir15\hardlink14
testdir16
testdir16\hardlink15
testdir2
testdir2\hardlink1
testdir3
testdir3\hardlink2
testdir4
testdir4\hardlink3
testdir5
testdir5\hardlink4
testdir6
testdir6\hardlink5
testdir7
testdir7\hardlink6
testdir8
testdir8\hardlink7
testdir9
testdir9\hardlink8

The MFT entry based on the full file system

istat -o 128 ntfs_file_name_list.vhd 38

MFT Entry Header Values:
Entry: 38        Sequence: 1
$LogFile Sequence Number: 1081061
Allocated File
Links: 16

$STANDARD_INFORMATION Attribute Values:
Flags: Archive
Owner ID: 0
Security ID: 264  ()
Created:	2020-02-01 10:48:46.138610600 (CET)
File Modified:	2020-02-01 10:48:46.138610600 (CET)
MFT Modified:	2020-02-01 10:48:46.276667100 (CET)
Accessed:	2020-02-01 10:48:46.138610600 (CET)

$FILE_NAME Attribute Values:
Flags: Archive
Name: testfile1
Parent MFT Entry: 37 	Sequence: 1
Allocated Size: 0   	Actual Size: 0
Created:	2020-02-01 10:48:46.138610600 (CET)
File Modified:	2020-02-01 10:48:46.138610600 (CET)
MFT Modified:	2020-02-01 10:48:46.138610600 (CET)
Accessed:	2020-02-01 10:48:46.138610600 (CET)

$FILE_NAME Attribute Values:
Flags: Archive
Name: hardlink1
Parent MFT Entry: 39 	Sequence: 1
Allocated Size: 16   	Actual Size: 14
Created:	2020-02-01 10:48:46.138610600 (CET)
File Modified:	2020-02-01 10:48:46.138610600 (CET)
MFT Modified:	2020-02-01 10:48:46.145124900 (CET)
Accessed:	2020-02-01 10:48:46.138610600 (CET)

$FILE_NAME Attribute Values:
Flags: Archive
Name: hardlink15
Parent MFT Entry: 55 	Sequence: 1
Allocated Size: 16   	Actual Size: 14
Created:	2020-02-01 10:48:46.138610600 (CET)
File Modified:	2020-02-01 10:48:46.138610600 (CET)
MFT Modified:	2020-02-01 10:48:46.276667100 (CET)
Accessed:	2020-02-01 10:48:46.138610600 (CET)

$FILE_NAME Attribute Values:
Flags: Archive
Name: hardlink2
Parent MFT Entry: 40 	Sequence: 1
Allocated Size: 16   	Actual Size: 14
Created:	2020-02-01 10:48:46.138610600 (CET)
File Modified:	2020-02-01 10:48:46.138610600 (CET)
MFT Modified:	2020-02-01 10:48:46.145124900 (CET)
Accessed:	2020-02-01 10:48:46.138610600 (CET)

$FILE_NAME Attribute Values:
Flags: Archive
Name: hardlink3
Parent MFT Entry: 41 	Sequence: 1
Allocated Size: 16   	Actual Size: 14
Created:	2020-02-01 10:48:46.138610600 (CET)
File Modified:	2020-02-01 10:48:46.138610600 (CET)
MFT Modified:	2020-02-01 10:48:46.160776400 (CET)
Accessed:	2020-02-01 10:48:46.138610600 (CET)

$FILE_NAME Attribute Values:
Flags: Archive
Name: hardlink4
Parent MFT Entry: 42 	Sequence: 1
Allocated Size: 16   	Actual Size: 14
Created:	2020-02-01 10:48:46.138610600 (CET)
File Modified:	2020-02-01 10:48:46.138610600 (CET)
MFT Modified:	2020-02-01 10:48:46.176391600 (CET)
Accessed:	2020-02-01 10:48:46.138610600 (CET)

$FILE_NAME Attribute Values:
Flags: Archive
Name: hardlink5
Parent MFT Entry: 43 	Sequence: 1
Allocated Size: 16   	Actual Size: 14
Created:	2020-02-01 10:48:46.138610600 (CET)
File Modified:	2020-02-01 10:48:46.138610600 (CET)
MFT Modified:	2020-02-01 10:48:46.176391600 (CET)
Accessed:	2020-02-01 10:48:46.138610600 (CET)

$FILE_NAME Attribute Values:
Flags: Archive
Name: hardlink6
Parent MFT Entry: 44 	Sequence: 1
Allocated Size: 16   	Actual Size: 14
Created:	2020-02-01 10:48:46.138610600 (CET)
File Modified:	2020-02-01 10:48:46.138610600 (CET)
MFT Modified:	2020-02-01 10:48:46.192022000 (CET)
Accessed:	2020-02-01 10:48:46.138610600 (CET)

$FILE_NAME Attribute Values:
Flags: Archive
Name: hardlink7
Parent MFT Entry: 45 	Sequence: 1
Allocated Size: 16   	Actual Size: 14
Created:	2020-02-01 10:48:46.138610600 (CET)
File Modified:	2020-02-01 10:48:46.138610600 (CET)
MFT Modified:	2020-02-01 10:48:46.207641500 (CET)
Accessed:	2020-02-01 10:48:46.138610600 (CET)

$FILE_NAME Attribute Values:
Flags: Archive
Name: hardlink8
Parent MFT Entry: 47 	Sequence: 1
Allocated Size: 16   	Actual Size: 14
Created:	2020-02-01 10:48:46.138610600 (CET)
File Modified:	2020-02-01 10:48:46.138610600 (CET)
MFT Modified:	2020-02-01 10:48:46.207641500 (CET)
Accessed:	2020-02-01 10:48:46.138610600 (CET)

$FILE_NAME Attribute Values:
Flags: Archive
Name: hardlink9
Parent MFT Entry: 48 	Sequence: 1
Allocated Size: 16   	Actual Size: 14
Created:	2020-02-01 10:48:46.138610600 (CET)
File Modified:	2020-02-01 10:48:46.138610600 (CET)
MFT Modified:	2020-02-01 10:48:46.223263600 (CET)
Accessed:	2020-02-01 10:48:46.138610600 (CET)

$FILE_NAME Attribute Values:
Flags: Archive
Name: hardlink10
Parent MFT Entry: 49 	Sequence: 1
Allocated Size: 16   	Actual Size: 14
Created:	2020-02-01 10:48:46.138610600 (CET)
File Modified:	2020-02-01 10:48:46.138610600 (CET)
MFT Modified:	2020-02-01 10:48:46.223263600 (CET)
Accessed:	2020-02-01 10:48:46.138610600 (CET)

$FILE_NAME Attribute Values:
Flags: Archive
Name: hardlink11
Parent MFT Entry: 50 	Sequence: 1
Allocated Size: 16   	Actual Size: 14
Created:	2020-02-01 10:48:46.138610600 (CET)
File Modified:	2020-02-01 10:48:46.138610600 (CET)
MFT Modified:	2020-02-01 10:48:46.238888400 (CET)
Accessed:	2020-02-01 10:48:46.138610600 (CET)

$FILE_NAME Attribute Values:
Flags: Archive
Name: hardlink12
Parent MFT Entry: 52 	Sequence: 1
Allocated Size: 16   	Actual Size: 14
Created:	2020-02-01 10:48:46.138610600 (CET)
File Modified:	2020-02-01 10:48:46.138610600 (CET)
MFT Modified:	2020-02-01 10:48:46.245402800 (CET)
Accessed:	2020-02-01 10:48:46.138610600 (CET)

$FILE_NAME Attribute Values:
Flags: Archive
Name: hardlink13
Parent MFT Entry: 53 	Sequence: 1
Allocated Size: 16   	Actual Size: 14
Created:	2020-02-01 10:48:46.138610600 (CET)
File Modified:	2020-02-01 10:48:46.138610600 (CET)
MFT Modified:	2020-02-01 10:48:46.245402800 (CET)
Accessed:	2020-02-01 10:48:46.138610600 (CET)

$FILE_NAME Attribute Values:
Flags: Archive
Name: hardlink14
Parent MFT Entry: 54 	Sequence: 1
Allocated Size: 16   	Actual Size: 14
Created:	2020-02-01 10:48:46.138610600 (CET)
File Modified:	2020-02-01 10:48:46.138610600 (CET)
MFT Modified:	2020-02-01 10:48:46.261038600 (CET)
Accessed:	2020-02-01 10:48:46.138610600 (CET)

$ATTRIBUTE_LIST Attribute Values:
Type: 16-0 	MFT Entry: 38 	VCN: 0
Type: 48-2 	MFT Entry: 38 	VCN: 0
Type: 48-3 	MFT Entry: 38 	VCN: 0
Type: 48-0 	MFT Entry: 46 	VCN: 0
Type: 48-1 	MFT Entry: 46 	VCN: 0
Type: 48-2 	MFT Entry: 46 	VCN: 0
Type: 48-3 	MFT Entry: 46 	VCN: 0
Type: 48-4 	MFT Entry: 46 	VCN: 0
Type: 48-0 	MFT Entry: 51 	VCN: 0
Type: 48-1 	MFT Entry: 51 	VCN: 0
Type: 48-2 	MFT Entry: 51 	VCN: 0
Type: 48-3 	MFT Entry: 51 	VCN: 0
Type: 48-0 	MFT Entry: 56 	VCN: 0
Type: 48-1 	MFT Entry: 56 	VCN: 0
Type: 48-2 	MFT Entry: 56 	VCN: 0
Type: 48-3 	MFT Entry: 56 	VCN: 0
Type: 48-19 	MFT Entry: 38 	VCN: 0
Type: 128-5 	MFT Entry: 46 	VCN: 0

Attributes: 
Type: $STANDARD_INFORMATION (16-0)   Name: N/A   Resident   size: 72
Type: $ATTRIBUTE_LIST (32-12)   Name: N/A   Non-Resident   size: 576  init_size: 576
55 
Type: $FILE_NAME (48-2)   Name: N/A   Resident   size: 84
Type: $FILE_NAME (48-3)   Name: N/A   Resident   size: 84
Type: $FILE_NAME (48-19)   Name: N/A   Resident   size: 86
Type: $FILE_NAME (48-0)   Name: N/A   Resident   size: 84
Type: $FILE_NAME (48-0)   Name: N/A   Resident   size: 84
Type: $FILE_NAME (48-0)   Name: N/A   Resident   size: 84
Type: $FILE_NAME (48-0)   Name: N/A   Resident   size: 84
Type: $FILE_NAME (48-0)   Name: N/A   Resident   size: 84
Type: $DATA (128-20)   Name: N/A   Resident   size: 14
Type: $FILE_NAME (48-0)   Name: N/A   Resident   size: 84
Type: $FILE_NAME (48-0)   Name: N/A   Resident   size: 84
Type: $FILE_NAME (48-0)   Name: N/A   Resident   size: 84
Type: $FILE_NAME (48-0)   Name: N/A   Resident   size: 86
Type: $FILE_NAME (48-0)   Name: N/A   Resident   size: 86
Type: $FILE_NAME (48-0)   Name: N/A   Resident   size: 86
Type: $FILE_NAME (48-0)   Name: N/A   Resident   size: 86
Type: $FILE_NAME (48-0)   Name: N/A   Resident   size: 8

analyzeMFT.py is unable to reconstruct the $ATTRIBUTE_LIST, which should be possible based on the base record file reference.

analyzeMFT.py -f MFT.bin -b bodyfile --bodyfull

grep hardlink bodyfile
0|/testdir1/hardlink15|0|0|0|0|0|1580550526|1580550526|1580550526|1580550526
0|/testdir3/hardlink6|0|0|0|0|14|1580550526|1580550526|1580550526|1580550526
0|/testdir8/hardlink10|0|0|0|0|14|1580550526|1580550526|1580550526|1580550526
0|/testdir12/hardlink14|0|0|0|0|14|1580550526|1580550526|1580550526|1580550526

get_folder_path error (no attribute decode)

[user23@system23 analyzeMFT]$ sudo ./analyzeMFT.py -f mft.raw -o mftanalyzed.csv

Traceback (most recent call last):
  File "/home/user23/repo/analyzeMFT/./analyzeMFT.py", line 12, in <module>
    session.process_mft_file()
  File "/home/user23/repo/analyzeMFT/analyzemft/mftsession.py", line 189, in process_mft_file
    self.build_filepaths()
  File "/home/user23/repo/analyzeMFT/analyzemft/mftsession.py", line 309, in build_filepaths
    self.gen_filepaths()
  File "/home/user23/repo/analyzeMFT/analyzemft/mftsession.py", line 357, in gen_filepaths
    self.get_folder_path(i)
  File "/home/user23/repo/analyzeMFT/analyzemft/mftsession.py", line 342, in get_folder_path
    self.mft[seqnum]['filename'] = parentpath + self.path_sep + self.mft[seqnum]['name'].decode()
AttributeError: 'str' object has no attribute 'decode'. Did you mean: 'encode'?

No idea what went wrong :(

Any chance to get a dual licensing?

Hi, I am a MSc student working on a thesis related to NTFS file recovery. I'm planning to write an open source software (GPLv3) in Python which will apply some data recovery techniques. When I found your project I was hoping to be able to add its functionality to my application, in order to focus on the subject of my thesis which is directory tree reconstruction from MFT entries in damaged drives.

However, I can't. That's because the Common Public License is not compatible with the GPL, so I would breach your license in case I used a part of analyzeMFT as a library for my GPL software. Is there any chance that you and the few contributors may consider licensing the code also under the GPL?

Thank you very much in advance, and don't worry if the answer is no. I'll understand.

Fix bitparse

Phantasm @musichatred
@dckovar pls fix bitparse.py. 2's complement (ref +=1) should be outside the loop, gave me some heartburn today. Otherwise, thx for code!

Nanosecond zero detection failure with test case.

Test nanosecond zero detection. It currently fails to detect the nanoseconds being zero on the 508 c:\windows\system32\dllhost\svchost.exe file

--bodystd failing when at building MFT stage

Attempting to parse MFT using the following command:

analyzeMFT.py --bodyfull --bodystd -b OUTPUT -f MFT -p

Error:

Traceback (most recent call last):
File "/usr/local/bin/analyzeMFT.py", line 12, in
session.process_mft_file()
File "/usr/local/lib/python2.7/dist-packages/analyzemft/mftsession.py", line 187, in process_mft_file
self.do_output(record)
File "/usr/local/lib/python2.7/dist-packages/analyzemft/mftsession.py", line 212, in do_output
self.file_body.write(mft.mft_to_body(record, self.options.bodyfull, self.options.bodystd))
File "/usr/local/lib/python2.7/dist-packages/analyzemft/mft.py", line 364, in mft_to_body
int(record['si']['atime'].unixtime), # was str ....
KeyError: 'si'

Thanks in advance.

How to use it?

I git cloned and installed analyzeMFT, but I don't know how to test it, for example, on my D:\. Can you give an example (for Windows) about how to analyze the MFT of an exisiting drive?

Thanks.

Data run calculations are wrong?

Hi,

I contacted the Sleuthkit developer's list about this a while back, but didn't get a response. Sometimes analyzeMFT and Sleuthkit calculate different data runs for the same file. In these cases, Sleuthkit looks to be correct and analyzeMFT's results are off (I can also confirm by checking those data runs on disk). However analyzeMFT's method complies with all of the NTFS documentation I can find on data runs whereas Sleuthkit somehow gets different numbers.

I hope that you might be able to shed some light on this, as it appears to be a bug in analyzeMFT. Here's an example from the email I sent.

I've attached an odd example of a raw MFT entry (of a zip file) from my clean disk image. I also included the hex dump which includes my math and notes. Github is not allowing me to upload non-picture files, so I'll try to include them in my follow up.

I'm perplexed as to how TSK is parsing the data runs.

The data run snippet is :

31 01 4c 6c 05
21 03 71 01
31 16 be 31 fd
03 00 94 15
01 31
6f 9a 7c ff 31 27 04 bc 0d 31 4f 71 44 01 00 f5 80 00 00 00 00 80 00
00(End)

But TSK is interpreting the data runs as

31 01 4c 6c 05
21 03 71 01
31 16 be 31 fd
03 00 94 15 01
31 6f 9a 7c ff
31 27 04 bc 0d
31 4f 71 44 01
00 (End)

TSK seems to be right, but I don't understand what it's doing.

My analysis by hand (which is the same as what analyzeMFT gives me and consistent with all the NTFS documentation I could find) gives me the following runs. The first three are normal — I get the same result as TSK. The last few are divergent.

31 01 4c 6c 05 (normal)
len 0x01 offset 0x056c4c ==355404 Cluster Address == 355404

21 03 71 01 (normal)
len 0x03 offset 0x0171 == 369 Cluster Address == 355404 + 369 == 355773

31 16 be 31 fd (normal)
len 0x16 (22) offset 0xfd31be == -183874 Cluster Address == 171899

Here's where I'm confused:

03 00 94 15 (sparse)
The header gives me a 0 byte offset field and a 3 byte length field.
0 byte offset field means a sparse data run (so these runs don't take up disk space and return 0s when read)
3 byte length field gives me a length of 0x159400 == 1414144

01 31 (sparse)
0 byte offset field
1 byte length field == length 0x31

6f 9a 7c ff 31 27 04 bc 0d 31 4f 71 44 01 00 f5 80 00 00 00 00 80 00
Something is clearly wrong here.

TSK gives me something more reasonable:

[Len: 1, Addr: 355404],
[Len: 3, Addr: 355773],
[Len: 22, Addr: 171899],
[Len: 39, Addr: 242959],
[Len: 111, Addr: 209321],
[Len: 39, Addr: 1109421],
[Len: 79, Addr: 1192478],

The first three runs are the same, but the rest are different. TSK seems to interpret the runs like this:

31 01 4c 6c 05
21 03 71 01
31 16 be 31 fd
03 00 94 15 01
31 6f 9a 7c ff
31 27 04 bc 0d
31 4f 71 44 01
00 (End)

This only makes sense to me if the fourth line were 31 27 94 15 01 instead of 03 00 94 15 01. Then TSK's numbers and parsing check out with the raw run list. I believe that TSK is correct, but I don't understand how it is parsing the data runs here.

Update anomaly detection

Update anomaly detection to ONLY compare $StandardInfo and $Filename creation timestamps (it currently flags any timestamp anomaly between the two types of timestamps) -- their are too many reasons for the others to be off. But there is ONLY 1 time the creation timestamp is modified in both -- when teh file is created. If the two are different -- it is really weird. But we should have it ONLY focus on "CREATION" time to limit the massive amount of information.

Dealing with large MFTs

The new version (v2.0) starts running into memory problems when trying to deal with large (>400000 entry) MFTs. I've run it on a couple MFTs the old version could easily handle and I get MemoryErrors.

I suspect this is because the new version tries storing every record in a single giant dictionary (although it's possible there's a memory leak somewhere that I'm not seeing, but I don't think that's the case). I'm not sure there's an easy fix. Use NumPy? Don't store the data at all (like the previous version)?

Consider adding plugins to look for specific things

Consider implementing a mechanism to look for specific things, such as .exe files in ProgramData.

Fix issues with deleted files

See this blog:

http://az4n6.blogspot.com/2015/09/whos-your-master-mft-parsers-reviewed.html

How to install it?

When I install it ('python setup.py install') on Windows10, Python3.8.3 and Command prompt(as Administrator), it has error.
Error message is as below.
Please help me how to use it.

$python setup.py install
running install
running build
running build_py
running build_scripts
running install_lib
copying build\lib\analyzemft\bitparse.py -> C:\Program Files\Python38\Lib\site-packages\analyzemft
copying build\lib\analyzemft\mft.py -> C:\Program Files\Python38\Lib\site-packages\analyzemft
copying build\lib\analyzemft\mftsession.py -> C:\Program Files\Python38\Lib\site-packages\analyzemft
copying build\lib\analyzemft\mftutils.py -> C:\Program Files\Python38\Lib\site-packages\analyzemft
copying build\lib\analyzemft\__init__.py -> C:\Program Files\Python38\Lib\site-packages\analyzemft
byte-compiling C:\Program Files\Python38\Lib\site-packages\analyzemft\bitparse.py to bitparse.cpython-38.pyc
byte-compiling C:\Program Files\Python38\Lib\site-packages\analyzemft\mft.py to mft.cpython-38.pyc
  File "C:\Program Files\Python38\Lib\site-packages\analyzemft\mft.py", line 45
    print '-->Record number: %d\n\tMagic: %s Attribute offset: %d Flags: %s Size:%d' % (
          ^
SyntaxError: invalid syntax

byte-compiling C:\Program Files\Python38\Lib\site-packages\analyzemft\mftsession.py to mftsession.cpython-38.pyc
  File "C:\Program Files\Python38\Lib\site-packages\analyzemft\mftsession.py", line 122
    print "-f <filename> required."
          ^
SyntaxError: Missing parentheses in call to 'print'. Did you mean print("-f <filename> required.")?

byte-compiling C:\Program Files\Python38\Lib\site-packages\analyzemft\mftutils.py to mftutils.cpython-38.pyc
  File "C:\Program Files\Python38\Lib\site-packages\analyzemft\mftutils.py", line 52
    print "%s%s%s" % (sep.join("%02x" % ord(c) for c in line),
          ^
SyntaxError: invalid syntax

byte-compiling C:\Program Files\Python38\Lib\site-packages\analyzemft\__init__.py to __init__.cpython-38.pyc
running install_scripts
copying build\scripts-3.8\analyzeMFT.py -> C:\Program Files\Python38\Scripts
running install_egg_info
Removing C:\Program Files\Python38\Lib\site-packages\analyzeMFT-2.0.19-py3.8.egg-info
Writing C:\Program Files\Python38\Lib\site-packages\analyzeMFT-2.0.19-py3.8.egg-info

Missing full paths to files in bodyfile format

It would be nice to have the full paths for files included in the bodyfile output.

python2

hi,

python2 and python3 may be installed together...

diff --git a/analyzeMFT.py b/analyzeMFT.py
index dceaae7..0d21be5 100755
--- a/analyzeMFT.py
+++ b/analyzeMFT.py
@@ -1,4 +1,4 @@
-#!/usr/bin/python
+#!/usr/bin/python2
 
 try:
     from analyzemft import mftsession

# pip2 install analyzeMFT
Collecting analyzeMFT
  Using cached analyzeMFT-2.0.19.tar.gz
Installing collected packages: analyzeMFT
  Running setup.py install for analyzeMFT ... done
Successfully installed analyzeMFT-2.0.19

# pip3 install analyzeMFT
Collecting analyzeMFT
  Using cached analyzeMFT-2.0.19.tar.gz
Installing collected packages: analyzeMFT
  Running setup.py install for analyzeMFT ... done
Successfully installed analyzeMFT-2.0.19

# pip2 check
No broken requirements found.

# pip3 check
No broken requirements found.

# python2 /usr/bin/analyzeMFT.py 
-f <filename> required.

# python3 /usr/bin/analyzeMFT.py 
Traceback (most recent call last):
  File "/usr/bin/analyzeMFT.py", line 6, in <module>
    from .analyzemft import mftsession
ModuleNotFoundError: No module named '__main__.analyzemft'; '__main__' is not a package

python setup.py warning

/usr/lib/python2.7/distutils/dist.py:267: UserWarning: Unknown distribution option: 'install_requires'
warnings.warn(msg)

analyzeMFT doesn't catch all nanoseconds = 0

Use case is:

It currently fails to detect the nanoseconds being zero on the 508 c:\windows\system32\dllhost\svchost.exe file

See nano-zero folder on local dev:

Python datetime doesn't handle nanoseconds, only milliseconds.

http://stackoverflow.com/questions/15649942/python-convert-epoch-time-with-nanoseconds-to-human-readable

Corrupt MFT

I have tried running this on many different files, and always get "Corrupt MFT Record" and similar results. Any idea why?

Thanks in advance.

MFT entry: fix up values make incorrect assumption

https://github.com/dkovar/analyzeMFT/blob/master/analyzemft/mft.py#L474

    record['seq_attr1'] = raw_record[50:52]  # Sequence attribute for sector 1
    record['seq_attr2'] = raw_record[52:54]  # Sequence attribuet for sector 2

For MFT entries where the fixup value offset is 42 (for now seen in NTFS versions before 3.0) this should be:

    record['seq_attr1'] = raw_record[44:46]  # Sequence attribute for sector 1
    record['seq_attr2'] = raw_record[46:48]  # Sequence attribuet for sector 2

Also you likely want to fix the typo in attribuet

Also see: https://github.com/libyal/libfsntfs/blob/master/documentation/New%20Technologies%20File%20System%20(NTFS).asciidoc#mft-entry-header

dataruns offset in attribute header is from the start of attribute

Hi,

I'm working on similar project and I noticed you use number 64 for you data runs offset.
You can get this number (not sure if it is always the same) from the attribute header and it is the offset from the beginning of the attribute.

d['run_off'] = struct.unpack("<H",s[32:34])[0] # == 64

(d['ndataruns'],d['dataruns'],d['drunerror']) = unpack_dataruns(s[64:])

# can change to:
offset = d['run_off']

(d['ndataruns'],d['dataruns'],d['drunerror']) = unpack_dataruns(s[offset:])

I guess if non-resident attribute has name (does that ever happen?), that number would not be 64.. and thus better to use that offset from the header. Maybe thats why you were getting data run oddity (l > 6)?

Strange values for alen and others

I'm getting some strange values (large negative numbers) for alen and ssize (read from the mft by decodeATRHeader function in mft.py).
They're currently being read in as doubles. I think might need to be read in as (unsigned?) longs.

Investigate flawed records

0 Unknown Inactive File 0 NoParent NoParent NoFNRecord
17 Good Inactive File 18 5 5 /bootex.log
0 Unknown Inactive File 0 NoParent NoParent NoFNRecord
0 Unknown Inactive File 0 NoParent NoParent NoFNRecord
0 Zero Inactive File 0 NoParent NoParent NoFNRecord
0 Zero Inactive File 0 NoParent NoParent NoFNRecord
0 Zero Inactive File 0 NoParent NoParent NoFNRecord
0 Zero Inactive File 0 NoParent NoParent NoFNRecord

What is a Good but inactive record?
What is an Unknown record?
What is a Zero record?

What is in these records?

Feature Request: Add parameter to prepend file entries with user-inputted volume letter

Thought it would be a good idea if we could prepend all file entries with a user-inputted volume letter parameter, to yield entries like "C:...", "D:...", etc.

This is useful in situations where you are extracting and processing MFT's from multiple logical system volumes.

Trouble getting excel friendly

Not sure how this is supposed to work, but I've been trying to get an excel friendly timestamp. Here's what I've tried:

python analyzeMFT.py -l -p -f ~/Forensics/cases/macbook/mft.bin -e -c /home/Forensics/cases/macbook/excel.csv

python analyzeMFT.py -l -p -f ~/Forensics/cases/macbook/mft.bin -e /home/Forensics/cases/macbook/excel.csv

python analyzeMFT.py -l -p -f ~/Forensics/cases/macbook/mft.bin -e -o /home/Forensics/cases/macbook/excel.csv

Almost all of which give me:
2014-08-01|12:31:44.735449|TZ|

Is there something I'm missing? Thank you.

Option to supress csv header

I use this tool a lot and I plug it into other scripts which do not require header function. would it be possible to have an option to suppress this?

Many thanks

AnalyzeMFT did not see alternate data streams

Hello,

I was using analyzeMFT to parse out an MFT file. If I open the MFT in a hex editor I was able to see that the file '\system32_challenge\calc.exe' definitely has an ADS named 'hidden.txt'. However, analyzeMFT did not report on this at all. As I understand it, 'hidden.txt' should have appeared under the column 'filename 2', correct?

I read as much as I could about analyzeMFT to make sure I was using it correctly and reading the output right.

I used another tool called 'NTFSwalk' which also parses the output of the MFT file to csv format, except the output is much uglier than that of analyzeMFT. However, this tool did report the ADS hidden.txt.

I just wanted to let you know about my experience and results. Please let me know if you would like me to send you the MFT file in question so you can look at it.

Need $STD_INFO timestamps in bodyfile format

I would like to request that $STD_INFO timestamps are also added to output in bodyfile format.

DB Friendly mods

after Line # 267: SIAttributeSizeNT = 48
add delimiter = "|" # CSV definable delimiter because the data contains commas

modified line # 636
FROM: if ( p == 0 ) or ( p == 5 ):
TO: if ( p == 0 ) or ( p >4 ): # probably unnecessary but I was looking for a spurious generated "1" when 4 filenames

modified Line # 663-677
Make headers DB friendly: Replace spaces with underscores, replace objectionable symbols (# & /), prefix FN1...4 to make headers unique.

after line # 716: # If this goes above four FN attributes ...
add: # Limit to 4
add: if MFTR['fncnt'] > 4: MFTR['fncnt'] = 4

after line # 723: # Pad out the remaining FN columns
add: tmpBuffer = [] # Initialize to empty. *** This was the cause of the spurious "1"

modified line # 890:
FROM: output_file = csv.writer(open(options.output, 'wb'), dialect=csv.excel,quoting=1)
TO: output_file = csv.writer(open(options.output, 'wb'), delimiter = delimiter, quoting=1) # Defined delimiter

The above allowed me to successfully import a little under 1/2 million records into SQLite.

Installation Failure

Python 3.4.3 Installed

I run the install and receive the following ....

c:\IS\installers\analyzeMFT\analyzeMFT-master>python setup.py install
running install
running build
running build_py
creating build
creating build\lib
creating build\lib\analyzemft
copying analyzemft\bitparse.py -> build\lib\analyzemft
copying analyzemft\mft.py -> build\lib\analyzemft
copying analyzemft\mftsession.py -> build\lib\analyzemft
copying analyzemft\mftutils.py -> build\lib\analyzemft
copying analyzemft__init__.py -> build\lib\analyzemft
running build_scripts
creating build\scripts-3.4
copying and adjusting analyzeMFT.py -> build\scripts-3.4
running install_lib
creating C:\Python34\Lib\site-packages\analyzemft
copying build\lib\analyzemft\bitparse.py -> C:\Python34\Lib\site-packages\analyz
emft
copying build\lib\analyzemft\mft.py -> C:\Python34\Lib\site-packages\analyzemft
copying build\lib\analyzemft\mftsession.py -> C:\Python34\Lib\site-packages\anal
yzemft
copying build\lib\analyzemft\mftutils.py -> C:\Python34\Lib\site-packages\analyz
emft
copying build\lib\analyzemft__init__.py -> C:\Python34\Lib\site-packages\analyz
emft
byte-compiling C:\Python34\Lib\site-packages\analyzemft\bitparse.py to bitparse.
cpython-34.pyc
byte-compiling C:\Python34\Lib\site-packages\analyzemft\mft.py to mft.cpython-34
.pyc
File "C:\Python34\Lib\site-packages\analyzemft\mft.py", line 52
print '-->Record number: %d\n\tMagic: %s Attribute offset: %d Flags: %s Size
:%d' % (record_number, record['magic'],

^
SyntaxError: invalid syntax

byte-compiling C:\Python34\Lib\site-packages\analyzemft\mftsession.py to mftsess
ion.cpython-34.pyc
File "C:\Python34\Lib\site-packages\analyzemft\mftsession.py", line 100
print "-f required."
^
SyntaxError: Missing parentheses in call to 'print'

byte-compiling C:\Python34\Lib\site-packages\analyzemft\mftutils.py to mftutils.
cpython-34.pyc
File "C:\Python34\Lib\site-packages\analyzemft\mftutils.py", line 52
print "%s%s%s" % ( sep.join( "%02x" % ord(c) for c in line ),
^
SyntaxError: invalid syntax

byte-compiling C:\Python34\Lib\site-packages\analyzemft__init__.py to init.
cpython-34.pyc
running install_scripts
copying build\scripts-3.4\analyzeMFT.py -> C:\Python34\Scripts
running install_egg_info
Writing C:\Python34\Lib\site-packages\analyzeMFT-2.0.15-py3.4.egg-info

Python error occured at add_note function in mft.py file.

Hello,
First, I thanks to you for this tool

I have one question :D
While I trying to analyze MFT file(C:, windows 2003 R2) in Windows 7, python error occured as follows.

The error is "NameError: global name 'mft_record' is not defined".

This error occured 398th line of the "mft.py" file.

So, I modified from "mft_record" to "record" in add_note function, and the error doesn't occured.

But, I don't know whether this is the right way.

Help me. please.

..........................................
D:\MFT\analyzeMFT_2.0.4\analyzeMFT-2.0.4>python analyzeMFT.py -f $MFT_C -o test
Traceback (most recent call last):
File "analyzeMFT.py", line 13, in
session.process_mft_file()
File "D:\MFT\analyzeMFT_2.0.4\analyzeMFT-2.0.4\analyzemft\mftsession.py", line 134, in process_mft
_file
record = mft.parse_record(raw_record, self.options)
File "D:\MFT\analyzeMFT_2.0.4\analyzeMFT-2.0.4\analyzemft\mft.py", line 91, in parse_record
FNrecord = decodeFNAttribute(raw_record[read_ptr+ATRrecord['soff']:], options.localtz, record)
File "D:\MFT\analyzeMFT_2.0.4\analyzeMFT-2.0.4\analyzemft\mft.py", line 564, in decodeFNAttribute
add_note(record, 'Filename - chars converted to hex')
File "D:\MFT\analyzeMFT_2.0.4\analyzeMFT-2.0.4\analyzemft\mft.py", line 398, in add_note
record['notes'] = "%s | %s |" % (mft_record['notes'], s)
NameError: global name 'mft_record' is not defined
..........................................

analyzeMFT.py version 2.0.5 "Keyerror: 'soff'

Hello there, I'm trying to make use of this tool, but I keep encountering the error:

helpdesk@helpdesk-desktop:~/research/mft_test_1$ analyzeMFT.py -f ./mft2.bin -o ./mft2_analyzed.csv
Traceback (most recent call last):
File "/usr/local/bin/analyzeMFT.py", line 13, in
session.process_mft_file()
File "/usr/local/lib/python2.7/dist-packages/analyzemft/mftsession.py", line 164, in process_mft_file
self.build_filepaths()
File "/usr/local/lib/python2.7/dist-packages/analyzemft/mftsession.py", line 238, in build_filepaths
record = mft.parse_record(raw_record, self.options)
File "/usr/local/lib/python2.7/dist-packages/analyzemft/mft.py", line 118, in parse_record
DataRecord = decodeDataRecord(raw_record[read_ptr+ATRrecord['soff']:])
KeyError: 'soff'

I've tried on 3 different ntfs extracted MFT's, using the command:

icat -v -o 62 /dev/sdh 0 > mft2.bin

on an NTFS formatted drive with mmls output of:

 Slot    Start        End          Length       Description

00: Meta 0000000000 0000000000 0000000001 Primary Table (#0)
01: ----- 0000000000 0000000061 0000000062 Unallocated
02: 00:00 0000000062 0000501455 0000501394 NTFS (0x07)
03: ----- 0000501456 0000501759 0000000304 Unallocated

I've uploaded the mft that i'm trying to analyze here:
https://mega.co.nz/#!XYUmmIZT!fJsKXgBe4feejBSe1MH_LjO9FeFy8jD8tp4vc9Uw3U4

please let me know if my problem has to do with how I'm saving the MFT or if there is a bug in the program.

Thank you for all your hard work!