1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
|
#------------------------------------------------------------------------------
# $File: mathematica,v 1.15 2022/10/31 13:22:26 christos Exp $
# mathematica: file(1) magic for mathematica files
# "H. Nanosecond" <aldomel@ix.netcom.com>
# Mathematica a multi-purpose math program
# versions 2.2 and 3.0
#mathematica .mb
0 string \064\024\012\000\035\000\000\000 Mathematica version 2 notebook
!:ext mb
0 string \064\024\011\000\035\000\000\000 Mathematica version 2 notebook
!:ext mb
# .ma
# multiple possibilities:
0 string (*^\n\n::[\011frontEndVersion\ =\ Mathematica notebook
#>41 string >\0 %s
!:ext mb
#0 string (*^\n\n::[\011palette Mathematica notebook version 2.x
#0 string (*^\n\n::[\011Information Mathematica notebook version 2.x
#>675 string >\0 %s #doesn't work well
# there may be 'cr' instead of 'nl' in some does this matter?
# generic:
0 string (*^\r\r::[\011 Mathematica notebook version 2.x
!:ext mb
0 string (*^\r\n\r\n::[\011 Mathematica notebook version 2.x
!:ext mb
0 string (*^\015 Mathematica notebook version 2.x
!:ext mb
0 string (*^\n\r\n\r::[\011 Mathematica notebook version 2.x
!:ext mb
0 string (*^\r::[\011 Mathematica notebook version 2.x
!:ext mb
0 string (*^\r\n::[\011 Mathematica notebook version 2.x
!:ext mb
0 string (*^\n\n::[\011 Mathematica notebook version 2.x
!:ext mb
0 string (*^\n::[\011 Mathematica notebook version 2.x
!:ext mb
# Mathematica .mx files
#0 string (*This\ is\ a\ Mathematica\ binary\ dump\ file.\ It\ can\ be\ loaded\ with\ Get.*) Mathematica binary file
0 string (*This\ is\ a\ Mathematica\ binary\ Mathematica binary file
#>71 string \000\010\010\010\010\000\000\000\000\000\000\010\100\010\000\000\000
# >71... is optional
>88 string >\0 from %s
# Mathematica files PBF:
# 115 115 101 120 102 106 000 001 000 000 000 203 000 001 000
0 string MMAPBF\000\001\000\000\000\203\000\001\000 Mathematica PBF (fonts I think)
# .ml files These are menu resources I think
# these start with "[0-9][0-9][0-9]\ A~[0-9][0-9][0-9]\
# how to put that into a magic rule?
4 string \ A~ MAthematica .ml file
# .nb files
#too long 0 string (***********************************************************************\n\n\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ Mathematica-Compatible Notebook Mathematica 3.0 notebook
0 string (*********************** Mathematica 3.0 notebook
# other (* matches it is a comment start in these langs
# GRR: Too weak; also matches other languages e.g. ML
#0 string (* Mathematica, or Pascal, Modula-2 or 3 code text
#########################
# MatLab v5
# URL: http://fileformats.archiveteam.org/wiki/MAT
# Reference: https://www.mathworks.com/help/pdf_doc/matlab/matfile_format.pdf
# first 116 bytes of header contain text in human-readable form
0 string MATLAB Matlab v
#>11 string/T x \b, at 11 "%.105s"
#!:mime application/octet-stream
!:mime application/x-matlab-data
!:ext mat
# https://de.mathworks.com/help/matlab/import_export/mat-file-versions.html
# level of the MAT-file like: 5.0 7.0 or maybe 7.3
#>7 string x LEVEL "%.3s"
>7 ubyte =0x35 \b5 mat-file
>7 ubyte !0x35
>>7 string x \b%.3s mat-file
>126 short 0x494d (big endian)
>>124 beshort x version %#04x
>126 short 0x4d49 (little endian)
# 0x0100 for level 5.0 and 0x0200 for level 7.0
>>124 leshort x version %#04x
# test again so that default clause works
>126 short x
# created by MATLAB include Platform sometimes without leading comma (0x2C) or missing
# like: GLNX86 PCWIN PCWIN64 SOL2 Windows\0407 nt posix
>>20 search/2 Platform:\040 \b, platform
>>>&0 string x %-0.2s
>>>&2 ubyte !0x2C \b%c
>>>>&0 ubyte !0x2C \b%c
>>>>>&0 ubyte !0x2C \b%c
>>>>>>&0 ubyte !0x2C \b%c
>>>>>>>&0 ubyte !0x2C \b%c
>>>>>>>>&0 ubyte !0x2C \b%c
>>>>>>>>>&0 ubyte !0x2C \b%c
# examples without Platform tag like one_by_zero_char.mat
>>20 default x
>>>11 string x "%s"
# created by MATLAB include time like: Fri Feb 20 15:26:59 2009
>34 search/9/c created\040on:\040 \b, created
>>&0 string x %-.24s
# MatLab v4
# From: Joerg Jenderek
# check for valid imaginary flag of Matlab matrix version 4
13 ushort 0
# check for valid ASCII matrix name
>20 ubyte >0x1F
# skip PreviousEntries.dat with "invalid high" name \304P\344@\001
>>20 ubyte <0304
# skip some Netwfw*.dat and $I3KREPH.dat by checking for non zero number of rows
>>>4 ulong !0
# skip some CD-ROM filesystem like test-hfs.iso by looking for valid big endian type flag
>>>>0 ubelong&0xFFffFF00 0x00000300
>>>>>0 use matlab4
# no example for 8-bit and 16-bit integers matrix
>>>>0 ubelong&0xFFffFF00 0x00000400
>>>>>0 use matlab4
# branch for Little-Endian variant of Matlab MATrix version 4
# skip big endian variant by looking for valid low lttle endian type flag
>>>>0 ulelong <53
# skip tokens.dat and some Netwfw*.dat by check for valid imaginary flag value of MAT version 4
>>>>>12 ulelong <2
# no misidentified little endian MATrix example with "short" matrix name
>>>>>>16 ulelong <3
>>>>>>>0 use \^matlab4
# little endian MATrix with "long" matrix name or some misidentified samples
>>>>>>16 ulelong >2
# skip TileCacheLogo-*.dat with invalid 2nd character \001 of matrix name with length 96
>>>>>>>21 ubyte >0x1F
>>>>>>>>0 use \^matlab4
# display information of Matlab v4 mat-file
0 name matlab4 Matlab v4 mat-file
#!:mime application/octet-stream
!:mime application/x-matlab-data
!:ext mat
# 20-byte header with 5 long integers that contains information describing certain attributes of the Matrix
# type flag decimal MOPT; maximal 4052=FD4h; maximal 52=34h for little endian
#>0 ubelong x \b, type flag %u
#>0 ubelong x (%#x)
# M: 0~little endian 1~Big Endian 2~VAX D-float 3~VAX G-float 4~Cray
#>0 ubelong/1000 x \b, M=%u
>0 ubelong/1000 0 (little endian)
>0 ubelong/1000 1 (big endian)
>0 ubelong/1000 2 (VAX D-float)
>0 ubelong/1000 3 (VAX G-float)
>0 ubelong/1000 4 (Cray)
# namlen; the length of the matrix name
#>16 ubelong x \b, name length %u
#>(16.L+19) ubyte x \b, TERMINATING NAME CHARACTER=%#x
# nul terminated matrix name like: fit_params testmatrix testsparsecomplex teststringarray
#>20 string x \b, MATRIX NAME="%s"
#>21 ubyte x \b, MAYBE 2ND CHAR=%c
>16 pstring/L x %s
# T indicates the matrix type: 0~numeric 1~text 2~sparse
#>0 ubelong%10 x \b, T=%u
>0 ubelong%10 0 \b, numeric
>0 ubelong%10 1 \b, text
>0 ubelong%10 2 \b, sparse
# mrows; number of rows in the matrix like: 1 3 8
>4 ubelong x \b, rows %u
# ncols; number of columns in the matrix like: 1 3 4 5 9 43
>8 ubelong x \b, columns %u
# imagf; imaginary flag; 1~matrix has an imaginary part 0~only real data
>12 ubelong !0 \b, imaginary (%u)
# real; Real part of the matrix consists of mrows * ncols numbers
|