summaryrefslogtreecommitdiffstats
path: root/upstream/debian-unstable/man1/midistats.1
blob: 6b9c5d44802d8c24d263f53b0d3e3879d3200437 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
.TH MIDISTATS 1 "11 February 2024"
.SH NAME
\fBmidistats\fP \- program to summarize the statistical properties of a midi file
.SH SYNOPSIS
midistats \fIinfile\fP

.SH DESCRIPTION
\fImidistats\fP analyzes the contents of a midi file and outputs key
information and various statistical measures. Each line of output
starts with the name of the variable or variable array and the
associated values. The output is interpreted by the user interface
midiexplorer.tcl. Both programs are still being improved. Here
is an explanation of some of the output.
.PP
ntrks indicates the number of tracks in the midi file.
.PP
ppqn the number of midi pulses per quarter note.
.PP
keysig the key signature, followed by a major/minor flag,  the number
of sharps (positive) or flats (negative) in the key, and the beat number
where the key signature was found.
.PP
trk is followed by the track number for which the following information
applies.
.PP
program is followed by the channel number and the General Midi Program
number.
.PP
trkinfo is an array of 19 numbers which indicates the statistical properties
of the track of interest. The following data is given:
the channel number,
the first program assigned to this channel,
the number of notes for this channel counting any chords as one note,
the total number of notes for this for this channel,
the sum of the MIDI pitches for all the notes,
the sum of the note durations in MIDI pulse units,
the number of control parameter messages,
the number of pressure messages.
the number of distinct rhythm patterns for each channel
the number of pulses the channel was inactive
the minimum pitch value
the maximum pitch value
the minimum note length in pulses
the maximum note length in pulses
the number of gaps in the channel
the entropy of the pitch class histogram for that channel
the number of notes whose pitch were the same as the previous note
the number of notes whose pitch changed by less than 4 semitones
the number of notes whose pitch changed by 4 or more semitones
(In event of a chords the maximum pitches are compared.) 
.PP
After processing all the individual tracks, the following information
applies to the entire midi file.
.PP
npulses is the length of the longest midi track in midi pulse units
.PP
tempocmds specifies the number of times the tempo is changed in this
file.
.PP
pitchbends specifies the total number of pitchbends in this file.
.PP
pitchbendin c n specifies the number of pitchbends n in channel c
.PP
progs is a list of all the midi programs addressed
.PP
progsact the amount of activity for each of the above midi programs.
The activity is the sum of the note durations in midi pulse units.
.PP
progcolor: is a 17 dimensional vector where each component maps into
a specific group of MIDI programs. Some of these groups are, keyboard
instruments, brass instruments, wind instruments, and etc. More information
can be found in the midiexplorer documentation.
.PP
drums is a list of all the percussion instruments (channel 9) that were
used.
.PP
drumhits indicates the number of notes for each of the above percussion
instruments.
.PP
pitches is a histogram for the 11 pitch classes (C, C#, D ...B)
that occur in the midi file.
.PP
key indicates the key of the music, the number of sharps (positive) or
flats (negative) in the key signature, and a measure of the confidence
in this key signature. The key was estimated from the above pitch histogram
by convolving with Craig Sapp's model. The peak of rmaj or rmin (below)
indicates the key.  A correlation less than 0.4 indicates that the pitch
histogram does not follow the histogram of a major or minor scale.
(It may be the result of a mixture of two key signatures.)
.PP
rmaj the cross correlation coefficients with Craig Sapp's major key model
for each of the 11 keys (C, C#, D, ...,B).
.PP
rmaj the cross correlation coefficients with Craig Sapp's minor key model
for each of the 11 keys (C, C#, D, ...,B).
.PP
pitchact is a similar histogram but is weighted by the length of
the notes.
.PP
chanvol indicates the value of the control volume commands in the
midi file for each of the 16 channels. The maximum value is 127.
It scales the loudness of the notes (velocity) by its value.
.PP
chnact returns the amount of note activity in each channel.
.PP
trkact returns the number of notes in each track.
.PP
totalrhythmpatterns is the total number of bar rhythm patterns for
all channels except the percussion channel.
.PP
collisions. Midistats counts the bar rhythm patterns using a hashing
function. Presently collisions are ignored so occasionally two
distinct rhythm patterns are counted as one.
.PP
Midistats prints a number of arrays which may be useful in
determining where the music in the track is a melody line or
chordal rhythmic support. These arrays indicate the properties
for each of the 16 channels. (The percussion channel 9 contains
zeros.) In the case same channel occurs in several tracks, these
numbers are the totals for all track containing that channel.
Here is a description of these properties.
.PP
nnotes:  the total number of notes in each channel
.br
nzeros:  the number of notes whose previous note was the same pitch
.br
nsteps:  the number of notes whose pitch difference with the previous
note was less than 4 semitones.
.br
njumps:  the number of notes whose pitch difference with the previous
note was 4 or more semitones.
.br
rpats: the number of rhythmpatterns for each channels. This is a
duplication of data printed previously.
.br
pavg: the average pitch of all the notes for each channel.
.PP
In addition the midistats may return other codes that describe
other characteristics. They include

unquantized - the note onsets are not quantized
.br
triplets - 3 notes played in the time of 2 notes are present
.br
qnotes - the rhythm is basically simple
.br
clean_quantization - the note onsets are quantized into 1/4, 1/8, 1/16 time units.
.br
dithered_quantization - small variations in the quantized note onsets.
.br
Lyrics - lyrics are present in the meta data
.br
programcmd - there may be multiple program changes in a midi channel



.SH Advanced Percussion Analysis Tools

.PP
The MIDI file devotes channel 9 to the percussion instruments
and over 60 percussion instruments are defined in the MIDI
standard. Though there is a lot of diversity in the percussion
track, for most MIDI files only the first 10 or so percussion
instruments are important in defining the character of the track. The
program Midiexplorer has various tools for exposing the percussion
channel which are described in the documentation. The goal
here is to find the essential characteristics of the percussion
track which distinguishes the MIDI files. This is attempted
in the program midistats.  Here is a short description.


.br

A number of experimental tools for analyzing the percussion channel
(track) were introduced into midistats and are accessible through
the runtime arguments. When these tools are used in a script which
runs through a collection of midi files, you can build a database
of percussion descriptors.

.SH OPTIONS
.PP
-corestats
.br
outputs a line with 5 numbers separated by tabs. eg
.br
1       8       384     4057    375
.br
It returns the number of tracks, the number of channels, the
number of divisions per quarter note beat (ppqn),
the number of note onsets in the midi file, and the maximum
number of quarter note beats in midi file.


.PP
-pulseanalysis
.br
counts the number of note onsets as a function of its onset time
relative to a beat, grouping them into 12 intervals and returns
the result as a discrete probability density function. Generally,
the distribution consists of a couple of peaks corresponding
to quarter notes or eigth notes. If the distribution is flat,
it indicates that the times of the note occurrences have not been
quantized into beats and fractions. Here is a sample output.
.br
0.349,0.000,0.000,0.160,0.000,0.000,0.298,0.000,0.000,0.191,0.000,0.000

.PP
-panal
.br
Counts the number of note onsets for each percussion instrument. The first
number is the code (pitch) of the instrument, the second number is the
number of occurrences. eg.
.br
35 337  37 16   38 432  39 208  40 231  42 1088 46 384  49 42   54 1104 57 5    70 1040 85 16

.PP
-ppatfor n
.br
where n is the code number of the percussion instrument. Each beat
is represented by a 4 bit number where the position of the on-bit
indicates the time in the beat when the drum onset occurs. The bits
are ordered from left to right (higher order bits to lower order
bits). This is the order of bits that you would expect in a
time series.
Thus 0 indicates that there was no note onset in that beat, 1 indicates
a note onset at the end of the beat, 4 indicates a note onset
in the middle of the beat, and etc. The function returns a string
of numbers ranging from 0 to 7 indicating the presence of note onsets
for the selected percussion instrument for the sequence of beats
in the midi file. Here is a truncated sample of the output.
.br

0 0 0 0 0 0 0 0 1 0 0 4 1 0 0 4 1 0 0 4 1 0 0 4 1 0 0 4 1 0 0 4 1 4 4 0
1 0 0 0 1 0 5 0 1 0 5 0 1 0 5 0 1 0 5 0 1 0 5 0 1 0 5 0 1 0 5 0 1 0 0 0
1 0 5 0 1 0 5 0 1 etc. 

.br
One can see a repeating 4 beat pattern.

.PP
-ppat
.br
midistats attempts to find two percussion instruments in the midi file
which come closest to acting as the bass drum and snare drum.
If it is unsuccessful, it returns a message of its failue. Otherwise,
encodes the position of these drum onsets in a 8 bit byte for each
quarter note beat in the midi file. The lower (right) 4 bits encode the
bass drum and the higher (left) 4 bits encode the snare drum in the
same manner as described above for -ppatfor.
.br
0 0 0 0 0 0 0 0 0 0 33 145 33 145 33 145 33 145 33 145 33 145 33 145
.br
33 145 33 145 33 145 33 145 33 145 33 145 33 145 33 145 33 145 33 145
.br
33 145 33 145 33 145 33 145 33 145 33 and etc.


.PP
-ppathist
.br
computes and displays the histogram of the values that would appear
when running the -ppat. eg.
.br
bass 35 337
.br
snare 38 432
.br
1 (0.1) 64  32 (2.0) 8  33 (2.1) 136  144 (9.0) 8  145 (9.1) 136
.br
The bass percussion code, the number of onsets, and the snare
percussion code and the number of onsets are given in the
first two lines. In the next line the number of occurrences of
each value in the -ppat listing is given. The number in parentheses
splits the two 4-bit values with a period. Thus 33 = (2*16 + 1).

.PP
-pitchclass
.br
Returns the pitch class distribution for the entire midi file.

.PP
-nseqfor n
.br
Note sequence for channel n. This option produces a string of bytes
indicating the presence of a note in a time unit corresponding to
an eigth note. Thus each quarter note beat is represented by two
bytes. The pitch class is represented by the line number on the
staff, where 0 is C. Thus the notes on a scale are represented
by 7 numbers, and sharps and flats are ignored. The line number is
then converted to a bit position in the byte, so that the pitch
classes are represented by the numbers 1,2,4,8, and etc. A chord
of consisting of two note onsets would set two of the corresponding
bits. If we were to represent the full chromatic scale consisting
of 12 pitches, then we would require two-byte integers or
twice of much memory.
.br
Though the pitch resolution is not sufficient to distinguish
major or minor chords, it should be sufficient to be identify some
repeating patterns.
.PP
-nseq
.br
Same as above except it is applied to all channels except the
percussion channel.
.br
.PP
-nseqtokens
Returns the number of distinct sequence elements for each channel.
The channel number and number of distinct elements separated by
a comma is returned in a tab separated list for all active channels
except the percussion channel. Here is an example.
.br
2,3	3,4	4,11	5,6	6,3	7,3	8,6	9,3	11,2	12,1
.br

-ver (version number)


.SH AUTHOR
Seymour Shlien <fy733@ncf.ca>