summaryrefslogtreecommitdiffstats
path: root/Documentation/driver-api/mtd
diff options
context:
space:
mode:
Diffstat (limited to '')
-rw-r--r--Documentation/driver-api/mtd/index.rst12
-rw-r--r--Documentation/driver-api/mtd/intel-spi.rst90
-rw-r--r--Documentation/driver-api/mtd/nand_ecc.rst763
-rw-r--r--Documentation/driver-api/mtd/spi-nor.rst66
-rw-r--r--Documentation/driver-api/mtdnand.rst1009
5 files changed, 1940 insertions, 0 deletions
diff --git a/Documentation/driver-api/mtd/index.rst b/Documentation/driver-api/mtd/index.rst
new file mode 100644
index 000000000..436ba5a85
--- /dev/null
+++ b/Documentation/driver-api/mtd/index.rst
@@ -0,0 +1,12 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+==============================
+Memory Technology Device (MTD)
+==============================
+
+.. toctree::
+ :maxdepth: 1
+
+ intel-spi
+ nand_ecc
+ spi-nor
diff --git a/Documentation/driver-api/mtd/intel-spi.rst b/Documentation/driver-api/mtd/intel-spi.rst
new file mode 100644
index 000000000..0e6d9cd53
--- /dev/null
+++ b/Documentation/driver-api/mtd/intel-spi.rst
@@ -0,0 +1,90 @@
+==============================
+Upgrading BIOS using intel-spi
+==============================
+
+Many Intel CPUs like Baytrail and Braswell include SPI serial flash host
+controller which is used to hold BIOS and other platform specific data.
+Since contents of the SPI serial flash is crucial for machine to function,
+it is typically protected by different hardware protection mechanisms to
+avoid accidental (or on purpose) overwrite of the content.
+
+Not all manufacturers protect the SPI serial flash, mainly because it
+allows upgrading the BIOS image directly from an OS.
+
+The intel-spi driver makes it possible to read and write the SPI serial
+flash, if certain protection bits are not set and locked. If it finds
+any of them set, the whole MTD device is made read-only to prevent
+partial overwrites. By default the driver exposes SPI serial flash
+contents as read-only but it can be changed from kernel command line,
+passing "intel-spi.writeable=1".
+
+Please keep in mind that overwriting the BIOS image on SPI serial flash
+might render the machine unbootable and requires special equipment like
+Dediprog to revive. You have been warned!
+
+Below are the steps how to upgrade MinnowBoard MAX BIOS directly from
+Linux.
+
+ 1) Download and extract the latest Minnowboard MAX BIOS SPI image
+ [1]. At the time writing this the latest image is v92.
+
+ 2) Install mtd-utils package [2]. We need this in order to erase the SPI
+ serial flash. Distros like Debian and Fedora have this prepackaged with
+ name "mtd-utils".
+
+ 3) Add "intel-spi.writeable=1" to the kernel command line and reboot
+ the board (you can also reload the driver passing "writeable=1" as
+ module parameter to modprobe).
+
+ 4) Once the board is up and running again, find the right MTD partition
+ (it is named as "BIOS")::
+
+ # cat /proc/mtd
+ dev: size erasesize name
+ mtd0: 00800000 00001000 "BIOS"
+
+ So here it will be /dev/mtd0 but it may vary.
+
+ 5) Make backup of the existing image first::
+
+ # dd if=/dev/mtd0ro of=bios.bak
+ 16384+0 records in
+ 16384+0 records out
+ 8388608 bytes (8.4 MB) copied, 10.0269 s, 837 kB/s
+
+ 6) Verify the backup:
+
+ # sha1sum /dev/mtd0ro bios.bak
+ fdbb011920572ca6c991377c4b418a0502668b73 /dev/mtd0ro
+ fdbb011920572ca6c991377c4b418a0502668b73 bios.bak
+
+ The SHA1 sums must match. Otherwise do not continue any further!
+
+ 7) Erase the SPI serial flash. After this step, do not reboot the
+ board! Otherwise it will not start anymore::
+
+ # flash_erase /dev/mtd0 0 0
+ Erasing 4 Kibyte @ 7ff000 -- 100 % complete
+
+ 8) Once completed without errors you can write the new BIOS image:
+
+ # dd if=MNW2MAX1.X64.0092.R01.1605221712.bin of=/dev/mtd0
+
+ 9) Verify that the new content of the SPI serial flash matches the new
+ BIOS image::
+
+ # sha1sum /dev/mtd0ro MNW2MAX1.X64.0092.R01.1605221712.bin
+ 9b4df9e4be2057fceec3a5529ec3d950836c87a2 /dev/mtd0ro
+ 9b4df9e4be2057fceec3a5529ec3d950836c87a2 MNW2MAX1.X64.0092.R01.1605221712.bin
+
+ The SHA1 sums should match.
+
+ 10) Now you can reboot your board and observe the new BIOS starting up
+ properly.
+
+References
+----------
+
+[1] https://firmware.intel.com/sites/default/files/MinnowBoard%2EMAX_%2EX64%2E92%2ER01%2Ezip
+
+[2] http://www.linux-mtd.infradead.org/
diff --git a/Documentation/driver-api/mtd/nand_ecc.rst b/Documentation/driver-api/mtd/nand_ecc.rst
new file mode 100644
index 000000000..e8d3c53a5
--- /dev/null
+++ b/Documentation/driver-api/mtd/nand_ecc.rst
@@ -0,0 +1,763 @@
+==========================
+NAND Error-correction Code
+==========================
+
+Introduction
+============
+
+Having looked at the linux mtd/nand driver and more specific at nand_ecc.c
+I felt there was room for optimisation. I bashed the code for a few hours
+performing tricks like table lookup removing superfluous code etc.
+After that the speed was increased by 35-40%.
+Still I was not too happy as I felt there was additional room for improvement.
+
+Bad! I was hooked.
+I decided to annotate my steps in this file. Perhaps it is useful to someone
+or someone learns something from it.
+
+
+The problem
+===========
+
+NAND flash (at least SLC one) typically has sectors of 256 bytes.
+However NAND flash is not extremely reliable so some error detection
+(and sometimes correction) is needed.
+
+This is done by means of a Hamming code. I'll try to explain it in
+laymans terms (and apologies to all the pro's in the field in case I do
+not use the right terminology, my coding theory class was almost 30
+years ago, and I must admit it was not one of my favourites).
+
+As I said before the ecc calculation is performed on sectors of 256
+bytes. This is done by calculating several parity bits over the rows and
+columns. The parity used is even parity which means that the parity bit = 1
+if the data over which the parity is calculated is 1 and the parity bit = 0
+if the data over which the parity is calculated is 0. So the total
+number of bits over the data over which the parity is calculated + the
+parity bit is even. (see wikipedia if you can't follow this).
+Parity is often calculated by means of an exclusive or operation,
+sometimes also referred to as xor. In C the operator for xor is ^
+
+Back to ecc.
+Let's give a small figure:
+
+========= ==== ==== ==== ==== ==== ==== ==== ==== === === === === ====
+byte 0: bit7 bit6 bit5 bit4 bit3 bit2 bit1 bit0 rp0 rp2 rp4 ... rp14
+byte 1: bit7 bit6 bit5 bit4 bit3 bit2 bit1 bit0 rp1 rp2 rp4 ... rp14
+byte 2: bit7 bit6 bit5 bit4 bit3 bit2 bit1 bit0 rp0 rp3 rp4 ... rp14
+byte 3: bit7 bit6 bit5 bit4 bit3 bit2 bit1 bit0 rp1 rp3 rp4 ... rp14
+byte 4: bit7 bit6 bit5 bit4 bit3 bit2 bit1 bit0 rp0 rp2 rp5 ... rp14
+...
+byte 254: bit7 bit6 bit5 bit4 bit3 bit2 bit1 bit0 rp0 rp3 rp5 ... rp15
+byte 255: bit7 bit6 bit5 bit4 bit3 bit2 bit1 bit0 rp1 rp3 rp5 ... rp15
+ cp1 cp0 cp1 cp0 cp1 cp0 cp1 cp0
+ cp3 cp3 cp2 cp2 cp3 cp3 cp2 cp2
+ cp5 cp5 cp5 cp5 cp4 cp4 cp4 cp4
+========= ==== ==== ==== ==== ==== ==== ==== ==== === === === === ====
+
+This figure represents a sector of 256 bytes.
+cp is my abbreviation for column parity, rp for row parity.
+
+Let's start to explain column parity.
+
+- cp0 is the parity that belongs to all bit0, bit2, bit4, bit6.
+
+ so the sum of all bit0, bit2, bit4 and bit6 values + cp0 itself is even.
+
+Similarly cp1 is the sum of all bit1, bit3, bit5 and bit7.
+
+- cp2 is the parity over bit0, bit1, bit4 and bit5
+- cp3 is the parity over bit2, bit3, bit6 and bit7.
+- cp4 is the parity over bit0, bit1, bit2 and bit3.
+- cp5 is the parity over bit4, bit5, bit6 and bit7.
+
+Note that each of cp0 .. cp5 is exactly one bit.
+
+Row parity actually works almost the same.
+
+- rp0 is the parity of all even bytes (0, 2, 4, 6, ... 252, 254)
+- rp1 is the parity of all odd bytes (1, 3, 5, 7, ..., 253, 255)
+- rp2 is the parity of all bytes 0, 1, 4, 5, 8, 9, ...
+ (so handle two bytes, then skip 2 bytes).
+- rp3 is covers the half rp2 does not cover (bytes 2, 3, 6, 7, 10, 11, ...)
+- for rp4 the rule is cover 4 bytes, skip 4 bytes, cover 4 bytes, skip 4 etc.
+
+ so rp4 calculates parity over bytes 0, 1, 2, 3, 8, 9, 10, 11, 16, ...)
+- and rp5 covers the other half, so bytes 4, 5, 6, 7, 12, 13, 14, 15, 20, ..
+
+The story now becomes quite boring. I guess you get the idea.
+
+- rp6 covers 8 bytes then skips 8 etc
+- rp7 skips 8 bytes then covers 8 etc
+- rp8 covers 16 bytes then skips 16 etc
+- rp9 skips 16 bytes then covers 16 etc
+- rp10 covers 32 bytes then skips 32 etc
+- rp11 skips 32 bytes then covers 32 etc
+- rp12 covers 64 bytes then skips 64 etc
+- rp13 skips 64 bytes then covers 64 etc
+- rp14 covers 128 bytes then skips 128
+- rp15 skips 128 bytes then covers 128
+
+In the end the parity bits are grouped together in three bytes as
+follows:
+
+===== ===== ===== ===== ===== ===== ===== ===== =====
+ECC Bit 7 Bit 6 Bit 5 Bit 4 Bit 3 Bit 2 Bit 1 Bit 0
+===== ===== ===== ===== ===== ===== ===== ===== =====
+ECC 0 rp07 rp06 rp05 rp04 rp03 rp02 rp01 rp00
+ECC 1 rp15 rp14 rp13 rp12 rp11 rp10 rp09 rp08
+ECC 2 cp5 cp4 cp3 cp2 cp1 cp0 1 1
+===== ===== ===== ===== ===== ===== ===== ===== =====
+
+I detected after writing this that ST application note AN1823
+(http://www.st.com/stonline/) gives a much
+nicer picture.(but they use line parity as term where I use row parity)
+Oh well, I'm graphically challenged, so suffer with me for a moment :-)
+
+And I could not reuse the ST picture anyway for copyright reasons.
+
+
+Attempt 0
+=========
+
+Implementing the parity calculation is pretty simple.
+In C pseudocode::
+
+ for (i = 0; i < 256; i++)
+ {
+ if (i & 0x01)
+ rp1 = bit7 ^ bit6 ^ bit5 ^ bit4 ^ bit3 ^ bit2 ^ bit1 ^ bit0 ^ rp1;
+ else
+ rp0 = bit7 ^ bit6 ^ bit5 ^ bit4 ^ bit3 ^ bit2 ^ bit1 ^ bit0 ^ rp0;
+ if (i & 0x02)
+ rp3 = bit7 ^ bit6 ^ bit5 ^ bit4 ^ bit3 ^ bit2 ^ bit1 ^ bit0 ^ rp3;
+ else
+ rp2 = bit7 ^ bit6 ^ bit5 ^ bit4 ^ bit3 ^ bit2 ^ bit1 ^ bit0 ^ rp2;
+ if (i & 0x04)
+ rp5 = bit7 ^ bit6 ^ bit5 ^ bit4 ^ bit3 ^ bit2 ^ bit1 ^ bit0 ^ rp5;
+ else
+ rp4 = bit7 ^ bit6 ^ bit5 ^ bit4 ^ bit3 ^ bit2 ^ bit1 ^ bit0 ^ rp4;
+ if (i & 0x08)
+ rp7 = bit7 ^ bit6 ^ bit5 ^ bit4 ^ bit3 ^ bit2 ^ bit1 ^ bit0 ^ rp7;
+ else
+ rp6 = bit7 ^ bit6 ^ bit5 ^ bit4 ^ bit3 ^ bit2 ^ bit1 ^ bit0 ^ rp6;
+ if (i & 0x10)
+ rp9 = bit7 ^ bit6 ^ bit5 ^ bit4 ^ bit3 ^ bit2 ^ bit1 ^ bit0 ^ rp9;
+ else
+ rp8 = bit7 ^ bit6 ^ bit5 ^ bit4 ^ bit3 ^ bit2 ^ bit1 ^ bit0 ^ rp8;
+ if (i & 0x20)
+ rp11 = bit7 ^ bit6 ^ bit5 ^ bit4 ^ bit3 ^ bit2 ^ bit1 ^ bit0 ^ rp11;
+ else
+ rp10 = bit7 ^ bit6 ^ bit5 ^ bit4 ^ bit3 ^ bit2 ^ bit1 ^ bit0 ^ rp10;
+ if (i & 0x40)
+ rp13 = bit7 ^ bit6 ^ bit5 ^ bit4 ^ bit3 ^ bit2 ^ bit1 ^ bit0 ^ rp13;
+ else
+ rp12 = bit7 ^ bit6 ^ bit5 ^ bit4 ^ bit3 ^ bit2 ^ bit1 ^ bit0 ^ rp12;
+ if (i & 0x80)
+ rp15 = bit7 ^ bit6 ^ bit5 ^ bit4 ^ bit3 ^ bit2 ^ bit1 ^ bit0 ^ rp15;
+ else
+ rp14 = bit7 ^ bit6 ^ bit5 ^ bit4 ^ bit3 ^ bit2 ^ bit1 ^ bit0 ^ rp14;
+ cp0 = bit6 ^ bit4 ^ bit2 ^ bit0 ^ cp0;
+ cp1 = bit7 ^ bit5 ^ bit3 ^ bit1 ^ cp1;
+ cp2 = bit5 ^ bit4 ^ bit1 ^ bit0 ^ cp2;
+ cp3 = bit7 ^ bit6 ^ bit3 ^ bit2 ^ cp3
+ cp4 = bit3 ^ bit2 ^ bit1 ^ bit0 ^ cp4
+ cp5 = bit7 ^ bit6 ^ bit5 ^ bit4 ^ cp5
+ }
+
+
+Analysis 0
+==========
+
+C does have bitwise operators but not really operators to do the above
+efficiently (and most hardware has no such instructions either).
+Therefore without implementing this it was clear that the code above was
+not going to bring me a Nobel prize :-)
+
+Fortunately the exclusive or operation is commutative, so we can combine
+the values in any order. So instead of calculating all the bits
+individually, let us try to rearrange things.
+For the column parity this is easy. We can just xor the bytes and in the
+end filter out the relevant bits. This is pretty nice as it will bring
+all cp calculation out of the for loop.
+
+Similarly we can first xor the bytes for the various rows.
+This leads to:
+
+
+Attempt 1
+=========
+
+::
+
+ const char parity[256] = {
+ 0, 1, 1, 0, 1, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 0,
+ 1, 0, 0, 1, 0, 1, 1, 0, 0, 1, 1, 0, 1, 0, 0, 1,
+ 1, 0, 0, 1, 0, 1, 1, 0, 0, 1, 1, 0, 1, 0, 0, 1,
+ 0, 1, 1, 0, 1, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 0,
+ 1, 0, 0, 1, 0, 1, 1, 0, 0, 1, 1, 0, 1, 0, 0, 1,
+ 0, 1, 1, 0, 1, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 0,
+ 0, 1, 1, 0, 1, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 0,
+ 1, 0, 0, 1, 0, 1, 1, 0, 0, 1, 1, 0, 1, 0, 0, 1,
+ 1, 0, 0, 1, 0, 1, 1, 0, 0, 1, 1, 0, 1, 0, 0, 1,
+ 0, 1, 1, 0, 1, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 0,
+ 0, 1, 1, 0, 1, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 0,
+ 1, 0, 0, 1, 0, 1, 1, 0, 0, 1, 1, 0, 1, 0, 0, 1,
+ 0, 1, 1, 0, 1, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 0,
+ 1, 0, 0, 1, 0, 1, 1, 0, 0, 1, 1, 0, 1, 0, 0, 1,
+ 1, 0, 0, 1, 0, 1, 1, 0, 0, 1, 1, 0, 1, 0, 0, 1,
+ 0, 1, 1, 0, 1, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 0
+ };
+
+ void ecc1(const unsigned char *buf, unsigned char *code)
+ {
+ int i;
+ const unsigned char *bp = buf;
+ unsigned char cur;
+ unsigned char rp0, rp1, rp2, rp3, rp4, rp5, rp6, rp7;
+ unsigned char rp8, rp9, rp10, rp11, rp12, rp13, rp14, rp15;
+ unsigned char par;
+
+ par = 0;
+ rp0 = 0; rp1 = 0; rp2 = 0; rp3 = 0;
+ rp4 = 0; rp5 = 0; rp6 = 0; rp7 = 0;
+ rp8 = 0; rp9 = 0; rp10 = 0; rp11 = 0;
+ rp12 = 0; rp13 = 0; rp14 = 0; rp15 = 0;
+
+ for (i = 0; i < 256; i++)
+ {
+ cur = *bp++;
+ par ^= cur;
+ if (i & 0x01) rp1 ^= cur; else rp0 ^= cur;
+ if (i & 0x02) rp3 ^= cur; else rp2 ^= cur;
+ if (i & 0x04) rp5 ^= cur; else rp4 ^= cur;
+ if (i & 0x08) rp7 ^= cur; else rp6 ^= cur;
+ if (i & 0x10) rp9 ^= cur; else rp8 ^= cur;
+ if (i & 0x20) rp11 ^= cur; else rp10 ^= cur;
+ if (i & 0x40) rp13 ^= cur; else rp12 ^= cur;
+ if (i & 0x80) rp15 ^= cur; else rp14 ^= cur;
+ }
+ code[0] =
+ (parity[rp7] << 7) |
+ (parity[rp6] << 6) |
+ (parity[rp5] << 5) |
+ (parity[rp4] << 4) |
+ (parity[rp3] << 3) |
+ (parity[rp2] << 2) |
+ (parity[rp1] << 1) |
+ (parity[rp0]);
+ code[1] =
+ (parity[rp15] << 7) |
+ (parity[rp14] << 6) |
+ (parity[rp13] << 5) |
+ (parity[rp12] << 4) |
+ (parity[rp11] << 3) |
+ (parity[rp10] << 2) |
+ (parity[rp9] << 1) |
+ (parity[rp8]);
+ code[2] =
+ (parity[par & 0xf0] << 7) |
+ (parity[par & 0x0f] << 6) |
+ (parity[par & 0xcc] << 5) |
+ (parity[par & 0x33] << 4) |
+ (parity[par & 0xaa] << 3) |
+ (parity[par & 0x55] << 2);
+ code[0] = ~code[0];
+ code[1] = ~code[1];
+ code[2] = ~code[2];
+ }
+
+Still pretty straightforward. The last three invert statements are there to
+give a checksum of 0xff 0xff 0xff for an empty flash. In an empty flash
+all data is 0xff, so the checksum then matches.
+
+I also introduced the parity lookup. I expected this to be the fastest
+way to calculate the parity, but I will investigate alternatives later
+on.
+
+
+Analysis 1
+==========
+
+The code works, but is not terribly efficient. On my system it took
+almost 4 times as much time as the linux driver code. But hey, if it was
+*that* easy this would have been done long before.
+No pain. no gain.
+
+Fortunately there is plenty of room for improvement.
+
+In step 1 we moved from bit-wise calculation to byte-wise calculation.
+However in C we can also use the unsigned long data type and virtually
+every modern microprocessor supports 32 bit operations, so why not try
+to write our code in such a way that we process data in 32 bit chunks.
+
+Of course this means some modification as the row parity is byte by
+byte. A quick analysis:
+for the column parity we use the par variable. When extending to 32 bits
+we can in the end easily calculate rp0 and rp1 from it.
+(because par now consists of 4 bytes, contributing to rp1, rp0, rp1, rp0
+respectively, from MSB to LSB)
+also rp2 and rp3 can be easily retrieved from par as rp3 covers the
+first two MSBs and rp2 covers the last two LSBs.
+
+Note that of course now the loop is executed only 64 times (256/4).
+And note that care must taken wrt byte ordering. The way bytes are
+ordered in a long is machine dependent, and might affect us.
+Anyway, if there is an issue: this code is developed on x86 (to be
+precise: a DELL PC with a D920 Intel CPU)
+
+And of course the performance might depend on alignment, but I expect
+that the I/O buffers in the nand driver are aligned properly (and
+otherwise that should be fixed to get maximum performance).
+
+Let's give it a try...
+
+
+Attempt 2
+=========
+
+::
+
+ extern const char parity[256];
+
+ void ecc2(const unsigned char *buf, unsigned char *code)
+ {
+ int i;
+ const unsigned long *bp = (unsigned long *)buf;
+ unsigned long cur;
+ unsigned long rp0, rp1, rp2, rp3, rp4, rp5, rp6, rp7;
+ unsigned long rp8, rp9, rp10, rp11, rp12, rp13, rp14, rp15;
+ unsigned long par;
+
+ par = 0;
+ rp0 = 0; rp1 = 0; rp2 = 0; rp3 = 0;
+ rp4 = 0; rp5 = 0; rp6 = 0; rp7 = 0;
+ rp8 = 0; rp9 = 0; rp10 = 0; rp11 = 0;
+ rp12 = 0; rp13 = 0; rp14 = 0; rp15 = 0;
+
+ for (i = 0; i < 64; i++)
+ {
+ cur = *bp++;
+ par ^= cur;
+ if (i & 0x01) rp5 ^= cur; else rp4 ^= cur;
+ if (i & 0x02) rp7 ^= cur; else rp6 ^= cur;
+ if (i & 0x04) rp9 ^= cur; else rp8 ^= cur;
+ if (i & 0x08) rp11 ^= cur; else rp10 ^= cur;
+ if (i & 0x10) rp13 ^= cur; else rp12 ^= cur;
+ if (i & 0x20) rp15 ^= cur; else rp14 ^= cur;
+ }
+ /*
+ we need to adapt the code generation for the fact that rp vars are now
+ long; also the column parity calculation needs to be changed.
+ we'll bring rp4 to 15 back to single byte entities by shifting and
+ xoring
+ */
+ rp4 ^= (rp4 >> 16); rp4 ^= (rp4 >> 8); rp4 &= 0xff;
+ rp5 ^= (rp5 >> 16); rp5 ^= (rp5 >> 8); rp5 &= 0xff;
+ rp6 ^= (rp6 >> 16); rp6 ^= (rp6 >> 8); rp6 &= 0xff;
+ rp7 ^= (rp7 >> 16); rp7 ^= (rp7 >> 8); rp7 &= 0xff;
+ rp8 ^= (rp8 >> 16); rp8 ^= (rp8 >> 8); rp8 &= 0xff;
+ rp9 ^= (rp9 >> 16); rp9 ^= (rp9 >> 8); rp9 &= 0xff;
+ rp10 ^= (rp10 >> 16); rp10 ^= (rp10 >> 8); rp10 &= 0xff;
+ rp11 ^= (rp11 >> 16); rp11 ^= (rp11 >> 8); rp11 &= 0xff;
+ rp12 ^= (rp12 >> 16); rp12 ^= (rp12 >> 8); rp12 &= 0xff;
+ rp13 ^= (rp13 >> 16); rp13 ^= (rp13 >> 8); rp13 &= 0xff;
+ rp14 ^= (rp14 >> 16); rp14 ^= (rp14 >> 8); rp14 &= 0xff;
+ rp15 ^= (rp15 >> 16); rp15 ^= (rp15 >> 8); rp15 &= 0xff;
+ rp3 = (par >> 16); rp3 ^= (rp3 >> 8); rp3 &= 0xff;
+ rp2 = par & 0xffff; rp2 ^= (rp2 >> 8); rp2 &= 0xff;
+ par ^= (par >> 16);
+ rp1 = (par >> 8); rp1 &= 0xff;
+ rp0 = (par & 0xff);
+ par ^= (par >> 8); par &= 0xff;
+
+ code[0] =
+ (parity[rp7] << 7) |
+ (parity[rp6] << 6) |
+ (parity[rp5] << 5) |
+ (parity[rp4] << 4) |
+ (parity[rp3] << 3) |
+ (parity[rp2] << 2) |
+ (parity[rp1] << 1) |
+ (parity[rp0]);
+ code[1] =
+ (parity[rp15] << 7) |
+ (parity[rp14] << 6) |
+ (parity[rp13] << 5) |
+ (parity[rp12] << 4) |
+ (parity[rp11] << 3) |
+ (parity[rp10] << 2) |
+ (parity[rp9] << 1) |
+ (parity[rp8]);
+ code[2] =
+ (parity[par & 0xf0] << 7) |
+ (parity[par & 0x0f] << 6) |
+ (parity[par & 0xcc] << 5) |
+ (parity[par & 0x33] << 4) |
+ (parity[par & 0xaa] << 3) |
+ (parity[par & 0x55] << 2);
+ code[0] = ~code[0];
+ code[1] = ~code[1];
+ code[2] = ~code[2];
+ }
+
+The parity array is not shown any more. Note also that for these
+examples I kinda deviated from my regular programming style by allowing
+multiple statements on a line, not using { } in then and else blocks
+with only a single statement and by using operators like ^=
+
+
+Analysis 2
+==========
+
+The code (of course) works, and hurray: we are a little bit faster than
+the linux driver code (about 15%). But wait, don't cheer too quickly.
+There is more to be gained.
+If we look at e.g. rp14 and rp15 we see that we either xor our data with
+rp14 or with rp15. However we also have par which goes over all data.
+This means there is no need to calculate rp14 as it can be calculated from
+rp15 through rp14 = par ^ rp15, because par = rp14 ^ rp15;
+(or if desired we can avoid calculating rp15 and calculate it from
+rp14). That is why some places refer to inverse parity.
+Of course the same thing holds for rp4/5, rp6/7, rp8/9, rp10/11 and rp12/13.
+Effectively this means we can eliminate the else clause from the if
+statements. Also we can optimise the calculation in the end a little bit
+by going from long to byte first. Actually we can even avoid the table
+lookups
+
+Attempt 3
+=========
+
+Odd replaced::
+
+ if (i & 0x01) rp5 ^= cur; else rp4 ^= cur;
+ if (i & 0x02) rp7 ^= cur; else rp6 ^= cur;
+ if (i & 0x04) rp9 ^= cur; else rp8 ^= cur;
+ if (i & 0x08) rp11 ^= cur; else rp10 ^= cur;
+ if (i & 0x10) rp13 ^= cur; else rp12 ^= cur;
+ if (i & 0x20) rp15 ^= cur; else rp14 ^= cur;
+
+with::
+
+ if (i & 0x01) rp5 ^= cur;
+ if (i & 0x02) rp7 ^= cur;
+ if (i & 0x04) rp9 ^= cur;
+ if (i & 0x08) rp11 ^= cur;
+ if (i & 0x10) rp13 ^= cur;
+ if (i & 0x20) rp15 ^= cur;
+
+and outside the loop added::
+
+ rp4 = par ^ rp5;
+ rp6 = par ^ rp7;
+ rp8 = par ^ rp9;
+ rp10 = par ^ rp11;
+ rp12 = par ^ rp13;
+ rp14 = par ^ rp15;
+
+And after that the code takes about 30% more time, although the number of
+statements is reduced. This is also reflected in the assembly code.
+
+
+Analysis 3
+==========
+
+Very weird. Guess it has to do with caching or instruction parallellism
+or so. I also tried on an eeePC (Celeron, clocked at 900 Mhz). Interesting
+observation was that this one is only 30% slower (according to time)
+executing the code as my 3Ghz D920 processor.
+
+Well, it was expected not to be easy so maybe instead move to a
+different track: let's move back to the code from attempt2 and do some
+loop unrolling. This will eliminate a few if statements. I'll try
+different amounts of unrolling to see what works best.
+
+
+Attempt 4
+=========
+
+Unrolled the loop 1, 2, 3 and 4 times.
+For 4 the code starts with::
+
+ for (i = 0; i < 4; i++)
+ {
+ cur = *bp++;
+ par ^= cur;
+ rp4 ^= cur;
+ rp6 ^= cur;
+ rp8 ^= cur;
+ rp10 ^= cur;
+ if (i & 0x1) rp13 ^= cur; else rp12 ^= cur;
+ if (i & 0x2) rp15 ^= cur; else rp14 ^= cur;
+ cur = *bp++;
+ par ^= cur;
+ rp5 ^= cur;
+ rp6 ^= cur;
+ ...
+
+
+Analysis 4
+==========
+
+Unrolling once gains about 15%
+
+Unrolling twice keeps the gain at about 15%
+
+Unrolling three times gives a gain of 30% compared to attempt 2.
+
+Unrolling four times gives a marginal improvement compared to unrolling
+three times.
+
+I decided to proceed with a four time unrolled loop anyway. It was my gut
+feeling that in the next steps I would obtain additional gain from it.
+
+The next step was triggered by the fact that par contains the xor of all
+bytes and rp4 and rp5 each contain the xor of half of the bytes.
+So in effect par = rp4 ^ rp5. But as xor is commutative we can also say
+that rp5 = par ^ rp4. So no need to keep both rp4 and rp5 around. We can
+eliminate rp5 (or rp4, but I already foresaw another optimisation).
+The same holds for rp6/7, rp8/9, rp10/11 rp12/13 and rp14/15.
+
+
+Attempt 5
+=========
+
+Effectively so all odd digit rp assignments in the loop were removed.
+This included the else clause of the if statements.
+Of course after the loop we need to correct things by adding code like::
+
+ rp5 = par ^ rp4;
+
+Also the initial assignments (rp5 = 0; etc) could be removed.
+Along the line I also removed the initialisation of rp0/1/2/3.
+
+
+Analysis 5
+==========
+
+Measurements showed this was a good move. The run-time roughly halved
+compared with attempt 4 with 4 times unrolled, and we only require 1/3rd
+of the processor time compared to the current code in the linux kernel.
+
+However, still I thought there was more. I didn't like all the if
+statements. Why not keep a running parity and only keep the last if
+statement. Time for yet another version!
+
+
+Attempt 6
+=========
+
+THe code within the for loop was changed to::
+
+ for (i = 0; i < 4; i++)
+ {
+ cur = *bp++; tmppar = cur; rp4 ^= cur;
+ cur = *bp++; tmppar ^= cur; rp6 ^= tmppar;
+ cur = *bp++; tmppar ^= cur; rp4 ^= cur;
+ cur = *bp++; tmppar ^= cur; rp8 ^= tmppar;
+
+ cur = *bp++; tmppar ^= cur; rp4 ^= cur; rp6 ^= cur;
+ cur = *bp++; tmppar ^= cur; rp6 ^= cur;
+ cur = *bp++; tmppar ^= cur; rp4 ^= cur;
+ cur = *bp++; tmppar ^= cur; rp10 ^= tmppar;
+
+ cur = *bp++; tmppar ^= cur; rp4 ^= cur; rp6 ^= cur; rp8 ^= cur;
+ cur = *bp++; tmppar ^= cur; rp6 ^= cur; rp8 ^= cur;
+ cur = *bp++; tmppar ^= cur; rp4 ^= cur; rp8 ^= cur;
+ cur = *bp++; tmppar ^= cur; rp8 ^= cur;
+
+ cur = *bp++; tmppar ^= cur; rp4 ^= cur; rp6 ^= cur;
+ cur = *bp++; tmppar ^= cur; rp6 ^= cur;
+ cur = *bp++; tmppar ^= cur; rp4 ^= cur;
+ cur = *bp++; tmppar ^= cur;
+
+ par ^= tmppar;
+ if ((i & 0x1) == 0) rp12 ^= tmppar;
+ if ((i & 0x2) == 0) rp14 ^= tmppar;
+ }
+
+As you can see tmppar is used to accumulate the parity within a for
+iteration. In the last 3 statements is added to par and, if needed,
+to rp12 and rp14.
+
+While making the changes I also found that I could exploit that tmppar
+contains the running parity for this iteration. So instead of having:
+rp4 ^= cur; rp6 ^= cur;
+I removed the rp6 ^= cur; statement and did rp6 ^= tmppar; on next
+statement. A similar change was done for rp8 and rp10
+
+
+Analysis 6
+==========
+
+Measuring this code again showed big gain. When executing the original
+linux code 1 million times, this took about 1 second on my system.
+(using time to measure the performance). After this iteration I was back
+to 0.075 sec. Actually I had to decide to start measuring over 10
+million iterations in order not to lose too much accuracy. This one
+definitely seemed to be the jackpot!
+
+There is a little bit more room for improvement though. There are three
+places with statements::
+
+ rp4 ^= cur; rp6 ^= cur;
+
+It seems more efficient to also maintain a variable rp4_6 in the while
+loop; This eliminates 3 statements per loop. Of course after the loop we
+need to correct by adding::
+
+ rp4 ^= rp4_6;
+ rp6 ^= rp4_6
+
+Furthermore there are 4 sequential assignments to rp8. This can be
+encoded slightly more efficiently by saving tmppar before those 4 lines
+and later do rp8 = rp8 ^ tmppar ^ notrp8;
+(where notrp8 is the value of rp8 before those 4 lines).
+Again a use of the commutative property of xor.
+Time for a new test!
+
+
+Attempt 7
+=========
+
+The new code now looks like::
+
+ for (i = 0; i < 4; i++)
+ {
+ cur = *bp++; tmppar = cur; rp4 ^= cur;
+ cur = *bp++; tmppar ^= cur; rp6 ^= tmppar;
+ cur = *bp++; tmppar ^= cur; rp4 ^= cur;
+ cur = *bp++; tmppar ^= cur; rp8 ^= tmppar;
+
+ cur = *bp++; tmppar ^= cur; rp4_6 ^= cur;
+ cur = *bp++; tmppar ^= cur; rp6 ^= cur;
+ cur = *bp++; tmppar ^= cur; rp4 ^= cur;
+ cur = *bp++; tmppar ^= cur; rp10 ^= tmppar;
+
+ notrp8 = tmppar;
+ cur = *bp++; tmppar ^= cur; rp4_6 ^= cur;
+ cur = *bp++; tmppar ^= cur; rp6 ^= cur;
+ cur = *bp++; tmppar ^= cur; rp4 ^= cur;
+ cur = *bp++; tmppar ^= cur;
+ rp8 = rp8 ^ tmppar ^ notrp8;
+
+ cur = *bp++; tmppar ^= cur; rp4_6 ^= cur;
+ cur = *bp++; tmppar ^= cur; rp6 ^= cur;
+ cur = *bp++; tmppar ^= cur; rp4 ^= cur;
+ cur = *bp++; tmppar ^= cur;
+
+ par ^= tmppar;
+ if ((i & 0x1) == 0) rp12 ^= tmppar;
+ if ((i & 0x2) == 0) rp14 ^= tmppar;
+ }
+ rp4 ^= rp4_6;
+ rp6 ^= rp4_6;
+
+
+Not a big change, but every penny counts :-)
+
+
+Analysis 7
+==========
+
+Actually this made things worse. Not very much, but I don't want to move
+into the wrong direction. Maybe something to investigate later. Could
+have to do with caching again.
+
+Guess that is what there is to win within the loop. Maybe unrolling one
+more time will help. I'll keep the optimisations from 7 for now.
+
+
+Attempt 8
+=========
+
+Unrolled the loop one more time.
+
+
+Analysis 8
+==========
+
+This makes things worse. Let's stick with attempt 6 and continue from there.
+Although it seems that the code within the loop cannot be optimised
+further there is still room to optimize the generation of the ecc codes.
+We can simply calculate the total parity. If this is 0 then rp4 = rp5
+etc. If the parity is 1, then rp4 = !rp5;
+
+But if rp4 = rp5 we do not need rp5 etc. We can just write the even bits
+in the result byte and then do something like::
+
+ code[0] |= (code[0] << 1);
+
+Lets test this.
+
+
+Attempt 9
+=========
+
+Changed the code but again this slightly degrades performance. Tried all
+kind of other things, like having dedicated parity arrays to avoid the
+shift after parity[rp7] << 7; No gain.
+Change the lookup using the parity array by using shift operators (e.g.
+replace parity[rp7] << 7 with::
+
+ rp7 ^= (rp7 << 4);
+ rp7 ^= (rp7 << 2);
+ rp7 ^= (rp7 << 1);
+ rp7 &= 0x80;
+
+No gain.
+
+The only marginal change was inverting the parity bits, so we can remove
+the last three invert statements.
+
+Ah well, pity this does not deliver more. Then again 10 million
+iterations using the linux driver code takes between 13 and 13.5
+seconds, whereas my code now takes about 0.73 seconds for those 10
+million iterations. So basically I've improved the performance by a
+factor 18 on my system. Not that bad. Of course on different hardware
+you will get different results. No warranties!
+
+But of course there is no such thing as a free lunch. The codesize almost
+tripled (from 562 bytes to 1434 bytes). Then again, it is not that much.
+
+
+Correcting errors
+=================
+
+For correcting errors I again used the ST application note as a starter,
+but I also peeked at the existing code.
+
+The algorithm itself is pretty straightforward. Just xor the given and
+the calculated ecc. If all bytes are 0 there is no problem. If 11 bits
+are 1 we have one correctable bit error. If there is 1 bit 1, we have an
+error in the given ecc code.
+
+It proved to be fastest to do some table lookups. Performance gain
+introduced by this is about a factor 2 on my system when a repair had to
+be done, and 1% or so if no repair had to be done.
+
+Code size increased from 330 bytes to 686 bytes for this function.
+(gcc 4.2, -O3)
+
+
+Conclusion
+==========
+
+The gain when calculating the ecc is tremendous. Om my development hardware
+a speedup of a factor of 18 for ecc calculation was achieved. On a test on an
+embedded system with a MIPS core a factor 7 was obtained.
+
+On a test with a Linksys NSLU2 (ARMv5TE processor) the speedup was a factor
+5 (big endian mode, gcc 4.1.2, -O3)
+
+For correction not much gain could be obtained (as bitflips are rare). Then
+again there are also much less cycles spent there.
+
+It seems there is not much more gain possible in this, at least when
+programmed in C. Of course it might be possible to squeeze something more
+out of it with an assembler program, but due to pipeline behaviour etc
+this is very tricky (at least for intel hw).
+
+Author: Frans Meulenbroeks
+
+Copyright (C) 2008 Koninklijke Philips Electronics NV.
diff --git a/Documentation/driver-api/mtd/spi-nor.rst b/Documentation/driver-api/mtd/spi-nor.rst
new file mode 100644
index 000000000..1f0437676
--- /dev/null
+++ b/Documentation/driver-api/mtd/spi-nor.rst
@@ -0,0 +1,66 @@
+=================
+SPI NOR framework
+=================
+
+Part I - Why do we need this framework?
+---------------------------------------
+
+SPI bus controllers (drivers/spi/) only deal with streams of bytes; the bus
+controller operates agnostic of the specific device attached. However, some
+controllers (such as Freescale's QuadSPI controller) cannot easily handle
+arbitrary streams of bytes, but rather are designed specifically for SPI NOR.
+
+In particular, Freescale's QuadSPI controller must know the NOR commands to
+find the right LUT sequence. Unfortunately, the SPI subsystem has no notion of
+opcodes, addresses, or data payloads; a SPI controller simply knows to send or
+receive bytes (Tx and Rx). Therefore, we must define a new layering scheme under
+which the controller driver is aware of the opcodes, addressing, and other
+details of the SPI NOR protocol.
+
+Part II - How does the framework work?
+--------------------------------------
+
+This framework just adds a new layer between the MTD and the SPI bus driver.
+With this new layer, the SPI NOR controller driver does not depend on the
+m25p80 code anymore.
+
+Before this framework, the layer is like::
+
+ MTD
+ ------------------------
+ m25p80
+ ------------------------
+ SPI bus driver
+ ------------------------
+ SPI NOR chip
+
+ After this framework, the layer is like:
+ MTD
+ ------------------------
+ SPI NOR framework
+ ------------------------
+ m25p80
+ ------------------------
+ SPI bus driver
+ ------------------------
+ SPI NOR chip
+
+ With the SPI NOR controller driver (Freescale QuadSPI), it looks like:
+ MTD
+ ------------------------
+ SPI NOR framework
+ ------------------------
+ fsl-quadSPI
+ ------------------------
+ SPI NOR chip
+
+Part III - How can drivers use the framework?
+---------------------------------------------
+
+The main API is spi_nor_scan(). Before you call the hook, a driver should
+initialize the necessary fields for spi_nor{}. Please see
+drivers/mtd/spi-nor/spi-nor.c for detail. Please also refer to spi-fsl-qspi.c
+when you want to write a new driver for a SPI NOR controller.
+Another API is spi_nor_restore(), this is used to restore the status of SPI
+flash chip such as addressing mode. Call it whenever detach the driver from
+device or reboot the system.
diff --git a/Documentation/driver-api/mtdnand.rst b/Documentation/driver-api/mtdnand.rst
new file mode 100644
index 000000000..0bf8d6ec3
--- /dev/null
+++ b/Documentation/driver-api/mtdnand.rst
@@ -0,0 +1,1009 @@
+=====================================
+MTD NAND Driver Programming Interface
+=====================================
+
+:Author: Thomas Gleixner
+
+Introduction
+============
+
+The generic NAND driver supports almost all NAND and AG-AND based chips
+and connects them to the Memory Technology Devices (MTD) subsystem of
+the Linux Kernel.
+
+This documentation is provided for developers who want to implement
+board drivers or filesystem drivers suitable for NAND devices.
+
+Known Bugs And Assumptions
+==========================
+
+None.
+
+Documentation hints
+===================
+
+The function and structure docs are autogenerated. Each function and
+struct member has a short description which is marked with an [XXX]
+identifier. The following chapters explain the meaning of those
+identifiers.
+
+Function identifiers [XXX]
+--------------------------
+
+The functions are marked with [XXX] identifiers in the short comment.
+The identifiers explain the usage and scope of the functions. Following
+identifiers are used:
+
+- [MTD Interface]
+
+ These functions provide the interface to the MTD kernel API. They are
+ not replaceable and provide functionality which is complete hardware
+ independent.
+
+- [NAND Interface]
+
+ These functions are exported and provide the interface to the NAND
+ kernel API.
+
+- [GENERIC]
+
+ Generic functions are not replaceable and provide functionality which
+ is complete hardware independent.
+
+- [DEFAULT]
+
+ Default functions provide hardware related functionality which is
+ suitable for most of the implementations. These functions can be
+ replaced by the board driver if necessary. Those functions are called
+ via pointers in the NAND chip description structure. The board driver
+ can set the functions which should be replaced by board dependent
+ functions before calling nand_scan(). If the function pointer is
+ NULL on entry to nand_scan() then the pointer is set to the default
+ function which is suitable for the detected chip type.
+
+Struct member identifiers [XXX]
+-------------------------------
+
+The struct members are marked with [XXX] identifiers in the comment. The
+identifiers explain the usage and scope of the members. Following
+identifiers are used:
+
+- [INTERN]
+
+ These members are for NAND driver internal use only and must not be
+ modified. Most of these values are calculated from the chip geometry
+ information which is evaluated during nand_scan().
+
+- [REPLACEABLE]
+
+ Replaceable members hold hardware related functions which can be
+ provided by the board driver. The board driver can set the functions
+ which should be replaced by board dependent functions before calling
+ nand_scan(). If the function pointer is NULL on entry to
+ nand_scan() then the pointer is set to the default function which is
+ suitable for the detected chip type.
+
+- [BOARDSPECIFIC]
+
+ Board specific members hold hardware related information which must
+ be provided by the board driver. The board driver must set the
+ function pointers and datafields before calling nand_scan().
+
+- [OPTIONAL]
+
+ Optional members can hold information relevant for the board driver.
+ The generic NAND driver code does not use this information.
+
+Basic board driver
+==================
+
+For most boards it will be sufficient to provide just the basic
+functions and fill out some really board dependent members in the nand
+chip description structure.
+
+Basic defines
+-------------
+
+At least you have to provide a nand_chip structure and a storage for
+the ioremap'ed chip address. You can allocate the nand_chip structure
+using kmalloc or you can allocate it statically. The NAND chip structure
+embeds an mtd structure which will be registered to the MTD subsystem.
+You can extract a pointer to the mtd structure from a nand_chip pointer
+using the nand_to_mtd() helper.
+
+Kmalloc based example
+
+::
+
+ static struct mtd_info *board_mtd;
+ static void __iomem *baseaddr;
+
+
+Static example
+
+::
+
+ static struct nand_chip board_chip;
+ static void __iomem *baseaddr;
+
+
+Partition defines
+-----------------
+
+If you want to divide your device into partitions, then define a
+partitioning scheme suitable to your board.
+
+::
+
+ #define NUM_PARTITIONS 2
+ static struct mtd_partition partition_info[] = {
+ { .name = "Flash partition 1",
+ .offset = 0,
+ .size = 8 * 1024 * 1024 },
+ { .name = "Flash partition 2",
+ .offset = MTDPART_OFS_NEXT,
+ .size = MTDPART_SIZ_FULL },
+ };
+
+
+Hardware control function
+-------------------------
+
+The hardware control function provides access to the control pins of the
+NAND chip(s). The access can be done by GPIO pins or by address lines.
+If you use address lines, make sure that the timing requirements are
+met.
+
+*GPIO based example*
+
+::
+
+ static void board_hwcontrol(struct mtd_info *mtd, int cmd)
+ {
+ switch(cmd){
+ case NAND_CTL_SETCLE: /* Set CLE pin high */ break;
+ case NAND_CTL_CLRCLE: /* Set CLE pin low */ break;
+ case NAND_CTL_SETALE: /* Set ALE pin high */ break;
+ case NAND_CTL_CLRALE: /* Set ALE pin low */ break;
+ case NAND_CTL_SETNCE: /* Set nCE pin low */ break;
+ case NAND_CTL_CLRNCE: /* Set nCE pin high */ break;
+ }
+ }
+
+
+*Address lines based example.* It's assumed that the nCE pin is driven
+by a chip select decoder.
+
+::
+
+ static void board_hwcontrol(struct mtd_info *mtd, int cmd)
+ {
+ struct nand_chip *this = mtd_to_nand(mtd);
+ switch(cmd){
+ case NAND_CTL_SETCLE: this->legacy.IO_ADDR_W |= CLE_ADRR_BIT; break;
+ case NAND_CTL_CLRCLE: this->legacy.IO_ADDR_W &= ~CLE_ADRR_BIT; break;
+ case NAND_CTL_SETALE: this->legacy.IO_ADDR_W |= ALE_ADRR_BIT; break;
+ case NAND_CTL_CLRALE: this->legacy.IO_ADDR_W &= ~ALE_ADRR_BIT; break;
+ }
+ }
+
+
+Device ready function
+---------------------
+
+If the hardware interface has the ready busy pin of the NAND chip
+connected to a GPIO or other accessible I/O pin, this function is used
+to read back the state of the pin. The function has no arguments and
+should return 0, if the device is busy (R/B pin is low) and 1, if the
+device is ready (R/B pin is high). If the hardware interface does not
+give access to the ready busy pin, then the function must not be defined
+and the function pointer this->legacy.dev_ready is set to NULL.
+
+Init function
+-------------
+
+The init function allocates memory and sets up all the board specific
+parameters and function pointers. When everything is set up nand_scan()
+is called. This function tries to detect and identify then chip. If a
+chip is found all the internal data fields are initialized accordingly.
+The structure(s) have to be zeroed out first and then filled with the
+necessary information about the device.
+
+::
+
+ static int __init board_init (void)
+ {
+ struct nand_chip *this;
+ int err = 0;
+
+ /* Allocate memory for MTD device structure and private data */
+ this = kzalloc(sizeof(struct nand_chip), GFP_KERNEL);
+ if (!this) {
+ printk ("Unable to allocate NAND MTD device structure.\n");
+ err = -ENOMEM;
+ goto out;
+ }
+
+ board_mtd = nand_to_mtd(this);
+
+ /* map physical address */
+ baseaddr = ioremap(CHIP_PHYSICAL_ADDRESS, 1024);
+ if (!baseaddr) {
+ printk("Ioremap to access NAND chip failed\n");
+ err = -EIO;
+ goto out_mtd;
+ }
+
+ /* Set address of NAND IO lines */
+ this->legacy.IO_ADDR_R = baseaddr;
+ this->legacy.IO_ADDR_W = baseaddr;
+ /* Reference hardware control function */
+ this->hwcontrol = board_hwcontrol;
+ /* Set command delay time, see datasheet for correct value */
+ this->legacy.chip_delay = CHIP_DEPENDEND_COMMAND_DELAY;
+ /* Assign the device ready function, if available */
+ this->legacy.dev_ready = board_dev_ready;
+ this->eccmode = NAND_ECC_SOFT;
+
+ /* Scan to find existence of the device */
+ if (nand_scan (this, 1)) {
+ err = -ENXIO;
+ goto out_ior;
+ }
+
+ add_mtd_partitions(board_mtd, partition_info, NUM_PARTITIONS);
+ goto out;
+
+ out_ior:
+ iounmap(baseaddr);
+ out_mtd:
+ kfree (this);
+ out:
+ return err;
+ }
+ module_init(board_init);
+
+
+Exit function
+-------------
+
+The exit function is only necessary if the driver is compiled as a
+module. It releases all resources which are held by the chip driver and
+unregisters the partitions in the MTD layer.
+
+::
+
+ #ifdef MODULE
+ static void __exit board_cleanup (void)
+ {
+ /* Unregister device */
+ WARN_ON(mtd_device_unregister(board_mtd));
+ /* Release resources */
+ nand_cleanup(mtd_to_nand(board_mtd));
+
+ /* unmap physical address */
+ iounmap(baseaddr);
+
+ /* Free the MTD device structure */
+ kfree (mtd_to_nand(board_mtd));
+ }
+ module_exit(board_cleanup);
+ #endif
+
+
+Advanced board driver functions
+===============================
+
+This chapter describes the advanced functionality of the NAND driver.
+For a list of functions which can be overridden by the board driver see
+the documentation of the nand_chip structure.
+
+Multiple chip control
+---------------------
+
+The nand driver can control chip arrays. Therefore the board driver must
+provide an own select_chip function. This function must (de)select the
+requested chip. The function pointer in the nand_chip structure must be
+set before calling nand_scan(). The maxchip parameter of nand_scan()
+defines the maximum number of chips to scan for. Make sure that the
+select_chip function can handle the requested number of chips.
+
+The nand driver concatenates the chips to one virtual chip and provides
+this virtual chip to the MTD layer.
+
+*Note: The driver can only handle linear chip arrays of equally sized
+chips. There is no support for parallel arrays which extend the
+buswidth.*
+
+*GPIO based example*
+
+::
+
+ static void board_select_chip (struct mtd_info *mtd, int chip)
+ {
+ /* Deselect all chips, set all nCE pins high */
+ GPIO(BOARD_NAND_NCE) |= 0xff;
+ if (chip >= 0)
+ GPIO(BOARD_NAND_NCE) &= ~ (1 << chip);
+ }
+
+
+*Address lines based example.* Its assumed that the nCE pins are
+connected to an address decoder.
+
+::
+
+ static void board_select_chip (struct mtd_info *mtd, int chip)
+ {
+ struct nand_chip *this = mtd_to_nand(mtd);
+
+ /* Deselect all chips */
+ this->legacy.IO_ADDR_R &= ~BOARD_NAND_ADDR_MASK;
+ this->legacy.IO_ADDR_W &= ~BOARD_NAND_ADDR_MASK;
+ switch (chip) {
+ case 0:
+ this->legacy.IO_ADDR_R |= BOARD_NAND_ADDR_CHIP0;
+ this->legacy.IO_ADDR_W |= BOARD_NAND_ADDR_CHIP0;
+ break;
+ ....
+ case n:
+ this->legacy.IO_ADDR_R |= BOARD_NAND_ADDR_CHIPn;
+ this->legacy.IO_ADDR_W |= BOARD_NAND_ADDR_CHIPn;
+ break;
+ }
+ }
+
+
+Hardware ECC support
+--------------------
+
+Functions and constants
+~~~~~~~~~~~~~~~~~~~~~~~
+
+The nand driver supports three different types of hardware ECC.
+
+- NAND_ECC_HW3_256
+
+ Hardware ECC generator providing 3 bytes ECC per 256 byte.
+
+- NAND_ECC_HW3_512
+
+ Hardware ECC generator providing 3 bytes ECC per 512 byte.
+
+- NAND_ECC_HW6_512
+
+ Hardware ECC generator providing 6 bytes ECC per 512 byte.
+
+- NAND_ECC_HW8_512
+
+ Hardware ECC generator providing 8 bytes ECC per 512 byte.
+
+If your hardware generator has a different functionality add it at the
+appropriate place in nand_base.c
+
+The board driver must provide following functions:
+
+- enable_hwecc
+
+ This function is called before reading / writing to the chip. Reset
+ or initialize the hardware generator in this function. The function
+ is called with an argument which let you distinguish between read and
+ write operations.
+
+- calculate_ecc
+
+ This function is called after read / write from / to the chip.
+ Transfer the ECC from the hardware to the buffer. If the option
+ NAND_HWECC_SYNDROME is set then the function is only called on
+ write. See below.
+
+- correct_data
+
+ In case of an ECC error this function is called for error detection
+ and correction. Return 1 respectively 2 in case the error can be
+ corrected. If the error is not correctable return -1. If your
+ hardware generator matches the default algorithm of the nand_ecc
+ software generator then use the correction function provided by
+ nand_ecc instead of implementing duplicated code.
+
+Hardware ECC with syndrome calculation
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Many hardware ECC implementations provide Reed-Solomon codes and
+calculate an error syndrome on read. The syndrome must be converted to a
+standard Reed-Solomon syndrome before calling the error correction code
+in the generic Reed-Solomon library.
+
+The ECC bytes must be placed immediately after the data bytes in order
+to make the syndrome generator work. This is contrary to the usual
+layout used by software ECC. The separation of data and out of band area
+is not longer possible. The nand driver code handles this layout and the
+remaining free bytes in the oob area are managed by the autoplacement
+code. Provide a matching oob-layout in this case. See rts_from4.c and
+diskonchip.c for implementation reference. In those cases we must also
+use bad block tables on FLASH, because the ECC layout is interfering
+with the bad block marker positions. See bad block table support for
+details.
+
+Bad block table support
+-----------------------
+
+Most NAND chips mark the bad blocks at a defined position in the spare
+area. Those blocks must not be erased under any circumstances as the bad
+block information would be lost. It is possible to check the bad block
+mark each time when the blocks are accessed by reading the spare area of
+the first page in the block. This is time consuming so a bad block table
+is used.
+
+The nand driver supports various types of bad block tables.
+
+- Per device
+
+ The bad block table contains all bad block information of the device
+ which can consist of multiple chips.
+
+- Per chip
+
+ A bad block table is used per chip and contains the bad block
+ information for this particular chip.
+
+- Fixed offset
+
+ The bad block table is located at a fixed offset in the chip
+ (device). This applies to various DiskOnChip devices.
+
+- Automatic placed
+
+ The bad block table is automatically placed and detected either at
+ the end or at the beginning of a chip (device)
+
+- Mirrored tables
+
+ The bad block table is mirrored on the chip (device) to allow updates
+ of the bad block table without data loss.
+
+nand_scan() calls the function nand_default_bbt().
+nand_default_bbt() selects appropriate default bad block table
+descriptors depending on the chip information which was retrieved by
+nand_scan().
+
+The standard policy is scanning the device for bad blocks and build a
+ram based bad block table which allows faster access than always
+checking the bad block information on the flash chip itself.
+
+Flash based tables
+~~~~~~~~~~~~~~~~~~
+
+It may be desired or necessary to keep a bad block table in FLASH. For
+AG-AND chips this is mandatory, as they have no factory marked bad
+blocks. They have factory marked good blocks. The marker pattern is
+erased when the block is erased to be reused. So in case of powerloss
+before writing the pattern back to the chip this block would be lost and
+added to the bad blocks. Therefore we scan the chip(s) when we detect
+them the first time for good blocks and store this information in a bad
+block table before erasing any of the blocks.
+
+The blocks in which the tables are stored are protected against
+accidental access by marking them bad in the memory bad block table. The
+bad block table management functions are allowed to circumvent this
+protection.
+
+The simplest way to activate the FLASH based bad block table support is
+to set the option NAND_BBT_USE_FLASH in the bbt_option field of the
+nand chip structure before calling nand_scan(). For AG-AND chips is
+this done by default. This activates the default FLASH based bad block
+table functionality of the NAND driver. The default bad block table
+options are
+
+- Store bad block table per chip
+
+- Use 2 bits per block
+
+- Automatic placement at the end of the chip
+
+- Use mirrored tables with version numbers
+
+- Reserve 4 blocks at the end of the chip
+
+User defined tables
+~~~~~~~~~~~~~~~~~~~
+
+User defined tables are created by filling out a nand_bbt_descr
+structure and storing the pointer in the nand_chip structure member
+bbt_td before calling nand_scan(). If a mirror table is necessary a
+second structure must be created and a pointer to this structure must be
+stored in bbt_md inside the nand_chip structure. If the bbt_md member
+is set to NULL then only the main table is used and no scan for the
+mirrored table is performed.
+
+The most important field in the nand_bbt_descr structure is the
+options field. The options define most of the table properties. Use the
+predefined constants from rawnand.h to define the options.
+
+- Number of bits per block
+
+ The supported number of bits is 1, 2, 4, 8.
+
+- Table per chip
+
+ Setting the constant NAND_BBT_PERCHIP selects that a bad block
+ table is managed for each chip in a chip array. If this option is not
+ set then a per device bad block table is used.
+
+- Table location is absolute
+
+ Use the option constant NAND_BBT_ABSPAGE and define the absolute
+ page number where the bad block table starts in the field pages. If
+ you have selected bad block tables per chip and you have a multi chip
+ array then the start page must be given for each chip in the chip
+ array. Note: there is no scan for a table ident pattern performed, so
+ the fields pattern, veroffs, offs, len can be left uninitialized
+
+- Table location is automatically detected
+
+ The table can either be located in the first or the last good blocks
+ of the chip (device). Set NAND_BBT_LASTBLOCK to place the bad block
+ table at the end of the chip (device). The bad block tables are
+ marked and identified by a pattern which is stored in the spare area
+ of the first page in the block which holds the bad block table. Store
+ a pointer to the pattern in the pattern field. Further the length of
+ the pattern has to be stored in len and the offset in the spare area
+ must be given in the offs member of the nand_bbt_descr structure.
+ For mirrored bad block tables different patterns are mandatory.
+
+- Table creation
+
+ Set the option NAND_BBT_CREATE to enable the table creation if no
+ table can be found during the scan. Usually this is done only once if
+ a new chip is found.
+
+- Table write support
+
+ Set the option NAND_BBT_WRITE to enable the table write support.
+ This allows the update of the bad block table(s) in case a block has
+ to be marked bad due to wear. The MTD interface function
+ block_markbad is calling the update function of the bad block table.
+ If the write support is enabled then the table is updated on FLASH.
+
+ Note: Write support should only be enabled for mirrored tables with
+ version control.
+
+- Table version control
+
+ Set the option NAND_BBT_VERSION to enable the table version
+ control. It's highly recommended to enable this for mirrored tables
+ with write support. It makes sure that the risk of losing the bad
+ block table information is reduced to the loss of the information
+ about the one worn out block which should be marked bad. The version
+ is stored in 4 consecutive bytes in the spare area of the device. The
+ position of the version number is defined by the member veroffs in
+ the bad block table descriptor.
+
+- Save block contents on write
+
+ In case that the block which holds the bad block table does contain
+ other useful information, set the option NAND_BBT_SAVECONTENT. When
+ the bad block table is written then the whole block is read the bad
+ block table is updated and the block is erased and everything is
+ written back. If this option is not set only the bad block table is
+ written and everything else in the block is ignored and erased.
+
+- Number of reserved blocks
+
+ For automatic placement some blocks must be reserved for bad block
+ table storage. The number of reserved blocks is defined in the
+ maxblocks member of the bad block table description structure.
+ Reserving 4 blocks for mirrored tables should be a reasonable number.
+ This also limits the number of blocks which are scanned for the bad
+ block table ident pattern.
+
+Spare area (auto)placement
+--------------------------
+
+The nand driver implements different possibilities for placement of
+filesystem data in the spare area,
+
+- Placement defined by fs driver
+
+- Automatic placement
+
+The default placement function is automatic placement. The nand driver
+has built in default placement schemes for the various chiptypes. If due
+to hardware ECC functionality the default placement does not fit then
+the board driver can provide a own placement scheme.
+
+File system drivers can provide a own placement scheme which is used
+instead of the default placement scheme.
+
+Placement schemes are defined by a nand_oobinfo structure
+
+::
+
+ struct nand_oobinfo {
+ int useecc;
+ int eccbytes;
+ int eccpos[24];
+ int oobfree[8][2];
+ };
+
+
+- useecc
+
+ The useecc member controls the ecc and placement function. The header
+ file include/mtd/mtd-abi.h contains constants to select ecc and
+ placement. MTD_NANDECC_OFF switches off the ecc complete. This is
+ not recommended and available for testing and diagnosis only.
+ MTD_NANDECC_PLACE selects caller defined placement,
+ MTD_NANDECC_AUTOPLACE selects automatic placement.
+
+- eccbytes
+
+ The eccbytes member defines the number of ecc bytes per page.
+
+- eccpos
+
+ The eccpos array holds the byte offsets in the spare area where the
+ ecc codes are placed.
+
+- oobfree
+
+ The oobfree array defines the areas in the spare area which can be
+ used for automatic placement. The information is given in the format
+ {offset, size}. offset defines the start of the usable area, size the
+ length in bytes. More than one area can be defined. The list is
+ terminated by an {0, 0} entry.
+
+Placement defined by fs driver
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The calling function provides a pointer to a nand_oobinfo structure
+which defines the ecc placement. For writes the caller must provide a
+spare area buffer along with the data buffer. The spare area buffer size
+is (number of pages) \* (size of spare area). For reads the buffer size
+is (number of pages) \* ((size of spare area) + (number of ecc steps per
+page) \* sizeof (int)). The driver stores the result of the ecc check
+for each tuple in the spare buffer. The storage sequence is::
+
+ <spare data page 0><ecc result 0>...<ecc result n>
+
+ ...
+
+ <spare data page n><ecc result 0>...<ecc result n>
+
+This is a legacy mode used by YAFFS1.
+
+If the spare area buffer is NULL then only the ECC placement is done
+according to the given scheme in the nand_oobinfo structure.
+
+Automatic placement
+~~~~~~~~~~~~~~~~~~~
+
+Automatic placement uses the built in defaults to place the ecc bytes in
+the spare area. If filesystem data have to be stored / read into the
+spare area then the calling function must provide a buffer. The buffer
+size per page is determined by the oobfree array in the nand_oobinfo
+structure.
+
+If the spare area buffer is NULL then only the ECC placement is done
+according to the default builtin scheme.
+
+Spare area autoplacement default schemes
+----------------------------------------
+
+256 byte pagesize
+~~~~~~~~~~~~~~~~~
+
+======== ================== ===================================================
+Offset Content Comment
+======== ================== ===================================================
+0x00 ECC byte 0 Error correction code byte 0
+0x01 ECC byte 1 Error correction code byte 1
+0x02 ECC byte 2 Error correction code byte 2
+0x03 Autoplace 0
+0x04 Autoplace 1
+0x05 Bad block marker If any bit in this byte is zero, then this
+ block is bad. This applies only to the first
+ page in a block. In the remaining pages this
+ byte is reserved
+0x06 Autoplace 2
+0x07 Autoplace 3
+======== ================== ===================================================
+
+512 byte pagesize
+~~~~~~~~~~~~~~~~~
+
+
+============= ================== ==============================================
+Offset Content Comment
+============= ================== ==============================================
+0x00 ECC byte 0 Error correction code byte 0 of the lower
+ 256 Byte data in this page
+0x01 ECC byte 1 Error correction code byte 1 of the lower
+ 256 Bytes of data in this page
+0x02 ECC byte 2 Error correction code byte 2 of the lower
+ 256 Bytes of data in this page
+0x03 ECC byte 3 Error correction code byte 0 of the upper
+ 256 Bytes of data in this page
+0x04 reserved reserved
+0x05 Bad block marker If any bit in this byte is zero, then this
+ block is bad. This applies only to the first
+ page in a block. In the remaining pages this
+ byte is reserved
+0x06 ECC byte 4 Error correction code byte 1 of the upper
+ 256 Bytes of data in this page
+0x07 ECC byte 5 Error correction code byte 2 of the upper
+ 256 Bytes of data in this page
+0x08 - 0x0F Autoplace 0 - 7
+============= ================== ==============================================
+
+2048 byte pagesize
+~~~~~~~~~~~~~~~~~~
+
+=========== ================== ================================================
+Offset Content Comment
+=========== ================== ================================================
+0x00 Bad block marker If any bit in this byte is zero, then this block
+ is bad. This applies only to the first page in a
+ block. In the remaining pages this byte is
+ reserved
+0x01 Reserved Reserved
+0x02-0x27 Autoplace 0 - 37
+0x28 ECC byte 0 Error correction code byte 0 of the first
+ 256 Byte data in this page
+0x29 ECC byte 1 Error correction code byte 1 of the first
+ 256 Bytes of data in this page
+0x2A ECC byte 2 Error correction code byte 2 of the first
+ 256 Bytes data in this page
+0x2B ECC byte 3 Error correction code byte 0 of the second
+ 256 Bytes of data in this page
+0x2C ECC byte 4 Error correction code byte 1 of the second
+ 256 Bytes of data in this page
+0x2D ECC byte 5 Error correction code byte 2 of the second
+ 256 Bytes of data in this page
+0x2E ECC byte 6 Error correction code byte 0 of the third
+ 256 Bytes of data in this page
+0x2F ECC byte 7 Error correction code byte 1 of the third
+ 256 Bytes of data in this page
+0x30 ECC byte 8 Error correction code byte 2 of the third
+ 256 Bytes of data in this page
+0x31 ECC byte 9 Error correction code byte 0 of the fourth
+ 256 Bytes of data in this page
+0x32 ECC byte 10 Error correction code byte 1 of the fourth
+ 256 Bytes of data in this page
+0x33 ECC byte 11 Error correction code byte 2 of the fourth
+ 256 Bytes of data in this page
+0x34 ECC byte 12 Error correction code byte 0 of the fifth
+ 256 Bytes of data in this page
+0x35 ECC byte 13 Error correction code byte 1 of the fifth
+ 256 Bytes of data in this page
+0x36 ECC byte 14 Error correction code byte 2 of the fifth
+ 256 Bytes of data in this page
+0x37 ECC byte 15 Error correction code byte 0 of the sixth
+ 256 Bytes of data in this page
+0x38 ECC byte 16 Error correction code byte 1 of the sixth
+ 256 Bytes of data in this page
+0x39 ECC byte 17 Error correction code byte 2 of the sixth
+ 256 Bytes of data in this page
+0x3A ECC byte 18 Error correction code byte 0 of the seventh
+ 256 Bytes of data in this page
+0x3B ECC byte 19 Error correction code byte 1 of the seventh
+ 256 Bytes of data in this page
+0x3C ECC byte 20 Error correction code byte 2 of the seventh
+ 256 Bytes of data in this page
+0x3D ECC byte 21 Error correction code byte 0 of the eighth
+ 256 Bytes of data in this page
+0x3E ECC byte 22 Error correction code byte 1 of the eighth
+ 256 Bytes of data in this page
+0x3F ECC byte 23 Error correction code byte 2 of the eighth
+ 256 Bytes of data in this page
+=========== ================== ================================================
+
+Filesystem support
+==================
+
+The NAND driver provides all necessary functions for a filesystem via
+the MTD interface.
+
+Filesystems must be aware of the NAND peculiarities and restrictions.
+One major restrictions of NAND Flash is, that you cannot write as often
+as you want to a page. The consecutive writes to a page, before erasing
+it again, are restricted to 1-3 writes, depending on the manufacturers
+specifications. This applies similar to the spare area.
+
+Therefore NAND aware filesystems must either write in page size chunks
+or hold a writebuffer to collect smaller writes until they sum up to
+pagesize. Available NAND aware filesystems: JFFS2, YAFFS.
+
+The spare area usage to store filesystem data is controlled by the spare
+area placement functionality which is described in one of the earlier
+chapters.
+
+Tools
+=====
+
+The MTD project provides a couple of helpful tools to handle NAND Flash.
+
+- flasherase, flasheraseall: Erase and format FLASH partitions
+
+- nandwrite: write filesystem images to NAND FLASH
+
+- nanddump: dump the contents of a NAND FLASH partitions
+
+These tools are aware of the NAND restrictions. Please use those tools
+instead of complaining about errors which are caused by non NAND aware
+access methods.
+
+Constants
+=========
+
+This chapter describes the constants which might be relevant for a
+driver developer.
+
+Chip option constants
+---------------------
+
+Constants for chip id table
+~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+These constants are defined in rawnand.h. They are OR-ed together to
+describe the chip functionality::
+
+ /* Buswitdh is 16 bit */
+ #define NAND_BUSWIDTH_16 0x00000002
+ /* Device supports partial programming without padding */
+ #define NAND_NO_PADDING 0x00000004
+ /* Chip has cache program function */
+ #define NAND_CACHEPRG 0x00000008
+ /* Chip has copy back function */
+ #define NAND_COPYBACK 0x00000010
+ /* AND Chip which has 4 banks and a confusing page / block
+ * assignment. See Renesas datasheet for further information */
+ #define NAND_IS_AND 0x00000020
+ /* Chip has a array of 4 pages which can be read without
+ * additional ready /busy waits */
+ #define NAND_4PAGE_ARRAY 0x00000040
+
+
+Constants for runtime options
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+These constants are defined in rawnand.h. They are OR-ed together to
+describe the functionality::
+
+ /* The hw ecc generator provides a syndrome instead a ecc value on read
+ * This can only work if we have the ecc bytes directly behind the
+ * data bytes. Applies for DOC and AG-AND Renesas HW Reed Solomon generators */
+ #define NAND_HWECC_SYNDROME 0x00020000
+
+
+ECC selection constants
+-----------------------
+
+Use these constants to select the ECC algorithm::
+
+ /* No ECC. Usage is not recommended ! */
+ #define NAND_ECC_NONE 0
+ /* Software ECC 3 byte ECC per 256 Byte data */
+ #define NAND_ECC_SOFT 1
+ /* Hardware ECC 3 byte ECC per 256 Byte data */
+ #define NAND_ECC_HW3_256 2
+ /* Hardware ECC 3 byte ECC per 512 Byte data */
+ #define NAND_ECC_HW3_512 3
+ /* Hardware ECC 6 byte ECC per 512 Byte data */
+ #define NAND_ECC_HW6_512 4
+ /* Hardware ECC 8 byte ECC per 512 Byte data */
+ #define NAND_ECC_HW8_512 6
+
+
+Hardware control related constants
+----------------------------------
+
+These constants describe the requested hardware access function when the
+boardspecific hardware control function is called::
+
+ /* Select the chip by setting nCE to low */
+ #define NAND_CTL_SETNCE 1
+ /* Deselect the chip by setting nCE to high */
+ #define NAND_CTL_CLRNCE 2
+ /* Select the command latch by setting CLE to high */
+ #define NAND_CTL_SETCLE 3
+ /* Deselect the command latch by setting CLE to low */
+ #define NAND_CTL_CLRCLE 4
+ /* Select the address latch by setting ALE to high */
+ #define NAND_CTL_SETALE 5
+ /* Deselect the address latch by setting ALE to low */
+ #define NAND_CTL_CLRALE 6
+ /* Set write protection by setting WP to high. Not used! */
+ #define NAND_CTL_SETWP 7
+ /* Clear write protection by setting WP to low. Not used! */
+ #define NAND_CTL_CLRWP 8
+
+
+Bad block table related constants
+---------------------------------
+
+These constants describe the options used for bad block table
+descriptors::
+
+ /* Options for the bad block table descriptors */
+
+ /* The number of bits used per block in the bbt on the device */
+ #define NAND_BBT_NRBITS_MSK 0x0000000F
+ #define NAND_BBT_1BIT 0x00000001
+ #define NAND_BBT_2BIT 0x00000002
+ #define NAND_BBT_4BIT 0x00000004
+ #define NAND_BBT_8BIT 0x00000008
+ /* The bad block table is in the last good block of the device */
+ #define NAND_BBT_LASTBLOCK 0x00000010
+ /* The bbt is at the given page, else we must scan for the bbt */
+ #define NAND_BBT_ABSPAGE 0x00000020
+ /* bbt is stored per chip on multichip devices */
+ #define NAND_BBT_PERCHIP 0x00000080
+ /* bbt has a version counter at offset veroffs */
+ #define NAND_BBT_VERSION 0x00000100
+ /* Create a bbt if none axists */
+ #define NAND_BBT_CREATE 0x00000200
+ /* Write bbt if necessary */
+ #define NAND_BBT_WRITE 0x00001000
+ /* Read and write back block contents when writing bbt */
+ #define NAND_BBT_SAVECONTENT 0x00002000
+
+
+Structures
+==========
+
+This chapter contains the autogenerated documentation of the structures
+which are used in the NAND driver and might be relevant for a driver
+developer. Each struct member has a short description which is marked
+with an [XXX] identifier. See the chapter "Documentation hints" for an
+explanation.
+
+.. kernel-doc:: include/linux/mtd/rawnand.h
+ :internal:
+
+Public Functions Provided
+=========================
+
+This chapter contains the autogenerated documentation of the NAND kernel
+API functions which are exported. Each function has a short description
+which is marked with an [XXX] identifier. See the chapter "Documentation
+hints" for an explanation.
+
+.. kernel-doc:: drivers/mtd/nand/raw/nand_base.c
+ :export:
+
+.. kernel-doc:: drivers/mtd/nand/raw/nand_ecc.c
+ :export:
+
+Internal Functions Provided
+===========================
+
+This chapter contains the autogenerated documentation of the NAND driver
+internal functions. Each function has a short description which is
+marked with an [XXX] identifier. See the chapter "Documentation hints"
+for an explanation. The functions marked with [DEFAULT] might be
+relevant for a board driver developer.
+
+.. kernel-doc:: drivers/mtd/nand/raw/nand_base.c
+ :internal:
+
+.. kernel-doc:: drivers/mtd/nand/raw/nand_bbt.c
+ :internal:
+
+Credits
+=======
+
+The following people have contributed to the NAND driver:
+
+1. Steven J. Hill\ sjhill@realitydiluted.com
+
+2. David Woodhouse\ dwmw2@infradead.org
+
+3. Thomas Gleixner\ tglx@linutronix.de
+
+A lot of users have provided bugfixes, improvements and helping hands
+for testing. Thanks a lot.
+
+The following people have contributed to this document:
+
+1. Thomas Gleixner\ tglx@linutronix.de