From 464df1d5e5ab1322e2dd0a7796939fff1aeefa9a Mon Sep 17 00:00:00 2001 From: Daniel Baumann Date: Sun, 7 Apr 2024 17:49:25 +0200 Subject: Adding upstream version 1.47.0. Signed-off-by: Daniel Baumann --- ext2ed/doc/ext2ed-design.sgml | 3459 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 3459 insertions(+) create mode 100644 ext2ed/doc/ext2ed-design.sgml (limited to 'ext2ed/doc/ext2ed-design.sgml') diff --git a/ext2ed/doc/ext2ed-design.sgml b/ext2ed/doc/ext2ed-design.sgml new file mode 100644 index 0000000..b2cab37 --- /dev/null +++ b/ext2ed/doc/ext2ed-design.sgml @@ -0,0 +1,3459 @@ + + +
+ + + +EXT2ED - The Extended-2 filesystem editor - Design and implementation + +Programmed by Gadi Oxman, with the guide of Avner Lottem + +v0.1, August 3 1995 + + + + +About EXT2ED documentation + + +The EXT2ED documentation consists of three parts: + + + + + + The ext2 filesystem overview. + + + + + + The EXT2ED user's guide. + + + + + + The EXT2ED design and implementation. + + + + + + + + +This document is not the user's guide. If you just intend to use EXT2ED, you +may not want to read it. + + + +However, if you intend to browse and modify the source code, this document is +for you. + + + +In any case, If you intend to read this article, I strongly suggest that you +will be familiar with the material presented in the other two articles as well. + + + + + +Preface + + +In this document I will try to explain how EXT2ED is constructed. +At this time of writing, the initial version is finished and ready +for distribution; It is fully functional. However, this was not always the +case. + + + +At first, I didn't know much about Unix, much less about Unix filesystems, +and even less about Linux and the extended-2 filesystem. While working +on this project, I gradually acquired knowledge about all of the above +subjects. I can think of two ways in which I could have made my project: + + + + + + The "Engineer" way + +Learn the subject thoroughly before I get to the programming itself. +Then, I could easily see the entire picture and select the best +course of action, taking all the factors into account. + + + + + + The "Explorer - Progressive" way. + +Jump immediately into the cold water - Start programming and +learning the material in parallel. + + + + + + + + +I guess that the above dilemma is typical and appears all through science and +technology. + + + +However, I didn't have the luxury of choice when I started my project - +Linux is a relatively new (and great!) operating system. The extended-2 +filesystem is even newer - Its first release lies somewhere in 1993 - Only +passed two years until I started working on my project. + + + +The situation I found myself at the beginning was that I didn't have a fully +detailed document which describes the ext2 filesystem. In fact, I didn't +have any ext2 document at all. When I asked Avner about documentation, he +suggested two references: + + + + + + A general Unix book - THE DESIGN OF THE UNIX OPERATING SYSTEM, by +Maurice J. Bach. + + + + + + The kernel sources. + + + + + +I read the relevant parts of the book before I started my project - It is a +bit old now, but the principles are still the same. However, I needed +more than just the principles. + + + +The kernel sources are a rare bonus! You don't get everyday the full +sources of the operating system. There is so much that can be learned from +them, and it is the ultimate source - The exact answer how the kernel +works is there, with all the fine details. At the first week I started to +look at random at the relevant parts of the sources. However, it is difficult +to understand the global picture from direct reading of over one hundred +page sources. Then, I started to do some programming. I didn't know +yet what I was looking for, and I started to work on the project like a kid +who starts to build a large puzzle. + + + +However, this was exactly the interesting part! It is frustrating to know +it all from advance - I think that the discovery itself, bit by bit, is the +key to a true learning and understanding. + + + +Now, in this document, I am trying to present the subject. Even though I +developed EXT2ED progressively, I now can see the entire subject much +brighter than I did before, and though I do have the option of presenting it +only in the "engineer" way. However, I will not do that. + + + +My presentation will be mixed - Sometimes I will present a subject with an +incremental perspective, and sometimes from a "top down" view. I'll leave +you to decide if my presentation choice was wise :-) + + + +In addition, you'll notice that the sections tend to get shorter as we get +closer to the end. The reason is simply that I started to feel that I was +repeating myself so I decided to present only the new ideas. + + + + + +Getting started ... + + +Getting started is almost always the most difficult task. Once you get +started, things start "running" ... + + + +Before the actual programming + + +From mine talking with Avner, I understood that Linux, like any other Unix +system, provides accesses to the entire disk as though it were a general +file - Accessing the device. It is surely a nice idea. Avner suggested two +ways of action: + + + + + + Opening the device like a regular file in the user space. + + + + + + Constructing a device driver which will run in the kernel space and +provide hooks for the user space program. The advantage is that it +will be a part of the kernel, and would be able to use the ext2 +kernel functions to do some of the work. + + + + + +I chose the first way. I think that the basic reason was simplicity - Learning +the ext2 filesystem was complicated enough, and adding to it the task of +learning how to program in the kernel space was too much. I still don't know +how to program a device driver, and this is perhaps the bad part, but +concerning the project in a back-perspective, I think that the first way is +superior to the second; Ironically, because of the very reason I chose it - +Simplicity. EXT2ED can now run entirely in the user space (which I think is +a point in favor, because it doesn't require the user to recompile its +kernel), and the entire hard work is mine, which fitted nicely into the +learning experience - I didn't use other code to do the job (aside from +looking at the sources, of-course). + + + + + +Jumping into the cold water + + +I didn't know almost anything of the structure of the ext2 filesystem. +Reading the sources was not enough - I needed to experiment. However, a tool +for experiments in the ext2 filesystem was exactly my project! - Kind of a +paradox. + + + +I started immediately with constructing a simple hex editor - It would +open the device as a regular file, provide means of moving inside the +filesystem with a simple offset method, and just show a + hex dump of the contents at this point. Programming this was trivially +simple of-course. At this point, the user-interface didn't matter to me - I +wanted a fast way to interact. As a result, I chose a simple command line +parser. Of course, there where no windows at this point. + + + +A hex editor is nice, but is not enough. It indeed enabled me to see each part +of the filesystem, but the format of the viewed data was difficult to +analyze. I wanted to see the data in a more intuitive way. + + + +At this point of time, the most helpful file in the sources was the ext2 +main include file - /usr/include/linux/ext2_fs.h. Among its contents +there were various structures which I assumed they are disk images - Appear +exactly like that on the disk. + + + +I wanted a quick way to get going. I didn't have the patience to learn +each of the structures use in the code. Rather, I wanted to see them in action, +so that I could explore the connections between them - Test my assumptions, +and reach other assumptions. + + + +So after the hex editor, EXT2ED progressed into a tool which has some +elements of a compiler. I programmed EXT2ED to dynamically read the kernel +ext2 main include file in run time, and process the information. The goal +was to imply a structure-definition on the current offset at the +filesystem. EXT2ED would then display the structure as a list of its +variables names and contents, instead of a meaningless hex dump. + + + +The format of the include file is not very complicated - The structures +are mostly flat - Didn't contain a lot of recursive structure; Only a +global structure definition, and some variables. There were cases of +structures inside structures, I treated them in a somewhat non-elegant way - I +made all the structures flat, and expanded the arrays. As a result, the parser +was very simple. After all, this was not an exercise in compiling, and I +wanted to quickly get some results. + + + +To handle the task, I constructed the struct_descriptor structure. +Each struct_descriptor instance contained information which is needed +in order to format a block of data according to the C structure contained in +the kernel source. The information contained: + + + + + + The descriptor name, used to reference to the structure in EXT2ED. + + + + + + The name of each variable. + + + + + + The relative offset of the each variable in the data block. + + + + + + The length, in bytes, of each variable. + + + + + +Since I didn't want to limit the number of structures, I chose a simple +double linked list to store the information. One variable contained the +current structure type - A pointer to the relevant +struct_descriptor. + + + +Now EXT2ED contained basically three command line operations: + + + + + + setdevice + +Used to open a device for reading only. Write access was postponed +to a very advanced state in the project, simply because I didn't +know a thing of the filesystem structure, and I believed that +making actual changes would do nothing but damage :-) + + + + + + setoffset + +Used to move in the device. + + + + + + settype + +Used to imply a structure definition on the current place. + + + + + + show + +Used to display the data. It displayed the data in a simple hex dump +if there was no type set, or in a nice formatted way - As a list of +the variable contents, if there was. + + + + + + + + +Command line analyzing was primitive back then - A simple switch, as far as +I can remember - Nothing alike the current flow control, but it was enough +at the time. + + + +At the end, I had something to start working with. It knew to format many +structures - None of which I understood - and provided me, without too much +work, something to start with. + + + + + + + +Starting to explore + + +With the above tool in my pocket, I started to explore the ext2 filesystem +structure. From the brief reading in Bach's book, I got familiar to some +basic concepts - The superblock, for example. It seems that the +superblock is an important part of the filesystem. I decided to start +exploring with that. + + + +I realized that the superblock should be at a fixed location in the +filesystem - Probably near the beginning. There can be no other way - +The kernel should start at some place to find it. A brief looking in +the kernel sources revealed that the superblock is signed by a special +signature - A magic number - EXT2_SUPER_MAGIC (0xEF53 - EF probably +stands for Extended Filesystem). I quickly found the superblock at the +fixed offset 1024 in the filesystem - The s_magic variable in the +superblock was set exactly to the above value. + + + +It seems that starting with the superblock was a good bet - Just from +the list of variables, one can learn a lot. I didn't understand all of them +at the time, but it seemed that the following keywords were repeating themselves +in various variables: + + + + + + block + + + + + + inode + + + + + + group + + + + + +At this point, I started to explore the block groups. I will not detail here +the technical design of the ext2 filesystem. I have written a special +article which explains just that, in the "engineering" way. Please refer to it +if you feel that you are lacking knowledge in the structure of the ext2 +filesystem. + + + +I was exploring the filesystem in this way for some time, along with reading +the sources. This lead naturally to the next step. + + + + + +Object specific commands + + +What has become clear is that the above way of exploring is not powerful +enough - I found myself doing various calculations manually in order to pass +between related structures. I needed to replace some tasks with an automated +procedure. + + + +In addition, it also became clear that (of-course) each key object in the +filesystem has its special place in regard to the overall ext2 filesystem +design, and needs a fine tuned handling. It is at this point that the +structure definitions came to life - They became object +definitions, making EXT2ED object oriented. + + + +The actual meaning of the breathtaking words above, is that each structure +now had a list of private commands, which ended up in +calling special fine-tuned C functions. This approach was +found to be very powerful and is the heart of EXT2ED even now. + + + +In order to implement the above concepts, I added the structure +struct_commands. The role of this structure is to group together a +group of commands, which can be later assigned to a specific type. Each +structure had: + + + + + + A list of command names. + + + + + + A list of pointers to functions, which binds each command to its +special fine-tuned C function. + + + + + +In order to relate a list of commands to a type definition, each +struct_descriptor structure (explained earlier) was added a private +struct_commands structure. + + + +Follows the current definitions of struct_descriptor and of +struct_command: + + +struct struct_descriptor { + unsigned long length; + unsigned char name [60]; + unsigned short fields_num; + unsigned char field_names [MAX_FIELDS][80]; + unsigned short field_lengths [MAX_FIELDS]; + unsigned short field_positions [MAX_FIELDS]; + struct struct_commands type_commands; + struct struct_descriptor *prev,*next; +}; + +typedef void (*PF) (char *); + +struct struct_commands { + int last_command; + char *names [MAX_COMMANDS_NUM]; + char *descriptions [MAX_COMMANDS_NUM]; + PF callback [MAX_COMMANDS_NUM]; +}; + + + + + + + + +Program flow control + + +Obviously the above approach lead to a major redesign of EXT2ED. The +main engine of the resulting design is basically the same even now. + + + +I redesigned the program flow control. Up to now, I analyzed the user command +line with the simple switch method. Now I used the far superior callback +method. + + + +I divided the available user commands into two groups: + + + + + + General commands. + + + + + + Type specific commands. + + + + + +As a result, at each point in time, the user was able to enter a +general command, selectable from a list of general commands which was +always available, or a type specific command, selectable from a list of +commands which changed in time according to the current type that the +user was editing. The special type specific command "knew" how to +handle the object in the best possible way - It was "fine tuned" for the +object's place in the ext2 filesystem design. + + + +In order to implement the above idea, I constructed a global variable of +type struct_commands, which contained the general commands. +The type specific commands were accessible through the struct +descriptors, as explained earlier. + + + +The program flow was now done according to the following algorithm: + + + + + + Ask the user for a command line. + + + + + + Analyze the user command - Separate it into command and +arguments. + + + + + + Trace the list of known objects to match the command name to a type. +If the type is found, call the callback function, with the arguments +as a parameter. Then go back to step (1). + + + + + + If the command is not type specific, try to find it in the general +commands, and call it if found. Go back to step (1). + + + + + + If the command is not found, issue a short error message, and return +to step (1). + + + + + +Note the order of the above steps. In particular, note that a command +is first assumed to be a type-specific command and only if this fails, a +general command is searched. The "side-effect" (main effect, actually) +is that when we have two commands with the same name - One that is a +type specific command, and one that is a general command, the dispatching +algorithm will call the type specific command. This allows +overriding of a command to provide fine-tuned operation. +For example, the show command is overridden nearly everywhere, +to accommodate for the different ways in which different objects are displayed, +in order to provide an intuitive fine-tuned display. + + + +The above is done in the dispatch function, in main.c. Since +it is a very important function in EXT2ED, and it is relatively short, I will +list it entirely here. Note that a redesign was made since then - Another +level was added between the two described, but I'll elaborate more on this +later. However, the basic structure follows the explanation described above. + + +int dispatch (char *command_line) + +{ + int i,found=0; + char command [80]; + + parse_word (command_line,command); + + if (strcmp (command,"quit")==0) return (1); + + /* 1. Search for type specific commands FIRST - Allows overriding of a general command */ + + if (current_type != NULL) + for (i=0;i<=current_type->type_commands.last_command && !found;i++) { + if (strcmp (command,current_type->type_commands.names [i])==0) { + (*current_type->type_commands.callback [i]) (command_line); + found=1; + } + } + + /* 2. Now search for ext2 filesystem general commands */ + + if (!found) + for (i=0;i<=ext2_commands.last_command && !found;i++) { + if (strcmp (command,ext2_commands.names [i])==0) { + (*ext2_commands.callback [i]) (command_line); + found=1; + } + } + + + /* 3. If not found, search the general commands */ + + if (!found) + for (i=0;i<=general_commands.last_command && !found;i++) { + if (strcmp (command,general_commands.names [i])==0) { + (*general_commands.callback [i]) (command_line); + found=1; + } + } + + if (!found) { + wprintw (command_win,"Error: Unknown command\n"); + refresh_command_win (); + } + + return (0); +} + + + + + + + +Source files in EXT2ED + + +The project was getting large enough to be split into several source +files. I split the source as much as I could into self-contained +source files. The source files consist of the following blocks: + + + + + + Main include file - ext2ed.h + +This file contains the definitions of the various structures, +variables and functions used in EXT2ED. It is included by all source +files in EXT2ED. + + + + + + + Main block - main.c + +main.c handles the upper level of the program flow control. +It contains the parser and the dispatcher. Its task is +to ask the user for a required action, and to pass control to other +lower level functions in order to do the actual job. + + + + + + + Initialization - init.c + +The init source is responsible for the various initialization +actions which need to be done through the program. For example, +auto detection of an ext2 filesystem when selecting a device and +initialization of the filesystem-specific structures described +earlier. + + + + + + + Disk activity - disk.c + +disk.c is handles the lower level interaction with the +device. All disk activity is passed through this file - The various +functions through the source code request disk actions from the +functions in this file. In this way, for example, we can easily block +the write access to the device. + + + + + + + Display output activity - win.c + +In a similar way to disk.c, the user-interface functions and +most of the interaction with the ncurses library are done +here. Nothing will be actually written to a specific window without +calling a function from this file. + + + + + + + Commands available through dispatching - *_com.c + +The above file name is generic - Each file which ends with +_com.c contains a group of related commands which can be +called through the dispatching function. + +Each object typically has its own file. A separate file is also +available for the general commands. + + + + + +The entire list of source files available at this time is: + + + + + + blockbitmap_com.c + + + + + + dir_com.c + + + + + + disk.c + + + + + + ext2_com.c + + + + + + file_com.c + + + + + + general_com.c + + + + + + group_com.c + + + + + + init.c + + + + + + inode_com.c + + + + + + inodebitmap_com.c + + + + + + main.c + + + + + + super_com.c + + + + + + win.c + + + + + + + + + + +User interface + + +The user interface is text-based only and is based on the following +libraries: + + + + + + + + + The ncurses library, developed by Zeyd Ben-Halim. + + + + + + The GNU readline library. + + + + + + + + +The user interaction is command line based - The user enters a command +line, which consists of a command and of arguments. This fits +nicely with the program flow control described earlier - The command +is used by dispatch to select the right function, and the +arguments are interpreted by the function itself. + + + +The ncurses library + + +The ncurses library enables me to divide the screen into "windows". +The main advantage is that I treat the "window" in a virtual way, asking +the ncurses library to "write to a window". However, the ncurses +library internally buffers the requests, and nothing is actually passed to the +terminal until an explicit refresh is requested. When the refresh request is +made, ncurses compares the current terminal state (as known in the last time +that a refresh was done) with the new to be shown state, and passes to the +terminal the minimal information required to update the display. As a +result, the display output is optimized behind the scenes by the +ncurses library, while I can still treat it in a virtual way. + + + +There are two basic concepts in the ncurses library: + + + + + + A window. + + + + + + A pad. + + + + + +A window can be no bigger than the actual terminal size. A pad, however, is +not limited in its size. + + + +The user screen is divided by EXT2ED into three windows and one pad: + + + + + + Title window. + + + + + + Status window. + + + + + + Main display pad. + + + + + + Command window. + + + + + + + + +The title window is static - It just displays the current version +of EXT2ED. + + + +The user interaction is done in the command window. The user enters a +command line, feedback is usually displayed there, and then relevant +data is usually displayed in the main display and in the status window. + + + +The main display is using a pad instead of a window because +the amount of information which is written to it is not known in advance. +Therefor, the user treats the main display as a "window" into a bigger +display and can scroll vertically using the pgdn and pgup +commands. Although the pad mechanism enables me to use horizontal +scrolling, I have not utilized this. + + + +When I need to show something to the user, I use the ncurses wprintw +command. Then an explicit refresh command is required. As explained before, +the refresh commands is piped through win.c. For example, to update +the command window, refresh_command_win () is used. + + + + + +The readline library + + +Avner suggested me to integrate the GNU readline library in my project. +The readline library is designed specifically for programs which use +command line interface. It provides a nice package of command line editing +tools - Inserting, deleting words, and the whole package of editing tools +which are normally available in the bash shell (Refer to the readline +documentation for details). In addition, I utilized the history +feature of the readline library - The entered commands are saved in a +command history, and can be called later by whatever means that the +readline package provides. Command completion is also supported - When the +user enters a partial command name, EXT2ED will provide the readline library +with the possible completions. + + + + + + + +Possible support of other filesystems + + +The entire ext2 layer is provided through specific objects. Given another +set of objects, support of other filesystem can be provided using the same +dispatching mechanism. In order to prepare the surface for this option, I +added yet another layer to the two-layer structure presented earlier. EXT2ED +commands now consist of three layers: + + + + + + The general commands. + + + + + + The ext2 general commands. + + + + + + The ext2 object specific commands. + + + + + +The general commands are provided by the general_com.c source file, +and are always available. The two other levels are not present when EXT2ED +loads - They are dynamically added by init.c when EXT2ED detects an +ext2 filesystem on the device. + + + +The abstraction levels presented above helps to extend EXT2ED to fully +support a new filesystem, with its own specific type commands. + + + +Even without any source code modification, the user is free to add structure +definitions in a separate file (specified in the configuration file), +which will be added to the list of available objects. The added objects will +consist only of variables, of-course, and will be used through the more +primitive setoffset and settype commands. + + + + + +On the implementation of the various commands + + +This section points out some typical programming style that I used in many +places at the code. + + + +The explicit use of the dispatch function + + +The various commands are reached by the user through the dispatch +function. This is not surprising. The fact that can be surprising, at least in +a first look, is that you'll find the dispatch call in many of my +own functions!. + + + +I am in fact using my own implemented functions to construct higher +level operations. I am heavily using the fact that the dispatching mechanism +is object oriented ant that the overriding principle takes place and +selects the proper function to call when several commands with the same name +are accessible. + + + +Sometimes, however, I call the explicit command directly, without passing +through dispatch. This is typically done when I want to bypass the +overriding effect. + + + + +This is used, for example, in the interaction between the global cd command +and the dir object specific cd command. You will see there that in order +to implement the "entire" cd command, the type specific cd command uses both +a dispatching mechanism to call itself recursively if a relative path is +used, or a direct call of the general cd handling function if an explicit path +is used. + + + + + + +Passing information between handling functions + + +Typically, every source code file which handles one object type has a global +structure specifically designed for it which is used by most of the +functions in that file. This is used to pass information between the various +functions there, and to physically provide the link to other related +objects, typically for initialization use. + + + + +For example, in order to edit a file, information about the +inode is needed - The file command is available only when editing an +inode. When the file command is issued, the handling function (found, +according to the source division outlined above, in inode_com.c) will +store the necessary information about the inode in a specific structure +of type struct_file_info which will be available for use by the file_com.c +functions. Only then it will set the type to file. This is also the reason +that a direct asynchronous set of the object type to a file through a settype +command will fail - The above data structure will not be initialized +properly because the user never was at the inode of the file. + + + + + + +A very simplified overview of a typical command handling function + + +This is a very simplified overview. Detailed information will follow +where appropriate. + + + +The prototype of a typical handling function + + + + + + + + I chose a unified naming convention for the various object +specific commands. It is perhaps best showed with an example: + +The prototype of the handling function of the command next of +the type file is: + + + extern void type_file___next (char *command_line); + + + + +For other types and commands, the words file and next +should be replaced accordingly. + + + + + + + The ext2 general commands syntax is similar. For example, the ext2 +general command super results in calling: + + + extern void type_ext2___super (char *command_line); + + + +Those functions are available in ext2_com.c. + + + + + + The general commands syntax is even simpler - The name of the +handling function is exactly the name of the commands. Those +functions are available in general_com.c. + + + + + + + + + + +"Typical" algorithm + + +This section can't of-course provide meaningful information - Each +command is handled differently, but the following frame is typical: + + + + + + Parse command line arguments and analyze them. Return with an error +message if the syntax is wrong. + + + + + + "Act accordingly", perhaps making use of the global variable available +to this type. + + + + + + Use some dispatch / direct calls in order to pass control to +other lower-level user commands. + + + + + + Sometimes dispatch to the object's show command to +display the resulting data to the user. + + + + + +I told you it is meaningless :-) + + + + + + + + + +Initialization overview + + +In this section I will discuss some aspects of the various initialization +routines available in the source file init.c. + + + +Upon startup + + +Follows the function main, appearing of-course in main.c: + + + +int main (void) + +{ + if (!init ()) return (0); /* Perform some initial initialization */ + /* Quit if failed */ + + parser (); /* Get and parse user commands */ + + prepare_to_close (); /* Do some cleanup */ + printf ("Quitting ...\n"); + return (1); /* And quit */ +} + + + + + +The two initialization functions, which are called by main, are: + + + + + + init + + + + + + prepare_to_close + + + + + + + + +The init function + + +init is called from main upon startup. It initializes the +following tasks / subsystems: + + + + + + Processing of the user configuration file, by using the +process_configuration_file function. Failing to complete the +configuration file processing is considered a fatal error, +and EXT2ED is aborted. I did it this way because the configuration +file has some sensitive user options like write access behavior, and +I wanted to be sure that the user is aware of them. + + + + + + Registration of the general commands through the use of +the add_general_commands function. + + + + + + Reset of the object memory rotating lifo structure. + + + + + + Reset of the device parameters and of the current type. + + + + + + Initialization of the windows subsystem - The interface between the +ncurses library and EXT2ED, through the use of the init_windows +function, available in win.c. + + + + + + Initialization of the interface between the readline library and +EXT2ED, through init_readline. + + + + + + Initialization of the signals subsystem, through +init_signals. + + + + + + Disabling write access. Write access needs to be explicitly enabled +using a user command, to prevent accidental user mistakes. + + + + + +When init is finished, it dispatches the help command in order +to show the available commands to the user. Note that the ext2 layer is still +not added; It will be added if and when EXT2ED will detect an ext2 +filesystem on a device. + + + + + +The prepare_to_close function + + +The prepare_to_close function reverses some of the actions done +earlier in EXT2ED and freeing the dynamically allocated memory. +Specifically, it: + + + + + + Closes the open device, if any. + + + + + + Removes the first level - Removing the general commands, through +the use of free_user_commands, with a pointer to the +general_commands structure as a parameter. + + + + + + Removes of the second level - Removing the ext2 ext2 general +commands, in much the same way. + + + + + + Removes of the third level - Removing the objects and the object +specific commands, by using free_struct_descriptors. + + + + + + Closes the window subsystem, and detaches EXT2ED from the ncurses +library, through the use of the close_windows function, +available in win.c. + + + + + + + + + + + + +Registration of commands + + +Addition of a user command is done through the add_user_command +function. The prototype is: + + +void add_user_command (struct struct_commands *ptr,char *name,char +*description,PF callback); + + +The function receives a pointer to a structure of type +struct_commands, a desired name for the command which will be used by +the user to identify the command, a short description which is utilized by the +help subsystem, and a pointer to a C function which will be called if +dispatch decides that this command was requested. + + + +The add_user_command is a low level function used in the three +levels to add user commands. For example, addition of the ext2 +general commands is done by: + + +void add_ext2_general_commands (void) + +{ + add_user_command (&ext2_commands,"super","Moves to the superblock of the filesystem",type_ext2___super); + add_user_command (&ext2_commands,"group","Moves to the first group descriptor",type_ext2___group); + add_user_command (&ext2_commands,"cd","Moves to the directory specified",type_ext2___cd); +} + + + + + + + +Registration of objects + + +Registration of objects is based, as explained earlier, on the "compilation" +of an external user file, which has a syntax similar to the C language +struct keyword. The primitive parser I have implemented detects the +definition of structures, and calls some lower level functions to actually +register the new detected object. The parser's prototype is: + + +int set_struct_descriptors (char *file_name) + + +It opens the given file name, and calls, when appropriate: + + + + + + add_new_descriptor + + + + + + add_new_variable + + + + + +add_new_descriptor is a low level function which adds a new descriptor +to the doubly linked list of the available objects. It will then call +fill_type_commands, which will add specific commands to the object, +if the object is known. + + + +add_new_variable will add a new variable of the requested length to the +specified descriptor. + + + + + +Initialization upon specification of a device + + +When the general command setdevice is used to open a device, some +initialization sequence takes place, which is intended to determine two +factors: + + + + + + Are we dealing with an ext2 filesystem ? + + + + + + What are the basic filesystem parameters, such as its total size and +its block size ? + + + + + +This questions are answered by the set_file_system_info, possibly +using some help from the user, through the configuration file. +The answers are placed in the file_system_info structure, which is of +type struct_file_system_info: + + +struct struct_file_system_info { + unsigned long file_system_size; + unsigned long super_block_offset; + unsigned long first_group_desc_offset; + unsigned long groups_count; + unsigned long inodes_per_block; + unsigned long blocks_per_group; /* The name is misleading; beware */ + unsigned long no_blocks_in_group; + unsigned short block_size; + struct ext2_super_block super_block; +}; + + + + + +Autodetection of an ext2 filesystem is usually recommended. However, on a damaged +filesystem I can't assure a success. That's were the user comes in - He can +override the auto detection procedure and force an ext2 filesystem, by +selecting the proper options in the configuration file. + + + +If auto detection succeeds, the second question above is automatically +answered - I get all the information I need from the filesystem itself. In +any case, default parameters can be supplied in the configuration file and +the user can select the required behavior. + + + +If we decide to treat the filesystem as an ext2 filesystem, registration of +the ext2 specific objects is done at this point, by calling the +set_struct_descriptors outlined earlier, with the name of the file +which describes the ext2 objects, and is basically based on the ext2 sources +main include file. At this point, EXT2ED can be fully used by the user. + + + +If we do not register the ext2 specific objects, the user can still provide +object definitions in a separate file, and will be able to use EXT2ED in a +limited form, but more sophisticated than a simple hex editor. + + + + + + + +main.c + + +As described earlier, main.c is used as a front-head to the entire +program. main.c contains the following elements: + + + +The main routine + + +The main routine was displayed above. Its task is to pass control to +the initialization routines and to the parser. + + + + + +The parser + + +The parser consists of the following functions: + + + + + + The parser function, which reads the command line from the +user and saves it in readline's history buffer and in the internal +last-command buffer. + + + + + + The parse_word function, which receives a string and parses +the first word from it, ignoring whitespaces, and returns a pointer +to the rest of the string. + + + + + + The complete_command function, which is used by the readline +library for command completion. It scans the available commands at +this point and determines the possible completions. + + + + + + + + + + +The dispatcher + + +The dispatcher was already explained in the flow control section - section +. Its task is to pass control to the proper command +handling function, based on the command line's command. + + + + + +The self-sanity control + + +This is not fully implemented. + + + +The general idea was to provide a control system which will supervise the +internal work of EXT2ED. Since I am pretty sure that bugs exist, I have +double checked myself in a few instances, and issued an internal +error warning if I reached the conclusion that something is not logical. +The internal error is reported by the function internal_error, +available in main.c. + + + +The self sanity check is compiled only if the compile time option +DEBUG is selected. + + + + + + + +The windows interface + + +Screen handling and interfacing to the ncurses library is done in +win.c. + + + +Initialization + + +Opening of the windows is done in init_windows. In +close_windows, we just close our windows. The various window lengths +with an exception to the show pad are defined in the main header file. +The rest of the display will be used by the show pad. + + + + + +Display output + + +Each actual refreshing of the terminal monitor is done by using the +appropriate refresh function from this file: refresh_title_win, +refresh_show_win, refresh_show_pad and +refresh_command_win. + + + +With the exception of the show pad, each function simply calls the +ncurses refresh command. In order to provide to scrolling in +the show pad, some information about its status is constantly updated +by the various functions which display output in it. refresh_show_pad +passes this information to ncurses so that the correct part of the pad +is actually copied to the display. + + + +The above information is saved in a global variable of type struct +struct_pad_info: + + + + + +struct struct_pad_info { + int display_lines,display_cols; + int line,col; + int max_line,max_col; + int disable_output; +}; + + + + + + + +Screen redraw + + +The redraw_all function will just reopen the windows. This action is +necessary if the display gets garbled from some reason. + + + + + + + +The disk interface + + +All the disk activity with regard to the filesystem passes through the file +disk.c. This is done that way to provide additional levels of safety +concerning the disk access. This way, global decisions considering the disk +can be easily accomplished. The benefits of this isolation will become even +clearer in the next sections. + + + +Low level functions + + +Read requests are ultimately handled by low_read and write requests +are handled by low_write. They just receive the length of the data +block, the offset in the filesystem and a pointer to the buffer and pass the +request to the fread or fwrite standard library functions. + + + + + +Mounted filesystems + + +EXT2ED design assumes that the edited filesystem is not mounted. Even if +a reasonably simple way to handle mounted filesystems exists, it is +probably too complicated :-) + + + +Write access to a mounted filesystem will be denied. Read access can be +allowed by using a configuration file option. The mount status is determined +by reading the file /etc/mtab. + + + + + +Write access + + +Write access is the most sensitive part in the program. This program is +intended for editing filesystems. It is obvious that a small mistake +in this regard can make the filesystem not usable anymore. + + + +The following safety measures are added, of-course, to the general Unix +permission protection - The user can always disable write access on the +device file itself. + + + +Considering the user, the following safety measures were taken: + + + + + + The filesystem is never opened with write-access enables. +Rather, the user must explicitly request to enable write-access. + + + + + + The user can disable write access entirely by using a +configuration file option. + + + + + + Changes are never done automatically - Whenever the user makes +changes, they are done in memory. An explicit writedata +command should be issued to make the changes active in the disk. + + + + + +Considering myself, I tried to protect against my bugs by: + + + + + + Opening the device in read-only mode until a write request is +issued by the user. + + + + + + Limiting actual filesystem access to two functions only - +low_read for reading, and low_write for writing. Those +functions were programmed carefully, and I added the self +sanity checks there. In addition, this is the only place in which I +need to check the user options described above - There can be no +place in which I can "forget" to check them. + +Note that The disabling of write-access through the configuration file +is double checked here only as a self-sanity check - If +DEBUG is selected, since write enable should have been refused +and write-access is always disabled at startup, hence finding +here that the user has write access disabled through the +configuration file clearly indicates that I have a bug somewhere. + + + + + + + + +The following safety measure can provide protection against both user +mistakes and my own bugs: + + + + + + I added a logging option, which logs every actual write +access to the disk in the lowest level - In low_write itself. + +The logging has nothing to do with the current type and the various +other higher level operations of EXT2ED - It is simply a hex dump of +the contents which will be overwritten; Both the original contents +and the new written data. + +In that case, even if the user makes a mistake, the original data +can be retrieved. + +Even If I have a bug somewhere which causes incorrect data to be +written to the disk, the logging option will still log exactly the +original contents at the place were data was incorrectly overwritten. +(This assumes, of-course, that low-write and the logging +itself work correctly. I have done my best to verify that this is +indeed the case). + +The logging option is implemented in the log_changes +function. + + + + + + + + + + +Reading / Writing objects + + +Usually (not always), the current object data is available in the +global variable type_data, which is of the type: + + +struct struct_type_data { + long offset_in_block; + + union union_type_data { + char buffer [EXT2_MAX_BLOCK_SIZE]; + struct ext2_acl_header t_ext2_acl_header; + struct ext2_acl_entry t_ext2_acl_entry; + struct ext2_old_group_desc t_ext2_old_group_desc; + struct ext2_group_desc t_ext2_group_desc; + struct ext2_inode t_ext2_inode; + struct ext2_super_block t_ext2_super_block; + struct ext2_dir_entry t_ext2_dir_entry; + } u; +}; + + +The above union enables me, in the program, to treat the data as raw data or +as a meaningful filesystem object. + + + +The reading and writing, if done to this global variable, are done through +the functions load_type_data and write_type_data, available in +disk.c. + + + + + + + +The general commands + + +The general commands are handled in the file general_com.c. + + + +The help system + + +The help command is handled by the function help. The algorithm is as +follows: + + + + + + + + + Check the command line arguments. If there is an argument, pass +control to the detailed_help function, in order to provide +help on the specific command. + + + + + + If general help was requested, display a list of the available +commands at this point. The three levels are displayed in reverse +order - First the commands which are specific to the current type +(If a current type is defined), then the ext2 general commands (If +we decided that the filesystem should be treated like an ext2 +filesystem), then the general commands. + + + + + + Display information about EXT2ED - Current version, general +information about the project, etc. + + + + + + + + + + +The setdevice command + + +The setdevice commands result in calling the set_device +function. The algorithm is: + + + + + + + + + Parse the command line argument. If it isn't available report the +error and return. + + + + + + Close the current open device, if there is one. + + + + + + Open the new device in read-only mode. Update the global variables +device_name and device_handle. + + + + + + Disable write access. + + + + + + Empty the object memory. + + + + + + Unregister the ext2 general commands, using +free_user_commands. + + + + + + Unregister the current objects, using free_struct_descriptors + + + + + + Call set_file_system_info to auto-detect an ext2 filesystem +and set the basic filesystem values. + + + + + + Add the alternate descriptors, supplied by the user. + + + + + + Set the device offset to the filesystem start by dispatching +setoffset 0. + + + + + + Show the new available commands by dispatching the help +command. + + + + + + + + + + +Basic maneuvering + + +Basic maneuvering is done using the setoffset and the settype +user commands. + + + +set_offset accepts some alternative forms of specifying the new +offset. They all ultimately lead to changing the device_offset +global variable and seeking to the new position. set_offset also +calls load_type_data to read a block ahead of the new position into +the type_data global variable. + + + +set_type will point the global variable current_type to the +correct entry in the double linked list of the known objects. If the +requested type is hex or none, current_type will be +initialized to NULL. set_type will also dispatch show, +so that the object data will be re-formatted in the new format. + + + +When editing an ext2 filesystem, it is not intended that those commands will +be used directly, and it is usually not required. My implementation of the +ext2 layer, on the other hand, uses this lower level commands on countless +occasions. + + + + + +The display functions + + +The general command version of show is handled by the show +function. This command is overridden by various objects to provide a display +which is better suited to the object. + + + +The general show command will format the data in type_data according +to the structure definition of the current type and show it on the show +pad. If there is no current type, the data will be shown as a simple hex +dump; Otherwise, the list of variables, along with their values will be shown. + + + +A call to show_info is also made - show_info will provide +general statistics on the show_window, such as the current +block, current type, current offset and current page. + + + +The pgup and pgdn general commands just update the +show_pad_info global variable - We just increment +show_pad_info.line with the number of lines in the screen - +show_pad_info.display_lines, which was initialized in +init_windows. + + + + + +Changing data + + +Data change is done in memory only. An update to the disk if followed by an +explicit writedata command to the disk. The write_data +function simple calls the write_type_data function, outlined earlier. + + + +The set command is used for changing the data. + + + +If there is no current type, control is passed to the hex_set function, +which treats the data as a block of bytes and uses the +type_data.offset_in_block variable to write the new text or hex string +to the correct place in the block. + + + +If a current type is defined, the requested variable is searched in the +current object, and the desired new valued is entered. + + + +The enablewrite commands just sets the global variable +write_access to 1 and re-opens the filesystem in read-write +mode, if possible. + + + +If the current type is NULL, a hex-mode is assumed - The next and +prev commands will just update type_data.offset_in_block. + + + +If the current type is not NULL, the The next and prev command +are usually overridden anyway. If they are not overridden, it will be assumed +that the user is editing an array of such objects, and they will just pass +to the next / prev element by dispatching to setoffset using the +setoffset type + / - X syntax. + + + + + + + +The ext2 general commands + + +The ext2 general commands are contained in the ext2_general_commands +global variable (which is of type struct struct_commands). + + + +The handling functions are implemented in the source file ext2_com.c. +I will include the entire source code since it is relatively short. + + + +The super command + + +The super command just "brings the user" to the main superblock and set the +type to ext2_super_block. The implementation is trivial: + + + + + +void type_ext2___super (char *command_line) + +{ + char buffer [80]; + + super_info.copy_num=0; + sprintf (buffer,"setoffset %ld",file_system_info.super_block_offset);dispatch (buffer); + sprintf (buffer,"settype ext2_super_block");dispatch (buffer); +} + + +It involves only setting the copy_num variable to indicate the main +copy, dispatching a setoffset command to reach the superblock, and +dispatching a settype to enable the superblock specific commands. +This last command will also call the show command of the +ext2_super_block type, through dispatching at the general command +settype. + + + + + +The group command + + +The group command will bring the user to the specified group descriptor in +the main copy of the group descriptors. The type will be set to +ext2_group_desc: + + +void type_ext2___group (char *command_line) + +{ + long group_num=0; + char *ptr,buffer [80]; + + ptr=parse_word (command_line,buffer); + if (*ptr!=0) { + ptr=parse_word (ptr,buffer); + group_num=atol (buffer); + } + + group_info.copy_num=0;group_info.group_num=0; + sprintf (buffer,"setoffset %ld",file_system_info.first_group_desc_offset);dispatch (buffer); + sprintf (buffer,"settype ext2_group_desc");dispatch (buffer); + sprintf (buffer,"entry %ld",group_num);dispatch (buffer); +} + + +The implementation is as trivial as the super implementation. Note +the use of the entry command, which is a command of the +ext2_group_desc object, to pass to the correct group descriptor. + + + + + +The cd command + + +The cd command performs the usual cd function. The path to the global +cd command is a path from /. + + + +This is one of the best examples of the power of the object oriented +design and of the dispatching mechanism. The operation is complicated, yet the +implementation is surprisingly short! + + + + + +void type_ext2___cd (char *command_line) + +{ + char temp [80],buffer [80],*ptr; + + ptr=parse_word (command_line,buffer); + if (*ptr==0) { + wprintw (command_win,"Error - No argument specified\n"); + refresh_command_win ();return; + } + ptr=parse_word (ptr,buffer); + + if (buffer [0] != '/') { + wprintw (command_win,"Error - Use a full pathname (begin with '/')\n"); + refresh_command_win ();return; + } + + dispatch ("super");dispatch ("group");dispatch ("inode"); + dispatch ("next");dispatch ("dir"); + if (buffer [1] != 0) { + sprintf (temp,"cd %s",buffer+1);dispatch (temp); + } +} + + + + + +Note the number of the dispatch calls! + + + +super is used to get to the superblock. group to get to the +first group descriptor. inode brings us to the first inode - The bad +blocks inode. A next is command to pass to the root directory inode, +a dir command "enters" the directory, and then we let the object +specific cd command to take us from there (The object is dir, so +that dispatch will call the cd command of the dir type). +Note that a symbolic link following could bring us back to the root directory, +thus the innocent calls above treats nicely such a recursive case! + + + +I feel that the above is intuitive - I was expressing myself "in the +language" of the ext2 filesystem - (Go to the inode, etc), and the code was +written exactly in this spirit! + + + +I can write more at this point, but I guess I am already a bit carried +away with the self compliments :-) + + + + + + + +The superblock + + +This section details the handling of the superblock. + + + +The superblock variables + + +The superblock object is ext2_super_block. The definition is just +taken from the kernel ext2 main include file - /usr/include/linux/ext2_fs.h. + + + +Those lines of source are copyrighted by Remy Card - The author of the +ext2 filesystem, and by Linus Torvalds - The first author of the Linux +operating system. Please cross reference the section Acknowledgments for the +full copyright. + + + + + + + +struct ext2_super_block { + __u32 s_inodes_count; /* Inodes count */ + __u32 s_blocks_count; /* Blocks count */ + __u32 s_r_blocks_count; /* Reserved blocks count */ + __u32 s_free_blocks_count; /* Free blocks count */ + __u32 s_free_inodes_count; /* Free inodes count */ + __u32 s_first_data_block; /* First Data Block */ + __u32 s_log_block_size; /* Block size */ + __s32 s_log_frag_size; /* Fragment size */ + __u32 s_blocks_per_group; /* # Blocks per group */ + __u32 s_frags_per_group; /* # Fragments per group */ + __u32 s_inodes_per_group; /* # Inodes per group */ + __u32 s_mtime; /* Mount time */ + __u32 s_wtime; /* Write time */ + __u16 s_mnt_count; /* Mount count */ + __s16 s_max_mnt_count; /* Maximal mount count */ + __u16 s_magic; /* Magic signature */ + __u16 s_state; /* File system state */ + __u16 s_errors; /* Behavior when detecting errors */ + __u16 s_pad; + __u32 s_lastcheck; /* time of last check */ + __u32 s_checkinterval; /* max. time between checks */ + __u32 s_creator_os; /* OS */ + __u32 s_rev_level; /* Revision level */ + __u16 s_def_resuid; /* Default uid for reserved blocks */ + __u16 s_def_resgid; /* Default gid for reserved blocks */ + __u32 s_reserved[0]; /* Padding to the end of the block */ + __u32 s_reserved[1]; /* Padding to the end of the block */ + . + . + . + __u32 s_reserved[234]; /* Padding to the end of the block */ +}; + + + + + +Note that I expanded the array due to my primitive parser +implementation. The various fields are described in the technical +document. + + + + + +The superblock commands + + +This section explains the commands available in the ext2_super_block +type. They all appear in super_com.c + + + +The show command + + +The show command is overridden here in order to provide more +information than just the list of variables. A show command will end +up in calling type_super_block___show. + + + +The first thing that we do is calling the general show command in +order to display the list of variables. + + + +We then add some interpretation to the various lines to make the data +somewhat more intuitive (Expansion of the time variables and the creator +operating system code, for example). + + + +We also display the backup copy number of the superblock in the status +window. This copy number is saved in the super_info global variable - +super_info.copy_num. Currently, this is the only variable there ... +but this type of internal variable saving is typical through my +implementation. + + + + + +The backup copies handling commands + + +The current copy number is available in super_info.copy_num. It +was initialized in the ext2 command super, and is used by the various +superblock routines. + + + +The gocopy routine will pass to another copy of the superblock. The +new device offset will be computed with the aid of the variables in the +file_system_info structure. Then the routine will dispatch to +the setoffset and the show routines. + + + +The setactivecopy routine will just save the current superblock data +in a temporary variable of type ext2_super_block, and will dispatch +gocopy 0 to pass to the main superblock. Then it will place the saved +data in place of the actual data. + + + +The above two commands can be used if the main superblock is corrupted. + + + + + + + + + +The group descriptors + + +The group descriptors handling mechanism allows the user to take a tour in +the group descriptors table, stopping at each point, and examining the +relevant inode table, block allocation map or inode allocation map through +dispatching to the relevant objects. + + + +Some information about the group descriptors is available in the global +variable group_info, which is of type struct_group_info: + + + + + +struct struct_group_info { + unsigned long copy_num; + unsigned long group_num; +}; + + + + + +group_num is the index of the current descriptor in the table. + + + +copy_num is the number of the current backup copy. + + + +The group descriptor's variables + + + + +struct ext2_group_desc +{ + __u32 bg_block_bitmap; /* Blocks bitmap block */ + __u32 bg_inode_bitmap; /* Inodes bitmap block */ + __u32 bg_inode_table; /* Inodes table block */ + __u16 bg_free_blocks_count; /* Free blocks count */ + __u16 bg_free_inodes_count; /* Free inodes count */ + __u16 bg_used_dirs_count; /* Directories count */ + __u16 bg_pad; + __u32 bg_reserved[3]; +}; + + + + + +The first three variables are used to provide the links to the +blockbitmap, inodebitmap and inode objects. + + + + + +Movement in the table + + +Movement in the group descriptors table is done using the next, prev and +entry commands. Note that the first two commands override the +general commands of the same name. The next and prev command are just +calling the entry function to do the job. I will show next, +for example: + + + + + +void type_ext2_group_desc___next (char *command_line) + +{ + long entry_offset=1; + char *ptr,buffer [80]; + + ptr=parse_word (command_line,buffer); + if (*ptr!=0) { + ptr=parse_word (ptr,buffer); + entry_offset=atol (buffer); + } + + sprintf (buffer,"entry %ld",group_info.group_num+entry_offset); + dispatch (buffer); +} + + +The entry function is also simple - It just calculates the offset +using the information in group_info and in file_system_info, +and uses the usual setoffset / show pair. + + + + + +The show command + + +As usual, the show command is overridden. The implementation is +similar to the superblock's show implementation - We just call the general +show command, and add some information in the status window - The contents of +the group_info structure. + + + + + +Moving between backup copies + + +This is done exactly like the superblock case. Please refer to explanation +there. + + + + + +Links to the available friends + + +From a group descriptor, one typically wants to reach an inode, or +one of the allocation bitmaps. This is done using the inode, +blockbitmap or inodebitmap commands. The implementation is again trivial +- Get the necessary information from the group descriptor, initialize the +structures of the next type, and issue the setoffset / settype pair. + + + +For example, here is the implementation of the blockbitmap command: + + + + + +void type_ext2_group_desc___blockbitmap (char *command_line) + +{ + long block_bitmap_offset; + char buffer [80]; + + block_bitmap_info.entry_num=0; + block_bitmap_info.group_num=group_info.group_num; + + block_bitmap_offset=type_data.u.t_ext2_group_desc.bg_block_bitmap; + sprintf (buffer,"setoffset block %ld",block_bitmap_offset);dispatch (buffer); + sprintf (buffer,"settype block_bitmap");dispatch (buffer); +} + + + + + + + + + +The inode table + + +The inode handling enables the user to move in the inode table, edit the +various attributes of the inode, and follow to the next stage - A file or a +directory. + + + +The inode variables + + + + +struct ext2_inode { + __u16 i_mode; /* File mode */ + __u16 i_uid; /* Owner Uid */ + __u32 i_size; /* Size in bytes */ + __u32 i_atime; /* Access time */ + __u32 i_ctime; /* Creation time */ + __u32 i_mtime; /* Modification time */ + __u32 i_dtime; /* Deletion Time */ + __u16 i_gid; /* Group Id */ + __u16 i_links_count; /* Links count */ + __u32 i_blocks; /* Blocks count */ + __u32 i_flags; /* File flags */ + union { + struct { + __u32 l_i_reserved1; + } linux1; + struct { + __u32 h_i_translator; + } hurd1; + } osd1; /* OS dependent 1 */ + __u32 i_block[EXT2_N_BLOCKS]; /* Pointers to blocks */ + __u32 i_version; /* File version (for NFS) */ + __u32 i_file_acl; /* File ACL */ + __u32 i_size_high; /* High 32bits of size */ + __u32 i_faddr; /* Fragment address */ + union { + struct { + __u8 l_i_frag; /* Fragment number */ + __u8 l_i_fsize; /* Fragment size */ + __u16 i_pad1; + __u32 l_i_reserved2[2]; + } linux2; + struct { + __u8 h_i_frag; /* Fragment number */ + __u8 h_i_fsize; /* Fragment size */ + __u16 h_i_mode_high; + __u16 h_i_uid_high; + __u16 h_i_gid_high; + __u32 h_i_author; + } hurd2; + } osd2; /* OS dependent 2 */ +}; + + + + + +The above is the original source code definition. We can see that the inode +supports Operating systems specific structures. In addition to the +expansion of the arrays, I have "flattened the inode to support only +the Linux declaration. It seemed that this one occasion of multiple +variable aliases didn't justify the complication of generally supporting +aliases. In any case, the above system specific variables are not used +internally by EXT2ED, and the user is free to change the definition in +ext2.descriptors to accommodate for his needs. + + + + + +The handling functions + + +The user interface to movement is the usual next / prev / +entry interface. There is really nothing special in those functions - The +size of the inode is fixed, the total number of inodes is known from the +superblock information, and the current entry can be figured up from the +device offset and the inode table start offset, which is known from the +corresponding group descriptor. Those functions are a bit older then some +other implementations of next and prev, and they do not save +information in a special structure. Rather, they recompute it when +necessary. + + + +The show command is overridden here, and provides a lot of additional +information about the inode - Its type, interpretation of the permissions, +special ext2 attributes (Immutable file, for example), and a lot more. +Again, the general show is called first, and then the additional +information is written. + + + + + +Accessing files and directories + + +From the inode, a file or a directory can typically be reached. +In order to treat a file, for example, its inode needs to be constantly +accessed. To satisfy that need, when editing a file or a directory, the +inode is still saved in memory - type_data is not overwritten. +Rather, the following takes place: + + + + + + An internal global structure which is used by the types file +and dir handling functions is initialized by calling the +appropriate function. + + + + + + The type is changed accordingly. + + + + + +The result is that a settype ext2_inode is the only action necessary +to return to the inode - We actually never left it. + + + +Follows the implementation of the inode's file command: + + + + + +void type_ext2_inode___file (char *command_line) + +{ + char buffer [80]; + + if (!S_ISREG (type_data.u.t_ext2_inode.i_mode)) { + wprintw (command_win,"Error - Inode type is not file\n"); + refresh_command_win (); return; + } + + if (!init_file_info ()) { + wprintw (command_win,"Error - Unable to show file\n"); + refresh_command_win ();return; + } + + sprintf (buffer,"settype file");dispatch (buffer); +} + + + + + +As we can see - We just call init_file_info to get the necessary +information from the inode, and set the type to file. The next call +to show, will dispatch to the file's show implementation. + + + + + + + +Viewing a file + + +There isn't an ext2 kernel structure which corresponds to a file - A file is +just a series of blocks which are determined by its inode. As explained in +the last section, the inode is never actually left - The type is changed to +file - A type which contains no variables, and a special structure is +initialized: + + + + + +struct struct_file_info { + + struct ext2_inodes *inode_ptr; + + long inode_offset; + long global_block_num,global_block_offset; + long block_num,blocks_count; + long file_offset,file_length; + long level; + unsigned char buffer [EXT2_MAX_BLOCK_SIZE]; + long offset_in_block; + + int display; + /* The following is used if the file is a directory */ + + long dir_entry_num,dir_entries_count; + long dir_entry_offset; +}; + + + + + +The inode_ptr will just point to the inode in type_data, which +is not overwritten while the user is editing the file, as the +setoffset command is not internally used. The buffer +will contain the current viewed block of the file. The other variables +contain information about the current place in the file. For example, +global_block_num just contains the current block number. + + + +The general idea is that the above data structure will provide the file +handling functions all the accurate information which is needed to accomplish +their task. + + + +The global structure of the above type, file_info, is initialized by +init_file_info in file_com.c, which is called by the +type_ext2_inode___file function when the user requests to watch the +file. It is updated as necessary to provide accurate information as long as +the file is edited. + + + +Returning to the file's inode + + +Concerning the method I used to handle files, the above task is trivial: + + +void type_file___inode (char *command_line) + +{ + dispatch ("settype ext2_inode"); +} + + + + + + + +File movement + + +EXT2ED keeps track of the current position in the file. Movement inside the +current block is done using next, prev and offset - They just change +file_info.offset_in_block. + + + +Movement between blocks is done using nextblock, prevblock and block. +To accomplish this, the direct blocks, indirect blocks, etc, need to be +traced. This is done by file_block_to_global_block, which accepts a +file's internal block number, and converts it to the actual filesystem block +number. + + + + + +long file_block_to_global_block (long file_block,struct struct_file_info *file_info_ptr) + +{ + long last_direct,last_indirect,last_dindirect; + long f_indirect,s_indirect; + + last_direct=EXT2_NDIR_BLOCKS-1; + last_indirect=last_direct+file_system_info.block_size/4; + last_dindirect=last_indirect+(file_system_info.block_size/4) \ + *(file_system_info.block_size/4); + + if (file_block <= last_direct) { + file_info_ptr->level=0; + return (file_info_ptr->inode_ptr->i_block [file_block]); + } + + if (file_block <= last_indirect) { + file_info_ptr->level=1; + file_block=file_block-last_direct-1; + return (return_indirect (file_info_ptr->inode_ptr-> \ + i_block [EXT2_IND_BLOCK],file_block)); + } + + if (file_block <= last_dindirect) { + file_info_ptr->level=2; + file_block=file_block-last_indirect-1; + return (return_dindirect (file_info_ptr->inode_ptr-> \ + i_block [EXT2_DIND_BLOCK],file_block)); + } + + file_info_ptr->level=3; + file_block=file_block-last_dindirect-1; + return (return_tindirect (file_info_ptr->inode_ptr-> \ + i_block [EXT2_TIND_BLOCK],file_block)); +} + + +last_direct, last_indirect, etc, contain the last internal block number +which is accessed by this method - If the requested block is smaller then +last_direct, for example, it is a direct block. + + + +If the block is a direct block, its number is just taken from the inode. +A non-direct block is handled by return_indirect, return_dindirect and +return_tindirect, which correspond to indirect, double-indirect and +triple-indirect. Each of the above functions is constructed using the lower +level functions. For example, return_dindirect is constructed as +follows: + + + + + +long return_dindirect (long table_block,long block_num) + +{ + long f_indirect; + + f_indirect=block_num/(file_system_info.block_size/4); + f_indirect=return_indirect (table_block,f_indirect); + return (return_indirect (f_indirect,block_num%(file_system_info.block_size/4))); +} + + + + + + + +Object memory + + +The remember command is overridden here and in the dir type - +We just remember the inode of the file. It is just simpler to implement, and +doesn't seem like a big limitation. + + + + + +Changing data + + +The set command is overridden, and provides the same functionality +like the usage of the general set command with no type declared. The +writedata is overridden so that we'll write the edited block +(file_info.buffer) and not type_data (Which contains the inode). + + + + + + + +Directories + + +A directory is just a file which is formatted according to a special format. +As such, EXT2ED handles directories and files quite alike. Specifically, the +same variable of type struct_file_info which is used in the +file, is used here. + + + +The dir type uses all the variables in the above structure, as +opposed to the file type, which didn't use the last ones. + + + +The search_dir_entries function + + +The entire situation is similar to that which was described in the +file type, with one main change: + + + +The main function in dir_com.c is search_dir_entries. This +function will "run" on the entire entries in the directory, and will +call a client's function each time. The client's function is supplied as an +argument, and will check the current entry for a match, based on its own +criterion. It will then signal search_dir_entries whether to +ABORT the search, whether it FOUND the entry it was looking +for, or that the entry is still not found, and we should CONTINUE +searching. Follows the declaration: + + +struct struct_file_info search_dir_entries \ + (int (*action) (struct struct_file_info *info),int *status) + +/* + This routine runs on all directory entries in the current directory. + For each entry, action is called. The return code of action is one of + the following: + + ABORT - Current dir entry is returned. + CONTINUE - Continue searching. + FOUND - Current dir entry is returned. + + If the last entry is reached, it is returned, along with an ABORT status. + + status is updated to the returned code of action. +*/ + + + + + +With the above tool in hand, many operations are simple to perform - Here is +the way I counted the entries in the current directory: + + + + + +long count_dir_entries (void) + +{ + int status; + + return (search_dir_entries (&action_count,&status).dir_entry_num); +} + +int action_count (struct struct_file_info *info) + +{ + return (CONTINUE); +} + + +It will just CONTINUE until the last entry. The returned structure +(of type struct_file_info) will have its number in the +dir_entry_num field, and this is exactly the required number! + + + + + +The cd command + + +The cd command accepts a relative path, and moves there ... +The implementation is of-course a bit more complicated: + + + + + + The path is checked that it is not an absolute path (from /). +If it is, we let the general cd to do the job by calling +directly type_ext2___cd. + + + + + + The path is divided into the nearest path and the rest of the path. +For example, cd 1/2/3/4 is divided into 1 and into +2/3/4. + + + + + + It is the first part of the path that we need to search for in the +current directory. We search for it using search_dir_entries, +which accepts the action_name function as the user defined +function. + + + + + + search_dir_entries will scan the entire entries and will call +our action_name function for each entry. In +action_name, the required name will be checked against the +name of the current entry, and FOUND will be returned when a +match occurs. + + + + + + If the required entry is found, we dispatch a remember +command to insert the current inode into the object memory. +This is required to easily support symbolic links - If we +find later that the inode pointed by the entry is actually a +symbolic link, we'll need to return to this point, and the above +inode doesn't have (and can't have, because of hard links) the +information necessary to "move back". + + + + + + We then dispatch a followinode command to reach the inode +pointed by the required entry. This command will automatically +change the type to ext2_inode - We are now at an inode, and +all the inode commands are available. + + + + + + We check the inode's type to see if it is a directory. If it is, we +dispatch a dir command to "enter the directory", and +recursively call ourself (The type is dir again) by +dispatching a cd command, with the rest of the path as an +argument. + + + + + + If the inode's type is a symbolic link (only fast symbolic link were +meanwhile implemented. I guess this is typically the case.), we note +the path it is pointing at, the saved inode is recalled, we dispatch +dir to get back to the original directory, and we call +ourself again with the link path/rest of the path argument. + + + + + + In any other case, we just stop at the resulting inode. + + + + + + + + + + + + +The block and inode allocation bitmaps + + +The block allocation bitmap is reached by the corresponding group descriptor. +The group descriptor handling functions will save the necessary information +into a structure of the struct_block_bitmap_info type: + + + + + +struct struct_block_bitmap_info { + unsigned long entry_num; + unsigned long group_num; +}; + + + + + +The show command is overridden, and will show the block as a series of +bits, each bit corresponding to a block. The main variable is the +entry_num variable, declared above, which is just the current block +number in this block group. The current entry is highlighted, and the +next, prev and entry commands just change the above variable. + + + +The allocate and deallocate change the specified bits. Nothing +special about them - They just contain code which converts between bit and +byte locations. + + + +The inode allocation bitmap is treated in much the same fashion, with +the same commands available. + + + + + +Filesystem size limitation + + +While an ext2 filesystem has a size limit of 4 TB, EXT2ED currently +can't handle filesystems which are bigger than 2 GB. + + + +This limitation results from my usage of 32 bit long variables and +of the fseek filesystem call, which can't seek up to 4 TB. + + + +By looking in the ext2 library source code by Theodore Ts'o, +I discovered the llseek system call which can seek to a +64 bit unsigned long long offset. Correcting the situation is not +difficult in concept - I need to change long into unsigned long long where +appropriate and modify disk.c to use the llseek system call. + + + +However, fixing the above limitation involves making changes in many places +in the code and will obviously make the entire code less stable. For that +reason, I chose to release EXT2ED as it is now and to postpone the above fix +to the next release. + + + + + +Conclusion + + +Had I known in advance the structure of the ext2 filesystem, I feel that +the resulting design would have been quite different from the presented +design above. + + + +EXT2ED has now two levels of abstraction - A general filesystem, and an +ext2 filesystem, and the surface is more or less prepared for additions +of other filesystems. Had I approached the design in the "engineering" way, +I guess that the first level above would not have existed. + + + + + +Copyright + + +EXT2ED is Copyright (C) 1995 Gadi Oxman. + + + +EXT2ED is hereby placed under the GPL - Gnu Public License. You are free and +welcome to copy, view and modify the sources. My only wish is that my +copyright presented above will be left and that a list of the bug fixes, +added features, etc, will be provided. + + + +The entire EXT2ED project is based, of-course, on the kernel sources. The +ext2.descriptors distributed with EXT2ED is a slightly modified +version of the main ext2 include file, /usr/include/linux/ext2_fs.h. Follows +the original copyright: + + + + + +/* + * linux/include/linux/ext2_fs.h + * + * Copyright (C) 1992, 1993, 1994, 1995 + * Remy Card (card@masi.ibp.fr) + * Laboratoire MASI - Institut Blaise Pascal + * Universite Pierre et Marie Curie (Paris VI) + * + * from + * + * linux/include/linux/minix_fs.h + * + * Copyright (C) 1991, 1992 Linus Torvalds + */ + + + + + + + + +Acknowledgments + + +EXT2ED was constructed as a student project in the software +laboratory of the faculty of electrical-engineering in the +Technion - Israel's institute of technology. + + + +At first, I would like to thank Avner Lottem and Doctor Ilana +David for their interest and assistance in this project. + + + +I would also like to thank the following people, who were involved in the +design and implementation of the ext2 filesystem kernel code and support +utilities: + + + + + + Remy Card + +Who designed, implemented and maintains the ext2 filesystem kernel +code, and some of the ext2 utilities. Remy Card is also the +author of several helpful slides concerning the ext2 filesystem. +Specifically, he is the author of File Management in the Linux +Kernel and of The Second Extended File System - Current +State, Future Development. + + + + + + + Wayne Davison + +Who designed the ext2 filesystem. + + + + + + Stephen Tweedie + +Who helped designing the ext2 filesystem kernel code and wrote the +slides Optimizations in File Systems. + + + + + + Theodore Ts'o + +Who is the author of several ext2 utilities and of the ext2 library +libext2fs (which I didn't use, simply because I didn't know +it exists when I started to work on my project). + + + + + + + + +Lastly, I would like to thank, of-course, Linus Torvalds and the +Linux community for providing all of us with such a great operating +system. + + + +Please contact me in a case of bug report, suggestions, or just about +anything concerning EXT2ED. + + + +Enjoy, + + + +Gadi Oxman <tgud@tochnapc2.technion.ac.il> + + + +Haifa, August 95 + + + + +
-- cgit v1.2.3