The JPEG Sheriff

Introduction (initial draft)

  1. introduction

    The JPEG Sheriff as the name hopefully indicates is intended to preserve the order and specifically the order for JPEG files. As such it basically has two functions. The first is the generation of verification lists. The second is the ability to read those lists and verify the files stored on ones disk for both completeness and correctness.

  2. preparation

    After having downloaded the Sheriff you can extract it into any directory. Having done so you are ready to run it. The Sheriff does not need any elaborate install procedure. Nor does it need any DLL's, OCX's, run-time libraries or any other auxiliary program. Neither does it do anything to your system. It does not modify the registry, create or update INI files or anything. It is self contained.

    That having been said we will now start Notepad. So much for the hype. For this walkthrough we will need notepad occasionally to view the results of certain actions. Another program we will need is WinFile or the Explorer. This to drag files from and to move, copy and rename test files. Once these preparations are complete you are now ready to run the Sheriff and to have it perform its first task.

  3. generating a list

    To keep things simple it uses (old style) drag and drop. Simply put this means you should drag a file from the directory the files are stored in onto the Sheriff window and drop it. The Sheriff will retrieve the complete path, strip the filename from it and place the directory path into the listbox just below the Open menu. At this point you can select the Perform filesearch action from the Open menu. The Sheriff will then search out all files whose name matches the *.JPG* pattern and list those in the grid below the directory listbox.

    For each file whose name matches the pattern it will first calculate three check figures: a 32 bit CRC, a 32 bit checksum and a 32 bit checksum over the first 32K byte of the file. Only then will it try to interpret the file as a JFIF file (JPEG File Interchange Format). If succesful it retrieves the more interesting parameters of the JPEG image stored in it.

    When it has examined all files it will list them in the grid. Initially it will display the filename, the dimension of the image, the filesize and the CRC (Cyclic Redundancy Check). The error column will list '-' to indicate it has not been checked. The comment or description column will be empty. To keep the window size small it will not show the density the image was scanned at. To view the density you can put the mouse cursor over a filename when the window is the active window and a box will pop telling you what was stored in the JFIF file.

    It has also filled in the boxes in the status bar. Since we have not actually done anything it will only list the count of the number of files in the grid. This count will also appear in the box marked 'other'. There are as of now no files in a verification list, the box marked 'list', since we have not read a list yet. Neither are there any bad files or files that are not on the list. The box marked 'miss' stands for missing files.

    Since we are interested in generating a list we will do so. For this simply select CRC checklist from the Generate menu. A Save As window will pop up showing the directory you started the program from. For now we will simply provide test as the filename and press Save. You must of course have write access to that directory, if not you can change the directory to one you do have write access for.

    That was all it takes to generate a verification list. In the directory you will find a file called 'test.CRC' which you can view in notepad or type out in a DOS prompt.

  4. using a list

    To keep things simple we will use the list we just generated. Since it is a full Sheriff list we need only to select Read CRC list from the Open menu. The window that appears will now be called 'Select a CRC-32 list' to indicate the kind of list it will read. It will again show the directory the Sheriff was started from and should also list the generated 'test.CRC' file. You can read it by double clicking on that name.

    Several things will now have occurred. The title of the Sheriff's window will show the filename of the list as well as the type of checklist (CRC). Also the boxes in the status bar will have been updated. At this point there is a list so the count of recognized files in the list will be displayed. The box for bad files will still show 0 and the boxes for missing and other will show the same number.

    That is nice, but not enough. We want to know if all the files in our directory are present and accounted for. In order to verify this select Mark bad files from the Marking menu. Again things have happened. First the error column in the grid will now be empty, indicating that no file has an error in it. In the status bar the boxes marked bad, miss(ing) and other will all show a zero. Indicating that there were no bad files found nor where there any found missing. In short, all is well and our collection is both complete and correct.

    So that part of it works. But now for some action. To keep things simple, indeed a word I like, we will just rename one of the files in the directory. Let us say that the file named 'verynice.jpg', for example, will be renamed to 'verynice.tst'. This way we can easily put things back the way they were. Since the files have been changed the first thing to do is to select 'Perform filesearch' from the Open menu.

    As is usual, things have changed. The first thing you should notice is that the file now called 'verynice.tst' is no longer listed. True, it is still a valid JFIF file but the Sheriff will only recognize files whose extension ends at '.jpg'. Even though there may be arbitrary characters both before and after it, it does insist on finding '.jpg' somewhere in the name. Well, actually this is a lie but let us call it a white lie and proceed. Another thing that has changed is that the box marked count will show a number one less than before and also one less than the box marked list shows. The box marked bad will show 0 and the box marked other will show the count. Other then that the error column will once again be filled with '-' to indicate the files have not been verified. Yet.

    So let us do so. Even though we have already read a list we will do so again just for good measure and to get some practice. Once done we only need to select 'Mark bad files' from the Marking menu. Again the error column will be empty. But this time the box marked miss will show an one. Thus the Sheriff is of the opinion that we miss one file. Good, in this case. Lets see which one it is.

    We simply select Wanted list to clipboard from the Generate menu. In order to display the list we will fire up notepad again and select Paste from the Edit menu, though Ctrl-V or Shift-Ins will work too. And presto, as by magick the list will appear. Since we are still working with the default settings the list will show the filename, filesize, CRC-32 and description columns. The latter will of course be empty since we have not provided any descriptions yet.

    There is just one line listing the imaginary or actual 'verynice.jpg' file followed by the filesize and CRC as stored, by ourselves, in the test.CRC list. The list itself will be followed by the count of files, to wit one, as well as the total of filesizes listed. The nice thing about having this list on the clipboard is that it also can be pasted in mail messages and other things capable of handling text information on the clipboard.

  5. action

    In order to demonstrate the bad file detection it is probably best to (simply) copy the file we generated as checklist to the directory the JPEG files are stored. Since we want a bad file found we rename it from 'test.CRC' to 'verynice.jpg'. We also, as Hamp was kind enough to point out, need to select 'Perform filesearch' from the Open menu once more since again this have changed on the disk. We then select 'Mark bad files' from the Marking menu.

    Now let us look at what it shows. First of all we notice that the dimension is 0 x 0. Quite correct since a verification list is most certainly not a JPEG image. We can also see that both filesize and CRC are known, so it does indeed generate CRCs for non JPEG images. That information might come in handy if you want to verify non JPEG files. Also the error column will now show 'size' for the undoubtly bad file. Unless you were fortunate enough to have picked a JPEG file of exactly the same size as the list we generated in which case the error column will show CRC to indicate that the calculated CRC does not match the one stated in the verification list. If you put the mouse cursor over the filename the box will show a density of 1 x 1 unknown units. This happens to be the JFIF default to which this information always defaults.

    Finally, though the count box will again be equal to the list count, the box marked bad will now show one as will the box marked miss. The box marked other will show zero. We can of course generate a list of bad files to the clipboard by selecting that option from the Generate menu. If we then paste the list into notepad we will get the same list as before. This because the file that previously was missing was now present but found to be incorrect.

    Now for the fun part. Since we are not interested in bad JPEGs we decide to take action. First we select what punishment to adminster. Let us take drastic action. The default action is to 'Rename marked to *.BAD*' which would only rename verynice.jpg to verynice.BAD. Not enough. The second action is to 'Add '.BAD' to marked names'. This would result in a file named verynice.jpg.BAD. Again, not enough, we want blood. So we select the last option 'Delete marked bad files'. Yes, that is more like it. At this point we have laid down the law The JPEG Sheriff is to uphold. So we let it do its thing. We select 'Execute selected action on marked files'. To give us a chance to reconsider our stated position towards the criminal image in question, the Sheriff now shows a Confirmation box and asks us to either select Ok or to cop out using Cancel. Since we are adamant, it being a copy anyway, we give the Ok signal. The Sheriff, resigning to the facts of life, goes out and upholds the law.

    To verify that the job was done to our satisfaction we will list the directory. Where we will find, as we had expected, the file no more. It has indeed been eradicated. But what has not happened is the update of the list and counts. If we want to effectuate that as well, we will need to select 'Perform filesearch' from the Open menu once more. In the words of the Sheriff: "executing the execution is bad enough, but filling out the paperwork... Well, you just have to draw the line somewhere!"

    Well ... actually that is by now another white lie, but I liked that line too much to take it out.