ZenRecover
A friend of mine once ended up with a corrupt hard disk in his Creative Zen™ mp3 player, to the point of not being able to boot the device anymore. This particular family of mp3 players do not work as standard USB mass storage devices and use a proprietary filesystem, so automated file recovery using standard tools was not an option.
With the help of quetzalcoatl, who had already analyzed the filesystem layout to great lengths, I wrote a Python script to walk the filesystem structure and recover the user's files and directories. To use it, one needs to access the actual filesystem, which usually means opening the broken mp3 player and connecting the internal hard disk to a notebook or other IDE interface. This filesystem is known to be used in the Creative Nomad™ JukeBox 3, Creative Zen™, and possibly other mp3 players of the same brand.
Usage
python zenrecover.py Usage: zenrecover.py [-o OFFSET] DISK_OR_IMAGE SECTION OUTPUT_DIR DISK_OR_IMAGE is the disk containing the filesystem, or an image thereof OFFSET is the offset at which the filesystem starts (in bytes, default 20M) SECTION is the section of the filesystem to recover: "archives" or "songs" OUTPUT_DIR is the directory in which to place the recovered files
That's it. If the filesystem is not severely damaged, the tool should be able to recover all of your files.
Otherwise you will need to understand the filesystem layout explained below, examine the disk with a hex editor, and possibly do some tracing and debugging of ZenRecover itself, to figure out what's going wrong with the automated process. What I'm saying is that the tool will only work on a coherent or slightly damaged filesystem; otherwise your best bet is to take it as a starting point.
Download
zenrecover.py
(size: 7.9K;
license: GPL;
last updated on 8 Oct 2008)
LRU.py
(size: 2.3K;
license: GPL;
last updated on 8 Oct 2008)
Library written by Josiah Carlson, used by ZenRecover.
Filesystem layout
The hard disk of the aforementioned mp3 players is partitioned into two distinct filesystems; one for the operating system, formatted in MiniFS and one for the user data, formatted in CFS. The latter usually begins at offset 20MiB from the start of the disk, so this is the default offset used by ZenRecover. If the tool fails to find the root inode in your disk, the filesystem might be placed at a different offset. ZenRecover only understands CFS, where the user data resides.
CFS works like a traditional UNIX inode-based filesystem, such as Linux ext2. It appears to be based on Dominic Giampaolo's BFS.
CFS, as found on said mp3 players, assumes a disk sector of 512B. The fundamental unit in CFS is the cluster, made of 16 contiguous sectors, or 0x2000 bytes. As already said, CFS usually starts at sector 0xA000.
The first cluster is numbered -1 (take note of this!) and is filled with 0xFF. Next comes cluster 0, filled with 0x00. Here is the relationship between cluster number and offset from the start of CFS (not counting the first 20MB of the disk), most useful when examining the filesystem in a hex editor:
offset = (cluster + 1) × 0x2000 cluster = offset / 0x2000 − 1
Cluster 1 (the third cluster from the start of CFS) should contain some volume information, including prabably the location of the root inode. At cluster 2 should begin a cluster usage bitmap for the entire disk, which supposedly varies in dimension according to the size of the hard disk. I say should because in my case they were both TFU, therefore ZenRecover does not assume to find any useful information in their places.
At the cluster immediately following the cluster usage bitmap, we find the root inode. A CFS inode has the following (incomplete) structure:
| offset | type | description |
|---|---|---|
| 0 | int32 | magic number BE 3B D9 0A |
| 4 | int32 | self-reference, ie. "You should have found me in cluster x" |
| 0x20 | int32[12] | cluster numbers of the first 12 data clusters (each = -1 if unused) |
| 0x58 | int32 | cluster number of the second class data cluster chain (see below) |
| 0x64 | int32 | cluster number of the third class data cluster chain (see below) |
| 0x78 | int32 | serial number, set to -1 in the root inode |
| 0x7C | int32 | number of metadata records |
| 0x80 | start of metadata |
All ints are stored PDP-endian. That is, int16 are stored little-endian, but int32 are stored in a strange way: 0x11223344 becomes 22 11 44 33 on disk. Strings appear to be NUL-terminated UCS-2LE. Bitmaps are arrays of int32, so they follow the same byte-swapping:
uint32 #0, bit #0 = bitmap bit #0 uint32 #0, bit #31 = bitmap bit #31 uint32 #1, bit #0 = bitmap bit #32
…and so on, so that the order of bits for every int32 in a bitmap is:
23 <- 16 | 31 <- 24 | 7 <- 0 | 15 <- 8
Files seem to have metadata, directories don't. Data clusters, which are where the actual file data (or directory entries) lay, are referenced in a three-tier structure:
inode \_ up to 12 data clusters (some of which might be set to -1) \_ second class chain cluster (seems to always be allocated) | \_ up to 2048 clusters of data (some might be set to -1) \_ third class chain cluster (seems to always be allocated) \_ up to 2048 clusters of pointers (some might be set to -1) \_ up to 2048² clusters of data (some might be set to -1)
Metadata are a sequence of variable length, tagged records:
| offset | type | description |
|---|---|---|
| 0 | int16 | magic = 3 |
| 2 | int16 | length of this record |
| 4 | string[2] | tag (NUL-terminated UCS-2LE, 2 chars = 6 bytes) |
| 10 | byte[length] | data |
Here are some useful metadata records (other metadata seem to come from the ID3 tags of the songs):
| tag | type | description |
|---|---|---|
| "07" | string | filename with extension |
| "0=" | string | backslash-separated original path of the file, before it was copied over to the mp3 player |
| "0>" | int32 | file size |
Directories are just like ordinary files, except that they have no metadata and that their actual data is an array of directory entries. Each directory entry points to the inode of a specific file or subdirectory. Directories in CFS seem to be allocated 8 contiguous data clusters at a time. Every block of 8 data clusters has this layout:
| offset | type | description |
|---|---|---|
| 8 | int32 | number of allocated entries* |
| 16 | – | a 204 byte array usage bitmap, providing for 1632 bits |
| 220 | – | exactly 1632 dir entries, 40 bytes each |
| – | – | 36 null bytes of padding at the end |
*: the number of allocated entries is only set in the first data cluster, ie. in the first block, and contains the number of children of the directory. This number allows for a simple consistency check against the block bitmaps.
Every directory entry is 40 bytes long and has the following layout:
| offset | type | description |
|---|---|---|
| 0 | int32 | cluster number of the inode |
| 4 | int16 | length of the full filename |
| 8 | string | filename, truncated to 15 characters if longer |
So the whole filename is only contained in the metadata, which only files possess. This would imply that directories are limited to 15 character names. This is not a problem in practice, because the mp3 players we examined had this directory structure:
| /songs | all the songs, without any subdirectory |
| /archives | all the files uploaded as data, without any subdirectory |
| /playlists | |
| /recordings | |
| /system |
That's all! Hope you find it useful.
Disclaimer
The information provided in this web site was inferred from limited factual evidence and may be inaccurate or out-of-date. The author cannot be held responsible for (but not limited to) any loss or corruption of data, physical damage whether consequestial or inconsequential, business loss, personal injury, or any other type of damage or injury arising as a direct or indirect result of consulting this web site.
The product names used in this web site are for identification purposes only. All trademarks and registered trademarks are the property of their respective owners.
This site is not affiliated with Creative Technology Ltd.