Advertisement
Advertisement


How many files can I put in a directory?


Question

Does it matter how many files I keep in a single directory? If so, how many files in a directory is too many, and what are the impacts of having too many files? (This is on a Linux server.)

Background: I have a photo album website, and every image uploaded is renamed to an 8-hex-digit id (say, a58f375c.jpg). This is to avoid filename conflicts (if lots of "IMG0001.JPG" files are uploaded, for example). The original filename and any useful metadata is stored in a database. Right now, I have somewhere around 1500 files in the images directory. This makes listing the files in the directory (through FTP or SSH client) take a few seconds. But I can't see that it has any effect other than that. In particular, there doesn't seem to be any impact on how quickly an image file is served to the user.

I've thought about reducing the number of images by making 16 subdirectories: 0-9 and a-f. Then I'd move the images into the subdirectories based on what the first hex digit of the filename was. But I'm not sure that there's any reason to do so except for the occasional listing of the directory through FTP/SSH.

2016/02/28
1
567
2/28/2016 6:04:23 AM

Accepted Answer

FAT32:

  • Maximum number of files: 268,173,300
  • Maximum number of files per directory: 216 - 1 (65,535)
  • Maximum file size: 2 GiB - 1 without LFS, 4 GiB - 1 with

NTFS:

  • Maximum number of files: 232 - 1 (4,294,967,295)
  • Maximum file size
    • Implementation: 244 - 26 bytes (16 TiB - 64 KiB)
    • Theoretical: 264 - 26 bytes (16 EiB - 64 KiB)
  • Maximum volume size
    • Implementation: 232 - 1 clusters (256 TiB - 64 KiB)
    • Theoretical: 264 - 1 clusters (1 YiB - 64 KiB)

ext2:

  • Maximum number of files: 1018
  • Maximum number of files per directory: ~1.3 × 1020 (performance issues past 10,000)
  • Maximum file size
    • 16 GiB (block size of 1 KiB)
    • 256 GiB (block size of 2 KiB)
    • 2 TiB (block size of 4 KiB)
    • 2 TiB (block size of 8 KiB)
  • Maximum volume size
    • 4 TiB (block size of 1 KiB)
    • 8 TiB (block size of 2 KiB)
    • 16 TiB (block size of 4 KiB)
    • 32 TiB (block size of 8 KiB)

ext3:

  • Maximum number of files: min(volumeSize / 213, numberOfBlocks)
  • Maximum file size: same as ext2
  • Maximum volume size: same as ext2

ext4:

  • Maximum number of files: 232 - 1 (4,294,967,295)
  • Maximum number of files per directory: unlimited
  • Maximum file size: 244 - 1 bytes (16 TiB - 1)
  • Maximum volume size: 248 - 1 bytes (256 TiB - 1)
2018/11/20
744
11/20/2018 7:19:54 AM

I have had over 8 million files in a single ext3 directory. libc readdir() which is used by find, ls and most of the other methods discussed in this thread to list large directories.

The reason ls and find are slow in this case is that readdir() only reads 32K of directory entries at a time, so on slow disks it will require many many reads to list a directory. There is a solution to this speed problem. I wrote a pretty detailed article about it at: http://www.olark.com/spw/2011/08/you-can-list-a-directory-with-8-million-files-but-not-with-ls/

The key take away is: use getdents() directly -- http://www.kernel.org/doc/man-pages/online/pages/man2/getdents.2.html rather than anything that's based on libc readdir() so you can specify the buffer size when reading directory entries from disk.

2016/06/09

I have a directory with 88,914 files in it. Like yourself this is used for storing thumbnails and on a Linux server.

Listed files via FTP or a php function is slow yes, but there is also a performance hit on displaying the file. e.g. www.website.com/thumbdir/gh3hg4h2b4h234b3h2.jpg has a wait time of 200-400 ms. As a comparison on another site I have with a around 100 files in a directory the image is displayed after just ~40ms of waiting.

I've given this answer as most people have just written how directory search functions will perform, which you won't be using on a thumb folder - just statically displaying files, but will be interested in performance of how the files can actually be used.

2012/07/07

It depends a bit on the specific filesystem in use on the Linux server. Nowadays the default is ext3 with dir_index, which makes searching large directories very fast.

So speed shouldn't be an issue, other than the one you already noted, which is that listings will take longer.

There is a limit to the total number of files in one directory. I seem to remember it definitely working up to 32000 files.

2009/01/21

Keep in mind that on Linux if you have a directory with too many files, the shell may not be able to expand wildcards. I have this issue with a photo album hosted on Linux. It stores all the resized images in a single directory. While the file system can handle many files, the shell can't. Example:

-shell-3.00$ ls A*
-shell: /bin/ls: Argument list too long

or

-shell-3.00$ chmod 644 *jpg
-shell: /bin/chmod: Argument list too long
2009/01/21

I'm working on a similar problem right now. We have a hierarchichal directory structure and use image ids as filenames. For example, an image with id=1234567 is placed in

..../45/67/1234567_<...>.jpg

using last 4 digits to determine where the file goes.

With a few thousand images, you could use a one-level hierarchy. Our sysadmin suggested no more than couple of thousand files in any given directory (ext3) for efficiency / backup / whatever other reasons he had in mind.

2009/01/21

Source: https://stackoverflow.com/questions/466521
Licensed under: CC-BY-SA with attribution
Not affiliated with: Stack Overflow
Email: [email protected]