I’ve run the following test I’ve created a folder containing 15’000 files of 400 bytes using this batch :
@ECHO off
SET times=15000FOR /L %%i IN (1,1,%times%) DO (
fsutil file createnew filename%%i.txt 400
)
then I copy past it on my Windows Computer using this command :
robocopy LargeNumberOfFiles\ LargeNumberOfFiles2\
After it has completed I can see that the transfer rate was 915810 Bytes/sec this is less than 1 MB/s. It took me several seconds to copy 7 MBytes Please note that this is very slow.
I’ve tried the same with a folder with a single file of 50 Mbytes and the transfer rate is 1219512195 Bytes/sec. (yeah GB/s) instantaneous.
Why copying large number of files take so much time – ressources on a windows filesystem ?
Please note that I’ve tried to do the same on a linux system which runs on the same computer in a virtual machine (vmware player) with ext3 filesystem.
I use the cp command and the copy is instantaneous !
Please also note the following :
- no antivirus
- I’ve tested that behaviour on multiple windows computers (always ntfs) i always get comparable results (transfer rate under 1MB/s avg 7-8 seconds to copy 7 MBytes)
- I’ve tested on multiple linux ext3 system the copy is always instantaneous for that amount (15000 files of 400 bytes)
- The question is about understanding what makes windows filesystem so slow to copy large number of files compared to a linux one for instance.
Your 15,000 files take at least 60,000 HD transactions to complete:
- Read the Source file
- Allocate the New Destination file descriptor
- Write the Destination file
- Update the Destination file descriptor
You’re probably using a SATA drive too (just guessing) which doesn’t support Command Queuing, meaning the drive will have to thrash through all 60,000 operations individually. High end SCSI drives and arrays would increase performance by queuing up several operations in similar locations before executing them.
Your single file copy took at least 4 (though probably a few more, it depends) operations; the same as before. It might be more than 4 if there are file fragments on either the read or write operations.
Throughput is measured not only in bytes per second, but also IOps (essentially operations per second). You are being limited by the latter of these two.
Check more discussion of this question.