Google Answers Logo
View Question
 
Q: CreateFile performance ( No Answer,   2 Comments )
Question  
Subject: CreateFile performance
Category: Computers > Programming
Asked by: stormin-ga
List Price: $100.00
Posted: 07 Jul 2003 09:16 PDT
Expires: 16 Jul 2003 11:58 PDT
Question ID: 226062
Hi,

We have a situation in which we need to be able to open, read and
close about 500 data files in at most 5 seconds (preferably less) on a
relatively new machine (e.g. P4, 2.4GHz, IDE drive) with Win2K or XP
using NTFS.

Here are the relevant facts:

1) The files are organized in a hierarchical way:

\dir\subdir1\file1.dat
\dir\subdir1\file2.dat
.
.
.
\dir\subdir2\file3.dat
\dir\subdir2\file4.dat
.
.
.
\dir\subdir3\file5.dat
\dir\subdir3\file6.dat

etc.

2) We use VC++ .NET

3) The files are typically about 0.5-1.0 MB in size, but We only read
the first 4K of each file using the CreateFile, ReadFile and
CloseHandle Win32 API calls

4) In our profiling tests, we have observed that 88% of the time is
actually spent in CreateFile.  In an effort to narrow down the
problem, we wrote the following test code that just opens and closes
all the files (i.e. we don't bother reading, since it contributes
negligibly.):


void TraverseTree(string &strSearchPath, 
                  int& rnCount)
{
  string strCurrentPath;
  string strExtension;
  string strSOPUID;
  HANDLE hFile;
  HANDLE hFind;
  WIN32_FIND_DATA find_data;

  string strWildcardPath = strSearchPath + "\\*.*";

  hFind = FindFirstFile(strWildcardPath.c_str(), &find_data);

  BOOL bMoreFiles = TRUE;
	
  if (hFind == INVALID_HANDLE_VALUE)
    bMoreFiles = FALSE;

  while (bMoreFiles)
  {
    strCurrentPath = strSearchPath + "\\" +
                     string(find_data.cFileName);

    if (strcmp(find_data.cFileName, ".") != 0 && 
        strcmp(find_data.cFileName, "..") != 0)
    {
      if (find_data.dwFileAttributes & _A_SUBDIR)
      {
        TraverseTree(strCurrentPath, rnCount, rnNumOpenFailed);
      }
      else
      {
        rnCount++;

        hFile = CreateFile(strCurrentPath.c_str(), 
                           GENERIC_READ, 
                           0, 
                           NULL, 
                           OPEN_EXISTING, 
                           FILE_ATTRIBUTE_NORMAL |
                           FILE_FLAG_SEQUENTIAL_SCAN,
                           NULL);

        CloseHandle(hFile);
      }
    }

    bMoreFiles = FindNextFile(hFind, &find_data);
  }

  FindClose(hFind);
}

5) For 614 data files, the above code takes about 35 seconds to run on
a Dell P4 2.4 GHz w/ a 70GB IDE drive with XP Pro using NTFS.  There
were 5 subdirectories in the hierarchy, with each subdirectory
containing an average of 100+ files.

6) We tried another test in which we put all 614 files into a single
directory and executed the following code:

hFind = FindFirstFile(szSearchPath, &file_data);

int nNumFiles = 0;
int nNumFailed = 0;
BOOL bMoreFiles = TRUE;
	
while (bMoreFiles)
{
  if (strcmp(file_data.cFileName, ".") != 0 &&
      strcmp(file_data.cFileName, "..") != 0)
  {
    strcpy(szPathname, szRootPath);
    strcat(szPathname, file_data.cFileName);

    nNumFiles++;

    hFile = CreateFile(szPathname, 
                       GENERIC_READ, 
                       0, 
                       NULL, 
                       OPEN_EXISTING, 
                       FILE_ATTRIBUTE_NORMAL |
                       FILE_FLAG_SEQUENTIAL_SCAN,
                       NULL);

    if (hFile == INVALID_HANDLE_VALUE)
      nNumFailed++;

    CloseHandle(hFile);
  }

  bMoreFiles = FindNextFile(hFind, &file_data);
}

FindClose(hFind);

7) Strangely, this took about 7 seconds on the same machine--a
five-fold improvement.  (The machine was rebooted between each and
every test to ensure that nothing was being cached.  After the reboot,
time was alloted for the disk activity to "settle" down.)

8) To eliminate the possibility that the performance discrepancy had
anything to do with the recursive nature of the first test, we did
another test in which we enumerated all the paths of all 614 files
organized in the manner described in 1), stuck the paths into an
array, then simply iterated through the array, opening and closing the
files.  Even in that case, in which no recursion was involved in the
opening of the files, it still took 35 seconds.

9) All of the above leads me to believe that there's something about
the fact that the files are organized in sub-directories (as opposed
just in a single directory) that seems to slow things down.

So, my questions are:

a) Why the discrepancy between the two tests?  

b) How does NTFS actually organize files on the disk physically?  Are
directories just a logical concept, or do they have some physical
meaning in terms of how the files are organized on the disk?  Some
URLs to papers on this subject would be useful.

c) Does the rate of file opening (35 seconds / 614 files = 50-60
milliseconds) seem unusually slow in the hiearchical case?  Is it
about right in the non-hiearchical case (7 seconds /614 files = 10
milliseconds)?

d) Besides getting better hardware, What else can we do to optimize
this process in software?  Are we using the optimal flags in
CreateFile?

e) The pathnames that we actually use are quite long--approaching 200
characters.  Does this make a difference?

Request for Question Clarification by mathtalk-ga on 09 Jul 2003 10:29 PDT
Hi, stormin-ga:

Are you compiling to "native" (unmanaged) code or to managed code
(using IJW features of the compiler)?

regards, mathtalk-ga

Clarification of Question by stormin-ga on 09 Jul 2003 10:47 PDT
Hi mathtalk-ga,

It's unmanaged (native) code.  

BTW, we also did another test in which we copied the same 614 files
over 10 subdirectories (i.e. about 60 files per directory), expecting
that it would take longer than 35 seconds, but in fact, it only took
about 13 seconds.  Our current theory is that perhaps in that original
directory (the one that took 35 seconds), the files were spread out on
the disk, resulting in longer seek times, whereas in the other cases,
when we copied them, they resided in one place on the disk.  But who
really knows...just a guess. :)  I'm also wondering whether we're
approaching the theoretical limit of how fast we can actually open
files.  I mean, if the average seek time on an IDE drive is about
8-10ms, can we really expect anything better than 100 files/sec?

stormin-ga

Request for Question Clarification by mathtalk-ga on 09 Jul 2003 16:20 PDT
Hi, stormin-ga:

I think your working hypothesis is sound.  Perhaps using a RAMDISK
would be beneficial?  If the files "arrive" asynchronously, they could
be placed or copied into the RAMDISK storage.

regards, mathtalk-ga

Clarification of Question by stormin-ga on 10 Jul 2003 08:15 PDT
Hi mathtalk-ga,

Hmmm...the problem with a RAMDisk though is that it doesn't persist
and thus we'd have to have some intelligent scheme to load the RAMDisk
on bootup.  It is not uncommon for our application to have several
dozens of GB of data, and so it would not be possible to preload the
RAM disk with the data.

stormin-ga

Request for Question Clarification by mathtalk-ga on 10 Jul 2003 09:09 PDT
Hi, stormin-ga:

Forgive my pressing the point here.  You have said that your
application has many gigabytes of data, but for the task at hand only
the first 4k of each of several hundred files are of interest.  Indeed
you said that the 500 files involved ranged in size from 0.5 to 1
Mbyte.

It's true that some startup/shutdown code is needed to persist a
RAMDisk across reboots but this is a standard feature for commercial
products.  I'm not sure how price sensitive this project is, but the
commercial RAMDisk programs I've seen are typically < $100.

Microsoft has (deliberately?) limited their "sample" RAMDisk driver to
32Mbytes, but there is a freeware offering out there that goes to
2Gbytes.  I didn't think you'd want to devote that much space, but
either precopying the files to RAMDisk _or_ the first 4K or so of each
would allow for exceptional speed.  Depending on the approach taken,
either 500 Mbytes or 2 Mbytes would suffice for the files you
described.

regards, mathtalk-ga

P.S.  There is a known "gotcha" with using Microsoft's RAMDisk sample
driver with an NTFS only system under WinXP, but I can point you to a
workaround for this.

Clarification of Question by stormin-ga on 10 Jul 2003 11:28 PDT
Hi mathtalk-ga,

Thanks for taking the time to work on this problem.

Having never actually used a RAMDisk, I'm not familiar with how they
work.  Is it actually possible to just load the first 4K of each file
into memory like that?  I should probably clarify how our app works. 
In our hiearchy of data files, each directory (which contains
subdirectories, which contain the files) represents a set of data that
the user would view during a given session.  We don't know ahead of
time which set of data the user will choose to view.  So, it would
seem to me that we would have to traverse all our directories and load
the first 4K (to complicate matters, it isn't always just the first
4K--sometimes it's more and the only way we can tell is to by parsing
the data) of all our files into the RAM disk.  Since it's not uncommon
for us to have tens of thousands of data files, it would seem that it
would take some time to preload the RAM disk with the 4K segments at
bootup, and I'm not sure how feasible it is.

We've been doing some more thinking about this, and we're thinking of
doing a hybrid between the approach we're taking now, and the approach
we took in our last version of the software, in which we cached some
of the necessary information in that 4K header in a database. 
Database fetches are obviously fast.

With respect to your comment about IDE drives, is it fair to say that
the average seek time represents the theoretical least amount of time
between file opens?

Having said that though, we are still interested in answers to the
questions we posed for future reference.

stormin-ga
Answer  
There is no answer at this time.

Comments  
Subject: Re: CreateFile performance
From: mathtalk-ga on 10 Jul 2003 09:53 PDT
 
Hi, stormin-ga:

From a quick survey of current IDE hard disks, the best average seek
time (read) seems to be around 9 millisecs.  The spec for WD "Special
Edition" Caviar 40 Gbyte drive is 8.9 millisec.  You might compare the
spec for your current drive to see if this is much of an improvement.

500 seeks for read * 10 milliseconds = 5 seconds

regards, mathtalk-ga
Subject: Re: CreateFile performance
From: chandru_kan-ga on 11 Jul 2003 02:17 PDT
 
I don't think its because of subdirectory structure. 
Move all the files to one level up in the directory or to root itself
and then try. If    subdirectory is the cause, it should execute it in
less than 7 seconds.

And also you're not checking the return status of CreateFile function.
Are u sure that it opens all    the files successfully in subdirectory
case?

The length of the name is limited to MAX_PATH characters, just check
whether its exceeding     MAX_PATH characters.

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.
Search Google Answers for
Google Answers  


Google Home - Answers FAQ - Terms of Service - Privacy Policy