You are here: > ESRI Forums > arcgis desktop discussion forums > Thread Replies

ArcGIS Desktop Discussion Forums

ArcGIS Desktop - Geoprocessing Scripting (Python, JavaScript, VB) forum

Repro and workaround: Arc crashes when gene...   Jason Roberts Jul 28, 2006
Re: Repro and workaround: Arc crashes when...   Jason Roberts Aug 18, 2006
Re: Repro and workaround: Arc crashes when...   Jason Roberts Jun 04, 2009
Report Inappropriate Content • Top • Print • This Forum is closed for replies.    
Subject Repro and workaround: Arc crashes when generating hundreds of rasters in a directory 
Author Jason Roberts 
Date Jul 28, 2006 
Message I work with satellite data and frequently store hundreds of rasters in a single directory. I execute python scripts from geoprocessing models in ArcCatalog. Occasionally ArcCatalog will crash and the Microsoft “apology” will come up. After clicking Don’t Send, a second error will come up saying memory could not be read. After clicking OK on this, the ArcCatalog window goes away.

After extensive experimentation, I can reproduce and work around this problem. I assert that it is a bug in Arc and will try to report it through the appropriate channel. (My ESRI contract only allows one person at my organization to report bugs. This person is difficult to get ahold of, so if anyone out there has a contact at Arc that I could use, please let me know!)

The problem relates to how Arc stores raster information. When Arc stores a raster in a directory, most of the information goes into a subdirectory named after the raster. But some goes into a subdirectory called info. All the rasters in the directory share the same info subdirectory.

Here is how the directory structure would look if three rasters were stored in a directory C:\MyRasters:

C:\MyRasters
C:\MyRasters\raster1
C:\MyRasters\raster2
C:\MyRasters\raster3
C:\MyRasters\info

info contains a file called arc.dir and a series of files called arcXXXX.YYY, where XXXX is a decimal number beginning with 0000, and YYY is an extension such as dat or nit. arc.dir contains one binary record for each arcXXXX.YYY file. (I am sure others have documented the details of these data structures; I’m not going to go into it further.)

When you store a floating-point raster in C:\MyRasters, two entries are added to arc.dir and four arcXXXX.YYY files are created. If it is the first raster to be stored, the files will be arc0000.nit, arc0000.dat, arc0001.nit and arc0001.dat.

When you store an integer raster, three entries are added to arc.dir and seven arcXXXX.YYY files are created. If it is the first raster, they will be arc0000.nit, arc0000.dat, arc0001.nit, arc0001.dat, arc0002.nit, arc0002.dat and arc0002r.001.

The crash occurs when the raster you are adding causes the total number of entries in arc.dir to be evenly divisible by 1024. For example, the following scenarios will cause the crash:

1. Storing 512 floating point rasters (512*2 = 1024)

2. Storing 509 floating point and then 2 integer rasters (509*2 + 2*3 = 1024)

3. Storing 1024 integer rasters (1024*3 = 3072)

These scenarios will not cause crashes:

4. Storing 511 floating point and then 1 integer rasters (511*2 + 1*3 = 1025)

5. Storing 511 floating point, then 1 integer, and then 600 floating point rasters (511*2 + 1*3 + 600*2 = 2225)

This last scenario is remarkable, at least to me, because I had been led to believe that Arc could store no more than 1023 rasters in a single directory before it crashed. In fact, the crashing is governed by the behavior described above. So long as you never cause the entries in arc.dir to hit a multiple of 1024, you can store as many rasters as you want in a directory. Of course, the more you have, the more you will suffer from a memory leak (I will open a separate email thread about this).

You can tell if your next raster is going to crash Arc with the following test:

1. Look for the highest numbered arcXXXX.YYY file in the info directory.

2. If you’re going to store a floating point raster, add 2 to the XXXX number. If you’re going to store an integer raster, add 3.

3. If 1 + your result in step 2 equals a multiple of 1024, Arc will crash when you add the raster. For example, if your result in step 2 is 1023, 2047, 3071 or 4095, Arc will crash.

The attached ZIP file contains a python script and toolbox tool that allows you to generate rasters and see what happens. It implements the predictive test above and reports when Arc will crash. I tested the scenarios above with Arc 9.1 SP1 and it worked and crashed as I described.

The same script appears below.

If anyone knows anybody on the Arc dev or test teams, please pass this along to them…

Jason
 
 
"""RasterGenerator.py - generates rasters in an output directory, for the purpose of reproducing ArcGIS crashes."""

import os
import sys
import win32com.client

# Helper functions.

def log_returned_geoprocessor_messages(gp):
    i = 0
    while i < gp.MessageCount:
        sev = gp.GetSeverity(i)
        msg = gp.GetMessage(i)
        print msg
        if sev == 0:
            gp.AddMessage(msg)
        elif sev == 1:
            gp.AddWarning(msg)
        elif sev == 2:
            gp.AddError(msg)
        i = i + 1

def log_message(gp, msg):
    print msg
    gp.AddMessage(msg)

def log_warning(gp, msg):
    print msg
    gp.AddWarning(msg)

def log_error_and_exit(gp, msg):
    gp.AddError(msg)
    sys.exit(msg)

# Create the geoprocessor object.

gp = win32com.client.Dispatch("esriGeoprocessing.GPDispatch.1")

# Check out the spatial analyst extension.

try:
    gp.CheckOutExtension("spatial")
except Exception, e:
    log_error_and_exit(gp, "Failed to obtain a license for the spatial extension. gp.CheckOutExtension reported: " + str(e))

# Validate and parse the input parameters.

if len(sys.argv) < 4:
    log_error_and_exit(gp, "You must specify the input parameters: output directory, list of raster data types (FLOAT or INTEGER), and number of rasters to generate.")

output_dir = sys.argv[1]
raster_data_types = sys.argv[2]
num_rasters = sys.argv[3]

log_message(gp, "Output Directory = " + output_dir)
log_message(gp, "Raster Data Types = " + raster_data_types)
log_message(gp, "Number of Rasters = " + num_rasters)

if not os.path.isdir(output_dir) or not gp.Exists(output_dir):
    log_error_and_exit(gp, "The output directory \"" + output_dir + "\" does not appear to exist.")

raster_data_types_list = []
for data_type in raster_data_types.split(";"):
    if data_type.upper() != "FLOAT" and data_type.upper() != "INTEGER":
        log_error_and_exit(gp, "\"" + data_type + "\" is an invalid raster data type. The valid data types are \"FLOAT\" and \"INTEGER\".")
    raster_data_types_list.append(data_type.upper())

num_rasters_list = []
for num_str in num_rasters.split(";"):
    try:
        num = int(num_str)
    except:
        log_error_and_exit(gp, "\"" + num_str + "\" is an invalid number of rasters. It must be a positive integer.")
    if num < 1:
        log_error_and_exit(gp, "\"" + num_str + "\" is an invalid number of rasters. It must be a positive integer.")
    num_rasters_list.append(num)

if len(raster_data_types_list) != len(num_rasters_list):
    log_error_and_exit(gp, "The list of raster data types must be the same length as the list of numbers of rasters.")

if len(raster_data_types_list) == 0:
    log_error_and_exit(gp, "Please specify at least one item in the raster data types list and the number of rasters list.")

# Verify there are no rasters in the output directory.

try:
    gp.Workspace = output_dir
except Exception, e:
    log_error_and_exit(gp, "Could not check set the geoprocessor's workspace to \"" + output_dir + "\". The gp.Workspace property failed when set to that directory: " + str(e))

try:
    existing_rasters = gp.ListRasters("*")
except Exception, e:
    log_error_and_exit(gp, "Could not check for existing rasters in the directory \"" + output_dir + "\". The gp.ListRasters function failed: " + str(e))

existing_rasters.Reset()
r = existing_rasters.Next()
if r is not None and len(r.strip()) > 0:
    log_error_and_exit(gp, "Rasters alredy exist in the directory \"" + output_dir + "\". Please specify an empty directory.")

# Create the rasters.    

rasters_generated = 0
predicted_info_directory_entries = 0

for i in range(len(raster_data_types_list)):
    data_type = raster_data_types_list[i]
    for j in range(num_rasters_list[i]):

        # Print a warning if the next raster will cause Arc to crash.
        
        if data_type == "FLOAT":
            predicted_info_directory_entries = predicted_info_directory_entries + 2
        else:
            predicted_info_directory_entries = predicted_info_directory_entries + 3

        if divmod(predicted_info_directory_entries, 1024)[1] == 0:
            log_warning(gp, "***** The next raster will cause Arc to crash! *****")

        # Generate the raster

        raster = "%s\\test%04i" % (output_dir, rasters_generated + 1)
        
        if data_type == "FLOAT":
            gp.CreateConstantRaster_sa(raster, 1.1, "FLOAT", 1, "0 0 10 10")
        else:
            gp.CreateConstantRaster_sa(raster, 1, "INTEGER", 1, "0 0 10 10")

        log_returned_geoprocessor_messages(gp)

        rasters_generated = rasters_generated + 1            

        # Print a message if we were supposed to crash but we didn't.

        if divmod(predicted_info_directory_entries, 1024)[1] == 0:
            log_warning(gp, "***** Arc should have crashed. If you're seeing this message, it did not crash, and my algorithm for predicting its crashes is not correct. *****")

log_message(gp, str(rasters_generated) + " rasters generated successfully.")
 
  RasterGenerator.zip (opens in new window)
 
Report Inappropriate Content • Top • Print • This Forum is closed for replies.    
Subject Re: Repro and workaround: Arc crashes when generating hundreds of rasters in a directory 
Author Jason Roberts 
Date Aug 18, 2006 
Message ESRI informs me that this is now bug NIM003862. 
   
Report Inappropriate Content • Top • Print • This Forum is closed for replies.    
Subject Re: Repro and workaround: Arc crashes when generating hundreds of rasters in a directory 
Author Jason Roberts 
Date Jun 04, 2009 
Message In case anyone is monitoring this...

I finally heard back from someone at ESRI about this problem. It was Ralf Gottschalk, Development Technical Lead - SDK Unit. He said:

"NIM003862 – Your 1024 raster issue. Unfortunately this issue lies deep within legacy code, and is something that we are not going to fix at this time for fear it will break something else. Eventually we will be moving away from this code and are in the process of slowly doing so now. I’m not sure on the details how you hit this bug, if you were using the GRID format or some other format. Ultimately, behind the scenes when you work with rasters, we converted them to GRID (the Legacy part). So, you might have hit this issue using any raster format in the current framework. At 9.4 this should be changing and we will be doing native IO. ArcGIS will not convert formats behind the scenes which will avoid the 1024 issue, however, you will still hit the issue when using GRID until we remove the legacy code."

Jason