You are here: > ESRI Forums > arcgis desktop discussion forums > Thread Replies

ArcGIS Desktop Discussion Forums

ArcGIS Desktop - Geoprocessing Scripting (Python, JavaScript, VB) forum

Subsampling of a dataset   Mark Andersen Feb 04, 2010
Re: Subsampling of a dataset   William Huber Feb 05, 2010
Report Inappropriate Content • Top • Print • This Forum is closed for replies.    
Subject Subsampling of a dataset 
Author Mark Andersen 
Date Feb 04, 2010 
Message Hi all,

I have a feature class containing about 12,000 points. I would like to put together a VBA script that randomly selects a subset of the features in the feature class, and places these into a new feature class. If I only had to do this once, I could use something like Hawth's "Create Random Selection" tool. The sticking point is that I will want to subsample the dataset a large number of times to create a large number of subsample datasets (e.g. 1,000). The number of samples to be included in each subsample dataset could be anywhere from 2 to 1,500. In the end, for this to be usable for the modeling software I'm using, I'll want a single table that contains all of these subsamples, with a field identifying which subsample group each record belongs to, similar to this:

PointID SubsampleID AttributeA AttributeB
765 1 37 5.4
8,839 1 24 2.5
3,483 100 64 2.1

The biggest question I have is whether there's some built-in function to randomly select points from a feature class.

If not, I thought I could could cycle through randomly getting a record using a cursor and queryfilter, and adding that feature to an output feature class after checking to be sure it wasn't already there. I have outlined some code to do this below, but thought I might check to see if anyone else had already put together something similar that I might be able to use before I went through this.

Public Sub SubsampleFC()

    'usual junk
    Dim pmxdoc As IMxDocument
    Set pmxdoc = ThisDocument
    Dim pmap As IMap
    Set pmap = pmxdoc.FocusMap
    Dim pFLSamples As IFeatureLayer
    Dim pFCSamples As IFeatureClass
    Set pFLSamples = pmap.Layer(0)
    Set pFCSamples = pFLSamples.FeatureClass
    Dim pCursor As IFeatureCursor
    Dim pFilter As IQueryFilter
    Dim pFeature As IFeature
    Dim intSamples As Integer
    Dim intReplicates As Integer
    Dim intFeatureCount As Integer
    intFeatureCount = pFCSamples.FeatureCount
    intReplicates = InputBox("Enter the number of replicates to be drawn", "Number of Replicates")
    intSamples = InputBox("Enter the number of samples to draw per replicate", "Number of Samples")
    'loop through this to create a new FC for each replicate
    For X = 1 To intReplicates
        'dim a recordID variable so we can check whether it's already been selected
        Dim intRecordID As Integer
        'create a new feature class to hold the random samples
        'loop through intSamples number of times to keep adding points to a replicate FC
        For Y = 1 To intSamples
            'randomly select a record and add it to the current output replicate feature class
        'calculate the replicateID field to X for all records in the output
    'merge the replicate feature classes into a single feature class

End Sub
Report Inappropriate Content • Top • Print • This Forum is closed for replies.    
Subject Re: Subsampling of a dataset 
Author William Huber 
Date Feb 05, 2010 
Message "The biggest question I have is whether there's some built-in function to randomly select points from a feature class."

Yes: the rnd() function in the Field Calculator produces pseudo-random values in a numeric field. Compare that to any threshold (or range of thresholds) you like to obtain subsamples. 
  --Bill Huber
Quantitative Decisions ( )
More GIS Q&A at