ArcPy: Convert CSV to Shapefile – Complete Guide
Converting CSV files to shapefiles is a fundamental task in GIS workflows. ArcPy provides powerful tools to automate this conversion process, making it efficient to transform tabular data with coordinate information into spatial datasets.
Overview
This guide covers multiple methods to convert CSV files containing coordinate data into ESRI shapefiles using Python’s ArcPy library. Whether you’re working with latitude/longitude coordinates, state plane coordinates, or other projected coordinate systems, these techniques will help you create accurate spatial datasets.
Prerequisites
- ArcGIS Desktop or ArcGIS Pro installed
- Python environment with ArcPy library
- CSV file with coordinate columns (X/Y, Longitude/Latitude, etc.)
- Basic understanding of coordinate systems and projections
Method 1: Using MakeXYEventLayer and CopyFeatures
This is the most common approach for converting CSV files with coordinate data to shapefiles.
import arcpy
import os
def csv_to_shapefile_basic(csv_file, output_shapefile, x_field, y_field, spatial_reference=4326):
"""
Convert CSV to shapefile using MakeXYEventLayer
Parameters:
csv_file (str): Path to input CSV file
output_shapefile (str): Path to output shapefile
x_field (str): Name of X/longitude column
y_field (str): Name of Y/latitude column
spatial_reference (int): EPSG code for coordinate system
"""
try:
# Set workspace
arcpy.env.workspace = os.path.dirname(output_shapefile)
# Create XY event layer
event_layer = "temp_event_layer"
arcpy.management.MakeXYEventLayer(
table=csv_file,
in_x_field=x_field,
in_y_field=y_field,
out_layer=event_layer,
spatial_reference=arcpy.SpatialReference(spatial_reference)
)
# Copy features to shapefile
arcpy.management.CopyFeatures(event_layer, output_shapefile)
print(f"Successfully converted {csv_file} to {output_shapefile}")
except arcpy.ExecuteError:
print(f"ArcPy Error: {arcpy.GetMessages(2)}")
except Exception as e:
print(f"Python Error: {str(e)}")
# Example usage
csv_file = r"C:\data\points.csv"
output_shapefile = r"C:\output\points.shp"
csv_to_shapefile_basic(csv_file, output_shapefile, "longitude", "latitude")
Method 2: Advanced Conversion with Data Validation
This enhanced version includes data validation and error handling for production environments.
import arcpy
import pandas as pd
import os
def csv_to_shapefile_advanced(csv_file, output_shapefile, x_field, y_field,
spatial_reference=4326, validate_coords=True):
"""
Advanced CSV to shapefile conversion with validation
"""
try:
# Validate input file
if not os.path.exists(csv_file):
raise FileNotFoundError(f"CSV file not found: {csv_file}")
# Read CSV to validate structure
df = pd.read_csv(csv_file)
# Check if coordinate fields exist
if x_field not in df.columns:
raise ValueError(f"X field '{x_field}' not found in CSV")
if y_field not in df.columns:
raise ValueError(f"Y field '{y_field}' not found in CSV")
# Validate coordinate data
if validate_coords:
# Remove rows with null coordinates
initial_count = len(df)
df_clean = df.dropna(subset=[x_field, y_field])
if len(df_clean) != initial_count:
print(f"Warning: Removed {initial_count - len(df_clean)} rows with null coordinates")
# Check coordinate ranges for geographic coordinates
if spatial_reference == 4326: # WGS84
x_valid = df_clean[x_field].between(-180, 180)
y_valid = df_clean[y_field].between(-90, 90)
invalid_coords = ~(x_valid & y_valid)
if invalid_coords.any():
print(f"Warning: Found {invalid_coords.sum()} rows with invalid coordinates")
df_clean = df_clean[~invalid_coords]
# Save cleaned CSV temporarily
temp_csv = csv_file.replace('.csv', '_temp.csv')
df_clean.to_csv(temp_csv, index=False)
csv_file = temp_csv
# Set environment
arcpy.env.overwriteOutput = True
arcpy.env.workspace = os.path.dirname(output_shapefile)
# Create spatial reference object
sr = arcpy.SpatialReference(spatial_reference)
# Create XY event layer
event_layer = "temp_event_layer"
arcpy.management.MakeXYEventLayer(
table=csv_file,
in_x_field=x_field,
in_y_field=y_field,
out_layer=event_layer,
spatial_reference=sr
)
# Copy features to permanent shapefile
arcpy.management.CopyFeatures(event_layer, output_shapefile)
# Clean up temporary files
if validate_coords and os.path.exists(temp_csv):
os.remove(temp_csv)
# Get feature count
result = arcpy.management.GetCount(output_shapefile)
feature_count = int(result.getOutput(0))
print(f"Successfully created shapefile with {feature_count} features")
print(f"Output: {output_shapefile}")
return output_shapefile
except arcpy.ExecuteError:
print(f"ArcPy Error: {arcpy.GetMessages(2)}")
return None
except Exception as e:
print(f"Error: {str(e)}")
return None
# Example usage with validation
result = csv_to_shapefile_advanced(
csv_file=r"C:\data\survey_points.csv",
output_shapefile=r"C:\output\survey_points.shp",
x_field="lon",
y_field="lat",
spatial_reference=4326,
validate_coords=True
)
Method 3: Batch Processing Multiple CSV Files
Process multiple CSV files in a directory with consistent structure.
import arcpy
import os
import glob
def batch_csv_to_shapefile(input_directory, output_directory, x_field, y_field,
spatial_reference=4326, file_pattern="*.csv"):
"""
Convert multiple CSV files to shapefiles
"""
# Create output directory if it doesn't exist
os.makedirs(output_directory, exist_ok=True)
# Find all CSV files matching pattern
csv_files = glob.glob(os.path.join(input_directory, file_pattern))
if not csv_files:
print(f"No CSV files found in {input_directory}")
return
successful_conversions = 0
failed_conversions = []
for csv_file in csv_files:
try:
# Generate output shapefile name
base_name = os.path.splitext(os.path.basename(csv_file))[0]
output_shapefile = os.path.join(output_directory, f"{base_name}.shp")
# Convert CSV to shapefile
result = csv_to_shapefile_advanced(
csv_file, output_shapefile, x_field, y_field, spatial_reference
)
if result:
successful_conversions += 1
else:
failed_conversions.append(csv_file)
except Exception as e:
print(f"Failed to process {csv_file}: {str(e)}")
failed_conversions.append(csv_file)
# Summary report
print(f"\nBatch Processing Summary:")
print(f"Total files processed: {len(csv_files)}")
print(f"Successful conversions: {successful_conversions}")
print(f"Failed conversions: {len(failed_conversions)}")
if failed_conversions:
print("\nFailed files:")
for failed_file in failed_conversions:
print(f" - {failed_file}")
# Example usage
batch_csv_to_shapefile(
input_directory=r"C:\data\csv_files",
output_directory=r"C:\output\shapefiles",
x_field="longitude",
y_field="latitude",
spatial_reference=4326
)
Method 4: Converting with Attribute Mapping and Field Types
Handle field types and attribute mapping during conversion.
import arcpy
import os
def csv_to_shapefile_with_fields(csv_file, output_shapefile, x_field, y_field,
field_mapping=None, spatial_reference=4326):
"""
Convert CSV to shapefile with custom field mapping
field_mapping example:
{
'csv_field_name': {'shapefile_name': 'FIELD_NAME', 'type': 'TEXT', 'length': 50},
'population': {'shapefile_name': 'POP', 'type': 'LONG'},
'area_km2': {'shapefile_name': 'AREA_KM2', 'type': 'DOUBLE'}
}
"""
try:
# Set environment
arcpy.env.overwriteOutput = True
# Create XY event layer
event_layer = "temp_event_layer"
arcpy.management.MakeXYEventLayer(
table=csv_file,
in_x_field=x_field,
in_y_field=y_field,
out_layer=event_layer,
spatial_reference=arcpy.SpatialReference(spatial_reference)
)
if field_mapping:
# Create feature class with custom fields
temp_fc = "temp_fc"
arcpy.management.CreateFeatureclass(
out_path=arcpy.env.workspace,
out_name=temp_fc,
geometry_type="POINT",
spatial_reference=arcpy.SpatialReference(spatial_reference)
)
# Add custom fields
for csv_field, field_info in field_mapping.items():
field_name = field_info['shapefile_name']
field_type = field_info['type']
field_length = field_info.get('length', None)
arcpy.management.AddField(
in_table=temp_fc,
field_name=field_name,
field_type=field_type,
field_length=field_length
)
# Copy features with field mapping
field_mappings = arcpy.FieldMappings()
field_mappings.addTable(event_layer)
field_mappings.addTable(temp_fc)
# Configure field mappings
for csv_field, field_info in field_mapping.items():
shapefile_field = field_info['shapefile_name']
# Find and configure field map
field_map_index = field_mappings.findFieldMapIndex(csv_field)
if field_map_index != -1:
field_map = field_mappings.getFieldMap(field_map_index)
output_field = field_map.outputField
output_field.name = shapefile_field
field_map.outputField = output_field
field_mappings.replaceFieldMap(field_map_index, field_map)
arcpy.management.Append(
inputs=event_layer,
target=temp_fc,
schema_type="NO_TEST",
field_mapping=field_mappings
)
# Copy to final shapefile
arcpy.management.CopyFeatures(temp_fc, output_shapefile)
# Clean up
arcpy.management.Delete(temp_fc)
else:
# Simple copy without field mapping
arcpy.management.CopyFeatures(event_layer, output_shapefile)
print(f"Successfully created shapefile: {output_shapefile}")
except arcpy.ExecuteError:
print(f"ArcPy Error: {arcpy.GetMessages(2)}")
except Exception as e:
print(f"Error: {str(e)}")
# Example with field mapping
field_mapping = {
'site_name': {'shapefile_name': 'SITE_NAME', 'type': 'TEXT', 'length': 50},
'elevation': {'shapefile_name': 'ELEVATION', 'type': 'DOUBLE'},
'date_collected': {'shapefile_name': 'DATE_COL', 'type': 'TEXT', 'length': 20}
}
csv_to_shapefile_with_fields(
csv_file=r"C:\data\field_data.csv",
output_shapefile=r"C:\output\field_data.shp",
x_field="longitude",
y_field="latitude",
field_mapping=field_mapping,
spatial_reference=4326
)
Common Coordinate Systems
When converting CSV to shapefile, specify the appropriate coordinate system:
- 4326: WGS 84 (Geographic, degrees)
- 3857: Web Mercator (Projected, meters)
- 4269: NAD 83 (Geographic, degrees)
- 32610: UTM Zone 10N (Projected, meters)
- 2154: RGF93 / Lambert-93 (France)
Best Practices
Data Preparation
- Ensure CSV files have proper headers
- Remove or handle null coordinate values
- Validate coordinate ranges for your coordinate system
- Use consistent field names across datasets
Performance Optimization
- Set
arcpy.env.overwriteOutput = Trueto avoid prompts - Use appropriate workspace settings
- Clean up temporary layers and feature classes
- Process large datasets in batches
Error Handling
- Always wrap ArcPy operations in try-except blocks
- Check for file existence before processing
- Validate coordinate field names and data types
- Provide meaningful error messages
Troubleshooting
Common Issues
“Field does not exist” Error
- Verify field names match exactly (case-sensitive)
- Check for special characters or spaces in field names
- Ensure CSV file has proper headers
Invalid Coordinates
- Check coordinate format (decimal degrees vs. degrees-minutes-seconds)
- Verify coordinate system matches data format
- Look for missing or null coordinate values
Shapefile Creation Fails
- Ensure output directory exists and is writable
- Check for file locks on existing shapefiles
- Verify field names comply with shapefile limitations (10 characters max)
Performance Tips
- Use pandas for data validation and cleaning before ArcPy processing
- Process data in chunks for very large CSV files
- Consider using File Geodatabase format for better performance with large datasets
- Enable spatial indexing for frequently queried shapefiles
Converting CSV files to shapefiles with ArcPy provides a robust, scriptable solution for spatial data workflows. The methods presented here offer flexibility for different use cases, from simple one-time conversions to complex batch processing scenarios with custom field mapping and validation.
Choose the appropriate method based on your specific requirements: use the basic method for simple conversions, the advanced method for production environments requiring validation, or the batch processing approach for handling multiple files efficiently.