Geo-Rails Part 9: The PostGIS spatial_ref_sys Table and You

2012-02-05 georails

When you create a spatial database using PostGIS, you may notice that PostGIS automatically installs a table called "spatial_ref_sys". This is a standard table for spatial databases, as required by the Open Geospatial Consortium's specification. It defines which SRIDs are allowed in your geometries, and provides information about the corresponding coordinate systems.

In this article, we'll take a brief look at the spatial_ref_sys table and how you can use it in your application. We'll cover:

What's useful about the spatial_ref_sys table
Where the spatial_ref_sys data comes from, and how you can populate your own custom data.
Accessing spatial_ref_sys data from Ruby using RGeo's SRSDatabase

This is part 9 of my continuing series of articles on geospatial programming in Ruby and Rails. For a list of the other installments, please visit http://daniel-azuma.com/articles/georails.

So what's in the `spatial_ref_sys` table anyway?

The spatial_ref_sys table is defined by an OGC specification entitled Simple Feature Access Part 2: SQL Option. This is a companion to the Simple Features specification we covered in earlier articles. It takes the standard data types and operations and specifies how such data should appear in an SQL-based relational database. Many of the "ST_*" functions that we've used when interacting with PostGIS are defined in this specification.

We recall from part 4 that when you're working with spatial data, every coordinate references a coordinate system that defines what the coordinate means---whether it is latitude and longitude, or feet from your front door, or light-years from Alpha Centauri. Those coordinate systems are generally represented by a numeric ID reference known as the SRID. Spatial_ref_sys is merely a table of known coordinate systems keyed by their SRID. According to the spec:

Every Geometry Column is associated with a Spatial Reference System. The Spatial Reference System identifies the coordinate system for all geometric objects stored in the column, and gives meaning to the numeric coordinate values for any geometric object stored in the column. ... The Spatial Reference System Identifier (SRID) constitutes a unique integer key for a Spatial Reference System within a database.

How is this useful? Generally, spatial_ref_sys will provide an actual definition of the coordinate system for each SRID. This theoretically provides enough information to correctly interpret every piece of coordinate data in your database, and even---where possible---to convert the data into whatever coordinate system you need.

According to the spec, all implementations of spatial_ref_sys include these columns:

srid: The numeric SRID. This should be the table's primary key.
auth_name: An authority name as a string. This is set if this coordinate system is specified by an outside authority such as EPSG.
auth_srid: The numeric ID of the coordinate system in the above authority's catalog.
srtext: The Well-Known-Text (WKT) representation of the coordinate system (as we described in part 4).

If you are using PostGIS, you'll notice spatial_ref_sys has one more non-standard but very useful column:

proj4text: The Proj4 representation of the coordinate system.

Where does the data come from and what does it do?

In many cases, a spatial database will prepopulate spatial_ref_sys for you with a standard set of EPSG data. If you are using PostGIS, this is handled by the spatial_ref_sys.sql script, which gets run when you create a spatial database. If you are using Rails and follow the steps in part 2, the activerecord-postgis-adapter will do this for you automatically when you create your database.

The EPSG spatial reference database is such a universal standard that in most cases it is probably best to use one of its coordinate systems and its corresponding SRID. This will maximize the chances that your SRIDs will match those used by any external data sources you will interact with---meaning your data will be easily portable. However, you may run into a case where you need to define your own coordinate system, perhaps because you're using an unusual map projection. In such cases, you can add rows to the spatial_ref_sys table. You can also delete SRID rows that you don't need. It is, after all, merely a table in your database.

The main caveat is that most spatial databases (including PostGIS) will establish a foreign key constraint between SRIDs and this table. This means that your data can't just choose any SRID it wants. The SRID has to exist---and by "exist", we simply mean it has to be present in the spatial_ref_sys table. So you just need to make sure that you don't delete any SRIDs that you need, and you add any that aren't provided by default.

Some databases will provide additional tools that leverage the spatial_ref_sys information. PostGIS, for example, provides the SQL function ST_Transform(), which lets you transform a geometry from one coordinate system to another. When you call it, you provide the SRID of the desired coordinate system. PostGIS then looks up both the original and the target SRIDs in spatial_ref_sys, and uses the coordinate system information there to figure out how to compute the transformation.

Accessing the `spatial_ref_sys` from Ruby

The RGeo library provides a convenient way to look up coordinate systems from the spatial_ref_sys table. If you're already using PostGIS and activerecord-postgis-adapter, it's quite easy:

srs_database = RGeo::CoordSys::SRSDatabase::ActiveRecordTable.new
entry = srs_database.get(4326)
entry.identifier  # => 4326
entry.name        # => "WGS 84"
entry.proj4.to_s  # => " +proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs +towgs84=0,0,0"

You can now, for example, create a factory using the SRID (from entry.identifier) and the Proj4 object (from entry.proj4). As we saw in part 4, you generally need to provide Proj4 information if you are going to be converting data between coordinate systems.

As a convenient shorthand, some factories let you pass in an srs_database object when you construct a factory. The factory constructor will then go through the process of looking up the coordinate system definition and extracting the Proj4 specification. For example:

srs_database = RGeo::CoordSys::SRSDatabase::ActiveRecordTable.new
my_factory = RGeo::Geos.factory(:srs_database => srs_database, :srid => 3785)
my_factory.proj4.to_s  # => " +proj=merc +a=6378137 +b=6378137 +lat_ts=0.0 +lon_0=0.0 +x_0=0.0 +y_0=0 +units=m +k=1.0 +nadgrids=@null +no_defs"

Other SRSDatabase sources

The spatial_ref_sys table is not the only source of coordinate system information. The Proj4 library itself installs a bunch of data files that contain most of the EPSG codes and other information. There are also online services that you can use to look up coordinate systems; one particularly important one is spatialreference.org.

The SRSDatabase mechanism in RGeo supports a common interface for coordinate system databases. In the section above, we looked up EPSG 4326 from the spatial_ref_sys table using the ActiveRecordTable class. Alternately, we can use the SrOrg class to look up information from spatialreference.org:

srs_database = RGeo::CoordSys::SRSDatabase::SrOrg.new('EPSG')
entry = srs_database.get(4326)
# ...

You can choose an appropriate source of coordinate system information based on the requirements of your application. In many cases, however, I recommend using the spatial_ref_sys table because it is convenient and readily available. See the RDocs for RGeo for more information on connecting to different SRSDatabase sources.

Where to go from here

In this article, we covered spatial_ref_sys, one of the standard tables that is included in most spatial databases. For more information on this table and how it is implemented in the PostGIS database, see the PostGIS documentation online. The official specification of the table is available in the OGS Simple Features for SQL spec.

RGeo provides basic convenience tools for accessing coordinate system information from the spatial_ref_sys table. We covered a few basic examples in this article. For detailed information, see the RGeo::CoordSys::SRSDatabase module in the RGeo documentation.

Observant readers may notice that PostGIS includes one more automatic table, called geometry_columns. I may cover this table in a later article that digs deeper into PostGIS, but if you're interested now, it's described in the PostGIS documentation.

This is part 9 of my continuing series of articles on geospatial programming in Ruby and Rails. For a list of the other installments, please visit http://daniel-azuma.com/articles/georails.

So what's in the spatial_ref_sys table anyway?

Where does the data come from and what does it do?

Accessing the spatial_ref_sys from Ruby

Other SRSDatabase sources

Where to go from here

Dialogue & Discussion

So what's in the `spatial_ref_sys` table anyway?

Accessing the `spatial_ref_sys` from Ruby