Advertise here with Carbon Ads

This site is made possible by member support. ๐Ÿ’ž

Big thanks to Arcustech for hosting the site and offering amazing tech support.

When you buy through links on kottke.org, I may earn an affiliate commission. Thanks for supporting the site!

kottke.org. home of fine hypertext products since 1998.

๐Ÿ”  ๐Ÿ’€  ๐Ÿ“ธ  ๐Ÿ˜ญ  ๐Ÿ•ณ๏ธ  ๐Ÿค   ๐ŸŽฌ  ๐Ÿฅ”

The Bear with Its Own ZIP Code

Today I learned that ZIP Codes do not strictly represent geographic areas but rather “address groups or delivery routes”.

Despite the geographic derivation of most ZIP Codes, the codes themselves do not represent geographic regions; in general, they correspond to address groups or delivery routes. As a consequence, ZIP Code “areas” can overlap, be subsets of each other, or be artificial constructs with no geographic area (such as 095 for mail to the Navy, which is not geographically fixed). In similar fashion, in areas without regular postal routes (rural route areas) or no mail delivery (undeveloped areas), ZIP Codes are not assigned or are based on sparse delivery routes, and hence the boundary between ZIP Code areas is undefined.

The White House has its own ZIP Code (20500), as does the shoe floor of Saks Fifth Avenue in NYC (10022-SHOE). US mail to Santa Claus gets sent to the town of North Pole, Alaska (99705) but in Canada, Santa gets his own postal code (H0H 0H0). And Smokey Bear has his own ZIP Code (20252) because he gets so much mail.

ZIP Codes are therefore not that reliable when doing geospatial analysis of data:

Even though there are different place associations that probably mean more to you as an individual, such as a neighborhood, street, or the block you live on, the zip code is, in many organizations, the geographic unit of choice. It is used to make major decisions for marketing, opening or closing stores, providing services, and making decisions that can have a massive financial impact.

The problem is that zip codes are not a good representation of real human behavior, and when used in data analysis, often mask real, underlying insights, and may ultimately lead to bad outcomes. To understand why this is, we first need to understand a little more about the zip code itself.

For instance, in Miami’s 33139 ZIP Code the difference between the highest median income (as measured in much more granular US Census Block Groups) and lowest median income is over $240,000. So you can imagine it would be difficult to know or even assume anything in general about those residents based on their ZIP Code alone.