Let the People Define the Regions

Michael Cain

This off-the-cuff piece is inspired by, in order, people here saying "When you write about X, it sounds like a foreign country," for values of X like the South, the Midwest, or the Northeast; James Hanley's definition of the Midwest and the amazing set of "But wait!" comments it got; and Mike Schilling's (entirely proper) insistance that mathematics is a part of being well-rounded. I have occasionally floated the hypothesis that multi-state regions can are defined by the pattern of migration between states — on the theory that people generally prefer to move within a region where they are comfortable. My thought was originally spurred by playing with this nifty visualization tool. The Census Bureau provides summary spreadsheets for state-to-state migration flows here. Caveats if you're going to play with them: the 2009 data is in a different layout than 2011 and 2012, and the 2010 file is corrupted in some fashion and won't open. You might also enjoy fooling with Forbes' county-level interactive application based on IRS data.

With an appropriate distance measure defined based on the number of people that move between states [1], it's possible to partition the states into clusters such that states within a cluster are all "close" to one another and not so close to states in other clusters. I used a heirarchical partitioning scheme that builds clusters from the bottom up, using the average distance between cluster members to decide which clusters to merge. The seven-region shown here even makes some sense: there's a Northeast, a West, a Midwest split in two parts, a Southeast, a Mid-Atlantic, and a "Greater Texas" group. To James' definition of the Midwest, migration patterns say Kentucky should be included with Illinois, Indiana, and Ohio; Kansas and Missouri are Texas-centric, not Midwestern.


[1] Which pairs of states are closest by this measure? Minnesota/North Dakota, California/Nevada, Massachusetts/New Hampshire, Colorado/Wyoming, and Kansas/Missouri are the five closest pairs. This generally coincides with our — or at least my — intuition about things.