Analyzing the Bührer dataset

What data of the available Bührer dataset actually made it on one of the maps? A mosaic plot, done with the vcd package from the open source statistical software R (, gives a quick overview over the relevant factors.

Mosaic plot of the Bührer dataset
Mosaic plot of the Bührer dataset

The plot essentially shows areas proportional to the number of persons, ordered by the emigration status (left) and map # (top). For a given combination the successive blocks in the color red, black and grey denote Named, Married and Descendants persons respectively (see The methodology – preparing genealogical data for maps for explanations). These three categories make up roughly 4’500 persons of the original dataset, with the remainder not being shown. The small circles denote combinations that didn’t occur in the dataset.

A few observations:

  • Only a small fraction of persons in the dataset actually show up on map 1 and 2. This is comes as no surprise, given the large number of e.g. Swiss-based Bührers, “Assumed US” persons as known descendants of emigrants with no place information or “Undetermined” persons where location information could neither be determined nor inferred.
  • The number of Bührers emigrating for the generation prior 1880 (map 1) is significantly larger than the number of emigrating spouses from Switzerland, reflecting the fact that most married once overseas. A look at the category “Third country emigrated to US” indicates that a substantial part of the Bührers – at least for the first generation – preferred to marry other emigrants.
  • There’s very little Bührer emigration happening for the generations born after 1880 (map 2) – almost all Bührers in that period are America-born.

The plot has featured in a small presentation R User Meetup Mosaic plot Thomas Roth 20160803 (includes the R code) in a Zurich R User Group Meetup.

Plotting the map’s family tree

The – with 13 A3 pages very wide – family tree (Family Tree of Emigrated Buehrers) shows all persons that emigrated to the United States including their ancestors as well as their immediate relatives. Persons are aligned horizontally by generation, with oldest generations on the top. Squares denote males, circles females and triangles marriages. Persons represented by black line symbols are shown on the map whereas those with grey line symbols are not. Otherwise symbology follows the one for the map, i.e. line styling indicates category and colours show common male ancestors.

Family Tree of Emigrated Buehrers - Detail
Detail of the family tree of emigrated Bührers

Data for the family tree was prepared in the project’s PostgreSQL database stripping irrelevant persons and families. The family tree was drawn in yEd ( in “Family Tree” mode and styled via Properties Mapper.

GIS data used in the emigration map project

Data for the emigration map project was – except for the genealogical data – all from public sources. Interestingly enough there is also a relative wealth of sources with relevant historical GIS data.

  • Genealogical data from Swiss Buehrer Web Site as of October 2011 ( For practical reasons (notably less work for georeferencing) all persons not linked to the main family tree were removed as well as substantial irrelevant side lines like the Finney emigration from Ireland.
  • Digital Elevation Model DEM (30” resolution) from U.S. Geological Survey ( provided the base for the fairly easy looking smoothed hillshade layer that proved to be the most difficult to produce.
    The four 30” DEM tiles delimited by longitude/latitude (W140N90, W140N40, W100N90 and W100N40) that cover the United States were merged into a single DEM file which was subsequently reprojected and downsampled, all in QGIS.
    In GRASS a moving average was applied to the DEM which was exported as a GeoTIFF. From there, gdaldem was used to generate a hillshade GeoTIFF that received final blurring in Photoshop. Anything but easy.
  • Physical features (1:10 million scale) ocean, coastline, land, rivers and lakes from Natural Earth (
  • US counties from the US Census Bureau (
  • Historical state boundaries (filtered as of 1870) from the Atlas of Historical County Boundaries Project (
  • Historical US railroads (1870) from the Railroads and the Making of Modern America Project (
  • Historical US census data (1870) from the National Historical Geographic Information System NHGIS of the Minnesota Population Center ( The custom download data included e.g. the number of Swiss-born citizens per county and other interesting data that was not yet used in the project.


Software used for the family tree/GIS mapping project

The MacFamilyTree software from Synium ( was used to import, modify, consolidate and analyse the genealogical data. It was also used for the normalization, completion and geocoding of places. Except for MacFamilyTree all other mentioned software are open source.

Data was exported from MacFamilyTree’s underlying SQLLite database as SQL import script with the help of the SQLite Database Browser ( and subsequently imported into a PostgreSQL database ( with a PostGIS extension to add support for geographic objects. Unfortunately there seems to be no high quality GEDCOM-based parser/importer into SQL databases. Data handling and SQL scripts was done using pgAdmin3 (

The very flat data structure from MacFamilyTree was subsequently transformed into a more intuitive data model (“person”, “family”, “place”, “person_event” etc.) that served as a base for the extensive coded analysis and transformation logic in PostgreSQL’s procedural language PL/pgSQL.

All logic (and some data patching) were applied in roughly 40 sequential scripts per object. This repeatable processing proved to be a key success factor given the large number of methodological, coding and data errors encountered in the process that forced reprocessing.

Screenshot of the QGIS project for the emigration map
Screenshot of the QGIS project for the emigration map

All mapping and layout was done in QGIS (, with key features for the project becoming available only in QGIS 2.2. Data came from either PostGIS layers in PostgreSQL or shapefiles from various sources. The original approach to create a raw map that would receive its finish in a vector-based editor was dumped in favour of end-to-end map production in QGIS. This reflects on one hand the growing maturity of QGIS on one side, but also the difficulties to process the incredible amount of paths in its vector-based output in other programs.

The map – generalization challenges to preserve map readability

Symbology and family references in square brackets are explained in the legend.

Map detail showing ranking-driven labels
Map detail showing ranking-driven labels in the main immigration area around Fulton county

Styling of the map was inspired by the Schweizer Weltaltas (

  • Migration paths from/to overseas were displayed as point decorations with arrows, with label position and orientation calculated. To minimize clutter in the area in northwestern Ohio all arrows were demoted by a fixed distance and aligned on a circle grouped by destination and origin. Internal migration paths in contrast were real lines. All migration paths made extensive use of data-defined properties to control colour (emigration generation), dashing (person scope), line width (number of persons) as well as the label styling.
  • The number of distinct Bührer persons per county and common ancestry is indicated by the circle size. Distinct common male ancestors having a different colour that increases with their presumed emigration period going from red (early emigrants) to blue (late emigrants).
    Counties with Bührers from different ancestry have an additional transparent circle with bold lines to indicate the sum of Bührers. Given the restrictions in QGIS with overlay charts a representation as pie chart was not possible nor practical, given the large number of Bührer persons evidenced e.g. in Fulton County.
  • A custom label ranking for places was calculated to prevent label cluttering above all in Ohio and to ensure that place names representing the largest Bührer population will prevail.
  • Certain label information such as place names in red with first migration evidence in a certain region or labels for first-time migration paths between regions were forced to be always displayed. Label positioning in general and the label text of migration paths in particular was extensively manually tweaked to optimize the map.

Improving content and readability is probably best explained by comparing the final result with an earlier one-map version in QGIS 1.8 where point decorations with arrows were not yet supported.

Emigration map - first version with QGIS 1.8
An earlier version of the map project done with QGIS 1.8. The “missile attack” immigration paths obscure most of the map. Note the GIS data artifacts e.g. around the Great Lakes prior using superior Natural Earth GIS data.

The methodology – preparing genealogical data for maps

The production of the map showing the emigration of Bührers from Switzerland to the United States relied on the following, largely self-developed sequential steps:

  • Normalization, completion and geocoding of places used for family and persons events
  • Identification and categorization of in-scope persons, notably persons born as Bührer or name varieties such as Buehrer (“Named”), spouses (“Married”) and their children/grandchildren (“Descendant”) that have different family names
  • Constructing a sequence of geocoded events for a person’s life, also considering childbirth for women. In case geocoded events lacked dates a natural sequence was assumed, i.e. birth followed by marriage, childbirth, death and burial.
  • Determination of an emigration/residence status relative to Switzerland, the US or third countries. Of particular interest were those that emigrated to the US as well as confirmed or assumed US residents
  • Determination of a common male ancestor for all Bührers that emigrated or have lived in the US and the generation relative to him
  • Deriving a family status for emigrants, i.e. whether emigrants emigrated as single, with their spouse or family
  • Assignment of persons to a time period (generations prior/beyond 1880) based on known birth years, ensuring a consistent assignment of couples and siblings to the same period
  • Construction of migration path segments following the sequence of geocoded events
  • Aggregation of migration paths per county and time period, including aggregated indicators such as the category of in-scope person with Bührer prevailing and the minimal generation involved
Geocoding places in MacFamilyTree
Geocoding places in MacFamilyTree
Sample PostgreSQL script
Sample PostgreSQL script (categorization of in-scope persons)

Mapping the Emigration of the Bührers from Switzerland to the United States

In the second half of the 19th century many farmers emigrated from the rural communities of the Swiss canton of Schaffhausen, mainly due to a local agricultural crisis and better economic prospects elsewhere. There were emigration peaks around 1850, 1870 and in the 1880’s with a large majority of emigrants heading to the United States. From 1868 to 1890 an equivalent of more than 10% of the canton’s population had left, a large number for Swiss standards at the time.

Among the emigrants were many Bührers from the small hamlets of Bibern, Hofen, Opfertshofen as well as from Herblingen and Stetten in the Reiat region. Originally immigrated from southern Germany the Swiss Bührer at large remained constrained to that tiny corner of Switzerland, which greatly facilitates genealogical research.

Emigration of Buehrer to the United States - Map 1
Emigration of Buehrer to the United States – Map 1

The maps shows the Bührers’ emigration to the United States (Map 1) up to the 1880s and subsequent internal migration up to now (Map 2). Almost all of them initially immigrated to a small area in northwestern Ohio around the city of Archbold in Fulton County; some of them migrated further west to Kansas, Texas or the Pacific Northwest.

In order to provide the necessary historic context Map 1 shows historic state boundaries, population density as well as the railway network (the main mode of transportation for Swiss emigrants) at 1870, the heyday of the Bührer emigration.

Emigration of Buehrer to the United States - Map 2
Emigration of Buehrer to the United States – Map 2

A legend explains the symbology.

Analysis, visualisation and map-making with genealogical data