Chapter 3 Usage

oepsData is centered around two functions: load_oeps_dictionary, which loads a basic data dictionary; and load_oeps, which directly loads OEPS data. We expect that most users will start by calling load_oeps_dictionary to look at what data is available at their desired analysis scale, followed by calling load_oeps to actually load the data.

3.1 load_oeps_dictionary

load_oeps_dictionary takes one argument:

  • scale One of “tract”, “zcta”, “county”, or “state”

It returns the data dictionary (stored as a data.frame).

# Load all data available at the state level
data_dictionary <- load_oeps_dictionary(scale="state")

If you are working in RStudio, we recommend browsing the dictionary through the View command:

View(data_dictionary)

Here in the docs we can preview it directly:

data_dictionary <- load_oeps_dictionary(scale="state")
data_dictionary

3.2 load_oeps

We might find that we’re interested in the 1990 state data. We can load that data and its geometries using load_oeps, which accepts the following arguments:

  • scale The scale of analysis. One of “tract”, “zcta”, “county”, or “state”

  • year The release year for the data. One of 1980, 1990, 2000, 2010, or 2018.

  • themes The theme to pull data for. One of ’Geography”, “Social”, “Environment”, “Economic”, “Policy”, “Composite”, or “All”. Defaults All.

  • states A string or vector of strings specifying which states to pull data for, either as FIPS codes or names. Ignored when scale is in ZCTA. Defaults None.

  • counties A string or vector of strings specifying which counties to pull data for, either as FIPS or names. Ignored for ZCTA, and must be specified alongside states. Defaults None.

  • tidy Boolean specifying whether to return data in tidy format; defaults to FALSE.

  • geometry Boolean specifying whether to pull geometries for the dataset. Defaults FALSE

  • cache Boolean specifying whether to use cahced geometries or not. Defaults TRUE. See A note on caching for more information.


states_1990 <- load_oeps(scale="state", 
          year=1990,
          geometry=TRUE)

head(data.frame(states_1990))

Which lets us operate on the data as we desire. For instance, we can make a simple map:

library(tmap)
#> Breaking News: tmap 3.x is retiring. Please test v4, e.g. with
#> remotes::install_github('r-tmap/tmap')
library(sf)
#> Linking to GEOS 3.11.0, GDAL 3.5.3, PROJ 9.1.0; sf_use_s2() is TRUE

# reproject to a better display CRS
states_1990 <- st_transform(states_1990, "ESRI:102004")

tm_shape(states_1990) + 
  tm_fill("NoHsP", style="jenks") +
  tm_borders(alpha=0.05) +
  tm_layout(main.title = "Population over 25 without a high school degree")

See Examples for many more demonstrations of how you can use this function.

3.2.1 A note on caching

oepsData pulls its data from online repositories, primarily GitHub. This can lead to issues for users operating on slow internet, for whom load times can be long for larger datasets, or for users who anticipate needing the package when entirely offline.

To help minimize these issues, oepsData caches, or saves a local copy of, data loaded by load_oeps on its first load. Any later usage of the dataset will be pulled from the local cache.

Additionally, oepsData offers a few commands can help maintain caches:

  • cache_geometries and cache_oeps_tables will pre-cache all tables and geometries (it will overwrite existing cache content in the process).
  • clear_cache deletes all cached data.
  • cache_dir returns the directory of the oepsData cache.

Users who want to avoid using cached data and instead download data fresh every time can set cache=FALSE when calling load_oeps.

3.3 state_to_fips

This is a helper function that takes a given state’s name or abbreviation and returns the state’s FIPS code.

state_to_fips("Illinois")
#> [1] "17"
state_to_fips("IL")
#> [1] "17"

3.4 county_to_fips

This is a helper function that takes a county’s name and the FIPS of its state and returns the county level FIPS code.

county_to_fips("Champaign", 17)
#> [1] "17019"
county_to_fips("Champaign", state_to_fips("IL"))
#> [1] "17019"