Incorporating costs in conservation planning

Niche: use of cost data in conservation plans

  • Just as benefits, variability in costs can be large
    • Including spatial variability and variability across scope and scale of project
    • Understanding this variability can improve planning outcomes

Key cites
Babcock et al. 1997: https://doi.org/10.2307/3147171

  • Describes relative efficiency of management rules under different joint distributions of costs and benefits
  • Alternative targeting instruments considered incl. cost-targeting, benefit-targeting, and ratio-targeting (cost per benefit targeting)
  • Relative variability of benefits and costs, and correlation between the two, determine effects of sub-optimal targeting

Naidoo et al. 2006: https://doi.org/10.1016/j.tree.2006.10.003

  • Types of costs: acquisition, management, transaction (and opportunity, damage costs)
    • (Can be continuous or one-off)
  • Often based on non-monetary proxies
    • Most often area
    • Sometimes weighted but in often arbitrary ways
    • Efficiency gains from incorporating costs

Gap

  • Past looks at culverts have focused on benefits and used simplified cost models
  • Past looks at conservation costs have focused on land acquisition costs rather than restoration efforts
    • Unique features of culvert improvement in PNW: upstream land access model, lots of streams/roads, large variation in slope and stream size
  • Timely b/c Washington culvert case

Research approach

  • Examine variability in cost levels and drivers of costs across culvert projects in PNW using statistical model
  • Compare levels and variability of costs to (possibly several) benefit measures
  • Apply model to extant culverts to compare costs/benefit distributions over…
    • Space: where are high benefit, low cost culverts?
    • Observed projects vs. all culverts: what kind of decision rule is distribution of projects consistent with?

RQ1: How much variability is there in costs for culvert improvements?

  • Over space?
  • For observed projects vs. potential projects?
  • Relative to variability in benefits? (And implications for planning rules/future research)

RQ2: What are drivers of culvert improvement costs?

  • Economic drivers: economies of scale, transaction costs
  • Geophysical drivers: stream features, terrain features
  • And are these drivers also drivers of benefits (i.e. upstream habitat quality for target species)?

Data description

The unit of observation in our data is a culvert worksite. These data include all unique worksites associated with a culvert action in the PNSHP data between 2001 and 2015. Each worksite is associated with a project, a set of geographic coordinates, and the number of culverts at the site. Projects are associated with a year, a reporting source, and a unique cost. We also calculate the number of culverts associated with the project (methods found here). Note that a project may be associated with multiple related worksites, though 70% of worksites are uniquely identified to a project.

Dependent and explanatory variables included in the empirical model are described below, including a brief justification for inclusion. A more in-depth exploration of these variables can be found in this report.

Descriptive statistics (n = 1,234)
Variable Mean Std. dev. Number of levels
Cost per culvert ($USD2019) 82,600 93,800
Number of worksites (count) 1.48 0.958
Distance between worksites (m) 507 1,360
Stream slope (%) 0.0458 0.0406
Bankfull width (m) 7.56 5.41
Terrain slope (deg) 27.3 12
Housing density (units per sq. km) 5.96 24.7
Construction employment (jobs) 3,010 5,140
Ag/forestry employment (jobs) 778 534
Distance to urban area (m) 45,200 31,800
Density of construction equipment wholesalers (workers per sq. km) 0.0216 0.032
Density of brick, stone, and related wholesalers (workers per sq. km) 0.00276 0.00721
Density of durable metals wholesalers (workers per sq. km) 0.0049 0.00939
Density of sand and gravel sales yards (firms per sq. km) 0.0000244 0.00004
Private land, individual or company owner (proportion of land within 500m radius) 0.447 0.436
Private land, managed by industry (proportion of land within 500m radius) 0.203 0.356
Private land, managed by non-industrial owner (proportion of land within 500m radius) 0.0349 0.169
Paved road 2
Road speed class 6
Land cover class 6
Basin 9
Year 15
Reporting source 6

Cost estimates

Cost per culvert ($USD, 2019) is our primary dependent variable. This variable can also be interpreted as the project average costs at the work site. This variable is constructed by dividing the provided project costs by the number of culverts associated with the project.

Stream hydrological features

  • Stream slope (% grade): slope of stream at road crossing can require more expensive crossing design; identified via COMID matching with NHDPlus attributes.
  • Bankfull width (m): bankfull width is the preferred measure of stream width at road crossing, accounting for potential width during high-water events; identified via COMID matching with NHDPlus attributes.
    Note that these two variables are interacted to capture design guidelines that indicate increased complexity in culvert designs associated with streams that are both particularly wide and steep.

Road features

  • Road paved (indicator): modification of a paved road is more expensive; may also proxy for higher traffic volumes; measured via HERE road data for nearest object.
  • Road speed class (categorical): wider roads with more traffic are expected to be more expensive; measured via HERE road data for nearest object; classes range from 2 (fastest) to 7 (slowest).
Value key for HERE road variables
Classification Value Description
Speed category 1 > 130 kph / > 80 mph
2 101-130 kph / 65-80 mph
3 91-100 kph / 55-64 mph
4 71-90 kph / 41-54 mph
5 51-70 kph / 31-40 mph
6 31-50 kph / 21-30 mph
7 11-30 kph / 6-20 mph
8 < 11 kph / < 6 mph

Terrain features

  • Terrain slope (degrees): steeper terrain is expected to require more expensive projects; measured by the GTOPO30 grid cell the worksite falls in slope recorded for the catchment the stream is associated with via NHDPlus Selected Attributes data release, as opposed to the stream slope.

  • Land cover (categorical): different land covers may be associated more expensive projects (e.g. less accessible sites in forest, difficult soils in welands, etc.); identified via cover with worksite coordinates and NLCD land cover layer for nearest available year; here we use the broader NLCD Group definition rather than the more detailed classification (see below).

  • Elevation (m): mean elevation in meters in NHDPlus catchment.

Group Value Classification Description
Barren 31 Barren land (rock/sand/clay) Areas of bedrock, desert pavement, scarps, talus, slides, volcanic material, glacial debris, sand dunes, strip mines, gravel pits and other accumulations of earthen material. Generally, vegetation accounts for less than 15% of total cover.
Developed 21 Developed, open space Areas with a mixture of some constructed materials, but mostly vegetation in the form of lawn grasses. Impervious surfaces account for less than 20% of total cover. These areas most commonly include large-lot single-family housing units, parks, golf courses, and vegetation planted in developed settings for recreation, erosion control, or aesthetic purposes.
Developed 22 Developed, low intensity Areas with a mixture of constructed materials and vegetation. Impervious surfaces account for 20% to 49% percent of total cover. These areas most commonly include single-family housing units.
Developed 23 Developed, medium intensity Areas with a mixture of constructed materials and vegetation. Impervious surfaces account for 50% to 79% of the total cover. These areas most commonly include single-family housing units.
Developed 24 Developed high intensity Highly developed areas where people reside or work in high numbers. Examples include apartment complexes, row houses and commercial/industrial. Impervious surfaces account for 80% to 100% of the total cover.
Forest 41 Deciduous forest Areas dominated by trees generally greater than 5 meters tall, and greater than 20% of total vegetation cover. More than 75% of the tree species shed foliage simultaneously in response to seasonal change.
Forest 42 Evergreen forest Areas dominated by trees generally greater than 5 meters tall, and greater than 20% of total vegetation cover. More than 75% of the tree species maintain their leaves all year. Canopy is never without green foliage.
Forest 43 Mixed forest Areas dominated by trees generally greater than 5 meters tall, and greater than 20% of total vegetation cover. Neither deciduous nor evergreen species are greater than 75% of total tree cover.
Herbaceous 71 Grassland/herbaceous Areas dominated by gramanoid or herbaceous vegetation, generally greater than 80% of total vegetation. These areas are not subject to intensive management such as tilling, but can be utilized for grazing.
Planted-cultivated 81 Pasture/hay-areas of grasses, legumes, or grass Legume mixtures planted for livestock grazing or the production of seed or hay crops, typically on a perennial cycle. Pasture/hay vegetation accounts for greater than 20% of total vegetation.
Planted-cultivated 82 Cultivated crops Areas used for the production of annual crops, such as corn, soybeans, vegetables, tobacco, and cotton, and also perennial woody crops such as orchards and vineyards. Crop vegetation accounts for greater than 20% of total vegetation. This class also includes all land being actively tilled.
Shrubland 52 Shrub/scrub Areas dominated by shrubs; less than 5 meters tall with shrub canopy typically greater than 20% of total vegetation. This class includes true shrubs, young trees in an early successional stage or trees stunted from environmental conditions.
Water 11 Open water Areas of open water, generally with less than 25% cover of vegetation or soil.
Water 12 Perennial ice/snow Areas characterized by a perennial cover of ice and/or snow, generally greater than 25% of total cover.
Wetlands 90 Woody wetlands Areas where forest or shrubland vegetation accounts for greater than 20% of vegetative cover and the soil or substrate is periodically saturated with or covered with water.
Wetlands 95 Emergent herbaceous wetlands Areas where perennial herbaceous vegetation accounts for greater than 80% of vegetative cover and the soil or substrate is periodically saturated with or covered with water.

Economic, social, and built-environment characteristics features

  • Housing density (units per sq. km): more parcels near worksite introduces complexities related to site access and available areas for staging, etc.; measured for the immediate catchment of stream identified via matching with NHDPlus attribute data.
  • Employment in ag/forestry (jobs in county): availability of skilled labor may reduce project costs; employment data from county the worksite is located in via County Business Patterns data.
  • Employment in construction (jobs in county): see above.
  • Distance to urban area (m): as a measure of access to labor and equipment, euclidean distance to the nearest census designated urban area, defined as contiguous area with at least 50,000 residents.
  • Density of wholesalers (workers/firms per sq. km): Using firm point data from Homeland Infrastructure Foundation-Level Data (HIFLD), we calculate a kernel density field (1km resolution grid/100km search radius) to measure access to equipment and materials required for common culvert replacement designs, including…
    • Construction equipment (workers), representing heavy machinery required for digging, moving materials, etc.,
    • Brick, stone, and related (workers), representing concrete, pavement, and structural materials required to build replacement culverts,
    • Durable metals (workers), representing sheet metal used for piping, rebar, and other structural metals materials, and
    • Sand and gravel sales yards (firms), representing suppliers for fill commonly used to build roads back up after construction.
  • Private land (proportion within 500m radius): we use BLM’s 2019 Surface Jurisdiction Map to calculate the proportion of land within a 500m radius of each worksite that is privately managed, distinguishing between land managed by individuals or companies (residential or commercial land), industry (typically forestry in the region), and non-industrial (including private conservation) entities; Note that the baseline group (i.e., 0 in all three proportions) represents a pooled group of all other land ownership types, which includes all publicly-owned and managed land (i.e., national and state forests and parks, U.S. Fish and Wildlife land, BLM land, etc.).

Scale and scope features

  • Number of worksites associated with project (count): addressing multiple culverts under the same project may provide scale benefits, but might also increase complexity; measured via PNSHP database.
  • Distance between project worksites (m): more dispersed worksites under a single project may increase project costs due to increased transportation costs (and time); measured as the total euclidean distance between worksites for multiple worksite projects. This variable is interacted with the number of worksites to allow flexible coordination/scale effects.
  • Action type (categorical): PNSHP distinguishes between culvert removals and culvert installations, in addition to culvert improvements (the dominate category); we expect removals to be cheapest, followed by improvements and installations; dummies are included when a project includes one or more culverts flagged as either removals or installations.

Fixed effects

  • Year: the year the project was completed
  • Basin: the basin (HUC6) where the worksite is located
  • Reporting source: the reporting source for the project

Simple correlations across variables

Here we present a couple measures of correlation between potential continuous explanatory variables. The figure below show provides Pearson’s correlation coefficients for each pair of continuous explanatory variables included in the initial models, along side the dependent variable. Also included is the Variance Inflation Factor for each variable as calculated when all presented variables are included in a simple log-linear model, with cost per culvert as the dependent variable. Because of previously mentioned high correlation between housing density and population density, we include only housing density.

It looks like distance between worksites and the number of culverts is positively correlated, and both are negatively correlated with average project costs. We would expect distance to increase costs but the number of culverts to decrease costs (due to economies of scale), all else equal. Disentangling these effects should be possible with multiple regression.

Stream slope is negatively correlated with bankfull width, which means wider streams tend to be less steep. Stream slope is also positively correlated with terrain slope, as mentioned earlier. None of the three are strongly correlated with costs.

Finally, the two employment variables are positively correlated, and construction employment is positively correlated with housing density. Ag/forestry employment is weakly positively correlated with measures that indicate more rugged terrain such as stream and terrain slope. Housing density and to a lesser degree construction employment are positively correlated with costs.

No variables have particularly large VIFs, suggesting little potential for error-inflating multicollinearity. (A large VIF indicates that the variable is strongly correlated with the other variables in the model, leading to inflated standard errors and limiting the model’s usefulness for prediction or inference.)

Estimation

We estimate log-linear models estimated via OLS, with the average project cost as the dependent variable and the worksite as the unit of observation. Stream slope and bankfull width are interacted in this specification. Recommendations in culvert engineering reports indicate that more expensive culvert designs are particularly necessary when both of these variables are extreme, and an interaction term can capture this effect.

In addition to the fully specified model (mod_full), we present nine alternative models that include different fixed effects configurations or only sub-samples of the data focused on basin or reporting source criteria. For basins, we provide results estimated on a “core” group representing the five most frequently represented basins (Washington Coastal, Puget Sound, Southern Oregon Coastal, Northern Oregon Coastal, Willamette). Finally, we also estimate the model on only projects reported by OWRI and WA RCO, to examine how the two sources who report the most projects influence the overall results.

The resulting coefficients for the fixed effects and categorical variables, when exponentiated, can be interpreted as the ratio of average costs for that group relative those of the base group. Results for continuous variables are presented as exponentiated average marginal effect of a single standard deviation change, which can be interpreted as the ratio of costs relative to a worksite with a standard deviation lower for the variable.

Coefficient estimates

Cost models
Alternative fixed effects
Sub-samples
Term mod_full mod_nofe mod_nofe_nobasin mod_nofe_nosource mod_nofe_noyear mod_nofe_onlybasin mod_nofe_onlysource mod_nofe_onlyyear mod_basins_core mod_sources_core
Intercept 9.89***
(0.321)
10.7***
(0.29)
9.94***
(0.318)
10.3***
(0.329)
10.1***
(0.305)
10.6***
(0.311)
10.2***
(0.293)
10.2***
(0.328)
10***
(0.34)
10.1***
(0.366)
Stream slope -3.77*
(1.99)
-4.23**
(2.14)
-2.8
(2)
-4.65**
(2.08)
-3.6*
(2.02)
-4.38**
(2.12)
-2.55
(2.04)
-4.63**
(2.08)
-6.36**
(2.52)
-2.29
(3.37)
Bankfull width -0.00316
(0.00674)
-0.0008
(0.00773)
-0.00133
(0.00671)
-0.00445
(0.00695)
-0.0014
(0.00689)
-0.00327
(0.0071)
0.00102
(0.00701)
-0.00306
(0.00734)
-0.004
(0.00777)
0.00218
(0.00882)
Stream slope X bankfull width 1.01**
(0.444)
1.22***
(0.462)
0.985**
(0.448)
1.23***
(0.457)
0.939**
(0.446)
1.16**
(0.462)
0.868*
(0.449)
1.32***
(0.461)
1.71***
(0.594)
1.06
(0.764)
Road paved (dummy) 0.211*
(0.121)
0.268**
(0.125)
0.193
(0.128)
0.228*
(0.12)
0.278**
(0.121)
0.307**
(0.12)
0.249*
(0.128)
0.217*
(0.125)
0.147
(0.124)
0.235
(0.167)
Road speed class: 3 0.59*
(0.338)
0.537
(0.335)
0.59*
(0.34)
0.712**
(0.34)
0.517
(0.332)
0.601*
(0.339)
0.538
(0.337)
0.623*
(0.334)
0.802**
(0.367)
0.606
(0.489)
Road speed class: 4 0.454
(0.308)
0.515*
(0.311)
0.441
(0.291)
0.449
(0.341)
0.378
(0.31)
0.409
(0.341)
0.388
(0.289)
0.497
(0.326)
0.582
(0.36)
0.377
(0.388)
Road speed class: 5 0.38***
(0.142)
0.392***
(0.142)
0.406***
(0.148)
0.44***
(0.138)
0.34**
(0.141)
0.408***
(0.137)
0.375**
(0.147)
0.415***
(0.145)
0.471***
(0.145)
0.272
(0.181)
Road speed class: 6 0.237*
(0.143)
0.39***
(0.145)
0.27*
(0.145)
0.249*
(0.14)
0.19
(0.143)
0.2
(0.141)
0.232
(0.147)
0.379***
(0.143)
0.355**
(0.166)
0.401**
(0.191)
Road speed class: 7 0.259**
(0.102)
0.396***
(0.106)
0.303***
(0.102)
0.303***
(0.104)
0.276***
(0.103)
0.337***
(0.105)
0.321***
(0.102)
0.367***
(0.105)
0.33***
(0.115)
0.268**
(0.128)
Terrain slope 0.00797**
(0.00399)
0.0052
(0.00389)
0.00596
(0.00374)
0.00911**
(0.00415)
0.00901**
(0.0039)
0.0106**
(0.00413)
0.00606
(0.00373)
0.0059
(0.00382)
0.00865*
(0.00448)
-0.000371
(0.00546)
Elevation -0.0000916
(0.000298)
-0.000096
(0.000195)
-0.000247
(0.000197)
0.0000319
(0.000295)
-0.000168
(0.000292)
-0.0000455
(0.000287)
-0.000228
(0.000194)
-0.000123
(0.000196)
-0.000266
(0.000351)
-0.00012
(0.000274)
Land cover: Developed 0.309***
(0.0817)
0.361***
(0.0907)
0.256***
(0.0809)
0.344***
(0.0857)
0.319***
(0.0848)
0.398***
(0.091)
0.264***
(0.0844)
0.299***
(0.086)
0.302***
(0.0972)
0.32***
(0.111)
Land cover: Herbaceous -0.0662
(0.148)
-0.0976
(0.185)
-0.0914
(0.156)
-0.0382
(0.156)
-0.0905
(0.151)
-0.0422
(0.163)
-0.122
(0.158)
-0.108
(0.177)
-0.167
(0.196)
-0.0511
(0.227)
Land cover: Planted-cultivated 0.345**
(0.162)
0.347**
(0.173)
0.356**
(0.158)
0.337**
(0.166)
0.305*
(0.172)
0.307*
(0.172)
0.325*
(0.169)
0.356**
(0.168)
0.367**
(0.172)
0.361*
(0.192)
Land cover: Shrubland 0.178
(0.142)
0.285*
(0.149)
0.152
(0.143)
0.219
(0.147)
0.171
(0.14)
0.228
(0.146)
0.148
(0.142)
0.256*
(0.148)
0.247
(0.187)
0.16
(0.219)
Land cover: Wetlands 0.0652
(0.156)
0.124
(0.162)
0.0298
(0.165)
0.0683
(0.157)
0.0535
(0.145)
0.107
(0.147)
0.0208
(0.158)
0.043
(0.171)
-0.0514
(0.182)
0.134
(0.19)
Housing density 0.000463
(0.0019)
0.00129
(0.00186)
0.000661
(0.00187)
0.000714
(0.002)
0.000514
(0.00181)
0.000892
(0.00197)
0.00059
(0.00177)
0.00133
(0.00193)
0.00106
(0.002)
0.000332
(0.00407)
Ag/forestry employment 0.000128
(0.0000956)
0.0000428
(0.0000873)
0.0000429
(0.0000851)
0.000197**
(0.0000962)
0.000138
(0.0000929)
0.000201**
(0.0000951)
0.0000635
(0.0000848)
0.0000617
(0.0000874)
0.0000691
(0.0001)
0.0000864
(0.000119)
Construction employment -0.00000881
(0.0000088)
0.00000667
(0.00000913)
0.00000492
(0.00000769)
-0.0000184*
(0.0000101)
-0.00000466
(0.00000792)
-0.000014
(0.00000908)
0.00000754
(0.00000792)
0.00000276
(0.00000855)
0.00000189
(0.00000914)
0.00000532
(0.0000141)
Distance to urban area -0.00000062
(0.00000203)
0.00000146
(0.00000182)
0.000000322
(0.00000166)
-0.00000201
(0.00000207)
-0.00000114
(0.000002)
-0.00000209
(0.00000211)
0.0000000525
(0.00000165)
0.00000119
(0.00000172)
0.00000191
(0.00000235)
-0.000000632
(0.00000231)
Density (employee-weighted) of construction equipment suppliers -12.1**
(5.68)
9.2
(5.67)
-4.62
(5.39)
-6.17
(5.71)
-11.9**
(5.48)
-4.69
(5.55)
-4.32
(5.2)
6.75
(5.49)
11.3*
(6.71)
7.18
(7.5)
Density (employee-weighted) of brick, concrete, and related materials suppliers -88.5***
(29.4)
54.6*
(28.9)
-36.8
(25.9)
-57.6*
(30.4)
-87.3***
(28.9)
-45.3
(30.1)
-38.5
(25.5)
40.6
(27.1)
62.1*
(32.6)
50.2
(40.7)
Density (employee-weighted) of metal materials suppliers 96.9***
(34.2)
-59.5*
(34.3)
43.7
(31.1)
58.7*
(35.2)
94.4***
(33.4)
43.6
(34.8)
44
(30.3)
-41.5
(32.5)
-67*
(39.4)
-55.5
(47.6)
Density of sand and gravel sales yards -3330*
(1940)
-566
(1810)
-2270
(1730)
-2380
(2070)
-3280*
(1920)
-1910
(2060)
-2460
(1720)
-649
(1840)
-788
(2200)
1400
(2800)
Private land, individual or company (% 500m buffer) -0.0202
(0.119)
-0.269**
(0.119)
0.0314
(0.118)
-0.357***
(0.114)
0.0191
(0.12)
-0.373***
(0.115)
0.0707
(0.12)
-0.245**
(0.116)
-0.197
(0.124)
-0.0327
(0.185)
Private land, managed by industry (% 500m buffer) -0.486***
(0.157)
-0.931***
(0.151)
-0.477***
(0.15)
-0.739***
(0.158)
-0.436***
(0.159)
-0.737***
(0.162)
-0.428***
(0.151)
-0.923***
(0.146)
-0.88***
(0.155)
-0.616***
(0.182)
Private land, managed by non-industrial owner (% 500m buffer) 0.658**
(0.323)
-0.22
(0.309)
0.561*
(0.321)
0.394
(0.324)
0.531*
(0.315)
0.18
(0.316)
0.439
(0.313)
0.0398
(0.322)
0.0586
(0.337)
0.236
(0.402)
Number of worksites -0.426***
(0.112)
-0.527***
(0.122)
-0.484***
(0.121)
-0.426***
(0.11)
-0.448***
(0.116)
-0.447***
(0.114)
-0.515***
(0.123)
-0.493***
(0.12)
-0.467***
(0.124)
-0.52***
(0.145)
Distance between worksites -0.0000314
(0.000123)
-0.0000484
(0.000127)
-0.0000344
(0.00014)
-0.0000345
(0.000124)
-0.0000369
(0.000106)
-0.0000482
(0.000107)
-0.0000413
(0.000125)
-0.0000271
(0.000145)
-0.00000757
(0.000138)
-0.0000565
(0.000151)
Number of worksites X distance 0.0000487*
(0.000029)
0.0000623*
(0.0000326)
0.0000581*
(0.0000351)
0.0000492*
(0.0000282)
0.0000519*
(0.0000272)
0.0000528**
(0.0000264)
0.0000626*
(0.0000345)
0.0000568*
(0.0000339)
0.0000433
(0.0000347)
0.0000636*
(0.0000385)
Culvert installation (dummy) 0.21
(0.149)
0.746***
(0.14)
0.174
(0.15)
0.5***
(0.145)
0.42***
(0.144)
0.695***
(0.139)
0.411***
(0.143)
0.539***
(0.149)
0.508***
(0.19)
0.186
(0.657)
Culvert removal (dummy) 0.0356
(0.117)
-0.0403
(0.116)
0.0493
(0.121)
-0.0703
(0.113)
0.132
(0.114)
0.00382
(0.112)
0.158
(0.118)
-0.119
(0.117)
-0.205
(0.125)
0.122
(0.141)
Project source: BLM 0.905***
(0.144)
0.919***
(0.134)
0.969***
(0.151)
0.971***
(0.14)
Project source: HABITAT WORK SCHEDULE 1.18***
(0.322)
1.61***
(0.258)
1.11***
(0.308)
1.62***
(0.255)
Project source: REO 0.921***
(0.116)
0.953***
(0.11)
0.967***
(0.113)
0.996***
(0.107)
Project source: SRFBD 1.01***
(0.24)
1.3***
(0.225)
0.827***
(0.218)
1.18***
(0.199)
Project source: WA RCO 0.392*
(0.232)
0.869***
(0.171)
0.3
(0.202)
0.858***
(0.146)
Year: 2002 0.191
(0.151)
0.235
(0.149)
0.378**
(0.151)
0.429***
(0.15)
0.479***
(0.154)
0.363**
(0.18)
Year: 2003 0.233
(0.157)
0.271*
(0.159)
0.486***
(0.156)
0.557***
(0.158)
0.58***
(0.162)
0.473**
(0.197)
Year: 2004 0.643***
(0.144)
0.704***
(0.147)
0.798***
(0.146)
0.887***
(0.151)
0.889***
(0.164)
0.628***
(0.183)
Year: 2005 0.298*
(0.161)
0.326**
(0.165)
0.441***
(0.164)
0.517***
(0.17)
0.542***
(0.172)
0.27
(0.203)
Year: 2006 0.384**
(0.183)
0.397**
(0.192)
0.523***
(0.18)
0.569***
(0.187)
0.6***
(0.193)
0.407*
(0.236)
Year: 2007 0.53***
(0.17)
0.592***
(0.169)
0.726***
(0.169)
0.88***
(0.171)
0.685***
(0.18)
0.694***
(0.209)
Year: 2008 0.00304
(0.293)
-0.0467
(0.29)
0.48*
(0.283)
0.587**
(0.284)
0.596*
(0.333)
0.615*
(0.368)
Year: 2009 0.519*
(0.286)
0.462
(0.281)
1.01***
(0.301)
1.02***
(0.277)
0.993***
(0.327)
0.512
(0.448)
Year: 2010 0.187
(0.249)
0.174
(0.243)
0.495*
(0.254)
0.723***
(0.235)
0.63*
(0.35)
0.457
(0.365)
Year: 2011 0.177
(0.272)
0.178
(0.285)
0.614**
(0.292)
0.824***
(0.29)
0.87**
(0.361)
0.358
(0.31)
Year: 2012 -0.101
(0.421)
-0.0491
(0.421)
0.176
(0.459)
0.389
(0.464)
0.384
(0.54)
0.58
(0.502)
Year: 2013 0.003
(0.291)
0.0585
(0.297)
0.116
(0.296)
0.154
(0.31)
0.218
(0.317)
-0.128
(0.358)
Year: 2014 0.152
(0.293)
0.216
(0.278)
0.214
(0.291)
0.278
(0.28)
0.399
(0.307)
0.204
(0.324)
Year: 2015 0.825**
(0.32)
0.887**
(0.359)
1.44***
(0.315)
1.6***
(0.379)
1.62***
(0.39)
Basin: JOHN DAY -0.0324
(0.417)
0.0886
(0.41)
0.173
(0.413)
0.34
(0.401)
Basin: LOWER COLUMBIA 0.55**
(0.231)
0.338
(0.226)
0.646***
(0.233)
0.459**
(0.225)
Basin: MIDDLE COLUMBIA 0.307
(0.295)
0.222
(0.256)
0.449
(0.29)
0.375
(0.263)
Basin: NORTHERN OREGON COASTAL -0.151
(0.172)
-0.411**
(0.169)
-0.17
(0.177)
-0.412**
(0.178)
Basin: PUGET SOUND 1.19***
(0.342)
1.37***
(0.332)
1.26***
(0.313)
1.4***
(0.319)
Basin: UPPER COLUMBIA 0.072
(0.304)
0.213
(0.282)
0.269
(0.299)
0.4
(0.276)
Basin: WASHINGTON COASTAL 0.622**
(0.242)
0.632***
(0.202)
0.745***
(0.235)
0.773***
(0.186)
Basin: WILLAMETTE 0.112
(0.183)
-0.19
(0.173)
0.147
(0.182)
-0.128
(0.177)
Adj. R2 0.409 0.286 0.396 0.357 0.391 0.328 0.374 0.324 0.34 0.274
AIC 3477.7 3686.2 3498.2 3577.7 3502 3619.2 3529.1 3632.2 3218.5 2300.2
BIC 3790 3860.3 3769.6 3864.4 3742.6 3834.2 3728.8 3877.9 3458.3 2519.2
N 1235 1235 1235 1235 1235 1235 1235 1235 1091 779
* p < 0.1, ** p < 0.05, *** p < 0.01; Heteroskedasticity-consistent clustered standard errors in parentheses (HC3, clustered at project-level)

Model fit discussion

The full model has an adjusted R-squared of 0.409, indicating a decent model fit. The version of the model with no fixed effects has an adjusted R-squared of 0.286, indicating that a significant amount of variability is explained by the additional explanatory variables. When fixed effect categories are removed, we can check which fixed effects explain the most variation relative to each other. It looks like reporting source accounts for the most variation, followed by basin then year, based on the relative R-squares of models where each is removed and where each is included on its own.

We can also compare AIC and BIC; the model with only fixed effects for reporting source is preferred on the basis of BIC, which includes a stronger penalty for the number of estimated parameters, while the full model is superior on the basis of AIC. We use the full model in what proceeds as the preferred model.

When the model is fit only on culverts in the “core” basins or sources, adjusted R-squared falls signifciantly This suggests that information from the additional sources and outside the core basins improves the fit of the model.

Model visualizations

Slope and bankfull width interaction effect

To demonstrate the nuance in how bankfull width and slope jointly influence costs, we present predicted cost contours across both variables.

Worksites are distributed along a slope - bankfull width convex curve, with few projects both high slope and high width. This pattern mirrors the cost contours over slope - bankfull width space. Comparing the observed projects to other culverts in the Washington or Oregon inventories will show whether this relationship exists for all culverts or whether projects were selected along the cost curve. That is, would projects that did not occur exist in the upper-right space?

Number of worksites and total distance interaction effect

We can repeat the exercise for the number of worksites and total distance between worksites to examine the trade-off between economies of scale from grouping multiple worksites under one project and increased costs in coordination as proxied by total distance between sites. The contours of this cost surface can be interpreted as the distance limit for which adding an additional worksite to a project is associated with economies or dis-economies of scale.

For example, consider a potential project with one worksite (indicated by X) considering expanding to include a second worksite located 5km away (indicated by an *). Because the second worksite would increase the total distance between worksites below the distance where the cost contour (indicated by the solid line, with dashed lines indicating a 95% c.i.) crosses two worksites, expanding the project would be associated with economies of scale.

That these contours tend to be quite steep initially indicates strong potential for economies of scale when opportunities to group nearby worksites under a single project arise. As the number of worksites increases, these contours flatten, indicating that the maximum distance for an efficient additional worksite drops quickly.

Continuous variables

To display how costs vary across the continuous explanatory variables, we plot average marginal effects. These are scaled to a standard deviation change for each variable. Average marginal effects are calculated with other variables held at their means, which is most relevant for slope and bankfull width for which the model allows an interaction effect. The resulting estimate is exponentiated so that it can be interpreted as the expected ratio of costs resulting from a standard deviation change in the continuous variable.

A standard deviation increase in number of worksites is associated with 46% lower average costs in the preferred model when other variables, most importantly distance between worksites, are held at their means. On the other hand, a standard deviation increase in distance between worksites is associated with average costs 13% higher, though this marginal effect is not statistically different from zero. This conflict is at the core of the managerial tradeoffs to consider when grouping worksites under a single project, as described in the preceeding section. That the negative association between costs and the number of worksites exceeds the positive association with total distance for the mean project suggests that there are unexploited opportunities for economies of scale.

The average marginal effects for all population features are insignificant in the preferred model. However, in some of the alternative models, housing density and distance to urban area has a slight positive association with costs. The housing density effect might be evidence of increased access costs due to negotiating with multiple landowners, while the distance association may be evidence of increased costs due to lack of access to materials or labor. We are in the process of gathering improved proxies for each of these potential mechanisms.

One reason the population features may be insignificant is because we measure access to labor and materials in alternative ways in the model. Worksites closer to construction equipment and brick/concrete firms, and to a lesser extent gravel/sand sales yards, are associated with significantly lower costs, while those closer to metal suppliers have much higher costs.

Private land ownership influences costs in an interesting way. Note that for these variables, the implicit baseline are worksites with no privately-managed land within a 500m buffer. Such culverts are most often owned by government agencies. Compared to this baseline, culverts surrounded by land managed by a non-industrial private owner (i.e., non-profit conservation groups) are associated with higher costs, while culverts surrounded by industrially-managed land (i.e., land managed for forestry) are associated with lower costs. This key result indicates efficiencies associated with barrier improvements conducted by large forest landowners. Bankfull width and stream slope both have positive associations in the preferred models. A standard deviation increase in either is associated with about fifty percent higher average costs when the other is held at its mean. As seen in the earlier figure and raw coefficients, this effect is driven largely by the interaction term in the model.

Finally, a standard deviation increase in the terrain slope of the catchment the worksite is located in is associated with 19% higher costs. That is, culverts in hillier or more mountainous areas are more expensive to improve. On the otherhand, higher eleveations are associated with lower costs, though this relationship is not statistically significant.

Fixed effects and categorical variables

We can examine the fixed effect estimates to compare relative expected costs across groups. We do this by exponentiating the point estimates. The resulting value can be interpreted as the ratio of costs in a given group relative to a base group.

Year effects

For the preferred model, project average costs are as much as twice as high than 2001 levels between 2002 and 2007, when other factors are controlled for. In years that follow, costs return to around 2001 levels. In 2015, the last year of the sample, costs are nearly two-and-a-half times 2001 average costs, though this effect is weakly estimated and based on only a limited number of observations from this year.

Reporting source effects

BLM and REO have tightly estimated positive associations with costs, around two-and-a-half times costs observed from OWRI. OWRI projects have the lowest average costs, followed by WA RCO costs which are associated with 66% higher costs. Projects reported by Habitat Work Schedule and SRFBD coefficients estimated with less precision, but larger point levels than BLM and REO.

Basin effects

After controlling for other factors with the preferred model, projects in the Puget Sound and Lower Columbia basins have the highest average costs, followed by the Washington Coastal and Middle Columbia basins. In general, the pattern seems to be that projects in Oregon basins have lower average costs, with the Willamette and both Oregon Coastal basins coming in with the lowest costs.

Land cover effects

Cultivated cropland and developed land covers have positive associations with costs, relative to worksites found in areas with forest land cover. Shrubland also has a positive association, though the relationship is not statistically significant. The remaining land covers, wetlands and herbaceous, do not appear to have different costs than the forest land cover baseline.

Road feature effects

There appears to be a positive association between road speed class and project costs, where the largest, most heavily trafficked roads are associated with higher worksite costs. The point estimates for the largest speed class have large standard errors, likely due to lower representation of these roads in the sample.

Scope and scale effects

There is some evidence from the models that installations are more expensive than improvements (the baseline) and removals, though this association largely washes out when the full suite of fixed effects is included. This may indicate that distinctions between these categories in the data are loosely defined.

Residual and predicted value maps

In this section, we map both the residuals and predicted values for each in-sample worksite. Patterns may reveal areas where costs are more expensive, where missing explanatory variables may be effecting costs over a specific region. They also demonstrate how these models could be used to project costs for the full inventories of fish passage barriers documented by state agencies.

By looking at the map of residuals, we can examine where the model performs better or worse. Few obvious patterns emerge. There seems to be some clustering of positive residuals in the Portland area, suggesting that costs are underestimated for that region. Otherwise, the residuals appear to be fairly well distributed. We examine the spatial concentration of the residuals and consider spatial lag specifications in a future report.

The map of predicted values highlights areas where physical conditions (i.e. road, stream, and terrain features) have the largest impact on costs. By holding year, basin, reporting source, and project scale effects constant, we can identify where the remaining variables in the model predict higher or lower costs. Here, we see higher costs particularly in the Southern Oregon Coastal basin. At the mouth of the Columbia and around the Portland area, there appear to be clusters of lower cost projects. The John Day basin projects in Eastern Oregon also appear to have lower costs than other areas.

This map shows the standard errors on the cost predictions for each worksite in the sample. Darker points indicate increased uncertainty. Uncertainty appears to be the highest in the John Day and Puget Sound regions. This suggests that costs estimates are the least reliable for these areas, and could be improved with more observations from these regions.

Finally, we present a number of summary statistics by basin for the predictions and metrics presented above.

Prediction charactaristics across basins
Basin Mean Predicted Value Predicted Value Coefficient of Variation Mean Prediction Standard Error (log scale) Mean Absolute Residual (log scale) N
NORTHERN OREGON COASTAL 40,487 0.441 0.444 0.828 221
JOHN DAY 46,460 0.193 0.492 0.648 29
WILLAMETTE 58,278 0.529 0.453 0.818 281
UPPER COLUMBIA 64,016 0.547 0.510 0.578 40
SOUTHERN OREGON COASTAL 66,862 0.290 0.447 0.648 458
MIDDLE COLUMBIA 75,273 0.343 0.521 0.657 16
LOWER COLUMBIA 80,150 0.452 0.464 0.794 59
WASHINGTON COASTAL 96,876 0.288 0.480 0.725 61
PUGET SOUND 105,422 0.413 0.526 0.750 71

The mean predicted value by basin indicates basins where culverts are more or less expensive based on landscape conditions (fixed and project effects fixed). Puget Sound culvert worksites are the most expensive, while Northern Oregon Coastal worksites are the least expensive. The predicted value coefficient of variation shows where costs vary the most within a basin on a consistent scale. Costs vary the most within the Upper Columbia basin, and the least in the John Day basin. Finally, the mean prediction standard error shows where model uncertainty is highest. As observed on the map, Middle and Upper Columbia, and Puget Sound basins have the largest prediction standard errors, while the Western Oregon basins (Northern and Southern Oregon Coastal, and the Lower Columbaia) have the smallest, likely because these basins are releatively well-represented in the sample.

Benefit - cost visualizations

As a simple examination of what kind of decision rule might be in play for determining which culverts were selected, we plot as a benefit proxy total upstream length (in km) versus observed cost per culvert. Under a cost-targeting rule, all worksites would be to the left of a cost threshold, while for a benefit-targeting rule all would be above a benefit threshold. A benefit-cost ratio standard would be above an upward sloping line.

It should be acknowledged there are severe limitations to this application. Most obviously, without including data on un-attempted projects we are left assuming the total possible project space is “dense” (i.e. in any point in cost-benefit space we don’t observe a project, we assume a project there would be possible). This assumption can be addressed by incorporating data from state agency culvert inventories. Second, our benefits measure is rough at best, and alternative measures that account for upstream/downstream barriers as well as habitat conditions and species ranges may more realistically describe the decision space. Finally, in looking across projects from the full study area, we may miss heterogeneity in targeting rules across jurisdictions or regions. We partially address this final point by also examining the distributions of subsets of worksites by region and year.

No strong evidence that projects are selected on cost or benefit-cost ratio basis. It does somewhat appear that the observed projects follow a benefit targeting pattern with a fairly low benefit cut-off. For Lower Columbia, Upper Columbia, and Washington Coastal, and Puget Sound basins, there appears to be a slight upward tilt in the higher-cost region, indicating that benefit-cost targeting may be more frequent for higher-cost projects.

Adding culverts where no project is observed with costs predicted via the above models could reveal where observed projects exist in the space relative to the universe of potential projects, which should provide more clarity to the above analysis.

Conclusions

Key findings

  1. Stream features slope and bankfull width increase average costs, especially when they are both high.
  2. Paved roads are more expensive to improve, while other road variables have little discernible effect.
  3. Some evidence of economies of scale, in that worksites associated with projects associated with more worksites tend to have lower average costs. This effect is countered by a positive association between costs and total distance between worksites.
  4. John Day basin worksites are the lowest variance in costs between worksites, while Northern Oregon Coastal worksites have the lowest costs overall. Costs are highest in the Puget Sound and Washington Coastal basins. Cost variance is also particularly high in the Upper Columbia and Willamette basins.

Next steps

  1. Forecast costs/benefits for culvert inventories from Oregon and Washington
  2. Compare results of OLS estimates to estimates of models with spatial lags, and machine learning methods (boosted regression trees)
  3. Analyze how outcomes (cost/benefit targeting patterns, cost levels/variation, model uncertainty) across culvert ownership, jurisdictions

R version 4.0.2 (2020-06-22)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19041)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252 
[2] LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] nhdplusTools_0.3.16     lmtest_0.9-37           zoo_1.8-8              
 [4] sandwich_2.5-1          gifski_0.8.6            ggtext_0.1.1           
 [7] gganimate_1.0.7         ggcorrplot_0.1.3        mctest_1.3.1           
[10] readxl_1.3.1            ggeffects_1.0.1         margins_0.3.23         
[13] broom_0.7.0             kableExtra_1.1.0        knitr_1.29             
[16] scales_1.1.1            here_0.1                janitor_2.0.1          
[19] forcats_0.5.0           stringr_1.4.0           dplyr_1.0.1            
[22] purrr_0.3.4             readr_1.3.1             tidyr_1.1.1            
[25] tibble_3.0.4            tidyverse_1.3.0         MASS_7.3-51.6          
[28] equatiomatic_0.1.0      ggthemes_4.2.0          searchable_0.3.3.1     
[31] raster_3.3-13           sp_1.4-4                osmdata_0.1.3          
[34] sf_0.9-5                rnaturalearthdata_0.1.0 rnaturalearth_0.1.0    
[37] ggmap_3.0.0.902         ggplot2_3.3.2           workflowr_1.6.2        

loaded via a namespace (and not attached):
 [1] colorspace_1.4-1    selectr_0.4-2       rjson_0.2.20       
 [4] ellipsis_0.3.1      class_7.3-17        rgdal_1.5-16       
 [7] sjlabelled_1.1.6    rprojroot_1.3-2     snakecase_0.11.0   
[10] markdown_1.1        fs_1.4.2            gridtext_0.1.4     
[13] rstudioapi_0.11     farver_2.0.3        fansi_0.4.1        
[16] lubridate_1.7.9     xml2_1.3.2          codetools_0.2-16   
[19] jsonlite_1.7.1      dbplyr_1.4.4        rgeos_0.5-3        
[22] png_0.1-7           compiler_4.0.2      httr_1.4.2         
[25] backports_1.1.8     assertthat_0.2.1    cli_2.1.0          
[28] later_1.1.0.1       tweenr_1.0.1        htmltools_0.5.0    
[31] prettyunits_1.1.1   tools_4.0.2         igraph_1.2.5       
[34] gtable_0.3.0        glue_1.4.2          reshape2_1.4.4     
[37] RANN_2.6.1          Rcpp_1.0.5          cellranger_1.1.0   
[40] vctrs_0.3.4         insight_0.12.0      xfun_0.16          
[43] ps_1.3.3            rvest_0.3.6         lifecycle_0.2.0    
[46] hms_0.5.3           promises_1.1.1      RColorBrewer_1.1-2 
[49] yaml_2.2.1          curl_4.3            stringi_1.4.6      
[52] highr_0.8           e1071_1.7-3         RgoogleMaps_1.4.5.3
[55] rlang_0.4.8         pkgconfig_2.0.3     bitops_1.0-6       
[58] evaluate_0.14       lattice_0.20-41     prediction_0.3.14  
[61] labeling_0.3        tidyselect_1.1.0    plyr_1.8.6         
[64] magrittr_1.5        R6_2.4.1            generics_0.0.2     
[67] DBI_1.1.0           pillar_1.4.6        haven_2.3.1        
[70] withr_2.2.0         units_0.6-7         modelr_0.1.8       
[73] crayon_1.3.4        KernSmooth_2.23-17  rmarkdown_2.3      
[76] jpeg_0.1-8.1        progress_1.2.2      isoband_0.2.2      
[79] grid_4.0.2          data.table_1.13.0   blob_1.2.1         
[82] git2r_0.27.1        reprex_0.3.0        digest_0.6.25      
[85] classInt_0.4-3      webshot_0.5.2       httpuv_1.5.4       
[88] munsell_0.5.0       viridisLite_0.3.0