# Tell me how much energy your building uses,and I will tell you who you are Mathieu Bouville, PhD

I wanted to answer questions like: are older buildings less efficient? what about smaller buildings? what is the part played by climate? As you will see, my success in answering them varies. And the lack of an answer to some questions can turn out to be more interesting than a clear answer.

I use data from the Building performance database, a collaboration between Lawrence Berkeley National Laboratory and the U.S. Department of Energy, with close to a million buildings, both commercial and residential buildings, all from the U.S.

Energy consumption is measured as energy use intensity (EUI), i.e. the energy consumption per year divided by the floor area of the building. EUI can be handled in one of two ways. Site EUI looks only at how much energy was consumed on site (this is for instance the number on the electricity bill). Source EUI, on the other hand, accounts for how much energy was needed upstream, e.g. the energy of the coal burned to produce the electricity used in a building (see for instance here for more details). Source EUI is generally preferred, so in what follows 'energy use' and 'EUI' will both refer by default to the source energy use intensity, generally the median or the average (arithmetic mean).

## 1. Are older commercial buildings less efficient? Well, it is complicated.

One could expect older buildings to be less efficient, because insulation standards have improved over time.

### The energy spike of recent commercial buildings

The light blue diamonds in Fig. 1.1 show the energy consumption of commercial buildings as a function of the decade when they were built. Out of close to 130 000 commercial buildings, 84 000 have energy use data, 34 600 a building date (including bogus ones, see Appendix 1) and 26 750 have both. Of these, 20 000 were built between 1901 and 2010 and are included in Fig. 1.1.

The past two decades stand out: the energy consumption is much higher after 1990. Together these two decades make up the 60% of the buildings plotted as the light blue diamonds in Fig. 1.1. One can thus wonder whether 1990–2010 looks worse because of sampling.

Figure 1.1: Average source energy use intensity of commercial buildings by decade of construction.

The distribution for buildings built since 1995 is bimodal: most buildings consume less than 200 kBtu/​ft2/year but there is a cluster above 700. The latter is from the "food service" buildings (mostly restaurants/​cafeterias), which have a median source EUI of nearly 900 kBtu/​ft2/year (see Section 6.1 for more on these). The dark blue squares of Fig. 1.1 exclude restaurants ('commercial EX restaurants'), which leaves 15 550 buildings. Clearly the recent increase in energy consumption comes from restaurants — what looks like a time effect is in fact a sector effect, with a correlation with time.

### Time evolution of energy use

The dark blue squares in Fig. 1.2 (the same as in Fig. 1.1, only zoomed in) show a decrease of the mean energy use over the first half of the 20th century for commercial buildings EX restaurants. This was followed by an upward trend from the second world war up to 1990, after which data play a bit of yoyo.

In each box in of Fig. 1.2, the middle line is the median source EUI and the top and bottom lines are quartiles (minima and maxima are not plotted as whiskers because they are ridiculously extreme, and outliers cannot be plotted for lack of raw data). The trend is similar to the average, except that there is no increase in the 2000 decade.

Figure 1.2: Box plot (median and quartiles) and average (as line) of source energy use intensity of commercial buildings excluding restaurants by decade of construction.

After 1990, the average is near or even above the top quartile. (In general it is between the median and the upper quartile, as is typical with skewed distributions — the EUI cannot be negative but it can be quite high.) Out of 400 buildings categorized as "grocery stores and food markets" that have a construction date, 300 were reputed built in 2000. Since these have an average source EUI of 750 kBtu/​ft2/year, they artificially increase the average for 2000–2010 (the median being more robust does not feel the effect as much). The open symbol for the 2000–2010 decade in Fig. 1.2, which shows the energy use without grocery stores and food markets, confirms the trend from the median: more recent buildings consume less energy.

The source energy use intensity of commercial buildings has three phases: slight decrease for building dates up to the second world war, followed by four decades of increase and a decrease for buildings constructed over the last two decades.

### Residential and commercial buildings move in opposite directions

Figure 1.3 shows source energy consumption of resi­dential buildings by decade (see Appendix 2 for what is included). Most of the recent construction is clustered in specifically the year 2000, so data from 2000–2010 may be spurious, and are not used. The expected trend of a reduction of energy use with more modern construc­tions is justified, at least up to 1990.

It is difficult to identify the specific type of building (if any) that may be responsible for this recent upturn. Indeed three quarters of residential buildings are "single family" and the two available sub­categories of "attached" and "detached" together make up only 5% of them (the lion's share being "uncategorized").

Figure 1.3: Box plot (median and quartiles) for source energy use intensity of residential buildings by decade of construction.

The trend for residential buildings since the war is the opposite of commercial buildings, which exhibit an increase up to the eighties followed by a twenty-year reduction of energy use, as shown in Fig. 1.4. Since buildings constructed before 1950 make up only 9% of resi­dential buildings (14% of commercial ones), old buildings are clustered into just two ranges in Fig. 1.4 (the dashed lines indicate that the increments are then more than ten years).

Figure 1.4: Average source energy use intensity of residential (red, left axis) and commercial (blue, right) buildings by decade of construction.

While a third of the residential buildings date back to the seventies and another 22% to the eighties, those built since 1990 account for only 9% of the total (against 47% for commercial buildings), and this even includes the suspicious 2000 cluster. Either residential (but not commercial) construction activity has recently dropped or the database has a delay in including buildings — a delay which may not be random. This said, the systematically opposite trend between residential and commercial buildings is too clear to be explained away by the available data not being completely random.

## 2. Are smaller buildings less efficient? Yes.

A smaller building has a greater ratio of wall and roof area to floor area, which means greater heat losses (but number of floors and shape —e.g. compact vs. L-shaped— should also be taken into account). Moreover a house which is twice as big does not necessarily have twice as many ovens or freezers. Consequently, one can expect smaller buildings to be less efficient.

### Commercial buildings

Figure 2.1 shows the energy consumption of commercial buildings (N = 84 200) based on their size. Commercial buildings less than 5000 ft2 (465 m2) in floor area indeed have a noticeably higher energy consumption. With larger ones, size does not seem to make a dramatic difference. Physics says that smaller buildings should leak more heat, but it does not set a threshold for this effect, and certainly does not justify such a discontinuity.

Figure 2.1: Average source energy use intensity of commercial buildings against floor area.

The EUI among commercial buildings of less than 5000 ft2 is bimodal, with modes around 250 and 850 kBtu/​ft2/year. The latter comes (again) from the restaurants (see Fig. 6.2). So, as was the case with Fig. 1.1, this is not a size effect but rather an effect of economic sector. When food service is factored out (dark blue squares in Fig. 2.1), the actual size effect is quite small.

Figure 2.2 shows a decrease of the average source energy use intensity for larger commercial buildings EX restaurants. With the median, there is more noise than trend though.

Figure 2.2: Box plot (median and quartiles) and average (squares) of source energy use intensity of commercial buildings EX restaurants against their floor area.

### Residential buildings

 Figure 2.3 shows the source energy use intensity of residential buildings (N ≈ 700 000) based on their size (see Appendix 3 for what is included). The few of them (1%) larger than 5000 ft2 (465 m2) have been lumped with those between 2000 and 5000 ft2. This is also why commercial and residential buildings cannot be plotted together: their sizes are on different scales. As with commercial buildings, Figs. 2.1 and. 2.2, the energy consumption of residences decreases with size. Except that the larger ones are noticeably less efficient: the 0.1% of residential buildings larger than 20 000 ft2 (1850 m2) have an average source EUI of 124 kBtu/​ft2/year. Most of them are apartment buildings of course, not huge houses. There are too few of them to filter them (on most criteria the vast majority of them are of unknown type). Figure 2.3: Average source energy use intensity of residential buildings against their floor area.

## 3. Do more buildings with more occupants or longer operation hours consume more energy? Yes.

### Occupant density

 There are occupancy data for 20 000 commercial buildings, but only 2000 residential buildings, so the latter will not be used. The data, as they are provided, are broken down into eleven ranges of length 2: 0 to 2 people per 1000 square feet, 2 to 4, …, 18 to 20 and more than 20. Unfortunately this choice of increment yields very unevenly spread data: together, the 8 ranges above 6 people per 1000 square feet make up only 3% of the data. So they were added to the 4–6 range and the eleven original ranges are down to just three, with the one merging nine (4+ people per 1000 ft2) still amounting to less than 11% of the total. Figure 3.1 shows that the energy use increases with occupant density. As is now becoming common, this effect comes in part from food-service buildings. But for once the entire effect does not vanish, it is only somewhat weaker. There is a genuine increase: fuller buildings use more energy. Figure 3.1: Average source energy use intensity of commercial buildings against occupant density.

### Operating hours

Figure 3.2 shows a box plot for the source EUI of commercial buildings as a function of the number of operating hours, by eight-hour increments. The last range corresponds mostly to buildings operating 24/7. The 0–8 range stands out because of its high quartile, but with just 19 buildings thus seldom open this result is not meaningful. From 8 to 72 hours the energy use increases steadily with the number of operating hours. Then it stabilizes up to 104 hours. For longer openings the standard deviations get large (but buildings running between 104 and 160 hours a week together make up only less than 10% of the total).

Figure 3.2: Box plot (median and quartiles) of source energy use intensity of commercial buildings against the number of operating hours per week.

Figure 3.3 divides the average EUI by the number of operating hours. (Due to lack of better data, the middle of the range is used; since the error would be large for short hours these are not included). This 'intensity intensity' decreases up to 104 hours, i.e. the longer the building is operated the more energy it consumes (Fig. 3.2), but the increase is small enough for the energy used per operating hour to go down (light blue line in Fig. 3.3). From 104 to 120 hours the intensity increases markedly, then falls again. Buildings operated 24/7 (6% of the total) again consume more per hour.

Figure 3.3: Average source energy use intensity per operating hour of commercial buildings against the number of operating hours per week.

The thousand buildings opened 112 to 152 hours a week have a bimodal distribution for source EUI, with modes around 210 and 450 kBtu/​ft2/year. And the culprit is… not restaurants (they do not open nearly this much) but their cousins "grocery stores and food markets". When these are removed, the 650 remaining buildings operated 112 to 152 hours have a unimodal EUI.

Looking at commercial buildings EX grocery stores and food markets, there are data for 13 800 buildings (down from 14 450). The dark blue squares of Fig. 3.3 show that the bump is gone: it is not an effect of operating hours but of type of business. The EUI is roughly constant above 72 hours of weekly operations. (The low energy use of the 152–160 hour range comes from a meager 13 buildings and can probably be safely ignored.) Commercial buildings EX grocery stores and food markets consume less energy per operating hour up to about 100 hours, after which energy use stabilizes.

## 4. Do buildings in cold climates consume more energy? Strangely, no.

### Cold climates use less energy but it is hot climates that use less energy

A quarter of the 130 000 commercial buildings have no climate data. There are eight climates (including one, subarctic, without data), and most have subtypes; of these fourteen subtypes, five make up less than 5% of the data combined. I therefore use only five climates ("very hot" added to "hot" and "very cold" to "cold"). The less-numerous ones now make up 7% (cold) to 10% (hot) of the buildings.

The light blue bars in Fig. 4.1 show that climate explains quite less of the source EUI than other criteria. Moreover, the greater energy use is not in extreme climates but in "warm" places (the likes of El Paso, TX; San Francisco, CA; Memphis, TN). There is no significant difference between humid (70% of buildings) and dry (20%) climates, both are at 275–280 kBtu/​ft2/year; the marine climate (10%) has a slightly lower energy intensity of 260 kBtu/​ft2/year.

Figure 4.1: Average source (light blue, left axis) and site (dark blue, right) energy use intensities of commercial buildings by climate.

### Why?

Part of the reason is probably that commercial buildings use energy for more than heating and cooling. And a motor or an oven uses as much energy in cold and hot climates. So only part of the energy use depends on climate, which should dampen the effect. Residential buildings should thus show a stronger impact of climate. Unfortunately, out of close to 750 000 of them in the database, 87% are in a warm and dry climate (perhaps because 87% of them are from California); consequently, not much can be done with residential buildings and climate.

Another partial explanation is that buildings in the colder climates are better insulated (as indicates their median Energy Star rating of 77, against 61 for hot climates). This reduces the need for heating, but without removing it completely.

Or it could be good old artifacts. For instance the median size of commercial buildings in cold climates is 18 000 ft2 against 38 000 ft2 in hot climates, with 10.7% against 18.6% above 200 000 ft2. Also, half commercial buildings in hot climates have been built since 2000, against only a quarter in cold climates (medians of 2000 and 1974). And 60% of buildings in cold climates have an occupant density under 1 person per 1000 square feet against a quarter in hot climates, businesses in cold climates are opened longer hours, etc. Climate is plainly not the only difference between them.

### Conversion efficiency as signature

 Figure 4.2: Energy conversion efficiency (site EUI over source EUI) of commercial buildings by climate. Commercial buildings in cold climates (mostly in Minnesota, Vermont and Wisconsin) seem to have a lower energy use, which is certainly counter-intuitive. But note that the light bars in Fig. 4.1 give the source energy use intensity. Using the site energy instead (dark bars), the cold climates no longer stand out: it is the hot climates (mainly from Florida, Texas and Arizona) that do, for their low energy use. Figure 4.2 shows that the ratio of site to source (conversion efficiency) is 34% in the hot climates and 47% in the cold ones (with climates in between aligning nicely in between). This is indicative of a difference in the source of energy: burning wood or other fuels for heating is more efficient than burning coal to produce electricity to run air-conditioning. Unlike in Fig. 4.1, Fig. 4.2 shows a clear pattern: the conversion efficiency is a signature for the type of energy used.

## 5. Does better insulation reduce energy consumption? Yes.

### Aberrant results

The availability of insulation data is limited. So much so that strange results can be found.

Out of 1550 commercial buildings with wall insulation data, those with the better insulation (R between 10 and 20, no units provided) use more energy than those with less insulation (0–10): 369 kBtu/​ft2/year against 312. This may be reverse causation (better insulation where a lot of heating is needed) but it can also be an artifact (e.g. it turns out that 21% of the better-insulated buildings are open 24/7 against less than 10% of the less insulated).

Residential buildings have four kinds of windows: single-​pane, single-​pane with storm windows, double-​pane and triple-​pane. The last of these is the best, as it should. But with a median source EUI of 65 kBtu/​ft2/year, triple glazing is worse than residential buildings as a whole (median of 60 kBtu/​ft2/year). Again the sample is not representative.

### Energy Star rating

Even though the category exists in the database, LEED data are not in fact available. And Energy Star data being provided for only two thousand residential buildings, I will focus on commercial. 65 250 of them have an Energy Star rating (score between 0 and 100), with a median of 64 and half between 35 and 83.

Figure 5.1 shows a box plot of source EUI as a function of the Energy Star rating. There is a correlation, as there should be. Buildings rated below 10 for instance have atrocious energy use. Note that, while 5.8% of the 65 000 commercial buildings with a rating are rated 5 or less, these make up less than 2.2% of the 32 500 commercial buildings with energy use data.

Figure 5.1: Box plot (median and quartiles) and average (as diamonds) of the source energy use intensity of commercial buildings against their Energy Star rating.

However, the improvement is not monotonic: there is no dramatic amelioration of the median between about 10 and 70, and the median for 40–60 seems in fact worse than between 20 and 40. The average (diamonds) is flat between 20 and 60. Overall, a rating below 20 is bad, above 60 or 70 is good, and everything in between is just some no man's land with no clear order. This said, all outliers save one have ratings either of 1 or above 75.

Note, however that the lowest four ranges make up only between 2.5% and 4.5% of the data each. A linear regression with the four ranges 0–30, 30–60, 60–80 and 80–100 gives a drop of nearly 100 kBtu/​ft2/year of the average source EUI when upgrading from one range to the next (R2 > 0.99). So the Energy Star rating, generally speaking, correlates with the EUI. But it seems that the scale used is far more refined than what it can actually measure.

### Roof type

Figure 5.2 shows a box plot for the type of roof of residential buildings. 78% if roofs are of an unknown type, one type is NA and four types never occur. This still leaves more than 150 000 residential buildings with data (against only 3300 commercial buildings).

The figure is ordered from best to worst: "slate or tile shingles", "wood shingles/​shakes/​other wood", "built-up" (flat roof), "other or combi­nation", "shingles" and "asphalt/​fiber­glass shingles". The last two are clearly the worst choices; of course, the point of shingles is not that they insulate well, but that they are cheap.

This said, a more expensive house is more likely to have a better roof, but also better insulation of the walls, double- or triple-​glazing, etc. So causation is fuzzier than correlation.

Figure 5.2: Box plot (median and quartiles) of the source energy use intensity of residential buildings against the type of roof.

## 6. Commercial buildings by category

### "Food service" (i.e. restaurants)

The category of commercial buildings labeled "food service" in the database has nearly ten thousand buildings. Of these, 20% are classified as "restaurant or cafeteria" and the rest is mostly "uncategorized" (the three remaining subtypes of "bakery", "fast food" and "other" together have a measly 150 buildings). Therefore I often use the English 'restaurants' instead of the mumbo-​jumbo "food service".

Figure 6.1 shows clearly that these buildings (in purple) consume more energy (median of nearly 900 kBtu/​ft2/year) than other commercial buildings (yellow, median of 172 kBtu/​ft2/year). A kitchen obviously consumes more energy than an office or a warehouse. Thus the overall source EUI distribution of commercial buildings is bimodal.

Figure 6.1: Histogram and median of the source energy use intensity (in kBtu/ft2/​year) of food-service buildings (purple) and of other commercial buildings (yellow). Source: bpd.lbl.gov

Figure 6.2 shows clearly that restaurants (in purple) are doubly different from other commercial buildings (yellow). They were mostly built after 1995 (the median year restaurants were constructed is 2004 with a quartile in 2000). They are smaller than other commercial buildings. This is why they explain the high energy use of recent buildings and of small buildings.

Figure 6.2: Floor area (in square feet) against year of construction for food-service buildings (purple) and other commercial buildings (yellow). The colored crosshairs give medians. Source: bpd.lbl.gov

### Retail

The size distribution of the 21 350 retail buildings is bimodal, with modes at 14 000 ft2 (which is also the median) and at 92–​98 000 ft2 (perhaps due to a reglementary threshold at 100 000 ft2). The 5700 buildings between 12 000 and 14 000 ft2 (a quarter of the whole) show a peculiar pattern: only 25 have operating hours data and as many have occupant density data (the exact same 25 buildings), but 5600 have an Energy Star rating, and 99% are recorded as being built in the year 1900. If this pattern makes any sense to you do not hesitate to let me know.

The database has subcategories for "big box (> 50k sf)" and "small box (< 50k sf)" among others, but as usual most are in fact "uncategorized". I will call 'big retail' the 5400 buildings larger than 80 000 ft2 (7400 m2) and 'small retail' the 14 000 buildings smaller than 30 000 ft2 (2800 m2). (About 2000 buildings of inter­mediate size are thus excluded so that I do not have to pick a precise cut-off.)

Small stores have a median source EUI of nearly 300 kBtu/​ft2/year, more than twice as much as the 130 kBtu/​ft2/year of the big ones. Part of the explanation is the size itself: smaller buildings are less efficient, but not to such an extent. Another bit of explanation is that big retail has a median Energy Star rating of 80, against 61 for the small buildings. However, big stores open 97 hours a week (median) against only 80 for the small ones, which should slightly favor the latter.

Figure 6.3 shows that big retail (in yellow) is much less dispersed than small retail (in purple) both in terms of energy use and operating hours. This greater clustering occurs despite there being 1440 big stores and only 950 small ones with relevant data. One can notice that the opening hours of big retail are bimodal: 72–78 and 96–102 hours a week (the mode of small retail is around 85).

Figure 6.3: Source energy use intensity against weekly operating hours for small retail (purple) and large retail (yellow). The colored crosshairs give medians. Source: bpd.lbl.gov

## Appendices: What data are included in the analyses

In order to avoid cluttering the article with methodological details (especially what is included or not in the analysis), these are gathered here. The need arises in part because the data have some drawbacks.

• The raw data are not available, only predigested ones (e.g. average for a range of values).
• The data are in the most backward units they could find (Btu, square feet). I did not change them because (i) a chart with ticks at 0, 93 and 186 m2 looks worse than 0, 1000 and 2000 ft2 and (ii) I am more interested in comparisons than in the absolute numbers, so rescaling would not change much.
• Many technical data (type of heating, insulation) are so incomplete as to be useless; e.g. out of about 700 000 residential buildings, 112 have insulation data (R values) for the walls.

### 1. Construction date of commercial buildings

Figure A.1 shows that there were (supposedly) over fifteen times as many commercial buildings constructed between 1900 and 1905 as in the seventeenth, eighteenth and nineteenth centuries combined (the first bar is 1650–1900). The year 1900 itself is vastly over­represented, so it is likely treated as a sort of NA. 1901–1910 is thus used as a decade instead of 1900–1910. Moreover, the mean energy consumption of 1900 is twice as high as that of the next few decades (the median is even higher), so these commercial buildings without a construction year are particularly inefficient.

Also, the 2010 decade has a tenth of the commercial buildings of the previous one, so this decade is ignored too (its energy consumption was pretty close to the 2000 decade anyway).

Figure A.1: Histogram and median of the construction date of commercial buildings (a bar represents five years). Source: bpd.lbl.gov

^ Back to section "1. Are older commercial buildings less efficient?"

### 2. Construction date of residential buildings

One of the categories of residential buildings is buildings with "5+ units". These are not very numerous but (i) their median source EUI is twice as much as the overall median for residential buildings and (ii) many of them were supposedly built in 1895, 1905, 1915, etc. Apparently these buildings spontaneously sprout every ten years.

Also "mixed use" accounts for only 464 buildings with energy consumption data. This is both marginal and too imprecise (does it genuinely belong with residential buildings?). In Fig. 1.4, 'residential' thus means 'residential (not mixed-use) excluding 5+ units'.

^ Back to section on the construction date of residential buildings

### 3. Floor area of residential buildings

The larger residential buildings are of course rarer, but only up to 50 000 ft2 (4 650 m2). There are 360 of them between 40 000 and 50 000 ft2, but 1600 in 50 000–​60 000 ft2. This discontinuity comes wholly from the "multifamily" category (12 400 with relevant data, i.e. 1.7% of 700 000). Given this highly suspicious discontinuity, shown in Fig. A.2, these are not used.

The category having a discontinuity is not necessarily a problem. Indeed, being larger than 50 000 ft2 could be part of the definition, with smaller buildings categorized elsewhere showing the inverse jump (as with "big box" and "small box" retail stores). But this discontinuity having an impact on the overall distribution of residential buildings is highly suspicious.

Figure A.2: Histogram of floor area (in square feet) of multifamily residential buildings. Source: bpd.lbl.gov

What the histogram does not show is that 14.5% of them are between 200 000 ft2 and 14 500 000 ft2 (with no detailed break-down provided). Moreover, all multifamily buildings with data on both floor area and source EUI are "uncategorized", so we cannot know in which of the two categories of condominium and "town home" they belong.

Again "mixed use" is not used. So, in Fig. 2.3, 'residential' means 'residential (not mixed-use) excluding multifamily'.

^ Back to section on the floor area of residential buildings