| Skip
to content |
|
3.0 FREIGHT MOVEMENT BY AIR3.1 IntroductionAir cargo is a key part of the overall freight transported in terms of its dollar value, time sensitivity, and reliance on other shipment modes. This report outlines a method to integrate U.S. Census Bureau value data with the Department of Transportation (DOT), Office of Airline Information (OAI) weight data in order to develop two datasets containing data on commodity, commodity value, and weight of air shipments by O-D for domestic shipments; and data on commodity, value, weight, and origin-port of entry/exit-destination for international shipments. The aviation component of the provisional commodity O-D estimates combines Office of Airline Information (OAI) data on the weight of shipments for the U.S. airline industry with Census/Customs (hereafter Census) data on commodity type, value and weight for imports and exports by air, and the FAF2 domestic aviation value and weight data. The major reasons to use OAI data are the ability to estimate a port-of entry/exit and that it is considered the definitive source for tons of U.S. air freight shipped. While the Census data do provide port-of-entry/exit identities, these are based on the port in which a shipment clears customs rather than the first port after/before crossing the border. The main reasons for using the Census data are the availability of information on commodity type and value. The major contribution of the FAF2 domestic aviation data is to capture commodity type and value differences between the international and domestic data. This report specifies the process for combining the OAI, Census, and FAF2 data and the methodologies for estimating the port-of-entry/exit and for forecasting data for months that have not yet been reported. 3.2 Air Freight Data Sources3.2.1 OAI DataThe Office of Airline Information (OAI-BTS/RITA), part of the U.S. DOT Research and Innovative Technology Administration, publishes the Form 41T-100 and T-100 (f) traffic data monthly on both a market and segment basis. The T-100 data contain information on the weight of air freight and mail by carrier, origin airport, and destination airport, as well as additional identifying and operational information. The OAI data are considered the definitive source of tons shipped for the U.S. airline industry. OAI shipments are defined differently than FAF2 shipments in that OAI shipments use an airport basis (from airport origin to airport destination) rather than an establishment basis. In OAI market data, airport O-D refers to tons enplaned by a specific carrier at the origin airport and deplaned by the carrier at the destination airport. The T-100 market data will exclude the port-of-entry/exit whenever the port is an intermediate stop for the shipment. O-D for each record on the segment component of the T-100 data refers to a non-stop leg and reports tons transported rather than tons enplaned. The T-100 segment data will include the port of entry/exit for international shipments, but will exclude the ultimate O-D when a shipment has multiple stops. Combining the market and segment data to add ports-of-entry/exit is one of the main objectives of this project. The T-100 data covering freight shipments by U.S. carriers is publicly available approximately 60 days after the end-of-month and the T-100 (f) data covering foreign carriers is publicly available approximately six months after the end-of-month. The data can be found at http://www.transtats.bts.gov/ (the T-100 (f) data are included in the versions having all carriers). Two other differences between the OAI data and FAF2 are the lack of information on the value and commodity composition of shipments. In order to provide information for FAF2 international air shipments, U.S. Customs data on commodity type and value are combined with the OAI data. The coverage of the OAI data may be summarized with a few aggregate statistics. In 2003, freight data were recorded for almost 1,500 airports worldwide. About 600 of these were international airports where they were engaged in shipments between the U.S. and other countries. About 200 of these international airports were located in the U.S. and its territories. The OAI T-100 data cover large certificated U.S. commercial carriers; since 10/2002, commuter and small certificated carriers are covered as well, although these will account for only a negligible amount of international air shipments. The T-100 (f) covers foreign carriers serving the U.S. Included in these carriers are parcel, courier, and express carriers, which are treated as a separate mode in FAF2. In 2003, the T-100 and T-100 (f) showed 244 air carriers shipping freight in the U.S., and 188 carriers shipping freight between the U.S. and other countries (119 of these were foreign carriers). Like FAF2, the public version of the T-100 data excludes in-transit shipments from the market data and foreign-to-foreign shipments from the segment data; however, see the Further Notes on the Data below for a qualification. The T-100 data do not include private or illegal shipments of freight, and passenger baggage is not counted as freight. 3.2.2 Census Foreign Trade DataThe Census Bureau Foreign Trade Division (FTD) (http://www.census.gov/foreign-trade/reference/products/index.html) publishes two monthly paid subscription series that largely satisfy the need for International Air data. The data are collected by the U.S. Customs Service and published as: 1) U.S. Exports of Merchandise – Monthly – DVD ROM (information on the value, quantity, method of transportation, and shipping weights for 9,000 export commodities, 240 trading partners, and 45 Districts; 2) U.S. Imports of Merchandise – Monthly – DVD ROM (data on more than 17,000 commodities for 240 trading partners and 45 Districts. The data provide value, quantity, method of transportation, shipping weights, import charges, duties, and much more.) Shipments are for all merchandise between foreign countries and U.S. Customs Territories (50 states, District of Columbia, Puerto Rico, the U.S. Virgin Islands, and U.S. Foreign Trade Zones [FTZs]). The objective is to capture the physical movement of merchandise between foreign countries and the U.S. It includes government and non-government shipments and does not depend on the shipment being part of a commercial transaction. A shipment's O-D on the Census data is based on Customs Districts and where the shipment is processed by the Customs Service. For FAF2 purposes, a Customs Districts may include more than one state and a state may have more than one Customs District. The Export data satisfies the need for mode-destination-port of origin-tonnage-dollar value, but lacks port-of-exit data. The Import data satisfies the need for mode-origin-destination-tonnage-dollar value, but defines port-of-entry as the port in which the shipment clears customs rather than the first port after crossing the border. Commodities are reported using the 10-digit HS (Schedule B for exports), which can be translated to SCTG using a crosswalk provided by FHWA. Export values are reported free-alongside-ship (FAS). Import values are available both by customs-import-value (CIV.), which excludes duties, freight, insurance, and other costs of importation; or by customs-insurance-freight (CIF), which adds freight and insurance to the CIV. For FAF2 the CIF values are used to better reflect the shipment's value at the border. The data are available approximately three months after the end-of-month. Export data are recorded in the month in which the shipment leaves the country, corresponding to the FAF2 definition. However, import data are recorded in the month in which the shipment clears customs and may therefore not correspond to the month the shipment was transported into the country due to time spent in bonded warehouses or FTZs. Like FAF2, the Census data exclude in-transit shipments. Although the Census Bureau data provide vital information for the FAF2 project, there is also a substantial on-going cost to subscribe to the dataset, currently $2,700/year for both Imports and Exports. Therefore it may be useful to consider a related subset of data that is available on-line for $75 for a one-month subscription at http://www.usatradeonline.gov/usatrade.nsf?Open&mc=F9000 for future use. Appendix A compares the dimensions of the data sources used to produce the provisional estimates. 3.2.3 Further Notes on the Data
3.3 Combining Census and OAI Data3.3.1 Cross-Walks for Commodity and Geographic InformationCombining the OAI and Census data into a FAF2 dataset requires reconciling the different levels of detail at which commodity and geographic identifying information are collected and stored. In the case of commodity types and values, the OAI data are at a more general level than is required by FAF2 – a topic that is covered in the sections below on estimation. This section covers the cross-walks used to reconcile differences between the commodity types on the Census and FAF2 datasets and the geographic information on all three datasets. Several of cross-walks were already available from FHWA. Commodity cross-walks between the HS used in the Census Foreign Trade files to the SCTG codes used in FAF2 are available at http://ops.fhwa.dot.gov/freight/freight_analysis/faf/faf2_tech_document.htm. Cross-walks between countries and foreign trade regions are available upon request from FHWA (contact Michael Sprung at Michael.Sprung@fhwa.dot.gov). A third cross-walk from U.S. counties to FAF regions was also provided by FHWA. The cross-walks to be developed are translations between different levels of specificity for geographic information between the OAI data and Census/FAF2. The OAI geography is based on airports, the most specific level of detail, and is used as link between the other two. Each airport is assigned to both a Customs District and a FAF2 region so that the relevant (dis)aggregation can be accomplished. The first cross-walk developed for FAF2 International Aviation is from U.S. airports to counties, which is used in combination with the existing cross-walk from counties to FAF2 regions. The matching process requires two supplemental files: the Master Coordinates File (MCF) from OAI, available at http://www.transtats.bts.gov/Tables.asp?DB_ID=595&DB_Name=Aviation%20Support%20Tables&DB_Short_Name=Aviation%20Support%20Tables, and the county subdivision file from Census, available at http://www.census.gov/geo/www/gazetteer/places2k.html. Two other sources also proved useful when the assignment of an airport to a county was unresolved from the first round of processing: Mapquest ® at http://www.mapquest.com/maps/ and the National Association of Counties website at http://www.naco.org/Template.cfm?Section=Data_and_Demographics&Template=/cffiles/counties/city_srch.cfm. Both the MCF and County Subdivision files have information on the state and on latitude and longitude. Within each state, the airports are matched to the two closest county subdivisions. Two subdivisions are matched because an airport may be near the border of its actual county and closer to the geographic center of another county. When the two closest subdivisions were in the same county, the airport was assigned to that county. When the two closest subdivisions were in different counties, the airport city name from the MCF was used to determine the county using either Mapquest ® or the National Association of Counties website. The second cross-walk developed for FAF2 International Aviation is from U.S. airports to U.S. Customs Districts. As noted above, the MCF provides information on airports in the form of airport name, state, city name, and latitude and longitude. In order to assign airports to Customs Districts a hierarchical matching method is used. Matching airports to Customs Districts is more complicated because Customs Districts are less uniform than counties, i.e., a state may have multiple Customs Districts, no named District or Sub-District, or a Customs District may span more than one state. While Customs Sub-Districts also consist of places in the usual geographic sense of cities or regions, they may also be airports and business places (e.g., FedEx processing centers). Matching Customs Sub-Districts to Airports
The matching process resulted in each U.S. airport being assigned to a Customs Sub-District. 3.3.2 Estimating Flows by WeightEstimation for domestic and international data is substantially different, with the estimation for domestic data being straightforward. Estimation of domestic data consists of calculating growth rates from OAI domestic market data by FAF origin region between the CFS year and the provisional year required for FAF. These growth rates are then applied to the individual commodity weights from 2002 FAF by origin region to obtain the estimates. The OAI market data for international shipments are missing the port-of-entry/exit, while the Census foreign trade data are missing the port-of-exit for exports. Also, the port-of-entry for imports does not necessarily correspond to the FAF definition of port-of-entry. This section outlines a procedure for reconciling the differences between the two datasets and assigning a port-of-entry/exit to the OAI market data based on the OAI segment data. The guiding philosophy behind the algorithm is to impose aggregate efficiency by minimizing the distance transported at each step. The specification of the algorithm is based on a port-of-exit. The extension to a port-of-entry is straightforward. Notation: T = tons shipped.
The result of this algorithm will be a dataset with shipment weights by origin-port of exit-destination. Matching is done at the carrier level to preserve the correspondence between market and segment data in the OAI datasets. Appendix B, Table B1, provides round-by-round results of the estimation. Round 1 corresponds to step a), round 2 to step c), round 3 to step f), and round 4 to step h). For imports (exports), rounds 2 and 3 assign a different port-of-entry (exit) than the domestic destination (origin). These two rounds accounted for about 17% of imports and 13% of exports. A large majority of the data, more than 70% for both imports and exports, is assigned its port-of-entry/exit in the first round where it is equal to the original port-of-entry (exit) for imports (exports). 3.3.3 Estimating Commodity Composition and Value3.3.3.1 International RoutesCommodity composition and value are available from Census for exports at the domestic origin, and for imports at the domestic destination based on the Census geographic definitions. The Census information is used to estimate commodity composition and value for the OAI data. The corresponding domestic origin for exports and domestic destination for imports from the OAI data will be referred to as the matching ports. The first step in the estimation process is to determine whether it is reasonable to combine the two data sources. Evidence that combing the data is reasonable is given by the high correlation for tonnage values at the matching ports between the two data sources (see the bottom of Table B2 in Appendix B). Two caveats to this estimation need to be noted. The first is that the OAI data are about 20% larger than the Census data for both imports and exports. Although there are several differences between the data sources, the strongest explanation is that the difference is due to the OAI data including more in-transit shipments than the Census data. The OAI data are based on carrier reporting with market routes defined by enplanement and deplanement of the cargo. An in-transit shipment that switches carriers in the U.S., or which is transferred from one plane to another by the same carrier, would appear as an import/export on the OAI data. However, the same shipment would be more likely to be designated as in-transit in Customs' reporting to Census. Additional evidence that differences are due to in-transit shipments is that the Customs Districts with the largest differences are also likely transshipment ports: New York, Miami, and Anchorage. This is the second caveat, because it affects whether it is reasonable to apply Census information to the OAI data on a district-by-district basis. The second caveat is addressed in the estimation process. The estimation philosophy is to assign the Census commodity distribution and value-per-ton by commodity (prices) to the OAI data for each matching port while keeping the aggregate commodity distribution and prices equal to that for the Census data. Because of the differences in tonnage for some key ports, a straightforward port-by-port application would result in large differences at the aggregate level. The first step in the port-by-port estimation is to rescale Census exports/imports by the ratio of the respective OAI-to-Census aggregates. The approach taken here has two parts. For the share of a matching port's tons that is on both the OAI and rescaled Census data, the distribution and prices are taken directly from the Census data for that port. The remainder can be either excess Census tons, or excess OAI tons. The commodities and values for matching ports with excess Census tons are then aggregated to define residual commodity shares and prices. The residual commodity shares and prices are then applied to those Customs Districts with excess OAI tons. The result is an OAI-based dataset that reflects the Census commodity distribution and prices at the aggregate level and also captures a large share of Census port-level differences in commodities and prices. More formally, the estimation algorithm can be written in terms of exports as follows: T = tons shipped. α = 33x1 vector of commodity shares, p = a 33x1 vector of commodity prices, σ = the export scale factor =
The resulting FAF commodity shares and prices are then applied at the airport level before aggregating to create tons and value at the FAF regional level. 3.3.3.2 Domestic RoutesThe only information available on commodity distribution and prices for domestic routes is the 2002 CFS survey, which is also the basis for the 2002 FAF database, and this is used as the base for estimation5. In order to more accurately reflect the values for non-survey years, the commodity price information from the CFS is updated using the commodity price data from Census on exports. Exports are used for two reasons: exports more closely resemble domestic production than imports and, for 2002, commodity shares and prices of exports are more highly correlated with CFS commodity shares and prices. In particular, commodity prices are calculated at the national level for exports for the provisional estimation year, and then the ratio of the commodity price in the provisional year to the 2002 level is used to inflate/deflate domestic commodity prices obtained from 2002 FAF data. Domestic air freight commodity shares are unchanged at the individual route level, or when aggregated by origin FAF region, since individual commodity weights are estimated to grow at the same rate as the weight of OAI shipments from the origin FAF region. Note, however, that commodity shares for destination regions can change since they receive shipments from more than one origin region. 3.4 Forecasting Air Freight Data for the Remainder of the YearThe first release of current year estimates are to be made available in December of the current year so forecasts are required for weight, value and the commodity distribution for unreported months6. For both Census data and domestic shipments by domestic carriers on the OAI data, the missing data consists of the fourth quarters of the most recent year. Foreign carriers on the OAI data (who have minimal domestic shipments) and domestic carriers' international shipments will require the third and fourth quarters to be forecast. The specific forecast techniques are selected based on historical evidence, but the basic approach is to first forecast tons shipped based on the OAI data, and then to forecast values and the commodity distribution based on the Census data. The forecasts use the most recent data on annual changes to update the available data from the fourth quarter of the previous year. Using the available data from the fourth quarter helps to retain the seasonal pattern for routes, commodity distribution, and relative prices for the fourth quarter of the current year. The results of the forecasts are then used to supplement the available data so that the methods described above for estimating air freight flows can be applied. The specific technique will be part of the broad class called time-series techniques. The general alternatives to time-series techniques are model-based techniques, which hypothesize relations between variables and estimate a model based on those relations. The problem with model-based techniques for the FAF2 is that the use of variables outside the database (e.g., fuel prices) restricts how the forecast data can be used (e.g., how does the price of fuel affect congestion) for independent study. In the case of using fuel prices to help forecast missing data, the effect of fuel prices would be pre-determined by the forecast model rather than reflecting actual conditions. Time-series techniques in contrast use only the past histories of the variables of concern to forecast the future. 3.4.1 Forecasting the OAI Data for Tons ShippedTwo significant events have changed the characteristics of the OAI data and limited the efficacy of using the history of the series prior to 2002. The first is the 9/11/01 terrorist attacks, which had a profound, direct effect on aviation. The second is the carrier coverage of the T-100 data, which expanded in 10/2002 to include small certificated, commuter, and all-cargo carriers. Carriers that began full reporting in 2002 will be referred to as new-reporters while those who fully reported prior to 2002 will be referred to as prior-reporters. The primary impact of this change is on domestic tons shipped, because international operations were already reported prior to 10/2002. Of particular significance, domestic operations of Federal Express were not publicly reported prior to 10/2002. Given these events, the historical period used as a base for forecasting is restricted to 2002 and later. The growth rates that are the basis of the forecasts are also restricted to depend only on information from the previous year to allow for an evolving trend following September 11. The limited availability of data reduces the number of parameters that can be estimated and the ability to apply standard statistical tests. For these reasons, simple techniques that depend on only one estimated parameter were considered. The techniques examined consist of using data on annual growth rates between the previous and current calendar year and then applying these growth rates to missing quarter(s) from the previous calendar year. For example, one of the forecasts for domestic carriers uses the growth rate from the third quarter of the previous year to the third quarter of the current year. The forecast for the fourth quarter of the current year is obtained by applying this growth rate to the level of tons enplaned in the fourth quarter of the previous year. Annual growth rates are used to avoid seasonal effects, which may have also have changed since September 11. The forecasts considered differ along three dimensions: whether the time period used to calculate the growth rates is the year-to-date (YTD) or the most recent completed quarter (depending on availability) relative to the same period in the previous year, whether to forecast domestic and international routes separately or in combination, and whether to forecast prior- and new-reporting carriers separately or in combination. Another potential problem with the most recent data may be carriers who are late in reporting their data to OAI. To correct for the missing carrier effect, an adjustment is made to the data for the most recent year. The adjustment is based on the assumption that late-reporting carriers grew at the same rate as those who reported on time. Adjusted growth rates are calculated for each month after January, with the growth rate for each month based on aggregate enplaned tons from the subset of carriers who reported in both the current and previous month. The adjusted growth rate is then consecutively applied to each month after January, subject to a constraint that the adjusted aggregate enplaned tons is greater than aggregate enplaned tons obtained directly from the data (since the adjustment is to account for late reporters). 3.4.1.1 Mathematical Specification of the Adjusted Tons and Forecasts7Let Let m/q = m Let h = n, r, a, u index carrier subsets Let t = y indicates the current year. Let i = b, d, s, c index route groupings Let j = n, p, s, c, f index carrier groups Let k = 1, 2 Calculation of Adjusted Growth Rates Let Let Then Forecasts: Let and General Level Forecasts for All Carriers and Regions Separate Regions (Domestic and International) Combined Regions There are eight forecasts for the fourth quarter of the current year: Separate Regions (Domestic and International) – Separate Carrier Groups Combined Regions – Separate Carrier Groups Separate Regions – Combined Carrier Groups Combined Regions – Combined Carrier Groups For purposes of illustration, the numerical results of the forecasts over the period 2002 to 2006 are given in Tables A-2 (third quarter forecasts for foreign carriers) and A-3 (fourth quarter forecasts for all carriers) in the Appendix A. Results from Table A-2 are presented for completeness but do not enter into the selection process. Three summary measures are given for each forecast in both levels and percentage terms: average error, standard deviation, and absolute error. The selection decision will be based on measures of percentage error because of the large change in levels with the addition of new-reporting carriers. Due to the small sample size, the forecast is selected based on a subjective evaluation of these measures rather than using formal statistical tests. As an aid to reading Table A-2, the best measure for each group of four forecasts varying by time-period and regional grouping is highlighted. The measures in Table A-2 clearly indicate using a forecast based on growth rates calculated using the latest available quarter rather than year-to-date, because all of the best measures under percentage error fall in this category. Selecting between forecasts based on separate or combined regional groups and separate or combined carrier groups is less clear. However, the evidence slightly favors basing the forecasts on growth rates calculated using separate regional groups and combined carrier groups. Therefore, the selected forecast is: The growth rates for each group in this forecast will then be applied to the most recently available fourth quarter (and third for foreign carriers) segment and market data from OAI at the individual carrier and route level. Thus, for the aviation components of the 2006 provisional estimates, the missing fourth quarter for 2006 is obtained by applying: 3.4.2 Forecasting the Commodity Distribution and PriceThe commodity distribution and price are forecast based on historical data from Census. Exports and imports are forecast separately for international shipments, and the export distribution and price are then used to forecast domestic shipments. As above, due to the disruptions to the aviation industry, only data for 2002 and later are used as a basis for the forecasts. 3.4.2.1 International ShipmentsCensus data on imports and exports are the only timely source for the value and commodity distribution of air shipments. Historical data from Census are used to forecast the price (value divided by weight) and commodity shares for the fourth quarter of the most recent year. The forecasts are then combined with the aggregate weight forecast from the OAI data, and the techniques outlined above for estimating the value and commodity distribution are then applied. Forecasts use available information to generate estimates of unavailable information. Evaluation of forecast techniques is based on applying the technique to generate “historical forecasts,” which can then be used to calculate errors based on the known values and summarized based on the forecast criterion. The basic philosophy is that the historical forecasts should be generated using only information that would have been available to a forecaster under the same production conditions as future forecasts will be generated. For FAF2 purposes, a forecast for the fourth quarter of 2007, for example, uses only information that would have been available in December 2007 (Census data up to the third quarter of 2007). The evaluation criterion used here is the average squared error for the forecasts of prices and the standard deviation of the forecast errors for the commodity distributions. The reason for the different criterion is that the price forecasts are not necessarily mean zero and the standard deviation would fail to incorporate undesired bias effects. Three forecasts of prices are considered: (i) year-to-date (YTD) third quarter prices are used for the fourth quarter, (ii) annual increases in individual commodity prices based on YTD in the current year and the same period in the previous year, and (iii) the annual increase in the price of aggregated commodities based on YTD in the current and previous year is applied to all individual commodities. The second and third forecasts of the price increase are then applied to the individual fourth quarter prices from the previous year. The first forecast will not include seasonal effects on prices, while the second and third forecasts will include seasonal effects. The final forecasts of commodity prices may use different techniques for different commodities, since seasonal effects may be important for some commodities, but not others, and because forecasts for the individual commodities are independent. 3.4.2.2 Domestic ShipmentsThe export commodity distribution and prices used for the latest provisional year are based on the implied forecast of these values for exports given above. 3.4.3 ConclusionThe aviation portion of the provisional commodity O-D data requires estimating key components that are missing from the available data and forecasting a portion of the data to provide timely information for analysis. The techniques outlined above provide a reasonable approach to filling the data gaps that will provide useful information to users of the database. 5 An adjustment is made to the CFS data to account for observations that have been rounded to zero for either value or tonnage. National level prices are calculated for each commodity based only on observations for which both value and tonnage are greater than zero. The national level price is then used to calculate the missing tonnage (value) by dividing the non-zero value by the price (multiplying the non-zero tonnage by the price).
6 For revised releases, the most recently available data may be used, avoiding the need to use forecasts.
7 Note: The specification is geared toward the usual situation where data are available through September for domestic carriers and June for foreign carriers. In the event the forecast is implemented when fewer or greater months are available, then modifications would be required. For growth rates, the latest quarter would refer to the latest available three months. For example, if data are available only through August for domestic carriers, then the latest quarter would be June through August, and growth rates would be calculated relative to the same period in the previous year. On the other hand, the base to which the growth rates are applied to generate forecasts consists of the unavailable months. So if data are available only through August, the growth rates are multiplied by tons shipped in the September to December period of the previous year.
|
|
United States Department of Transportation - Federal Highway Administration |
||