Carbon Cycle Data assimilation System (CCDAS)
Schematic of the CARBONES Carbon Cycle Data Assimilation System
A CCDAS has been built on the coupling of a prognostic model of the terrestrial carbon cycle to an optimization system so as to estimate some of the process parameters of the model with respect to various data sources (flux measurements, carbon inventory data, satellite products). The assimilation procedure consists in minimizing a misfit function that measures the mismatch between i) the model outputs, depending on the searched parameters, and the various observation streams, and ii) some a priori knowledge on these parameters and their optimized value. The assimilation framework requires that the model quantities (depending on the parameter to optimize) can be mapped to the various data sources, and that the error statistics (uncertainty) of each are known a priori. Given the uncertainties on the prior values of the parameters and on each observation source, a CCDAS allows deriving the uncertainties on the optimised model parameters. These uncertainties can finally be translated into uncertainty on the data assimilated and other model diagnostic. In CARBONES, as the model used is fully prognostic, we can therefore apply knowledge about the current terrestrial carbon cycle gained during the parameter optimization to predict its evolution into the future (with attached uncertainty on the predicted quantities).
ORCHIDEE is a global process-oriented Terrestrial Biosphere Model [Krinner et al., 2005]. It calculates carbon, water and energy fluxes between land surfaces and the atmosphere. The water and energy component computes major biophysical variables (albedo, roughness height, soil humidity) and solves the energy and hydrological balances at a half-hourly time step. The carbon module is called on a daily basis and calculates a prognostic LAI, allocates carbon towards leaves, roots, sapwood, reproductive structures and carbohydrate reserves. A turnover rate is applied to biomass pools and produces litterfall. Litter is decomposed and goes into three soil organic carbon pools with different residence times (active, slow and passive) following the CENTURY model [Parton et al., 1988]. The link between the water and carbon modules is done through photosynthesis, which is based on the work of Farquhar et al.  for C3 plants and Collatz et al.  for C4 plants. The stomatal conductance is based on Ball et al. . The carbon module first calculates LAI and the water and energy module then calculates GPP and stomatal conductance. The carbon module calculates growth and maintenance respirations [Ruimy et al., 1996] and Net Primary Production (NPP), heterotrophic respiration and Net Ecosystem Exchange (NEE). ORCHIDEE is the land surface component of the IPSL-CM5 Earth System Model (Dufresne et al., submitted), currently used for CMIP5 simulations.
ORCHIDEE model is under an open-source licence, named CECILL.
Ball, J.T., Woodrow, I.E. & Berry, J.A. (1987). A model predicting stomatal conductance and its contribution to the control of photosynthesis. Prog. Photosynthesis Res. Proc. Int. Congress 7th, Providence. 10-15 Aug 1986. Vol4. Kluwer, Boston, 221-224
Collatz, G. J., M. Ribas-Carbo, and J. A. Berry (1992), Coupled Photosynthesis-Stomatal Conductance Model for Leaves of C4 Plants, Australian Journal of Plant Physiology, 19(5), 519-538.
Dufresne, J.-L., et al. (2011), Climate change projections using the IPSL-CM5 Earth System Model: from CMIP3 to CMIP5, submitted to Clim. Dynam.
Farquhar, G. D., S. V. Caemmerer, and J. A. Berry (1980), A Biochemical-Model of Photosynthetic Co2 Assimilation in Leaves of C-3 Species, Planta, 149(1), 78-90.
Krinner, G., et al. (2005), A dynamic global vegetation model for studies of the coupled atmosphere-biosphere system, Global Biogeochemical Cycles, 19(1).
Parton, W. J., J. W. B. Stewart, and C. V. Cole (1988), Dynamics of C, N, P and S in Grassland Soils - a Model, Biogeochemistry, 5(1), 109-131.
Ruimy, A., G. Dedieu, and B. Saugier (1996), TURC: A diagnostic model of continental gross primary productivity and net primary productivity, Global Biogeochemical Cycles, 10(2), 269-285.
The ocean plays a crucial role as it contributes to an uptake of about a quarter to a third of the anthropogenic emissions with significant year to year variations (Sabine et al., 2004). For the CARBONES CCDAS, we have developed a statistical model to estimate air-sea fluxes from satellite, in-situ measurements and model outputs. The fluxes rely deeply on the sea surface carbon partial pressure estimated in a first step by the OCVR system described below.
Importance of the ocean surface carbon dioxide partial pressure (PCO2sw)
The air-sea CO2 flux is typically controlled by two terms embedded in the formula : F = (k α) . ΔpCO2 where k is the piston velocity, a is the solubility (Weiss, 1974) and ΔpCO2 the difference between the pCO2 in surface seawater and that in the overlying air. ΔpCO2 represents the thermodynamic driving potential for the exchange flux at the sea-air interface. Uncertainties in the air-sea CO2 flux come not only from the gas exchange coefficient, but also from the ΔpCO2. The uncertainties mainly come from the poorly constrained estimates of the sea surface pCO2. Indeed the seasonal and geographical variation of surface water pCO2 is much greater (from 150 to 750 uatm) than that of the atmosphere, which varies by 20 uatm to around 370 uatm in remote uncontaminated marine air (Feely and al., 2001).
OCVR-System: an innovative tool to improve pCO2sw estimation
Figure 1: OCVR architecture used for a global ocean pCO2sw reanalysis from 1989 to 2009 at 2° resolution.
Ocean pCO2 time series are one of the most valuable tools to observe trends of carbon fluxes. These analyses are limited by the coverage of measurements (less than 5% at 2° and monthly resolution over the last 20 years). The rapid development of satellite measurements which provide very large volumes of data (weekly) and high resolution (less than 1°) is an alternative to this problem. However the fluxes can only be obtained with indirect methods based either on numerical modelling, or from robust algorithms using observable drivers. The system OCVR belongs to this latter family.
OCVR is a neural network framework developed by CLIMMOD within the CARBONES-EU FP7 project (see Fig. 1). As input variables, it uses observations from satellites (Surface Chlorophyll, Sea Surface Temperature...), in-situ and model outputs (Temperature, Salinity, Mixed Layer Depth,...) which control to first-order the surface ocean pCO2. A variational data assimilation scheme efficiently incorporates new sets of pCO2 observations (trend and seasonal adjustments), and takes into account extreme events like El Niño. The system then uses supplied atmospheric CO2 concentration to calculate air-sea flux according to a selectable exchange parameterization (e.g. Wanninkhof 1992, Nightingale 2000, Takahashi 2009). The results obtained with the OCVR-system are illustrated as global maps (see Fig. 2) and ocean time series (see Fig. 3).
Figure 2: Global pCO2sw maps from OCVR. January simulations for 1990, 2000 and 2009.
Feely, R.A., C.L. Sabine, T. Takahashi, and R. Wanninkhof 2001, Uptake and storage of carbon dioxide in the oceans: The global CO2 survey. Oceanography, 14(4), 18-32.
Nightingale, P.D., et al. 2000. In situ evaluation of air-sea gas exchange parameterizations using novel conservative and volatile tracers. Glob. Biogeochem Cycles, 14, 373-387.
Sabine, C.L., et al. 2004. The oceanic sink for anthropogenic CO2. Science, 305 (5682), 367-371.
Taro Takahashi, et al. 2009, Corrigendum to "Climatological mean and decadal change in surface ocean pCO2, and net sea-air CO2 flux over the global oceans" Deep Sea Res. II, 56, 554-577.
Wanninkhof, R., 1992. Relationship between wind speed and gas exchange. J. Geophys. Res. 97, 7373-7382.
The transport model is an essential component of the CCDAS that relates the surface fluxes to the atmospheric CO2 concentrations.
It follows the configuration of the General Circulation Model LMDz model (Hourdin et al., 2006) as implemented in the coupled climate simulations for the IPCC AR4 report (IPCC, 2007). The dynamical part of the GCM is based on a finite-difference formulation of the primitive equations solved on a 3D Eulerian grid with a typical horizontal resolution of 3.75° (longitudes) x 2.5° (latitudes) and 19 sigma-pressure layers up to 3hPa. This corresponds to a vertical resolution of about 300-500 m in the planetary boundary layer (first level at 70 m height) and to a resolution of about 2 km at the tropopause (with 7-9 levels located in the stratosphere). The calculated winds (u and v) are relaxed to ECMWF analyzed meteorology with a relaxation time of 2.5h (nudging) in order to realistically account for large scale advection (Hourdin et al., 2000). The advection of tracers is calculated based on the finite-volume, second-order scheme as described by Hourdin and Armengaud (1999). Deep convection is parameterized according to the scheme of Tiedtke (1989) and the turbulent mixing in the planetary boundary layer is based on a local second-order closure formalism.
Representation of vertical and horizontal grids of LMDZ Colors represent simulated surface and air temperatures, arrows represent wind's direction.
© L. Fairhead (IPSL/LMD)
The model has been widely used to model climate (IPCC, 2001, 2007) as well as transport and chemistry of gas and particles (Peylin et al., 2005, Bousquet et al. 2005, 2006, Rivier et al., 2006, Hauglustaine et al., 2004, Folberth et al., 2005). LMDz-INCA is running operationally since 2006 and calculates every day a 5-day forecast of the main atmospheric trace gases (http://www-lsceinca.cea.fr). For greenhouse gases, we have been using LMDz since 2003 at LSCE in an offline mode named LMDZt (with pre-calculated transport fields) to save computational time compared to the full GCM.
Few main changes have occurred since the first use of LMDz in the CCDAS. The model used for the first CARBONES re-analysis is based on LMDz version 3. Recent and ongoing developments at the LMD laboratory have led to an improved version, LMDz version 4, with the following changes:
The updated version has led to significant improvement in terms of simulated atmospheric CO2 concentrations. An internal report that can be requested from Philippe Bousquet at LSCE (email@example.com) compares the two LMDz versions together with two other state of the art global transport models (TM3 and TM5) to atmospheric CO2 data using a standard set of fluxes. Model skills for the simulation of i) large scale atmospheric features (TRANSCOM-I experiment), ii) upper air CO2 data, iii) vertical profile within the planetary boundary layer, and iv) continuous surface CO2 records are on averaged improved with the new LMDz version. Within the continental PBL, the standard LMDZ performances were limited by a too large diffusion that smoothes gradients of concentrations too efficiently. The new version 4 with an improved PBL scheme, presents equivalent capabilities to represent the synoptic variations of CO2 concentrations as other state-of-the-art models such as TM5. Such improvement is thus crucial to properly assimilate all surface CO2 concentration data within the CCDAS.
The model is freely available under a free-software license: CECILL. It can be accessed through: lmdz.lmd.jussieu.fr/
Bousquet P., Hauglustaine D.A., Peylin P., et al. (2005), Two decades of OH variability as inferred by an inversion of atmospheric transport and chemistry of methyl chloroform, Atmospheric Chemistry and Physics, 5: 2635-2656
Bousquet P., Ciais P., Miller J.B., et al. (2006), Contribution of anthropogenic and natural sources to atmospheric methane variability, Nature, 443(7110): 439-443
Folberth G., Hauglustaine D.A., Ciais P., et al. (2005), On the role of atmospheric chemistry in the global CO2 budget, Geophysical Research Letters, 32(8): L08801
Hauglustaine D.A., Hourdin F., Jourdain L., et al. (2004), Interactive chemistry in the Laboratoire de Meteorologie Dynamique general circulation model: Description and background tropospheric chemistry evaluation, Journal of Geophysical Research - Atmosphere, 109(D4): D04314
Hourdin F., Armengaud A. (1999), The use of finite-volume methods for atmospheric advection of trace species. Part I: Test of various formulations in a general circulation model, Monthly Weather Review, 127(5): 822-837
Hourdin, F. D., F. Couvreux, and L. Menut (2002), Parameterization of the dry convective boundary layer based on a mass flux representation of thermals, J Atmos Sci, 59, 1105-1123
Hourdin, F., et al. (2006), The LMDZ4 general circulation model: climate performance and sensitivity to parametrized physics with emphasis on tropical convection, Climate Dynamics, 27, 787-813
IPCC (2001). Climate Change 2001: The Scientific Basis. Contribution of Working Group I to the Third Assessment Report of the Intergovernmental Panel on Climate Change [Houghton, J.T., Y. Ding, D.J. Griggs, M. Noguer, P.J. van der Linden, X. Dai, K. Maskell, and C.A. Johnson (eds.)]. Cambridge University Press, Cambridge, United Kingdom and New York, NY, USA, 881pp
IPCC, (2007). Climate Change 2007: The Physical Science Basis. Contribution of Working Group I to the Fourth Assessment Report of the Intergovernmental Panel on Climate Change [Solomon, S., D. Qin, M. Manning (eds.)]
Peylin P, Rayner PJ, Bousquet P, et al. (2005), Daily CO2 flux estimates over Europe from continuous atmospheric measurements: 1, inverse methodology, Atmospheric Chemistry and Physics, 5: 3173-3186
Rivier L., Ciais P., Hauglustaine D.A., et al. (2006), Evaluation of SF6, C2Cl4, and CO to approximate fossil fuel CO2 in the Northern Hemisphere using a chemistry transport model, Journal Geophysical Research-Atmosphere, 111(D16) - D16311.
Tiedtke M. (1989), A comprehensive mass flux scheme for cumulus parameterization in large-scale models, Monthly Weather Review, 117(8): 1779-1800
MODIS collection 5 surface reflectance data (from 2000-2010) in the red (R) and near-infrared (NIR) bands at 5km resolution are used to optimise the phenology-related parameters of ORCHIDEE.
The data have been processed by LSCE at global scale to correct for directional effects, following Vermote et al. (2009). From this the Normalised Difference Vegetation Index (NDVI) is calculated using the following equation:
The NDVI is used as a measure of the vegetation "greenness", and thus can be used a proxy for the phenological cycle of the vegetation (Tucker et al., 1979).
The data are then averaged at the spatial resolution of the model grid and are interpolated to a daily timeseries by fitting a polynomial of order 3 to the 11 valid values that were closest to (and including) the observation. If there is a gap in the observations of more than 15 days there is no interpolation.
The NDVI data has been assessed using the MODIS quality flags. The value is treated only if:
In addition a cloud test and snow test are performed and pixels with solar zenithal angles above 75° and viewing zenithal angles above 60° are removed. The data has a noise range of ~0.025 to 0.03, with highest values in densely forested areas (Vermote et al., 2009).
Due to lack of confidence in the MODIS LAI/fAPAR product from previous studies, from both the identification of poor temporal consistency due to contamination by clouds and residual atmospheric effects, and inaccuracies in the absolute values of LAI (Yang et al., 2006; Weiss et al., 2007; Garrigues et al., 2008), a simple linear relationship between the modeled fAPAR and satellite-derived NDVI observations is used, based on studies such as Knyazikhin et al. (1998). The prior error on the fAPAR-derived observations was set to be the root mean squared error (RMSE) between these "observations" and the prior simulation of fAPAR.
Tucker, C. J. (1979). Red and photographic infrared linear combinations for monitoring vegetation. Remote Sensing of Environment, 8, 127-150.
Vermote, E., C.O. Justice and F-M Breon (2009), Towards a generalized approach for correction of the BRDF effect in MODIS directional reflectances, IEEE Transactions on Geoscience and Remote Sensing, 47, 3, 898-908.
We are using a set of atmospheric CO2 concentration data from selected stations around the world. This data stream integrates the fluxes over a large area and provides the basis to optimize the large global patterns of carbon fluxes correctly. Figure 1 below displays the location of the stations.
Figure 1: Localization of the different stations used in the current version of the CCDAS
We use the same data as Chevallier et al. (2010). These data come from three large data bases: The NOAA Earth System Laboratory (ESRL) archive, the CarboEurope IP project, and the World Data Centre for Greenhouse Gases (WDCGG) of the World Meteorological Organization (WMO) Global Atmospheric Watch Programme. The three databases include both in situ measurements made by automated quasi-continuous analyzers and air samples collected in flasks and later analysed at central facilities. The data treatments are fully discussed in Chevallier at al. (2010).
The atmospheric CO2 fluxes and CO2 concentrations are linked by the LMDz model (Hourdin et al., 2006) in the optimisation process. The errors in the LMDz model are included in the observational errors following Tarantola (2005). The treatment of these errors follows that of Chevallier et al. (2010). Values range from 0.37 ppm to about 30 ppm or more depending on the temporal resolution of the observations. The large values for some observations compensate for the absence of explicit correlations in the assigned transport model errors for temporally dense data.
The data are available for most station following an open data policy with a specific fair use agreement.
Most data used for this particular project come from:
Chevallier, F., et al., CO2 surface fluxes at grid point scale estimated from a global 21 year reanalysis of atmospheric measurements, Journal Of Geophysical Research, 115, D21307, doi:10.1029/2010JD013887, 2010
Hourdin, F. et al., The LMDZ4 general circulation model: climate performance and sensitivity to parametrized physics with emphasis on tropical convection, Climate Dynamics, 27(7), 787-813, 2006.
Tarantola, A., Inverse problem theory and methods for model parameters estimation, Society for Industrial and Applied Mathematics, Philadelphia, ISBN 0-89871-572-5, 2005
Eddy covariance flux measurements for Net Carbon Exchange and for Latent Heat flux from the global network of observation sites are used to constrain ecosystem physiology and fast processes from the synoptic variations to the seasonality of fluxes in ORCHIDEE. Note that we do not consider the sensible heat flux measurements, as the overall objectives of CARBONES concern the carbon balance and not the energy balance.
We use harmonized quality checked and gap filled data (LEVEL4) from a new global synthesis called the LaThuile dataset. These new data are made available during the course of the project. For other sites the use of the data is negotiated with the PIs. This dataset forms a unique collection of 600 site-years of online hourly measurements of CO2, water vapour, and heat fluxes. Figure 1 shows the location of the different sites. In order to avoid dealing with the large error correlations both in the model and the measurements, we are using daily mean values of Net Ecosystem Exchange and Latent Heat flux in the CCDAS. Note that days with less than 80% of half-hourly data left out the assimilation.
Figure 1: Localization of the different FLUXNET sites that are available in the FLUXNET database
Eddy covariance measurement errors consist of a random and systematic error component. The random error can be evaluated by using the high temporal density of the dataset, interpreting observations made under the same conditions as repeated measurements. Errors are largely Gaussian distributed with standard deviations which increase with the flux magnitude. Autocorrelation of fluxes is usually low, below a correlation coefficient of 0.6 at a lag of 0.5 h (Lasslop et al. 2008). Despite of the low autocorrelation, model parameter optimizations with flux data have shown that the use of only every second or every third data point is reasonable because the low autocorrelation could be due to filling of data gaps (Lasslop et al. 2008).
Systematic errors in eddy covariance measurements are caused by advection, low turbulence or variable footprints of the measurement station during day- and night-time. These systematic errors dominate the overall uncertainty of annual total carbon fluxes while uncertainties from the random error component are negligible (Lasslop et al. 2010). Systematic errors can be quantified using datasets from multiple years and sites covering a range of plausible values of the wind velocity or by using only day-time observations (Lasslop et al. 2010). Nevertheless, overall relationships between estimated carbon fluxes.
In Carbones, data authorized under the Free-Fair-Use policy has been utilized. As documented at www.fluxdata.org, the data available in this database (a subset of the LaThuile dataset) have been furnished by individual scientists who encourage their use under an open data policy that emphasizes the free and open exchange of scientific information.
The data are made freely available to the public and the scientific community in the belief that their wide dissemination will lead to greater understanding and new scientific insights and that global scientific problems require international cooperation.
Open access means that data are freely distributed without charge; there may be charges for the cost of reproduction and delivery when access is not web-based. Data download is unrestricted and requires only a free registration needed for web security reasons.
The FLUXNET participants that decided to share these data openly rely on the ethics and integrity of the users to assure that the data providers and FLUXNET receive fair credit for their work through inclusion of the text provided below in the acknowledgment.
The data users must send accepted papers or links to them to the fluxdata.org staff and PIs of the sites used in the paper. It is also recommended to contact the site PIs prior to publication to prevent potential misuse or misinterpretation of the data; if the work is based on only a few sites, this contact is strongly recommended.
Downloaded data cannot be redistributed to others and must not be redistributed via other websites, databases or any other storage system to prevent circulation of different versions of the datasets.
The following acknowledgment text has to be used with publications:
This work used eddy covariance data acquired by the FLUXNET community and in particular by the following networks: AmeriFlux (U.S. Department of Energy, Biological and Environmental Research, Terrestrial Carbon Program (DE-FG02-04ER63917 and DE-FG02-04ER63911)), AfriFlux, AsiaFlux, CarboAfrica, CarboEuropeIP, CarboItaly, CarboMont, ChinaFlux, Fluxnet-Canada (supported by CFCAS, NSERC, BIOCAP, Environment Canada, and NRCan), GreenGrass, KoFlux, LBA, NECC, OzFlux, TCOS-Siberia, USCCC. We acknowledge the financial support to the eddy covariance data harmonization provided by CarboEuropeIP, FAO-GTOS-TCO, iLEAPS, Max Planck Institute for Biogeochemistry, National Science Foundation, University of Tuscia, Université Laval and Environment Canada and US Department of Energy and the database development and technical support from Berkeley Water Center, Lawrence Berkeley National Laboratory, Microsoft Research eScience, Oak Ridge National Laboratory, University of California-Berkeley, University of Virginia.
Lasslop, G., M. Reichstein, J. Kattge, and D. Papale. 2008. Influences of observation errors in eddy flux data on inverse model parameter estimation. Biogeosciences, 5, 1311-1324.
Lasslop, G., M. Reichstein, D. Papale, A. D. Richardson, A. Arneth, A. Barr, P. Stoy, and G. Wohlfahrt. 2010. Separation of net ecosystem exchange into assimilation and respiration using a light response curve approach: critical issues and global evaluation. Global Change Biology, 16, 187-208.
Papale, D., M. Reichstein, E. Canfora, M. Aubinet, C. Bernhofer, B. Longdoz, W. Kutsch, S. Rambal, R. Valentini, T. Vesala, and D. Yakir. 2006. Towards a more harmonized processing of eddy covariance CO2 fluxes: algorithms and uncertainty estimation. Biogeosciences Discussions, 3, 961-992.
Reichstein, M., E. Falge, D. Baldocchi, D. Papale, R. Valentini, M. Aubinet, P. Berbigier, C. Bernhofer, N. Buchmann, T. Gilmanov, A. Granier, T. Grünwald, K. Havránková, D. Janous, A. Knohl, T. Laurela, A. Lohila, D. Loustau, G. Matteucci, T. Meyers, F. Miglietta, J.-M. Ourcival, S. Rambal, E. Rotenberg, M. Sanz, G. Seufert, F. Vaccari, T. Vesala, and D. Yakir. 2005. On the separation of net ecosystem exchange into assimilation and ecosystem respiration: review and improved algorithm. Global Change Biology, 11, 1424-1439.
Observed partial pressure of CO2 (pCO2) at the surface of the ocean is used build the OCVR statistical ocean model that calculates air-sea carbon fluxes. The pCO2 data are directly assimilated in OCVR to produce global time varying maps of pCO2 further used to calculate the net carbon flux.
We use two types of data streams:
Both data sets come from http://cdiac.ornl.gov/oceans/LDEO_Underway_Database/
The figure below illustrate the spatial coverage of these pCO2 data
The approach used to derive the pCO2sw over the 20 year period involves several steps:
In a third step, we further adjust pCO2sw for the inter-annual variations using the pCO2sw raw data from Takahashi database. A detailed description of the processes can be found in the "CCDAS description" deliverable.
As soon as they are available, the Takahashi pCO2 dataset shall be replaced by pCO2 data from SOCAT. These will include data from more than 1250 cruises from 1968 to 2007 with approximately 4.5 million measurements of various carbon parameters.
SOCAT brings together, in a common format, all publicly available fCO2 data for the surface oceans. (the fugacity of carbon dioxide, or fCO2, is the partial pressure of CO2 (pCO2) corrected for non-ideal behaviour of the gas.) The data set will serve as a foundation upon which the community will continue to build in the future, based on agreed data and metadata formats and standard 1st level quality-control procedures. In a near future, two distinct data products will be made available in SOCAT:
SOCAT contains about 50% more data than the Takahashi data set, which is why SOCAT-based air-sea CO2 flux estimates are expected to have smaller uncertainties. For the last version of OCVR (the ocean component of the CCDAS) we will use the fCO2 raw data set to train the neural network. The gridded product will be used for quality control.
The pCO2 data are freely available and accessible under: http://cdiac.ornl.gov/oceans/LDEO_Underway_Database/
The IERA Meteorological data are used as boundary conditions to the ORCHIDEE land surface model and to guide the atmospheric transport model LMDz towards the actual weather. The variables have been extracted at a 3-hourly temporal and 0.7°x0.7° spatial resolutions from the archive of the ECMWF interim reanalysis (ERA-I, Berisford et al. 2009) over the period 1989-2009. We have chosen this dataset for its consistency over the 20-yr target period.
The uncertainties associated to the IERA data set are not negligible, especially with respect to the precipitation fields. Thus, IERA shows large differences with GPCC (Global Precipitation Center) product (Schneider et al., 2008) over central Africa (Dee et al., 2011). These differences are indicative of higher uncertainties by the sparse radiosonde.
Details on the known quality issues with IERA data can be found from here: http://www.ecmwf.int/research/era/do/get/index/QualityIssues
Figure: Average precipitation rates (mm day-1) for the 21-year period 1989-2009 from (a) GPCC and (b) ERA-Interim, and decadal change in precipitation from (c) GPCC and (d) ERA-Interim, defined as the difference between the 2000-2009 average and the 1990-1999 average for each dataset (from Dee et al., 2011)
The data streams are freely available under: http://www.ecmwf.int/research/
Berrisford, P., et al. (2009), The ERA-Interim archive. ERA Report Series no. 1, 16 pp, Eur. Cent. for Medium-Range Weather Forecasting, Reading, U.K.
Dee, D. P., with 35 co-authors, 2011: The ERA-Interim reanalysis: configuration and performance of the data assimilation system. Quart. J. R. Meteorol. Soc., 137, 553-597
Schneider U, Fuchs T, Meyer-Christoffer A, Rudolf B. 2008. 'Global precipitation analysis products of the GPCC'. Deutscher Wetterdienst: Offenbach, Germany. Available at http://gpcc.dwd.de.
Specific fossil fuel emissions were constructed by the USTUTT/IER partner of CARBONES. These emissions are directly used in the CCDAS as imposed surface carbon fluxes.
For the global spatial distribution in particular for a reanalysis, it is assumed that EDGAR 4.2 delivers the most up-to-date information with the highest spatial resolution of 0.1° x 0.1°. Therefore, the gridded EDGAR 4.2 CO2 emissions are used as input for the temporal distribution over the globe. Based on the gridded and sectoral distinguished EDGAR 4.2 emissions, it is possible to apply country and sector specific time profiles derived by IER. The principal approach of the temporal resolution is to apply three different types of time profiles as indicated in equation 1.
Eq.1: Principle approach of the temporal distribution
The method is to combine annual EDGAR 4.2 emissions with (sector, country, month and year specific) monthly profiles and (country, sector and day specific) daily profiles and also with (country, day, hour and time zone specific) hourly profiles (see Eq. 1)
The derivation of the time profiles is on the basis of statistical data sets as well as correlations. Examples for statistical data sets are e.g. Eurostat, ENTSO-E, UN monthly bulletin, etc. Currently, the temporal profile for the globe are derived from data sets over Europe extrapolated with information on climate zone, average monthly temperature for the seasonal cycles and similarity in socio-economic parameters like population and GDP.
The most important correlation is the temperature dependency of the energy consumption respectively fuel use of the energy supply, industrial processes and in particular the residential heating. The influence of the temperature can be derived from correlating the time segment with the corresponding value. For example, the derivation of temperature dependent monthly time profiles for the energy supply (power plants) is described in the figure 1 (left) for France. Figure 1a shows the correlation between the ambient temperature and the fossil fuel consumption in France and 1b describes the hourly shares of residential fuel consumption for Germany.
Fig. 1: Temperature dependent derivation of temporal profiles (left yearly profile power plants in France, right daily profile for households in Germany)
The existing global spatially and temporally resolved emission models do not consider the emission heights. As a consequence, the application of effective emission heights is a major improvement of the fossil fuel emission modelling on the global scale. Table 1 shows the considered effective emission heights derived using information from EMEP and own assumptions. The fossil emissions heights are distinct into: i) cruise, climb and descent emissions from aircraft, parts from combustion processes from energy and manufacturing industry above 781m, ii) main part of emissions caused by combustions from energy and manufacturing industries as well as households between 92m and 781m, and iii) surface emissions like transport, residential sectors and industrial processes emitted below 92m.
Tab. 1: Derived effective emission heights considered in the second delivery in the updated sector structure
The global product at 1 degree resolution and hourly time profiles from the second delivery is shown in figure 2. The figure 2 shows the spatial and temporal distribution of the fossil fuel emission in a 1° x 1° with different effective emission heights.
Fig. 2: Spatial distribution of the hourly fossil fuel emission of the second delivery considering three different emissions height classes for the January 16, 2008 at 00 UTC.
An uncertainty analysis of the EDGAR emissions is performed by JRC. The uncertainty of the spatial resolution will be analysed on the basis of model inter comparisons over Europe. The uncertainty of the temporal profiles can be assessed by comparing the model prediction with statistical data on e.g. fuel use not considered for the derivation of the temperature correlations.
The fossil fuel data derived in CARBONES can distributed freely under the conditions listed below. The term data provider used hereafter refers to USTUTT/IER.
If users consider publishing work in which the CARBONES derived fossil fuel inventory plays a significant role, the data provider reserves the right to ask for co-authorship that will acknowledge the work performed by the data provider on the preparation of the emission inventories. This is important for track record and to show funding agencies that the CO2 inventories are used to advance research. The data providers will gladly provide input for such a paper and will fulfil their tasks as co-authors. If the role is not significant then proper reference and acknowledgement is sufficient. The data provider expects and trusts that in the light of a good cooperation the data user will make a balanced judgment about whether the fossil fuel emission prepared by the data provider played a significant role or not.
ENTSO-E. 2012. Hourly load values of all countries for a specific month. https://www.entsoe.eu/resources/data-portal/consumption/.
GISCO. 2010. Geographic Information System of the European Commission (EUROSTAT) (Stand: 15. Februar 2012).
JRC - Climate Change Unit. EUROPA - EDGAR Methodology. http://edgar.jrc.ec.europa.eu/methodology.php#12sou (Stand: 18. Juli 2011).
Peel, M. C, B. L. Finlayson, und T. A. McMahon. 2007. Updated world map of the Köppen-Geiger climate classification. Hydrology and Earth System Sciences, 11, 1633-1644.
Biomass burning emissions data from the Global Fire Data (GFED version 3) are used. Monthly CO2 emissions fields at 0.5°x0.5° degree resolution and over 1997-2010 period are considered. Figure 1 illustrates the mean annual fire emission over the globe from this GFED release.
Figure 1: Annual carbon emissions (as g C m-2 year-1), averaged over 1997-2010 from GFED database
The GFED data are breakdown into 6 sectors: deforestation, peat fires, savanna fires, agriculture, forest fires, and woodland. With these data, we generated fluxes of CO2 emissions relevant for typical "burning - regrowth" processes as detailed hereafter. These fluxes are then prescribed to the CCDAS over the period of 1989-2009. Before 1997, we use for each year, the mean annual field derived from the fluxes over the 1997-2010 period. In order to account for fundamental differences between the six initial categories, we grouped the GFED emissions in 3 types and made specific treatments for each of this type:
The overall biomass burning flux considered in the CCDAS for the optimization process is the sum of the three fluxes as described above.
The uncertainties in the derived biomass burning fluxes were difficult to assess. The GFED data do not provide uncertainties in the emissions. However, uncertainties in the burnt areas are given. We are currently working on the possibility to use this information and to derive an uncertainty on the biomass burning fluxes to be used in the final CCDAS version (i.e. with the optimization of the biomass emission for several regions).
The biomass burning data are freely available under: http://www.globalfiredata.org/
Houghton, R. A. (2003) Revised estimates of the annual net flux of carbon to the atmosphere from changes in land use and land management 1850-2000. Tellus 55B: 378-390.
The CARBONES forest biomass data consists of a series of gridded global maps of forest area, growing stock and aboveground, belowground and total biomass and carbon for the period 1950-2010, as well as 5-year changes in these variables. It is based on a database of forest area and growing stock, compiled out of a series of international assessments of FAO, MCPFE (now Forest Europe) and UNECE. Data of different assessments are to the extent possible harmonised to reflect both forest area and other wooded land, to be comparable between countries and assessments.
Data series per country on area and growing stock were checked for outliers. Values were linearly interpolated to create continuous time series. Extending time series was done by assuming a constant value equivalent to respectively the first or last known observation. Growing stocks were converted to aboveground, belowground and total biomass and carbon using default biomass expansion factors from the IPPC Good Practice Guidance (2006).
To downscale the statistical data to a finer scale, two remote-sensing based map products have been combined: For Europe (including Turkey and the European part of Russia) a forest map of 1km resolution (EFI 2011) was used and the Global Land Cover 2000 (GLC2000) dataset was applied for all other continents (JRC 2003). Finally, the combined forest raster map was projected into Mollweide projection, an equal area projection that is suitable for global analysis.
Polygon data with historic country boundaries were used to link the forest statistics to the countries. Five different polygon datasets have been compiled that consider the major country border changes over time worldwide.
To downscale the statistical/interpolated forest area values, a ratio was
calculated between these values and the total forest area in the input forest
raster map, for each country and year. This ratio was used to scale the raster
values in the forest map so that the forest area per country in the output map
would fit with the statistical/interpolated forest area value for that country.
Statistical/interpolated values of other forest parameters were divided by the
interpolated forest area values to derive values per hectare by country and
year. These values were then multiplied with the previously calculated forest
area maps at 1km scale for each year. The resulting raster maps show the respective
forest parameter at 1km level. All output rasters were then aggregated to the
common 1-degree grid by summing up raster cell values for each grid cell. All
global outputs are projected in Mollweide projection, Datum WGS84.
Figure 1: Mean above ground biomass for a 5 years mean period centered in 2005.
Within the CCDAS system, the forest biomass data are primarily used as validation. However, they are suitable as well to be used in the assimilation phase.
The underlying data are of varying quality, especially for the earlier assessments. Additional processing steps such as interpolation and gridding add uncertainty. We have attached uncertainty classes based on 1) basic uncertainty in the data, 2) methodological improvements over time and 3) interpolation.
This product has been described in more detail in Hengeveld et al. (in preparation).
The maps and the database can be downloaded at the CARBONES portal or at :
http://opendap.cgi-systems.nl/thredds/catalog/projecten/EuropeanForest/carbones/catalog.html. This product can be used free of charge, under the condition that it is properly referenced (Hengeveld et al. in prep).
Additional information on the forest biomass dataset can be found in a powerpoint presentation, available to download via this link.
This study was part of the CARBONES project (‘30-year re-analysis of carbon fluxes and pools over Europe and the Globe’, EU Contract No. 242316) and the GHG-Europe project (‘Greenhouse gas management in European land use systems’, EU Contract No. 244122).
Hengeveld, G.M.H., K. Gunia, M. Didion, S. Zudin, A.P.P.M. Clerkx, M.J. Schelhaas. Gridded maps as derived from a compilation of global forest biomass estimates 1950-2010. To be submitted to Earth System Science Data.
EFI, 2011. Forest Map of Europe. European Forest Institute and European Commission, Joint Research Centre. URL: http://www.efi.int/portal/virtual_library/information_services/mapping_services/forest_map_of_europe/
IPCC, 2006. Default biomass conversion and expansion factors. IPCC Guidelines for National Greenhouse Gas Inventories, Volume 4 – Agriculture, Forestry and Other Land Use, Table 4.5. Intergovernmental Panel on Climate Change.
JRC, 2003. Global Land Cover 2000 database. European Commission, Joint Research
Several soil organic data set have been used and combined to evaluate the soil carbon estimated with ORCHIDEE in the CCDAS.
Both databases contain information on soil parameters to a depth of 1 metre (organic carbon, pH, water storage capacity, soil depth, cation exchange capacity of the soil and the clay fraction, total exchangeable nutrients, lime and gypsum contents, sodium exchange percentage, salinity, textural class and granulometry). In each soil unit up to 9 different soils are represented and their respective proportion is given.
WISE5min Version 1.0 (Batjes, 2006)
The spatial resolution of this database is 5 arc minutes. The profile is provided in 20 cm intervals to a depth of 1 m. Standard deviations are also given for this dataset.
The database draws information from WISE version 3.0, which holds 10,100 globally distributed soil profiles.
Figure 1: WISE5min version 1.0
HWSD Version 1.1 (Nachtergaele et al., 2009)
The spatial resolution of the database is 1 km (30 arc seconds by 30 arc seconds). The profile is divided in topsoil (0 - 30 cm) and subsoil (30 - 100 cm).
The database draws information from:
Figure 2: HWSD version 1.1. Extended inland white areas are greater than 80 kg m-2.
Generally, soil data are associated with a very high error due to the high heterogeneity of soils. Furthermore, the extrapolation of data for the generation of the spatial global dataset from measured soil profiles is merely based on the soil type and does not take into account climate, land cover or management.
WISE5min dataset provides a standard deviation for all the soils Figure 3. It visualizes that most soils will have an error of more than 10% of the reported SOC value. Jobbagy and Jackson (2000) even report an error of 21% for the first metre. A further extrapolation of the data to a depth of 2m (the depth of ORCHIDEE simulation results) will increase the error to 24%.
Figure 3: Standard deviation of WISE5min in % of SOC in the grid.
The two datasets used in this project are freely available:
Batjes, N. ISRIC-WISE Derived Soil Properties on a 5 by 5 Arc-minutes Global Grid (ver. 1.0). Wageningen, The Netherlands, 2006.
Jobbagy, E. G., and R. B. Jackson. 'The Vertical Distribution of Soil Organic Carbon and Its Relation to Climate and Vegetation'. Ecological Applications, 10, no. 2 (2000): 423-436.
Nachtergaele, F.O., H. van Velthuizen, and L. Verelst. Harmonized World Soil Database Version 1.1, 2009.
Global spatial-temporal patterns of ecosystem carbon fluxes were estimated using different sources of observational data. These data-oriented estimates will be used as an independent dataset to evaluate gross carbon flux estimates from CARBONES.
Global spatial-temporal patterns of carbon and energy fluxes were created using a machine-learning algorithm that upscales site-level eddy covariance measurements to the globe based on satellite and meteorological observations (Jung et al. 2011). Using this algorithm, patterns of gross primary productivity, ecosystem respiration and latent and sensible heat at 0.5° x 0.5° spatial resolution and monthly time step (1982-2010) were produced (Fig. 1).
Figure 1: Maps of upscaled gross primary productivity for January and July 1982 (Jung et al. 2011)
Site-level estimates of gross primary productivity (GPP) and ecosystem respiration (Reco) were derived from the FLUXNET eddy covariance measurements according to the method of Lasslop et al. (2010). A model tree ensemble (MTE) was trained against these site-level GPP estimates based on satellite observations of FAPAR (fraction of photosynthetic active radiation) and meteorological data as explanatory variables. In a next step the trained MTEs have been applied to gridded patterns of these explanatory variables to derive global patterns of GPP and Reco (Jung et al. 2009, Jung et al. 2011).
Upscaled estimates of gross primary productivity and ecosystem respiration can be used to benchmark results of the CARBONES CCDAS. These datasets can be used to compare mean annual spatial patterns, the seasonality or relationships between carbon fluxes and climate patterns.
The main source of uncertainty in the MTE upscaling products originates from the representativeness of the FLUXNET eddy covariance station network. Some wide geographical regions are not represented by measurement stations (e.g. Africa, Siberia, South America, Tropical Asia). Nevertheless, the FLUXNET stations cover a wide range of climatological conditions. Hence, the environmental representativeness is better than the geographical representativeness. The extrapolation of carbon fluxes to environmental conditions that are not represented by the FLUXNET dataset mainly causes the uncertainty in flux estimates.
The uncertainty of upscaled fluxes was estimated based on the distribution of the predicted fluxes from the model tree ensemble (Jung et al. 2009). Additionally, the error of upscaled fluxes was estimated against eddy covariance site observation using cross- validation (Jung et al. 2011). Based on these validation studies uncertain aspects and robust patterns of the MTE upscaled patterns were identified.
Uncertain aspects are trends and anomalies of GPP estimates as well as estimates of net ecosystem exchange and ecosystem respiration. The magnitude of anomalies and inter-annual variability is substantially underestimated as comparisons with ecosystem models, atmospheric inversions and cross-validation have shown (Jung et al. 2011). As trends are usually calculated based on anomalies, trend estimates from this dataset are uncertain as well. Estimates of mean annual upscaled ecosystem respiration have higher certainty than anomalies but still cannot be considered as robust patterns and they might have an underestimation bias of 5-10%. Thus, anomalies and trends of GPP and Reco from MTE should not be used as a robust benchmark dataset of CARBONES result.
Robust patterns that can be extracted from the MTE dataset are mean annual patterns and the mean seasonal cycle of GPP. Mean annual GPP estimates from MTE compare well with satellite-derived fluorescence measurements that are directly linked to photosynthesis (Frankenberg et al. 2011). The uncertainty of mean annual GPP was quantified based on the spread of GPP estimates from the ensemble (Fig. 2). Uncertainties are high in some tropical regions and in regions with low vegetation cover like in inner Australia or Siberian Tundra (Fig. 2). Considering these uncertainties, mean annual and seasonal patterns of gross primary productivity can be used as independent datasets to evaluate process-model simulations (e.g. Bonan et al. 2011) or CCDAS results of gross primary production.
Figure 2: Spatial distribution of mean annual GPP (top) and associated uncertainties (bottom) as upscaled from the FLUXNET by the MTE.
The MTE upscaled patterns of carbon and water fluxes are available for scientific use on request to Max Planck Institute for Biogeochemistry, Jena, Germany (Martin Jung). Users of the dataset should consider and refer to results as described in Jung et al. (2009, 2010 and 2011).
Bonan, G.B., et al. (2011): Improving canopy processes in the Community Land Model version 4 (CLM4) using global flux patterns empirically inferred from FLUXNET data. Journal of Geophysical Research - Biogeosciences, 116, G02014.
Frankenberg, C., et al. (2011): New global observations of the terrestrial carbon cycle from GOSAT: Patterns of plant fluorescence with gross primary productivity. Geophysical Research Letters, 38, L17706, doi:10.1029/2011GL048738.
Jung, M., Reichstein, M., Bondeau, A. (2009): Towards global empirical upscaling of FLUXNET eddy covariance observations: validation of a model tree ensemble approach using a biosphere model. Biogeosciences, 6, 2001-2013.
Jung, M., et al. (2010): Recent decline in the global land evapotranspiration trend due to limited moisture supply. Nature, 467, 951-954.
Jung, M., et al. (2011): Global patterns of land-atmosphere fluxes of carbon dioxide, latent heat, and sensible heat derived from eddy covariance, satellite, and meteorological observations. Journal of Geophysical Research - Biogeosciences, 116, doi:10.1029/2010JG001566.
The air-sea fluxes from CARBONES are evaluated against independent ocean fluxes, estimated for instance from ocean interior inversion.
The fluxes based on the inversion represent the climatologic mean fluxes for the period of the 1990s. They can be scaled to any year thanks to the explicit estimation of the anthropogenic CO2 flux component (Mikaloff-Fletcher et al., 2006) and they have the advantage relative to those of Takahashi in that they tend to have a smaller error, and that the error covariance matrix is well known. Figure 3 illustrates for several basins the mean air-sea fluxes from Takahashi climatology and from the "ocean interior" inversion together with their error bars.
Figure 1: Annual mean CO2 fluxes from ocean to atmosphere for year 2000
Independent satellite measurements as well as atmospheric CO2 data not used in the assimilation process can be used to evaluate the CCDAS output.
We will used at the latest stage of the project a second vegetation index to i) first evaluate the optimized ORCHIDEE model output (namely the normalized temporal variations of fAPAR) and ii) possibly to be assimilated in place of the MODIS NDVI. The data streams that will be considered are:
Atmospheric CO2 data
For instance we will use vertical CO2 profiles measured on specific campaign, or measurements from the CONTRAIL database.