Data Assimilation Overview
Data Assimilation in WRF: An Overview
One of the greatest benefits to using atmospheric computer models is the ability to experimentally test the influence of new and unusual forces on weather and climate. Have you ever wanted to include additional data into your WRF run? Perhaps you have sea surface temperature, a specific set of precipitation, or other kinds of data. Maybe your research question requires a special set of data in your simulation. This post provides an overview of data assimilation techniques to help you get started on this process.
Sea-Surface Temperature
Although WRF doesn’t require sea-surface temperature (SST) to run, if you are doing longer simulations (like those you would do for regional climate modeling), it can be very beneficial. Therefore, WRF was designed to seamlessly incorporate SST into simulations. They even created a tutorial online with a link to free data to help you get started. This is probably the easiest way to get started with data assimilation, if you are new to WRF.
WRFDA
Did you know that WRF has a data assimilation component available? It’s called WRFDA and is described in chapter 6 of the WRF User’s Guide. It provides step-by-step descriptions of how to assimilate a few commonly used types of data, such as radiance, radar, and precipitation. It even provides some free input data you can use in your run, so this is an excellent place to start.
Be aware that if you have already installed the full version of WRF, you will still need to download and install the WRFDA component. Don’t worry though, if you managed to get WRF installed, the assimilation component should be pretty straightforward and integrate seamlessly.
Modifying the Source Code
This process is the most difficult to do, but it gives you the most flexibility. For example, let’s say you want to see if the heat released from a volcano eruption impacted the weather in your domain. You would want to replace land-surface temperature for that one specific grid cell, which would have to be done by finding the variable in the source code and modifying it. If you are interested, this is exactly what I did for my Masters Thesis, which has been published and is free to access.
Heat from a volcano comes from the use of satellite data. Once you have obtained a reasonable dataset, you will have to be sure you have interpolated the data to match the timestep (“time_step” in your namelist file) for the number of days, hours, minutes, etc. You have to be sure the amount of data you have matches the number of iterations WRF will use.
You also need to decide how to introduce the data into WRF for the run. This requires thoughtful consideration of the components of WRF as well as the parameterization schemes you decided to use (which you specified in your namelist). Eventually, you will have to explore the WRF source code for the equations and variables you are interested in. However, you risk making accidental changes if you are continually opening and closing files, which may alter how the model works or even prevent it from running altogether. So I suggest using an online source file, such as this one, so you can search the files without fear of risking your WRF code. Additionally, the files are all easily sorted so you don’t have to change directories as you are searching either. Once you know exactly what you want to modify, you can edit the correct file in code on your computer.
To incorporate heat from the volcano, you are changing the long-wave radiation output from the one grid cell that the volcano is in. Assume you are using the RRTM, you would search through that source code for the variable describing surface temperature. At the beginning of the RRTM subroutine, they list the variables:
SUBROUTINE RRTM (TTEN,GLW,OLR,TSFC,CLDFRA,T,Tw,QV,QC, & 1,5
QR,QI,QS,QG,P,Pw,DZ, & EMISS,R,G, &
kts,kte )
As you would expect, temperature is identified with a “T”, therefore, we will look to see when T is first used in calculations later on in the code. If you scroll down, you should come across the following:
CALL MM5ATM (CLDFRA,O3PROF,T,Tw,TSFC,QV,QC,QR,QI,QS,QG, &
P,Pw,DZ,EMISS,R,G, &
PAVEL,TAVEL,PZ,TZ,CLDFRAC,TAUCLOUD,COLDRY, &
WKL,WX,TBOUND,SEMISS, &
kts,kte )
This indicates that the routine that calculates T (temperature) is in the subroutine called MM5ATM. So we can go into that code, search for temperature there again. We actually find where comments in the code say that they are setting the surface temperature, which is done in the equation:
TBOUND = TSFC
where TSFC is the input data from either the startup data or from what was calculated in previous iterations of the run. Therefore, the TSFC variable is the one we need to alter with our dataset.
First, call your volcano heat data into the program. Say your dataset file is called “volcano” and the variable is “heat”. Also, assume the volcano is in the grid space described as i=45, j=48, k=0 (this is something you would have to determine based on how you set up your domain location, size, and resolution). Therefore, you would insert :
IF (i=45 .AND. j=48 .AND. k=0) THEN ! only insert the new data if the grid space is the volcano
CALL volcano (heat)
TBOUND = heat
ELSE
TBOUND = TSFC ! otherwise, do what it normally does
END IF
Then, you should be ready to run with your new data!
Data assimilation is not an easy thing to do. I suggest running WRF a few times before trying to assimilate any new data, at the very least so that you feel comfortable with WRF and learn to expect reasonable output data, so that you can more easily identify erroneous results.
Also, start out with easier assimilations, such as SST which is already built into WRF, or another variable using the WRFDA component. Editing the source code is a more advanced technique that requires a good understanding of FORTRAN and the WRF parameters.
Let me know if you try anything interesting. Happy assimilating!
-Morgan