ERDDAP's "files" system lets you browse a virtual file system and download
source
data files. Hopefully, this is a familiar, easy system that you can use with your
favorite web browser or, if you prefer, from a command line program like
curl.
ERDDAP was designed around the idea that most datasets are huge, so most users
just need or want a subset of the dataset that they are interested in
(e.g., a smaller geographic area, a smaller time range, or not all of the data variables).
But we
understand that some users actually do want an entire dataset, or at least the subset
which is found in a subset of the source data files. If that's you, then the "files"
system may be for you. One advantage of the "files" system is that you can see each
file's Last Modified time (Zulu time zone), so it is easy to see if a file has been
changed.
Click
To use the "files" system, just click. On any "files" web page, you can:
- Click on a heading (Name, Last modified, Size, or Description) to sort the items
by that attribute. Clicking repeatedly on one heading toggles the sort order
(ascending or descending). Note that "Last modified" uses the Zulu time zone.
- Click on a directory name to go to that directory.
- Click on a file name to download that file.
For datasets available via ERDDAP's tabledap or griddap, ERDDAP administrators
can set up ERDDAP to change a dataset's metadata and variable names on-the-fly
so that you, the user, see an improved version of the dataset's metadata. But in
"files", you will see the original metadata and variable names, so don't be surprised
if they are different! If you aren't comfortable dealing with the different metadata
and variable names, you might prefer using the dataset's Data Access Form instead.
Similarly, when you request a subset of data from one of ERDDAP's Data Access Forms,
you can specify the file type (e.g., .nc, .csv, .json, .mat) that you want to receive in
response. Naturally, the source data files available via "files" are just available in
one file type. If you aren't happy with the source file's file type, you might prefer
using the dataset's Data Access Forms instead.
Some datasets in this ERDDAP aren't available via the "files" system. Common reasons
include:
- The dataset's data doesn't come from files (e.g., it comes from a database
or Cassandra, or from a remote ERDDAP, THREDDS, or GRADS data server).
- The immediate source files are .ncml files which specify how to modify the
actual data files on-the-fly.
- The ERDDAP administrator chose not to make the source data files available.
If the source files for a dataset that you want aren't available, you can email
the administrator of this ERDDAP, webmaster at bios dot edu,
to request that they be made available, but there is usually a reason why they aren't
already available.
We understand that some users might prefer that ERDDAP offer files via FTP instead
of HTTP as is done by "files". Sorry. Hopefully, you'll be able to do what you need
to do with the current "files" system.
If you want to download a series of files from ERDDAP, you don't have to request each
file's ERDDAP URL in your browser, sitting and waiting for each file to download.
Ways to use curl:
- If you are comfortable writing computer programs (e.g., with C, Java, Python, Matlab, r),
you can write a program with a loop that imports all of the desired data files.
- If you are comfortable with command line programs (just running a program,
or using bash or tcsh scripts in Linux or Mac OS X, or batch files in Windows),
you can use curl to
save results files from ERDDAP into files on your hard drive, without using a browser
or writing a computer program.
ERDDAP+curl is amazingly powerful and allows you to
use ERDDAP in many new ways. To install curl:
- On Linux and Mac OS X, curl is probably already installed as /usr/bin/curl.
- On Windows, or if your computer doesn't have curl already, you need to
download curl
and install it. To get to a command line in Windows, click on "Start" and type
"cmd" into the search textfield.
("Win32 - Generic, Win32, binary (without SSL)" worked for me in Windows 7.)
Please be kind to other ERDDAP users: run just one script or curl command at a time.
Instructions for using curl are on the
curl man page and in this
curl tutorial.
But here is a quick tutorial related to using curl with ERDDAP:
- To download and save one file, use
curl -g "erddapUrl" -o fileDir/fileName.ext
where -g disables curl's globbing feature,
erddapUrl is any ERDDAP URL that requests a data or image file, and
-o fileDir/fileName.ext specifies the name for the file that will be created.
For example,
curl -g "https://coastwatch.pfeg.noaa.gov/erddap/files/cwwcNDBCMet/NDBC_41004_met.nc" -o ndbc/41004.nc
In curl, as in many other programs, the query part of the erddapUrl must be
percent encoded:
all characters in parameter values (the parts after '=' signs) other than A-Za-z0-9_-!.~'()*
must be encoded as %HH, where HH is the 2 digit hexadecimal value of the character,
for example,
a space becomes %20. Characters above #127 must be converted to UTF-8 bytes, then each UTF-8
byte must be percent encoded (ask a programmer for help). There are
web sites that percent encode and decode for you.
If you get the URL from your browser's address textfield, this may be already done.
- To download and save many files in one step, use curl with the globbing feature
enabled:
curl "erddapUrl" -o fileDir/fileName#1.ext
Since the globbing feature treats the characters [, ], {, and } as special, you must also
percent encode
them in the erddapURL as %5B, %5D, %7B, %7D, respectively.
Fortunately, these are almost never in "files" file names.
Then, in the erddapUrl, replace a zero-padded number (for example 01) with a
range of values (for example, [01-15] ), or replace a substring (for example 41004) with a list of values (for example, {41004,41009,41010} ).
The #1 within the output fileName causes the current value of the range or
list to be put into the output fileName. For example,
curl "https://coastwatch.pfeg.noaa.gov/erddap/files/cwwcNDBCMet/NDBC_{41004,41009,41010}_met.nc" -o ndbc/#1.nc
For most common image and video file types, the "files" system will now display a '?' icon
to the left of the file name. If you hover over that, you will see a popup window showing
the image or an audio or video player.
Similarly, for a few audio file types (notably .mp3, .ogg, and .wav), you will see an
audio control which allows you listen to the audio file.
These preview features will only work for certain file types, in certain browsers,
in certain operating systems.
They rely on browser features, so they are largely out of our control.
Alternatively, if you click on the link for an image, audio, or video file, a viewer
or player will open in a separate window. (If your browser asks you what you want
to do with the file, tell it to handle the media file itself (not via other software),
and tell it to
remember this choice so that it will be used automatically in the future.)
Unlike requests for most of the other resources in ERDDAP, a request for a file from the
"files" system may include a "Range" request in the header which specifies a range of bytes to
be returned, instead of the whole file.
See
Byte_serving.
This is used by some client software (for example, audio and video players in web browsers)
to request chunks of the file instead of the whole file.
Some software that is normally used to read data from a local data file can
also work with a remote file if the server supports Byte Range requests.
So, in general, such software can work with remote files served by ERDDAP's "files" system.
For example,
Ferret
and
ncview
can read data from local and remote .nc files.
However, Byte Range requests are slow and inefficient compared to local access.
So the more you work with a remote file, the more sense it makes to download
the file so you can access the local file quickly and efficiently.