R/state_database.R
, R/state_dataverse.R
, R/state_local_files.R
state.Rd
Determine what data files are currently on dataverse, in the local files, or in the local database.
get_db_state(db_path = find_db())
get_dvn_state(icews_doi = get_doi(), server = Sys.getenv("DATAVERSE_SERVER"))
get_local_state(raw_file_dir = find_raw())
Path to SQLite database files.
DOI of the main ICEWS repo on Dataverse, see get_doi()
For unit tests only; default is set to dataverse::get_dataset()
default.
Directory containing raw data files
For get_dvn_manifest
, a tibble with the following columns:
dvn_repo: "historic" or "weekly", see get_doi()
dvn_file_label: the file label on dataverse, possibly non-unique
dvn_file_id: the integer file ID on dataverse
file_name: the normalized, unique file name, see normalize_label()
For get_local_state
and get_db_state
, a tibble with columns:
file_name: the full source data file name, e.g.
"events.1995.20150313082510.tab"; see normalize_label()
The data files (tab-separated files, ".tab") on dataverse that contain the raw event data follow a common format denoting the set of events contained in a file and which version of the event data and/or file dump they correspond to. For example, "events.1995.20150313082510.tab" contains events for 1995 and the version is denoted by the timestamp, "20150313082510".
The download and update functions
(update_icews()
, download_data()
) will recognize which event sets
are locally available or still need to be downloaded, and whether any
local even sets have been superseded by a new version in dataverse, by
using
# Remote (DVN) state
# get_dvn_state()
#
# Local file state
# get_local_state()
#
# Database state
# get_db_state()