Example Data Files
example-data-files.Rmd
The andorR
package includes several raw data files in
different formats to demonstrate the data loading functions. These files
can be found and loaded using the system.file()
command.
Included Files
The following example data files are included in the
inst/extdata
directory:
- bse.csv
- ethical_path.csv
- ethical.csv
- ethical.json
- ethical.yml
- lupus.csv
- ms.csv
- unesco.yml
- woah.yml
File documentation
Ethical investment decision tree (ethical
)
Introduction
Making ethical investments requires careful consideration of multiple aspects of the investment target, including aspects such as
- Financial Viability
- Environmental Stewardship
- Social Responsibility
- Corporate Governance
- Profitability and Growth
- Solvency and Stability,
This hypothetical decision tree uses these concepts to illustrate the
functionality of andorR
Formats
This data is available in the following formats:
- Comma Separated Value (.csv) format, with data arranged in relational structure.
- YAML (.yml) format, with data arranged in a hierarchical structure
- JSON (.json) format, arranged in a hierarchical structure
- CSV (ethical_nl.csv) with data arranged in a node path structure.
World Organisation for Animal Health notifiable disease list
criteria (woah
)
Introduction
The World Organisation for Animal Health (WOAH) maintains a list of notifiable diseases to support surveillance and reporting of diseases that have an impact on trade. The criteria for the inclusion of a disease in the list are included in the Terrestrial Animal Health Code.
Format
This data is provided in YAML (.yml
) format, which
clearly represents the nested, hierarchical structure of the nomination
criteria.
The attributes at each level include
- name : a short name for each node
- question : the question for leaves
- rule : the rule (AND or OR) for nodes
- nodes : children nodes or leaves at the next level of the hierarchy
Bovine Spongiform Encephalopathy surveillance decision tree
(bse
)
Introduction
Surveillance for BSE use a targeted risk-based approach. A decision tree may be used to determine if an animal is a suitable target for surveillance. This tree is a theoretical composite based on information from three main sources.
Format
This file is in Comma Separated Value (.csv) format, with data arranged in relational structure. The column headers are:
- id : a unique serial numeric identifier for each row (node or leaf)
- name : a short name for each node
- question : the question for leaves
- rule : the rule (AND or OR) for nodes
- parent : the id of a node’s parent
Sources
-
World Organisation for Animal Health (WOAH)
The Terrestrial Animal Health Code chapter 11.4 provides standards for trade in animals and animal products, and for surveillance for BSE at the global level.
-
United States Department of Agriculture (USDA)
The US National Bovine Spongiform Encephalopathy Surveillance Plan provides technical information about surveillance for BSE in the US.
-
European Union (EU)
Regulation (EC) No 999/2001 is the foundation for European surveillance for BSE and other transmissible spongiform encephalopathies.
The tree was developed with the assistance of an LLM. The author has expertise in BSE surveillance.
Multiple sclerosis diagnostic decision tree (ms
)
Introduction
Multiple Sclerosis (MS) is a complex neurological condition. Its diagnosis is a process of deduction, requiring evidence of central nervous system lesions that are separated in both anatomical location (dissemination in space) and time (dissemination in time), while also ruling out other conditions that can mimic MS. This tree is a simplified model of this diagnostic process.
Format
This data is available in Comma Separated Value (.csv
)
format, with data arranged in a relational structure.
Source
This tree is a simplified model based on the core principles of the 2017 revisions to the McDonald criteria for the diagnosis of Multiple Sclerosis.
The tree was developed by the author (with no expertise in MS) with
the assistance of an LLM, for the purpose of illustrating the use of the
andorR
package.
Systemic Lupus Erythematous diagnostic decision tree
(lupus
)
Introduction
Systemic Lupus Erythematosus (SLE) is a multi-system autoimmune disease often called ‘the great imitator’ due to its wide range of symptoms. Modern diagnosis uses a criteria-based scoring system. This tree models the logic of combining clinical and immunological evidence to reach a classification of SLE.
Format
This data is available in Comma Separated Value (.csv
)
format, with data arranged in a relational structure.
Source
This tree is a simplified model based on the principles of the 2019 European League Against Rheumatism/American College of Rheumatology (EULAR/ACR) classification criteria for SLE.
The tree was developed by the author (with no expertise in SLE) with
the assistance of an LLM, for the purpose of illustrating the use of the
andorR
package.
UNESCO World Heritage nomination decision tree
(unesco
)
Introduction
The process of nominating a site for the UNESCO World Heritage List is a complex, evidence-intensive procedure. This decision tree models the core logic an expert committee would follow, based on the official UNESCO Operational Guidelines. It helps structure the assessment of a site’s Outstanding Universal Value (OUV), its adherence to the formal selection criteria, its integrity, and its protection and management framework.
Format
This data is provided in YAML (.yml
) format, which
clearly represents the nested, hierarchical structure of the nomination
criteria.
Source
The logic for this tree is based on the official guidelines published by the UNESCO World Heritage Centre. The ten selection criteria are detailed in Paragraph 77 of this document.
This tree was developed by the author (with no expertise in cultural
heritage) with the assistance of an LLM, for the purpose of illustrating
the use of the andorR
package
Example: Loading a File
CSV relational format
To load the example CSV file in relational format from within the package, you would use the following code:
path <- system.file("extdata", "ethical.csv", package = "andorR")
my_tree <- load_tree_csv(path)
CSV node path format
To load the example CSV file in node path format from within the package, you would use the following code:
path <- system.file("extdata", "ethical_path.csv", package = "andorR")
df <- read.csv(path)
my_tree <- load_tree_df_path(df)
YAML format
To load the example YAML file from within the package, you would use the following code:
path <- system.file("extdata", "ethical.yml", package = "andorR")
my_tree <- load_tree_yaml(path)