Exporting Marginal Distributions

Obtaining Marginal Distributions

Marginal distributions should first be obtained using the get_marginal_distributions() function.

To obtain the marginal distributions for all variables you should only specify the dataset:

library(RESIDE)
marginals <- get_marginal_distributions(IST)

To obtain marginal distributions for select variables, you should specify the variables using the variables parameter:

library(RESIDE)
marginals <- get_marginal_distributions(
  IST,
  variables = c(
    "SEX",
    "AGE",
    "ID14",
    "RSBP",
    "RATRIAL",
    "SET14D",
    "DSIDED"
  )
)

Printing the Marginal Distributions Prior to Export

Marginal distributions can be printed when generating marginal distributions using the print parameter:

library(RESIDE)
marginals <- get_marginal_distributions(
  IST,
  print = TRUE
)

Or from a stored marginals object:

library(RESIDE)
marginals <- get_marginal_distributions(IST)
print(marginals)

Marginal distributions can be exported using the export_marginal_distributions() function, specifying the marginal distributions (generated by `get_marginal_distributions()’) and a folder path:

library(RESIDE)
marginals <- get_marginal_distributions(IST)
export_marginal_distributions(
  marginals,
  folder_path = "/Users/ryan/marginals"
)

This folder should exist and not contain any previously exported marginal distributions. You can create the folder automatically using the create_folder parameter:

library(RESIDE)
marginals <- get_marginal_distributions(IST)
export_marginal_distributions(
  marginals,
  folder_path = "/Users/ryan/marginals",
  create_folder = TRUE
)

Files created by `export_marginal_distributions()`

The following files will be created by the export_marginal_distributions() function:

binary_variables.csv - Contains the marginal distributions for binary variables including:
- Variable Name
  - Mean
  - Number of Missing Observations
categorical_variables.csv Contains the marginal distributions for categorical variables including:
- Variable Name & Category Name
- Number of Observations in Each Category
- NB Missing Observations are coded as a separate category labelled missing.
continuous_variables.csv - Contains the marginal distributions for continuous variables including:
- Variable Name
- Transformed Mean
- Transformed Standard Deviation
- Number of Missing Observations
- Number of Decimal Places
continuous_quantiles.csv - Contains the Quantile mapping to allow for back transformation. For each continuous variable this contains:
- The original quantile value
- The transformed quantile value
- An epsilon value to indicate the amount of thinning applied
summary.csv - Contains and overall summary of the dataset including:
- Number of Rows
- Number of Columns
- Variable Names (for validation)

These files should then be sent to the user.

NB If there are no variables of a certain type the corresponding file will not be created.

Exporting Marginal Distributions

Obtaining Marginal Distributions

Printing the Marginal Distributions Prior to Export

Exporting Marginal Distributions

Files created by export_marginal_distributions()

Files created by `export_marginal_distributions()`