Estimate the goodness-of-fit between tree models and data

Estimate the goodness-of-fit between tree models and data.

treefit(
  target,
  name = NULL,
  perturbations = NULL,
  normalize = NULL,
  reduce_dimension = NULL,
  build_tree = NULL,
  max_p = 20,
  n_perturbations = 20
)

Arguments

target	The target data to be estimated. It must be one of them: `list(counts=COUNTS, expression=EXPRESSION)`: You must specify at least one of `COUNTS` and `EXPRESSION`. They are `matrix`. The rows and columns correspond to samples such cells and features such as genes. `COUNTS`'s value is count data such as the number of genes expressed. `EXPRESSION`'s value is normalized count data. `Seurat` object
name	The name of `target` as string.
perturbations	How to perturbate the target data. If this is `NULL`, all available perturbation methods are used. You can specify used perturbation methods as `list`. Here are available methods:
normalize	How to normalize counts data. If this is `NULL`, the default normalization is applied. You can specify a function that normalizes counts data.
reduce_dimension	How to reduce dimension of expression data. If this is `NULL`, the default dimensionality reduction is applied. You can specify a function that reduces dimension of expression data.
build_tree	How to build a tree of expression data. If this is `NULL`, MST is built. You can specify a function that builds tree of expression data.
max_p	How many low dimension Laplacian eigenvectors are used. The default is 20.
n_perturbations	How many times to perturb. The default is 20.

Value

An estimated result as a treefit object. It has the following attributes:

max_cca_distance: The result of max canonical correlation analysis distance as data.frame.
rms_cca_distance: The result of root mean square canonical correlation analysis distance as data.frame.
n_principal_paths_candidates: The candidates of the number of principal paths.

data.frame of max_cca_distance and rms_cca_distance has the same structure. They have the following columns:

p: Dimensionality of the feature space of tree structures.
mean: The mean of the target distance values.
standard_deviation: The standard deviation of the target distance values.

Examples

# Generate a star tree data that have normalized expression values
# not count data.
star <- treefit::generate_2d_n_arms_star_data(300, 3, 0.1)
# Estimate tree-likeness of the tree data.
fit <- treefit::treefit(list(expression=star))

Estimate the goodness-of-fit between tree models and data

Arguments

Value

Examples

Contents