Configuration#
Customize survey settings and data processing through configuration files.
Configuration File#
The main configuration file is located at:
sandbox/config.toml
This file defines:
Survey question mappings
Categorical variables and their valid choices
Numerical variables for statistical analysis
Data processing rules
Configuration Structure#
Questions Section#
Defines the mapping between question IDs and descriptions:
[questions]
q01 = "Gender identity"
q02 = "Gender expression"
q03 = "Geographic region"
# ... more questions
Categorical Headers#
Lists variables that should be treated as categorical (discrete) data:
[categorical_headers]
default = ["q01", "q02", "q03_regional", "q03_subregional"]
These variables are used for chi-square tests and categorical analysis.
Numerical Headers#
Lists variables that should be treated as numerical (continuous) data:
[numerical_headers]
default = ["q10", "q13", "sentiment_score"]
Data Rules#
Define transformations applied during data preparation:
[data_rules]
# Define categorical values for specific questions
# Define clustering rules for derived columns
# Define binning rules for numerical data
Using Custom Configuration#
Load Custom Config#
To use a custom configuration file:
cd custom_directory
poetry run ti config --load_from /path/to/config
Output Configuration#
Specify output directory for processed data:
poetry run ti prepare data.csv --write-dir /path/to/output
Input Directory#
Specify custom input directory:
poetry run ti prepare data.csv --read_from /path/to/input
Configuration with Plugins#
When using a custom survey schema with the --plugin option, you can still override configuration:
poetry run ti prepare data.csv \
--plugin plugins.custom_survey.CustomSchema \
--load_from /path/to/config
Survey Schema Configuration#
The survey schema (defined in plugins) determines:
Value Replacements - Standardize response values
Geographic Splitting - Split regional data into components
Clustering Rules - Create derived composite columns
Binning Rules - Convert numerical data to categories
See Plugin Development for details on schema customization.
Best Practices#
Version Control - Keep configuration files in Git
Separate Configs - Use different configs for different surveys
Document Changes - Comment configuration changes in commits
Validate Data - Run
ti configto verify configuration is loaded correctlyTest First - Test configuration with sample data before processing large datasets
Troubleshooting#
Configuration Not Found#
Ensure the config.toml file exists in your current directory or specify the path:
poetry run ti config --load_from /path/to/config
Invalid Categories#
Check that all categorical variables are properly defined in config.toml:
poetry run ti config --choices
Missing Questions#
Verify that all question IDs in data match those defined in configuration:
poetry run ti config --questions