titanite.core.security#
Privacy-safe handling of sensitive survey data.
This module provides utilities for protecting survey response confidentiality during aggregation and publication.
Module Contents#
Classes#
Utilities for privacy-safe data operations. |
API#
- class titanite.core.security.SecureDataHandler#
Utilities for privacy-safe data operations.
All methods are static — no instance state needed. Focuses on:
Safe data loading
Cell suppression (n < threshold removal)
Anonymization (sensitive column removal)
Examples
df = SecureDataHandler.load_sensitive_data(“data.csv”) suppressed = SecureDataHandler.suppress_small_cells(df, threshold=5) safe = SecureDataHandler.anonymize_for_publication( … suppressed, sensitive_columns=[“timestamp”, “q15”] … )
- static load_sensitive_data(filepath: str | pathlib.Path) pandas.DataFrame#
Load a CSV file safely (read-only, no side effects).
Parameters
filepath : str or Path Path to the CSV file to load
Returns
pd.DataFrame Loaded data
Raises
FileNotFoundError If the file does not exist
- static suppress_small_cells(data: pandas.DataFrame, threshold: int = 5, count_column: str = 'count') pandas.DataFrame#
Apply cell suppression: remove rows where count < threshold.
Used to prevent individual identification in aggregated results. This is essential for privacy protection in statistical releases.
Parameters
data : pd.DataFrame Aggregated (crosstab or grouped) DataFrame threshold : int, optional Minimum cell count to retain, by default 5 count_column : str, optional Name of the column holding counts, by default “count”
Returns
pd.DataFrame Filtered DataFrame with small cells removed. If count_column is not found, returns data unchanged with a warning.
- static anonymize_for_publication(data: pandas.DataFrame, sensitive_columns: list[str]) pandas.DataFrame#
Remove sensitive columns before publication.
Strips personally identifiable information and free-text responses that could compromise respondent confidentiality.
Parameters
data : pd.DataFrame DataFrame to anonymize sensitive_columns : list[str] Column names to remove (e.g., [“timestamp”, “q15”, “q16”])
Returns
pd.DataFrame Copy of data with sensitive columns dropped (if they exist)