titanite.core.security#

Privacy-safe handling of sensitive survey data.

This module provides utilities for protecting survey response confidentiality during aggregation and publication.

Module Contents#

Classes#

SecureDataHandler

Utilities for privacy-safe data operations.

API#

class titanite.core.security.SecureDataHandler#

Utilities for privacy-safe data operations.

All methods are static — no instance state needed. Focuses on:

  • Safe data loading

  • Cell suppression (n < threshold removal)

  • Anonymization (sensitive column removal)

Examples

df = SecureDataHandler.load_sensitive_data(“data.csv”) suppressed = SecureDataHandler.suppress_small_cells(df, threshold=5) safe = SecureDataHandler.anonymize_for_publication( … suppressed, sensitive_columns=[“timestamp”, “q15”] … )

static load_sensitive_data(filepath: str | pathlib.Path) pandas.DataFrame#

Load a CSV file safely (read-only, no side effects).

Parameters

filepath : str or Path Path to the CSV file to load

Returns

pd.DataFrame Loaded data

Raises

FileNotFoundError If the file does not exist

static suppress_small_cells(data: pandas.DataFrame, threshold: int = 5, count_column: str = 'count') pandas.DataFrame#

Apply cell suppression: remove rows where count < threshold.

Used to prevent individual identification in aggregated results. This is essential for privacy protection in statistical releases.

Parameters

data : pd.DataFrame Aggregated (crosstab or grouped) DataFrame threshold : int, optional Minimum cell count to retain, by default 5 count_column : str, optional Name of the column holding counts, by default “count”

Returns

pd.DataFrame Filtered DataFrame with small cells removed. If count_column is not found, returns data unchanged with a warning.

static anonymize_for_publication(data: pandas.DataFrame, sensitive_columns: list[str]) pandas.DataFrame#

Remove sensitive columns before publication.

Strips personally identifiable information and free-text responses that could compromise respondent confidentiality.

Parameters

data : pd.DataFrame DataFrame to anonymize sensitive_columns : list[str] Column names to remove (e.g., [“timestamp”, “q15”, “q16”])

Returns

pd.DataFrame Copy of data with sensitive columns dropped (if they exist)