Cho et al. ‘describe a protocol for large-scale genome-wide analysis that facilitates quality control and population stratification correction […] while maintaining the confidentiality of underlying genotypes and phenotypes. […] This approach may help to make currently restricted data available to the scientific community and could potentially enable secure genome crowdsourcing, allowing individuals to contribute their genomes to a study without compromising their privacy.’
In this preprint, Bonte et al. describe both a homomorphic encryption approach and a secure multiparty computation approach and provide efficient implementations.
Jagadeesh et al. encode an individuals functional variants as a binary vector. They then use Yao’s protocol to identify relevant coincidences between pools of such vectors engaging in secure multiparty computation.
Çetin et al. present ‘a novel string matching protocol to enable privacy-preserving queries on homomorphically encrypted data. [Their] protocol combines state-of-the-art techniques from homomorphic encryption and private set intersection protocols to minimize the computational and communication cost.’
The authors ‘propose a novel approach that combines efficient string data structures such as the Burrows–Wheeler transform with cryptographic techniques based on additive homomorphic encryption. [They] assume that the sequence data is searchable in efficient iterative query operations over a large indexed dictionary, for instance, from large genome collections and employing the (positional) Burrows–Wheeler transform. [They] use a technique called oblivious transfer that is based on additive homomorphic encryption to conceal the sequence query and the genomic region of interest in positional queries.’
Lu et al. ‘propose encryption of all genotype and phenotype data. To allow the cloud to perform meaningful computation in relation to the encrypted data, [they] use a fully homomorphic encryption scheme. Noting that [they] can evaluate typical statistics for GWAS from a frequency table, [their] solution evaluates frequency tables with encrypted genomic and clinical data as input. [They] propose to use a packing technique for efficient evaluation of these frequency tables.’
Truncated hash values of haplotype segments are used as privately known ‘genome sketches’. These serve as fuzzy extractors to decode publicly known ‘secure genome sketches’ revealing information only between related individuals.