Researches introduced MaskAnyone Toolkit To Minimize The Privacy Risks


Audio-visual data offers invaluable insights into human behavior and communication but raises significant privacy concerns, which proposes MaskAnyone, a toolkit for de-identifying individuals in audio-visual data while preserving data utility. 

By combining face-swapping, auditory masking, and real-time bulk processing, MaskAnyone addresses the ethical and legal challenges associated with human subject data.

EHA

The toolkit is designed to be scalable, customizable, and user-friendly, promoting ethical data sharing in the social and behavioral sciences. Video de-identification involves concealing or replacing individuals to protect privacy while preserving video utility.

Hiding techniques, including blurring, pixelation, and inpainting, obscure identities but can compromise information.  Masking, conversely, aims to retain essential attributes by replacing faces or creating digital avatars.

Are you from SOC and DFIR Teams? Analyse Malware Incidents & get live Access with ANY.RUN -> Get 14 Days Free Access

Landmark detection models like MediaPipe’s BlazePose are crucial for accurate person localization and pose estimation, enabling techniques like face swapping. 

Even though there have been advancements, there are still challenges to balancing privacy, utility, and real-time performance.

Voice data inherently contains personally identifiable information, necessitating techniques to obscure it while preserving linguistic and prosodic elements. 

Spectral Modification, Pitch Shifting, and Voice Conversion are common methods, each with trade-offs between privacy and utility.

Voice privacy challenges have highlighted this, with systems employing various models like x-vectors, Gaussian mixture models, and deep neural networks. 

Despite advancements, a standardized benchmark for evaluating utility preservation and privacy assurance in voice de-identification remains absent. 

MaskAnyone Architecture

MaskAnyone is a modular, extendable toolkit designed using Design Science Research principles to address ethical, robust, and effective audio-visual data management and sharing. 

It incorporates sophisticated features like 3D tracking and real-time processing in response to user feedback from researchers and data stewards. 

The toolkit offers diverse masking methods to balance open science and privacy, protecting against data breaches while maintaining data integrity, which aligns with FAIR principles and supports the development of thematic digital competence centers. 

Multimodal Masking Process

It offers comprehensive video masking capabilities, employing YOLOv8 and MediaPipe for person detection. Hiding strategies include blackout, blurring, contouring, and inpainting to de-identify individuals. 

Masking strategies preserve information through skeletonization, face mesh extraction, holistic landmark detection, face swapping, avatar generation, and blendshapes. 

Voice masking options include preservation, removal, and conversion for audio privacy control, while the toolkit provides flexibility in balancing privacy and utility through various techniques and parameters. 

According to the paper, a preliminary evaluation framework for video masking tools was developed to balance privacy preservation and utility retention. 

Automated evaluation metrics, including MAP for object detection and re-identification precision for masking, were employed. An emotional classification agreement served as a utility proxy. 

Human evaluation underscored the importance of usability and the trade-off between privacy and utility.

Face-swapping significantly reduced re-identification while preserving emotional cues, but further research is needed to refine masking techniques and validate the findings.

Download Free Cybersecurity Planning Checklist for SME Leaders (PDF) – Free Download



Source link