Neosync: Open-source data anonymization, synthetic data orchestration


Neosync is an open-source, developer-centric solution designed to anonymize PII, generate synthetic data, and synchronize environments for improved testing and debugging.

What you can do with Neosync

Safely test code with production data: Anonymize sensitive production data to safely use it locally, enhancing testing and developer workflows.

Reproduce production bugs: Create safe, representative data subsets by anonymizing and subsetting production data to reproduce and resolve bugs in local environments.

Enhance lower-level environments with high-quality data: Hydrate staging and QA environments with production-like data to catch issues before they reach production.

Ensure compliance with regulations: Use anonymized and synthetic data to reduce compliance risks and simplify adherence to laws like GDPR, HIPAA, DPDP, and FERPA.

Seed development databases: Generate synthetic data to populate development databases for unit testing, demos, and other use cases.

Key features

  • Generate synthetic data based on your schema
  • Anonymize existing production data for a better developer experience
  • Subset your production database for local and CI testing using any SQL query
  • Async pipeline that automatically handles job retries, failures, and playback using an event-sourcing model
  • Referential integrity for your data automatically
  • Declarative, GitOps-based configs as a step in your CI pipeline to hydrate your CI DB
  • Pre-built data transformers for all major data types
  • Custom data transformers using javascript or LLMs
  • Pre-built integrations with Postgres, MySQL, S3

Neosync is available for free download on GitHub.

Must read:




Source link