Showing 60 open source projects for "deduplication"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Enterprise-grade ITSM, for every business Icon
    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity.

    Freshservice is an intuitive, AI-powered platform that helps IT, operations, and business teams deliver exceptional service without the usual complexity. Automate repetitive tasks, resolve issues faster, and provide seamless support across the organization. From managing incidents and assets to driving smarter decisions, Freshservice makes it easy to stay efficient and scale with confidence.
    Try it Free
  • 1
    ipwb

    ipwb

    A distributed and persistent archive replay system using IPFS

    InterPlanetary Wayback (ipwb) facilitates permanence and collaboration in web archives by disseminating the contents of WARC files into the IPFS network. IPFS is a peer-to-peer content-addressable file system that inherently allows deduplication and facilitates opt-in replication. ipwb splits the header and payload of WARC response records before disseminating into IPFS to leverage the deduplication, builds a CDXJ index with references to the IPFS hashes returned, and combines the header and payload from IPFS at the time of replay. An important aspect of archival replay systems is rewriting various resource references for proper memento reconstruction so that they are dereferenced properly from the archive from around the same datetime as of the root memento and not from the live site (in which case the resource might have changed or gone missing). ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Note67

    Note67

    A private, local meeting notes assistant

    ...Users can record meetings directly from their microphone, view live transcriptions, filter by speaker, and export structured summaries, making it useful for professionals who need searchable, organized records of discussions. It also features thoughtful signal processing such as voice activity detection and echo deduplication to improve transcription accuracy, and provides standard note-taking features.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 3
    Kopia

    Kopia

    Cross-platform backup tool for Windows, macOS and Linux

    ...Kopia has both a command-line interface for automation and scripting as well as a graphical UI for interactive use, making it suitable for advanced users and those who prefer visual tools. Its architecture supports end-to-end encryption, optional compression, and deduplication, so multiple backups can share data efficiently, and repositories can be stored securely even in untrusted locations.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 4
    wagmi

    wagmi

    React Hooks for Ethereum

    wagmi is a collection of React Hooks containing everything you need to start working with Ethereum. wagmi makes it easy to "Connect Wallet," display ENS and balance information, sign messages, interact with contracts, and much more, all with caching, request deduplication, and persistence. We create a wagmi Client and pass it to the WagmiConfig React Context. The client is set up to use the ethers Default Provider and automatically connect to previously connected wallets. Next, we use the useConnect hook to connect an injected wallet (e.g. MetaMask) to the app. Finally, we show the connected account's address with useAccount and allow them to disconnect with useDisconnect. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • Find Hidden Risks in Windows Task Scheduler Icon
    Find Hidden Risks in Windows Task Scheduler

    Free diagnostic script reveals configuration issues, error patterns, and security risks. Instant HTML report.

    Windows Task Scheduler might be hiding critical failures. Download the free JAMS diagnostic tool to uncover problems before they impact production—get a color-coded risk report with clear remediation steps in minutes.
    Download Free Tool
  • 5
    Bareos

    Bareos

    Bareos is a cross-network Open Source backup solution

    Bareos (Backup Archiving Recovery Open Sourced) is an enterprise-grade open-source backup solution forked from Bacula. It offers robust backup, restore, and archiving features for Linux, Windows, and macOS systems. Bareos supports encrypted and deduplicated backups across networks, making it ideal for managing large infrastructure backups.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 6
    DOLMA

    DOLMA

    Data and tools for generating and inspecting OLMo pre-training data

    DOLMA (Data Optimization and Learning for Model Alignment) is a framework designed to manage large-scale datasets for training and fine-tuning language models efficiently.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    ClusterFuzz

    ClusterFuzz

    Scalable fuzzing infrastructure

    ClusterFuzz is a scalable fuzzing infrastructure that finds security and stability issues in software. Google uses ClusterFuzz to fuzz all Google products and as the fuzzing backend for OSS-Fuzz. ClusterFuzz provides many features which help seamlessly integrate fuzzing into a software project's development process. Can run on any size cluster (e.g. OSS-Fuzz instance runs on 100,000 VMs). Fully automatic bug filing, triage and closing for various issue trackers (e.g. Monorail, Jira)....
    Downloads: 5 This Week
    Last Update:
    See Project
  • 8
    OpenArchiver

    OpenArchiver

    An open-source platform for legally compliant email archiving

    ...It’s designed for scenarios where reliable, tamper-proof archiving and full-text search across both emails and attachments are essential for legal discovery, compliance, or long-term records retention. The platform combines a modern web UI with powerful backend services, including fast indexing, deduplication, encryption at rest, and asynchronous ingestion workflows, making it suitable for both small teams and enterprise deployments. Beyond simply capturing email, it emphasizes security and auditability with features like secure storage formats, file integrity verification, and detailed audit trails of user interactions.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 9
    SWR

    SWR

    React Hooks library for remote data fetching

    The name “SWR” is derived from stale-while-revalidate, a HTTP cache invalidation strategy popularized by HTTP RFC 5861. SWR is a strategy to first return the data from cache (stale), then send the fetch request (revalidate), and finally come with the up-to-date data. With SWR, components will get a stream of data updates constantly and automatically. And the UI will be always fast and reactive. With just one single line of code, you can simplify the logic of data fetching in your project,...
    Downloads: 4 This Week
    Last Update:
    See Project
  • AestheticsPro Medical Spa Software Icon
    AestheticsPro Medical Spa Software

    Our new software release will dramatically improve your medspa business performance while enhancing the customer experience

    AestheticsPro is the most complete Aesthetics Software on the market today. HIPAA Cloud Compliant with electronic charting, integrated POS, targeted marketing and results driven reporting; AestheticsPro delivers the tools you need to manage your medical spa business. It is our mission To Provide an All-in-One Cutting Edge Software to the Aesthetics Industry.
    Learn More
  • 10
    syzkaller

    syzkaller

    syzkaller is an unsupervised coverage-guided kernel fuzzer

    syzkaller is Google’s coverage-guided, feedback-driven kernel fuzzer designed to uncover reliability and security bugs in operating system kernels at scale. It automatically generates, mutates, and minimizes system call programs, then drives them through a specialized executor (syz-executor) to exercise deep kernel paths. The system integrates tightly with sanitizers such as KASAN, KMSAN, KCSAN, and UBSAN to surface memory safety, concurrency, and undefined behavior issues with actionable...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 11
    OpenMeter

    OpenMeter

    Metering and Billing for AI, API and DevOps

    OpenMeter is an open-source metering and billing platform designed to collect, aggregate, and analyze usage events from APIs, cloud infrastructure, and software services in real time, enabling flexible usage-based billing for SaaS, AI, and DevOps offerings. It supports high-scale event ingestion and deduplication to accurately record how customers consume billable resources such as API calls, compute time, or storage, and then correlates that usage with payment systems and billing plans to automate invoicing and revenue recognition. The system includes metering, storage, cataloging of products and pricing rules, and tools to enforce limits or quotas, supporting both self-service customer portals and internal dashboards. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Pachyderm

    Pachyderm

    Data-Centric Pipelines and Data Versioning

    ...Automatic immutable data lineage and data versioning of all data types. Autoscaling and parallel processing built on Kubernetes for resource orchestration. Uses standard object stores for data storage with automatic deduplication. Runs across all major cloud providers and on-premises installations. Automatic and intelligent versioning of even the largest data sets of unstructured and structured data. Git-like structure enables effective team collaboration. Full versioning for metadata including all analysis, parameters, artifacts, models, and intermediate results. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Nuke

    Nuke

    Image loading system

    Nuke ILS provides an efficient way to download and display images in your app. It's easy to learn and use thanks to a clear and concise API. Its architecture enables many powerful features while offering virtually unlimited possibilities for customization. Despite the number of features, the framework is lean and compiles in just under 3 seconds¹. Nuke has an automated test suite 2x the size of the codebase itself, ensuring excellent reliability. Every feature is carefully designed and...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    NeMo Curator

    NeMo Curator

    Scalable data pre processing and curation toolkit for LLMs

    NeMo Curator is a Python library specifically designed for fast and scalable dataset preparation and curation for large language model (LLM) use-cases such as foundation model pretraining, domain-adaptive pretraining (DAPT), supervised fine-tuning (SFT) and paramter-efficient fine-tuning (PEFT). It greatly accelerates data curation by leveraging GPUs with Dask and RAPIDS, resulting in significant time savings. The library provides a customizable and modular interface, simplifying pipeline...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    BmuS

    BmuS

    BmuS - Powerful linux backup program deduplication, encryption & more

    ...Visit the Quick Start Guide and FAQ on how to install Docker and Bmus on a Mac or Windows. https://www.youtube.com/watch?v=ksfYJlpqfCw BmuS features encryption, deduplication, Cloud Backups and much more. One of the key features that has received special attention (or is it called “Love”?) is the dashboard ( https://www.back-me-up-scotty.com/dashboards/bmus_dashboard.html ), which is probably the most unique feature of BmuS, apart from the fact that only a few backup tools can back up files AND MySQL-DB.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 16
    zpaqfranz

    zpaqfranz

    Zpaq compatible archiver for Win, Linux, Free/OpenBSD, Solaris & MacOS

    The Ultimate Archiver: zpaq fork with deduplication & versioning Forget pruning! Get forever storage of your files with thousands of versions. Far more efficient than Time Machine or ZFS snapshots - perfect for VM backups and permanent archiving, effortlessly handling TBs and millions of files. Optimized for cloud/NAS/USB with ultra-low bandwidth, military-grade encryption, and 1GB/s+ speeds on modern hardware.
    Downloads: 24 This Week
    Last Update:
    See Project
  • 17
    Blazefox

    Blazefox

    Blazefox: Fast, safe file deduplication and management tool for Linux

    Blazefox is a powerful and efficient file deduplication and management tool designed for both everyday users and system administrators. It scans directories recursively, identifies duplicate files using advanced content hashing, and provides safe options to remove, move, or rename duplicates. Blazefox supports flexible filtering with regex/glob patterns, interactive and dry-run modes, and advanced conflict resolution strategies, ensuring no data is lost accidentally.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    KemonoDownloader

    KemonoDownloader

    Kemono Downloader - A cross-platform Python app built with PyQt6

    Welcome to Kemono Downloader, a versatile Python-based desktop application built with PyQt6, designed to download content from Kemono.su. This tool enables users to archive individual posts or entire creator profiles from services like Patreon, Fanbox, and more, supporting a wide range of file types with customizable settings and advanced features.
    Leader badge
    Downloads: 192 This Week
    Last Update:
    See Project
  • 19
    Kindle Mate(KMate)

    Kindle Mate(KMate)

    Kindle clippings and Kindle Vocabulary Builder manager

    KMate (formerly Kindle Mate only for Windows) is a newly redesigned mate for Kindle users to manage notes, vocabulary, and knowledge—your ideal companion for deep reading and language learning with Kindle, helping you rediscover the value of your notes and vocabulary. Its legacy version (1.38), which I made nearly 10 years ago, has long been a classic global tool for Kindle users to organize notes, learn vocabulary, and manage knowledge. ## KMate 1.38 for Windows Full Installer(legacy...
    Downloads: 32 This Week
    Last Update:
    See Project
  • 20
    ByteSync

    ByteSync

    All-in-One Sync, Backup & Deduplication Tool

    ByteSync is a powerful open-source tool for remote file comparison and synchronization, designed to transfer only what truly matters: the differences. Whether you're syncing large files, backing up remote folders, or handling extensive sets of small files, ByteSync delivers speed, security, and efficiency—without unnecessary data transfers. Compatible with Windows, macOS, and Linux, ByteSync leverages cloud-based orchestration with end-to-end encryption, eliminating the need for VPNs or...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    XSIBackup-App

    XSIBackup-App

    Backup and replicate Linux files, databases and ©VMWare ©ESXi VMs

    Free virtual appliance to backup and replicate Linux servers and ©VMWare ©ESXi virtual machines from version 5.1 to 8.0. https://33hops.com/xsibackup-app-detailed-installation-instructions.html ©XSIBackup-App connects to multiple Linux or ©ESXi servers and backs VMs & files up to, local disk, NFS, iSCSI, Samba, etc... or to any Linux or ©ESXi server over IP, it just needs the SSH port open. This appliance is based in CentOS 7 and has root access. You can install any available...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 22
    text-dedup

    text-dedup

    All-in-one text de-duplication

    ...It supports Jaccard similarity thresholding, parallel execution, and flexible deduplication strategies, making it ideal for cleaning web-scraped data, language model training datasets, or document archives.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 23

    onlyjobs-desktop

    Private AI job tracker with gmail sync that runs on your local machine

    ...Using a powerful 3B parameter AI model that runs entirely on your Mac, it intelligently classifies emails into job applications, interviews, offers, and rejections without sending data to any servers. Key features include one-click Gmail sync supporting multiple accounts, smart deduplication that merges related job emails, and instant classification of company names, positions, and application status. View original emails with a single click to verify classifications. Built with Electron and React for a smooth native experience, OnlyJobs Desktop prioritizes privacy - all AI processing happens locally on your device. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Underscore Backup

    Underscore Backup

    Private, secure backups in the cloud.

    Private, secure backups in the cloud of any size with minimal resource usage exactly the way you want them. Your data is encrypted by default using encryption keys only available to you and before leaving your network. Backed up data is signed to ensure protection from tampering. Data can be stored with any amount of redundancy and even multiple backup locations simultaneously. No information about the contents of your data is saved in any backup location. Only you can download and...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 25
    orogene

    orogene

    Makes `node_modules/` happen. Fast. No fuss

    Orogene is a next-generation package manager designed for Node.js environments, focusing on speed, efficiency, and seamless integration with tools that utilize node_modules/, such as bundlers and CLI applications. It employs a central store for dependencies, deduplicates packages, and leverages copy-on-write techniques on supported filesystems to minimize disk usage and accelerate loading times. Orogene aims to provide a robust and user-friendly experience, ensuring that developers can...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • Next