Showing 75 open source projects for "simd"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Enterprise-grade ITSM, for every business Icon
    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity.

    Freshservice is an intuitive, AI-powered platform that helps IT, operations, and business teams deliver exceptional service without the usual complexity. Automate repetitive tasks, resolve issues faster, and provide seamless support across the organization. From managing incidents and assets to driving smarter decisions, Freshservice makes it easy to stay efficient and scale with confidence.
    Try it Free
  • 1
    Simd Library

    Simd Library

    C++ image processing and machine learning library with using of SIMD

    ...The algorithms are optimized with using of different SIMD CPU extensions. In particular, the library supports the following CPU extensions: SSE, AVX, AVX-512, and AMX for x86/x64, and NEON for ARM. The Simd Library has C API and also contains useful C++ classes and functions to facilitate access to C API. The library supports dynamic and static linking, 32-bit and 64-bit Windows and Linux, MSVS, G++ and Clang compilers, MSVS projects, and CMake build systems.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 2
    Datalevin

    Datalevin

    A simple, fast and versatile Datalog database

    Datalevin is an open-source Datalog-based database written in Clojure that runs natively on top of LMDB. It supports full ACID transactions, schema-less EDN data storage, vector search with SIMD acceleration, and text-based querying. Usable as an embedded library or as a client/server database with RBAC access, it acts like SQLite or Datomic on a ledger of immutable datoms, plus modern features like vector and full-text search.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 3
    ispc

    ispc

    Intel SPMD Program Compiler

    ...Under the SPMD model, the programmer writes a program that generally appears to be a regular serial program, though the execution model is actually that a number of program instances execute in parallel on the hardware. ispc compiles a C-based SPMD programming language to run on the SIMD units of CPUs and GPUs; it frequently provides a 3x or more speedup on architectures with 4-wide vector SSE units and 5x-6x on architectures with 8-wide AVX vector units, without any of the difficulty of writing intrinsics code. Parallelization across multiple cores is also supported by ispc, making it possible to write programs that achieve performance improvement that scales by both numbers of cores and vector unit size. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 4
    LoopVectorization.jl

    LoopVectorization.jl

    Macro(s) for vectorizing loops

    LoopVectorization.jl is a Julia package for accelerating numerical loops by automatically applying SIMD (Single Instruction, Multiple Data) vectorization and other low-level optimizations. It analyzes loops and generates highly efficient code that leverages CPU vector instructions, making it ideal for performance-critical computing in fields such as scientific computing, signal processing, and machine learning.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Find Hidden Risks in Windows Task Scheduler Icon
    Find Hidden Risks in Windows Task Scheduler

    Free diagnostic script reveals configuration issues, error patterns, and security risks. Instant HTML report.

    Windows Task Scheduler might be hiding critical failures. Download the free JAMS diagnostic tool to uncover problems before they impact production—get a color-coded risk report with clear remediation steps in minutes.
    Download Free Tool
  • 5
    Ultralight

    Ultralight

    Lightweight, high-performance HTML renderer for game developers

    ...Official API for C and C++, with bindings for more. Render web-content on the GPU via Direct3D, Metal, OpenGL, or your own engine for unmatched visual performance. Render web-content on the CPU via SIMD/parallel for incredibly easy integration with any environment (including server-side!). Ultralight is engineered for peak performance, ensuring minimal CPU and memory usage. Customize low-level platform functionality, integrate JavaScript directly with native code, dive deep into performance tuning, and more. Built for maximum portability, optimized for PCs, game consoles, TVs, and embedded systems.
    Downloads: 17 This Week
    Last Update:
    See Project
  • 6
    node-rs

    node-rs

    Node.js bindings Rust crates

    When Node.js meets Rust. Make rust crates binding to Node.js use napi-rs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    QSV

    QSV

    Blazing-fast Data-Wrangling toolkit

    qsv is a fast, command-line CSV data toolkit written in Rust that extends the capabilities of xsv. It’s designed to make working with CSV files at scale easy and efficient, offering over 40 powerful subcommands for tasks like querying, sampling, splitting, deduplicating, and more. qsv is ideal for data engineers, analysts, and developers who need high-performance CSV manipulation on the command line.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 8
    torchvision

    torchvision

    Datasets, transforms and models specific to Computer Vision

    The torchvision package consists of popular datasets, model architectures, and common image transformations for computer vision. We recommend Anaconda as Python package management system. Torchvision currently supports Pillow (default), Pillow-SIMD, which is a much faster drop-in replacement for Pillow with SIMD, if installed will be used as the default. Also, accimage, if installed can be activated by calling torchvision.set_image_backend('accimage'), libpng, which can be installed via conda conda install libpng or any of the package managers for debian-based and RHEL-based Linux distributions, and libjpeg, which can be installed via conda conda install jpeg or any of the package managers for debian-based and RHEL-based Linux distributions. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 9
    Claude-Flow

    Claude-Flow

    The leading agent orchestration platform for Claude

    ...The platform supports both quick swarm tasks and persistent multi-agent sessions known as hives, facilitating distributed AI collaboration with persistent contextual memory. At its core, Claude-Flow integrates Dynamic Agent Architecture (DAA) for self-organizing agent management, neural pattern recognition accelerated by WebAssembly SIMD, and a SQLite-based memory system for context retention and knowledge persistence across tasks. It automates development workflows via pre- and post-operation hooks, providing seamless coordination, code formatting, validation, and performance optimization.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Axe Credit Portal - ACP- is axefinance’s future-proof AI-driven solution to digitalize the loan process from KYC to servicing, available as a locally hosted or cloud-based software. Icon
    Axe Credit Portal - ACP- is axefinance’s future-proof AI-driven solution to digitalize the loan process from KYC to servicing, available as a locally hosted or cloud-based software.

    Banks, lending institutions

    Founded in 2004, axefinance is a global market-leading software provider focused on credit risk automation for lenders looking to provide an efficient, competitive, and seamless omnichannel financing journey for all client segments (FI, Retail, Commercial, and Corporate.)
    Learn More
  • 10
    HighwayHash

    HighwayHash

    Fast strong hash functions: SipHash/HighwayHash

    HighwayHash is a fast, keyed hash function intended for scenarios where you need strong, DoS-resistant hashing without the full overhead of a general-purpose cryptographic hash. It’s designed to defeat hash-flooding attacks by mixing input with wide SIMD operations and a branch-free inner loop, so adversaries can’t cheaply craft many colliding keys. The implementation targets multiple CPU families with vectorized code paths while keeping a portable fallback, yielding high throughput across platforms. It exposes simple one-shot and streaming APIs, so you can hash short keys or long byte streams with the same function. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Polars

    Polars

    Dataframes powered by a multithreaded, vectorized query engine

    Polars is a high-performance, multi-language DataFrame library built in Rust using Apache Arrow. It delivers blazing-fast, vectorized, and parallel data manipulation with both eager and lazy execution, making it an excellent tool for data processing in Python, Rust, Node.js, R, and SQL contexts.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    Sonic JSON

    Sonic JSON

    A blazingly fast JSON serializing & deserializing library

    A blazingly fast JSON serializing & deserializing library, accelerated by JIT (just-in-time compiling) and SIMD (single-instruction-multiple-data).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Compute Library

    Compute Library

    The Compute Library is a set of computer vision and machine learning

    The Compute Library is a set of computer vision and machine learning functions optimized for both Arm CPUs and GPUs using SIMD technologies. The library provides superior performance to other open-source alternatives and immediate support for new Arm® technologies e.g. SVE2.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 14
    ReverseDiff

    ReverseDiff

    Reverse Mode Automatic Differentiation for Julia

    ReverseDiff is a fast and compile-able tape-based reverse mode automatic differentiation (AD) that implements methods to take gradients, Jacobians, Hessians, and higher-order derivatives of native Julia functions (or any callable object, really). While performance can vary depending on the functions you evaluate, the algorithms implemented by ReverseDiff generally outperform non-AD algorithms in both speed and accuracy.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    simdjson

    simdjson

    Parsing gigabytes of JSON per second

    JSON is everywhere on the Internet. Servers spend a *lot* of time parsing it. We need a fresh approach. The simdjson library uses commonly available SIMD instructions and microparallel algorithms to parse JSON 4x faster than RapidJSON and 25x faster than JSON for Modern C++. The simdjson library uses three-quarters less instructions than state-of-the-art parser RapidJSON. To our knowledge, simdjson is the first fully-validating JSON parser to run at gigabytes per second (GB/s) on commodity processors. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    QuestDB

    QuestDB

    An open source SQL database designed to process time series data

    ...These extensions make it simple to correlate data from multiple sources using relational and time series joins. QuestDB achieves high performance from a column-oriented storage model, massively-parallelized vector execution, SIMD instructions, and various low-latency techniques. The entire codebase was built from the ground up in Java and C++, with no dependencies, and is 100% free from garbage collection. We provide a live demo provisioned with the latest QuestDB release and sample datasets.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 17
    StringZilla

    StringZilla

    10x faster string search, split, sort, and shuffle for long strings

    ...It matches the first few letters of words with hyper-scalar code to achieve memcpy speeds. The implementation fits into a single C 99 header file and uses different SIMD flavors and SWAR on older platforms. The Str is designed to replace long Python str strings and wrap our C-level API. On the other hand, the File memory-maps a file from persistent memory without loading its copy into RAM. The contents of that file would remain immutable, and the mapping can be shared by multiple Python processes simultaneously. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Numba

    Numba

    NumPy aware dynamic Python compiler using LLVM

    Numba is an open source JIT compiler that translates a subset of Python and NumPy code into fast machine code. Numba translates Python functions to optimized machine code at runtime using the industry-standard LLVM compiler library. Numba-compiled numerical algorithms in Python can approach the speeds of C or FORTRAN. You don't need to replace the Python interpreter, run a separate compilation step, or even have a C/C++ compiler installed. Just apply one of the Numba decorators to your...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    XNNPACK

    XNNPACK

    High-efficiency floating-point neural network inference operators

    ...Rather than serving as a standalone ML framework, XNNPACK provides high-performance computational primitives—such as convolutions, pooling, activation functions, and arithmetic operations—that are integrated into higher-level frameworks like TensorFlow Lite, PyTorch Mobile, ONNX Runtime, TensorFlow.js, and MediaPipe. The library is written in C/C++ and designed for maximum portability, efficiency, and performance, leveraging platform-specific instruction sets (e.g., NEON, AVX, SIMD) for optimized execution. It supports NHWC tensor layouts and allows flexible striding along the channel dimension to efficiently handle channel-split and concatenation operations without additional cost.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 20
    Zerocopy

    Zerocopy

    Zerocopy makes zero-cost memory manipulation effortless

    Zerocopy is a Rust library designed to make zero-cost memory manipulation both safe and effortless. It allows developers to reinterpret or convert raw byte sequences into structured types—and vice versa—without writing unsafe code directly. The crate provides safe abstractions for transmuting data while preserving Rust’s strict safety guarantees, removing the need for manual memory manipulation. Zerocopy introduces a suite of conversion traits such as TryFromBytes, FromBytes, IntoBytes, and...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    OpenGL Mathematics

    OpenGL Mathematics

    Highly Optimized Graphics Math (glm) for C

    Highly optimized 2D|3D math library, also known as OpenGL Mathematics (glm) for `C`. cglm provides lot of utils to help math operations to be fast and quick to write. It is community-friendly, feel free to bring any issues, bugs you faced. Almost all functions (inline versions) and parameters are documented inside the corresponding headers. OpenGL-related functions are dropped to make this lib platform/third-party independent. Make sure you have the latest version and feel free to report...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22

    UniSIMD-assembler

    SIMD macro assembler unified for ARM, MIPS, PPC and x86

    UniSIMD assembler is a high-level C/C++ macro assembler framework unified across ARM, MIPS, POWER and x86 architectures. It establishes a subset of both BASE and SIMD instruction sets with clearly defined common API, so that application logic can be written and maintained in one place without code replication. The assembler itself isn't a separate tool, but rather a collection of C/C++ header files, which applications need to include directly in order to use. At present, Intel SSE/SSE2/SSE4 and AVX/AVX2/AVX-512 (32/64-bit x86 ISAs), ARMv7 NEON/NEONv2, ARMv8 AArch32 and AArch64 NEON, SVE (32/64-bit ARM ISAs), MIPS 32/64-bit r5/r6 MSA and POWER 32/64-bit VMX/VSX (little/big-endian ISAs) are mostly implemented (/w horizontal reductions) although scalar improvements, wider SIMD vectors with zeroing/merging predicates in 3/4-operand instructions are planned as extensions to current 2/3-operand SPMD-driven vertical SIMD ISA. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23

    QuadRay-engine

    Realtime raytracer using SIMD on ARM, MIPS, PPC and x86

    QuadRay engine is a realtime raytracing project aimed at full SIMD utilization on ARM, MIPS, POWER and x86 architectures. The efficient use of SIMD is achieved by processing four rays at a time to match SIMD register width (hence the name). The rendering core of the engine is written in a unified SIMD assembler allowing single assembler code to be compatible with different processor architectures, thus reducing the need to maintain multiple parallel versions. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Vector Pascal is a language targeted at SIMD multi-core instruction-sets such as the AVX and SSE2 or x86-64-v3. It has a SIMD compiler which supports parallel vector operations, loop unrolling, common sub expression removal etc. It is implemented in Java.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 25
    sleef

    sleef

    Vectorized libm

    SLEEF stands for SIMD Library for Evaluating Elementary Functions. SLEEF implements vectorized versions of all C99 math functions, that utilize SIMD instructions of modern processors to make computation more efficient. The library also includes vectorized DFT subroutines.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • Next