Repository navigation

arm-neon

Website
Wikipedia

ncnn is a high-performance neural network inference framework optimized for the mobile platform

inference high-preformance simd arm-neon 深度学习人工智能 Android iOS ncnn vulkan 神经网络 caffe mxnet PyTorch onnx darknet Tensorflow mlir Keras RISC-V

C++

22127

4327

2 天前

microsoft / DirectXMath

DirectXMath is an all inline SIMD C++ linear algebra library for use in games and graphics apps

Microsoft directx simd neon cpp-library sse avx avx2 clang msvc xbox Desktop Universal Windows Platform arm-neon arm64

C++

1713

250

3 天前

ashvardanian / SimSIMD

Up to 200x Faster Dot Products & Similarity Metrics — for Python, Rust, C, JS, and Swift, supporting f64, f32, f16 real & complex, i8, and bit vectors using SIMD for both AVX2, AVX-512, NEON, SVE, & SVE2 📐

arm-neon arm-sve Assembly avx2 distance-calculation 监控 neon simd simd-instructions information-retrieval NumPy SciPy similarity-measures similarity-search vector-search avx512 blas blas-libraries float16 bfloat16

1513

3 天前

Tencent / FeatherCNN

FeatherCNN is a high performance inference engine for convolutional neural networks.

convolutional-neural-networks inference-engine caffe Android iOS arm-neon

C++

1219

281

6 年前

kimwalisch / primesieve

🚀 Fast prime number generator

prime-numbers 数学 primes sieve avx512 arm-neon arm-sve number-theory stress-testing primesieve benchmark

C++

1028

125

4 天前

spnda / fastgltf

A modern C++17 glTF 2.0 library focused on speed, correctness, and usability

C++gltf gltf-loader simd gltf2 gltf2-loader khronos cpp-library serialization avx sse arm-neon JSON cpp17-library neon cpp20-modules

C++

425

17 天前

WojciechMula / sse-popcount

SIMD (SSE) population count --- http://0x80.pl/articles/sse-popcount.html

arm-neon sse avx2 avx512 aarch64

C++

349

2 年前

OAID / Caffe-HRT

Heterogeneous Run Time version of Caffe. Added heterogeneous capabilities to the Caffe, uses heterogeneous computing infrastructure framework to speed up Deep Learning on Arm-based heterogeneous embedded platform. It also retains all the features of the original Caffe architecture which users deploy their applications seamlessly.

arm caffe arm-neon 机器学习人工智能 dnn cnn

C++

269

7 年前

rogerou / Arm-neon-intrinsics

arm neon 相关文档和指令意义

arm-neon doc

245

6 年前

AI-performance / embedded-ai.bench

benchmark for embededded-ai deep learning inference engines, such as NCNN / TNN / MNN / TensorFlow Lite etc.

inference ncnn mnn tnn arm arm-neon Android iOS tensorflow-lite 人工智能 benchmark

Python

204

5 年前

cdl-saarland / rv

RV: A Unified Region Vectorizer for LLVM

vectorization simd LLVM 编译器 avx512 avx2 spmd arm-neon openmp

C++

112

4 个月前

OAID / MXNet-HRT

Heterogeneous Run Time version of MXNet. Added heterogeneous capabilities to the MXNet, uses heterogeneous computing infrastructure framework to speed up Deep Learning on Arm-based heterogeneous embedded platform. It also retains all the features of the original MXNet architecture which users deploy their applications seamlessly.

arm arm-neon 机器学习人工智能 mxnet opencl dnn cnn

C++

8 年前

xxxxyu / FlexNN

Code for ACM MobiCom 2024 paper "FlexNN: Efficient and Adaptive DNN Inference on Memory-Constrained Edge Devices"

cnn dnn 移动 Android arm-neon inference ncnn 神经网络

8 个月前

wx257osn2 / qoixx

Single Header Quite Fast QOI(Quite OK Image Format) Implementation written in C++20

image-compression lossless-image-compression C++simd avx2 arm-neon arm-sve neon

C++

5 个月前

OAID / TensorFlow-HRT

Heterogeneous Run Time version of TensorFlow. Added heterogeneous capabilities to the TensorFlow, uses heterogeneous computing infrastructure framework to speed up Deep Learning on Arm-based heterogeneous embedded platform. It also retains all the features of the original TensorFlow architecture which users deploy their applications seamlessly.

arm Tensorflow arm-neon 机器学习人工智能 dnn cnn

C++

8 年前

Ferki-git-creator / NZ1

NZ1 - NanoZip 1: ultra-fast, dependency-free, portable C compression library optimized for embedded and high-performance use. Full docs: https://ferki-git-creator.github.io/nz1-site/

arm-neon avx2 c-library compression crc32 data-compression embedded embedded-systems 游戏开发 Internet of things lz77 Microcontroller no-dependencies portable real-time simd

1 个月前