NDK Programmer's Guide
|
Android NDK r3 added support for the new 'armeabi-v7a' ARM-based ABI that allows native code to use two useful instruction set extensions:
VFPv3, which provides hardware FPU registers and computations, to boost floating point performance significantly.
More specifically, by default 'armeabi-v7a' only supports VFPv3-D16 which only uses/requires 16 hardware FPU 64-bit registers.
More information about this can be read in docs/CPU-ARCH-ABIS.html
The ARMv7 Architecture Reference Manual also defines another optional instruction set extension known as "ARM Advanced SIMD", nick-named "NEON". It provides:
Not all ARMv7-based Android devices will support NEON, but those that do may benefit in significant ways from the scalar/vector instructions.
The NDK supports the compilation of modules or even specific source files with support for NEON. What this means is that a specific compiler flag will be used to enable the use of GCC ARM Neon intrinsics and VFPv3-D32 at the same time. The intrinsics are described here:
http://gcc.gnu.org/onlinedocs/gcc/ARM-NEON-Intrinsics.html
Using LOCAL_ARM_NEON
Define LOCAL_ARM_NEON to 'true' in your module definition, and the NDK will build all its source files with NEON support. This can be useful if you want to build a static or shared library that specifically contains NEON code paths.
.neon
suffix:When listing sources files in your LOCAL_SRC_FILES variable, you now have the option of using the .neon suffix to indicate that you want to corresponding source(s) to be built with Neon support. For example:
LOCAL_SRC_FILES := foo.c.neon bar.c
Will only build 'foo.c' with NEON support.
Note that the .neon suffix can be used with the .arm suffix too (used to specify the 32-bit ARM instruction set for non-NEON instructions), but must appear after it.
In other words, 'foo.c.arm.neon' works, but 'foo.c.neon.arm' does NOT.
Neon support only works when targeting the 'armeabi-v7a' or 'x86' ABI, otherwise the NDK build scripts will complain and abort. Neon is partially supported on x86 via translation header (To learn more about it, see x86). It is important to use checks like the following in your Android.mk:
# define a static library containing our NEON code ifeq ($(TARGET_ARCH_ABI),$(filter $(TARGET_ARCH_ABI), armeabi-v7a x86)) include $(CLEAR_VARS) LOCAL_MODULE := mylib-neon LOCAL_SRC_FILES := mylib-neon.c LOCAL_ARM_NEON := true include $(BUILD_STATIC_LIBRARY) endif # TARGET_ARCH_ABI == armeabi-v7a || x86
As said previously, NOT ALL ARMv7-BASED ANDROID DEVICES WILL SUPPORT NEON ! It is thus crucial to perform runtime detection to know if the NEON-capable machine code can be run on the target device.
To do that, use the cpufeatures library that comes with this NDK. To learn more about it, see docs/CPU-FEATURES.html.
You should explicitly check that android_getCpuFamily() returns ANDROID_CPU_FAMILY_ARM, and that android_getCpuFeatures() returns a value that has the ANDROID_CPU_ARM_FEATURE_NEON flag set, as in:
#include <cpu-features.h> ... ... if (android_getCpuFamily() == ANDROID_CPU_FAMILY_ARM && (android_getCpuFeatures() & ANDROID_CPU_ARM_FEATURE_NEON) != 0) { // use NEON-optimized routines ... } else { // use non-NEON fallback routines instead ... } ...
Look at the source code for the "hello-neon" sample in this NDK for an example on how to use the 'cpufeatures' library and Neon intrinsics at the same time.
This implements a tiny benchmark for a FIR filter loop using a C version, and a NEON-optimized one for devices that support it.