Issue caught in Golang runtime, which widely uses signal SIGURG for
scheduling. Sometimes messages cannot be received. Technically
socket_base_t::process_commands() returns failure even if some commands were
processed, but next message from mailbox could not be received during interrupt.
Solution: retry receiving from mailbox with zero timeout after EINTR.
Signed-off-by: Ilya Kondrashkin <ikondrashkin@nfware.com>
`gcc-13` added an assert to standard headers to make sure custom
allocators have intended implementation of rebind type instead
of inherited rebind. gcc change:
https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=64c986b49558a7
Without the fix build fails on this week's `gcc-13` as:
[ 92%] Building CXX object tests/CMakeFiles/test_security_curve.dir/test_security_curve.cpp.o
In file included from /<<NIX>>/gcc-13.0.0/include/c++/13.0.0/ext/alloc_traits.h:34,
from /<<NIX>>/gcc-13.0.0/include/c++/13.0.0/bits/stl_uninitialized.h:64,
from /<<NIX>>/gcc-13.0.0/include/c++/13.0.0/memory:69,
from tests/../src/secure_allocator.hpp:42,
from tests/../src/curve_client_tools.hpp:49,
from tests/test_security_curve.cpp:53:
/<<NIX>>/gcc-13.0.0/include/c++/13.0.0/bits/alloc_traits.h: In instantiation of 'struct std::__allocator_traits_base::__rebind<zmq::secure_allocator_t<unsigned char>, unsigned char, void>':
/<<NIX>>/gcc-13.0.0/include/c++/13.0.0/bits/alloc_traits.h:94:11: required by substitution of 'template<class _Alloc, class _Up> using std::__alloc_rebind = typename std::__allocator_traits_base::__rebind<_Alloc, _Up>::type [with _Alloc = zmq::secure_allocator_t<unsigned char>; _Up = unsigned char]'
/<<NIX>>/gcc-13.0.0/include/c++/13.0.0/bits/alloc_traits.h:228:8: required by substitution of 'template<class _Alloc> template<class _Tp> using std::allocator_traits< <template-parameter-1-1> >::rebind_alloc = std::__alloc_rebind<_Alloc, _Tp> [with _Tp = unsigned char; _Alloc = zmq::secure_allocator_t<unsigned char>]'
/<<NIX>>/gcc-13.0.0/include/c++/13.0.0/ext/alloc_traits.h:126:65: required from 'struct __gnu_cxx::__alloc_traits<zmq::secure_allocator_t<unsigned char>, unsigned char>::rebind<unsigned char>'
/<<NIX>>/gcc-13.0.0/include/c++/13.0.0/bits/stl_vector.h:88:21: required from 'struct std::_Vector_base<unsigned char, zmq::secure_allocator_t<unsigned char> >'
/<<NIX>>/gcc-13.0.0/include/c++/13.0.0/bits/stl_vector.h:423:11: required from 'class std::vector<unsigned char, zmq::secure_allocator_t<unsigned char> >'
tests/../src/curve_client_tools.hpp:64:76: required from here
/<<NIX>>/gcc-13.0.0/include/c++/13.0.0/bits/alloc_traits.h:70:31: error: static assertion failed: allocator_traits<A>::rebind_alloc<A::value_type> must be A
70 | _Tp>::value,
| ^~~~~
The change adds trivial `rebind` definition with expected return type
and satisfies conversion requirements.
init_dependency_root(): Moved to android_build_helper.sh
ANDROID_DEPENDENCIES_DIR: added to specify a storage for dependencies
when downloaded automatically.
ANDROID_BUILD_DIR: Changed the default in ci_build.sh.
ci_build.sh configures these 2 variables to no more use /tmp by default (
except for Android NDK), but a local clone subfolder.
This helps to find downloaded dependencies and generated binaries.
This avoid to have user permission conflicts, or conflicts with 2
different clones of LIBZMQ (for instance).
2 more functions are added:
- android_clone_library():
Similar to android_clone_library(), but fetch a tarball and uncompress it.
So far, only .tar.gz and .tgz archives are supported, but could be enhanced
easily, if needed.
- android_init_dependency_root():
Initialize or check XXX_ROOT, when XXX is a dependency name.
Enhanced version of init_android_root() in build.sh (to be dropped, then).
This version is now also applicable in CZMQ & ZYRE CI builds scripts for Android.
With these changes, this function is now able to build LIBZMQ but also
(almost) CZMQ, ZYRE (and their dependencies) in native or cross compilation
modes.
When two parties are trying to establish ZMTP connection and do a
handshake, in which one partie selects hanshake version 2 or lower
has been selected, handshake timer fires always (by default after 30s)
as the timer never gets cancelled.
Solution: Cancel handshake timer after handhake has been done.
Last PRs introduced a duplicate code in Android helper file.
Duplicate "Initialisation" sequence can be observed after the helper functions:
########################################################################
# Initialization
...
# (Empty string indicates no failure)
ANDROID_BUILD_FAIL=()
Seen when trying to report last LIBZMQ Android PRs to CZMQ & ZYRE.
Solution: Remove the duplicate code.
Solution: Update image.
Added references to tested versions (Ubuntu, debian) and tested NDK.
Carefull though. Android have changed there NDK naming, between 22 and 23.
Among them, CC, LD, CFLAGS, ... could be useful for other tools.
Reason: Many are calculated as "local" to a particular function, which makes
them unavailable outside this helper function.
Solution: Export more variables (CC, LD, CFLAGS, ...).
New exported variables are prefixed with ANDROID_BUILD_xxx.
This naming is expected to avoid any conflicts/problem with other tools:
- ANDROID_BUILD_CC
- ANDROID_BUILD_LD
- ANDROID_BUILD_CFLAGS
- ...
Solution: Implement the current same kind of mechanism as CZMQ & ZYRE, with enhancement.
Enhancement: When required and if LIBSODIUM is not set, the build tool checks
for an already existing clone, close to LIBZMQ. This mechanism is close
to what is done by LIBZMQ/CZMQ/ZYRE for their dependencies.
Additionnaly: Do not copy current source tree to any 'cache' folder. Use
current folder, but make sure all is cleaned before compilation is launched.
This is a lot safer, when building different clones in parrallel...
Enhancement to be reported to CZMQ/ZYRE via ZProject.
Solution: Add a trace function.
Requires to move the CI build helper code/check/init/... at the end of
helper file.
This new function is available for (and also used by) build.sh.
Output is like:
LIBZMQ (x86_64) - Blah ...
To be reported to CZMQ/ZYRE via ZProject.
* Problem: Android NDK 22 download broken since support of NDK 23.
Due to the PR to support NDK 23.
With NDK 23, the archive file name has changed.
This change is handled by the PR to support NDK23, but now, only 23 and after
are supported.
Also, NDK 23 support introduced a 2nd occurence of the variable
HOST_PLATFORM, with another value. One occurence being exported,
this may confuse next developpers (and it actually confused me).
Solution: Code review
1st occurence is simply dropped, and the algorithm around is changed so that
there is no need of a 'host_platform' kind of stuff.
2nd occurrence is renamed to ANDROID_BUILD_PLATFORM.
Note that 'HOST' is replaced by 'BUILD', as this is the common naming
when talking about the build/compilation machine, when cross compiling.
A dedicated function is created in the helpers, to actually download
the NDK. As this function is made 'public', more checks are performed.
Note:
To be reported in CZMQ & ZYRE, via ZPROJECT, where NDK is downloaded
in 2 different files.
* Problem: Android build environment variables need clarifications.
Reason: All are spread and initialized throughout the code.
Solution: Declare, initialize and document environment variables on top of build.sh.
Side effect: This participates to documentation.
* Problem: Android APP fails to load ZMQ (ARM64 only)
Seen with physical Android devices running ARM64.
Not seen with ARM, X86 or X86_64.
Any Android APP loading ZMQ fails with:
```
[FATAL] Couldn't load library library zmq from jar. Dependency is required!
```
Unpack zyre-android-2.0.1.jar, find libzmq.so for ARM64 and look for missing
symbols:
```
prompt> unzip zyre-android-2.0.1.jar
prompt> cd lib/arm64-v8a
prompt> nm --undefined-only ./libzmq.so | head
U __aarch64_ldadd4_acq
U __aarch64_ldadd4_acq_rel
U __aarch64_ldadd4_rel
U __aarch64_ldadd4_relax
U __aarch64_ldadd8_acq_rel
U __aarch64_ldadd8_relax
U __aarch64_swp8_acq
U __aarch64_swp8_acq_rel
U __aarch64_swp8_rel
U __aarch64_swp8_relax
prompt>
```
Some more symbols are missing, but those are relevant for this issue.
OK.
These symbols are present in libc++_shared, but not exported ...:
```
prompt> nm libc++_shared.so | grep aarch64
00000000000ee6d0 t __aarch64_cas1_acq_rel
00000000000ee7a0 t __aarch64_cas8_acq_rel
00000000001028f0 b __aarch64_have_lse_atomics
00000000000ee840 t __aarch64_ldadd4_acq_rel
00000000000ee810 t __aarch64_ldadd4_rel
00000000000ee8a0 t __aarch64_ldadd8_acq_rel
00000000000ee870 t __aarch64_ldadd8_relax
00000000000ee7e0 t __aarch64_swp8_acq_rel
prompt>
```
Issue seen also on the WEB, with GCC & CLANG:
- https://bugzilla.redhat.com/show_bug.cgi?id=1830472
- cea175b838
- ...
Solution: Add `-mno-outline-atomics` to CXXFLAGS (FLUTTER fix).
Additionaly, had to introduce NDK_NUMBER.
This variable is calculated in `android_build_helper.sh`.
It represents the numeric form of NDK_VERSION:
```
NDK_VERSION --> NDK_NUMBER
android-ndk-r25 --> 2500
android-ndk-r23c --> 2303
android-ndk-r22 --> 2200
android-ndk-r21e --> 2105
... and so on
```
This will help a few other things (NDK download ?).
Scenario:
```
export CURL=libsodium
cd zyre/builds/android
./ci_build.sh
```
Result:
```
Android (arm) build failed for the following reasons:
Found no library named libzmq.so libsodium.so
/home/stephan/git/zproject-android-testing/libzmq/builds/android/prefix/arm/lib/libzmq.so libsodium.so
```
Caused by PR #4437, where the 2nd commit was to fix Sonatype findings.
Lesson learnt: Not always a good idea to add double quotes around variables ...
Solution: Make VERIFY an array, so that Sonatype won't complain.
Seen when someone has to relaunch `ci_build.sh` manually, for troubleshooting,
or experiment(s), ci_build.sh stops as libraries are already built.
Solution: Clean more temporary/build folders before build.
Note:
To be reported in ZYRE/CZMQ via ZProject.
When called from ZYRE/CZMQ, it's difficult to identify which build script is
being executed
Solution: Modify each `echo` trace to show the project name and Android architecture in progress.
Note:
To be reported in ZYRE/CZMQ via ZProject.