Autor Thema: (solved) Driver fails on recent debian unstable armhf  (Gelesen 12718 mal)

ElmerFudd

  • Newbie
  • *
  • Beiträge: 7
    • Profil anzeigen
(solved) Driver fails on recent debian unstable armhf
« am: Mai 12, 2012, 04:41:33 Nachmittag »
I've had it running quite fine when the first unofficial debian armhf port came along (November 2011).

But something has changed in debian I believe, with my new fresh install. Firstly, it failed to detect arm-sysv, since the binary that was supposed to detect it recuired libc.so.6 and a few other libs to be in /lib, which has changed in debian.
A few symlinks later I got it installed, but it will not preload libmediaclient.so.

I've built a small executable that I could strace, and found that the program exists immediately after loading the so :

---- program ----
#include <dlfcn.h>
#include <stdio.h>

int main()
{
  dlopen("/opt/lib/libmediaclient.so",RTLD_LAZY);
  printf("Opened\n");
  return 0;
}
---- program ----

Compiled with gcc test.c -ldl -o test

Never prints Opened, the return code is 1, if it helps.

---- strace snippet ----
open("/opt/lib/libmediaclient.so", O_RDONLY) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0(\0\1\0\0\0004\31\0\0004\0\0\0"..., 512) = 512
lseek(3, 58456, SEEK_SET)               = 58456
read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1000) = 1000
lseek(3, 58201, SEEK_SET)               = 58201
read(3, "A/\0\0\0aeabi\0\1%\0\0\0\5ARM9TDMI\0\6\2\10\1\t\1"..., 48) = 48
exit_group(1)                           = ?
---- strace snippet ----

The driver works fine on the old broken debian from november. What could possibly cause this? - If it was dependency issues then I believe it would have printed it, and definitely not exit on dlopen.
Do you have any clues as to what could have caused this, and how to fix it?

It fails the same way with the newest driver as well as with the one used on the old debian install.
« Letzte Änderung: Juni 05, 2012, 07:16:04 Vormittag von Sundtek »

Sundtek

  • Administrator
  • Hero Member
  • *****
  • Beiträge: 8512
    • Profil anzeigen
Re:Driver fails on recent debian unstable armhf
« Antwort #1 am: Mai 12, 2012, 11:42:13 Nachmittag »
Hmm. We have several arm builds included. If possible try to contact us via skype chat (sundtek) or webchat http://chat.sundtek.de / irc.freenode.net #sundtek
Failure is a good thing! I'll fix it

ElmerFudd

  • Newbie
  • *
  • Beiträge: 7
    • Profil anzeigen
Re:Driver fails on recent debian unstable armhf
« Antwort #2 am: Mai 13, 2012, 02:06:14 Nachmittag »
It does the same on ubuntu 12.04

Both use glibc 2.15, I think the problem lies here.

On glibc 2.13 dlopen only reads the first part of the file :

---- strace snippet ----
open("/opt/lib/libmediaclient.so", O_RDONLY) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0(\0\1\0\0\0004\31\0\0004\0\0\0"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0755, st_size=55360, ...}) = 0
---- strace snippet ----

Versus 2.15:

---- strace snippet ----
open("/opt/lib/libmediaclient.so", O_RDONLY) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0(\0\1\0\0\0004\31\0\0004\0\0\0"..., 512) = 512
lseek(3, 58456, SEEK_SET)               = 58456
read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1000) = 1000
lseek(3, 58201, SEEK_SET)               = 58201
read(3, "A/\0\0\0aeabi\0\1%\0\0\0\5ARM9TDMI\0\6\2\10\1\t\1"..., 48) = 48
exit_group(1)                           = ?
---- strace snippet ----

With 2.15 it never reaches fstat, but simply exits the application from dlopen. Maybe something is wring with the file? - My guess is it reads something it doesn't like from that last 48 bytes, but why not just return error?
The 48 bytes is the .ARM.attributes part of the elf binary according to objdump, whatever that is.
« Letzte Änderung: Mai 13, 2012, 06:50:23 Nachmittag von ElmerFudd »

Sundtek

  • Administrator
  • Hero Member
  • *****
  • Beiträge: 8512
    • Profil anzeigen
Re:Driver fails on recent debian unstable armhf
« Antwort #3 am: Mai 14, 2012, 07:19:37 Vormittag »
Just contact us via chat, maybe we need to add another toolchain to our driver - basically that's no problem for us.
Failure is a good thing! I'll fix it

ElmerFudd

  • Newbie
  • *
  • Beiträge: 7
    • Profil anzeigen
Re:Driver fails on recent debian unstable armhf
« Antwort #4 am: Mai 14, 2012, 07:57:18 Nachmittag »
I'm on irc as ElmerFudd, but I'm trying to debug the issue within glibc. I'm having a hard time getting my head around the code, but I think I'm getting the hang of if.
So far I found out that both ubuntu and debian uses eglibc and not glibc. Using the LD_DEBUG flag and strace I've narrowed the code down to elf/dl-load.c so far.

I'll find time tomorrow to figure out where in eglibc it exits any application that tries to load libmediaclient, using the tried-and-proven printf method.
I think it is a bug in eglibc, to be honest. If you have some pointers as to debug (e)glibc it would be greatly appreciated.
I'm idling on irc :)

ElmerFudd

  • Newbie
  • *
  • Beiträge: 7
    • Profil anzeigen
Re:Driver fails on recent debian unstable armhf
« Antwort #5 am: Mai 17, 2012, 01:55:30 Nachmittag »
I've been trying to debug eglibc the last couple of days. Here are what I have done:

Getting the source with apt-get source, and building it manually.
Debugging my executable that loads libmediaclient.so manually works fine with my build, using the steps to debug glibc here :
http://sourceware.org/glibc/wiki/Debugging/Loader_Debugging#Debugging_With_an_Alternate_Loader

When I do an strace I can see that it doesn't load the arm attributes with my own build.

When looking at the source code I simply cannot see where it is supposed to look up the arm attributes. But here are some facts :
It NEVER reaches any code within the so itself.
The application crashes way before it even tries to resolve dependencies from the so.
The application loads fine using my own build of eglibc from apt.
When running my own build I can see it never checks the ARM attributes of libmediaclient.so when doing an strace.

I'm stuck. Please pm me any thoughts on freenode, I'm online as ElmerFudd.

Sundtek

  • Administrator
  • Hero Member
  • *****
  • Beiträge: 8512
    • Profil anzeigen
Re:Driver fails on recent debian unstable armhf
« Antwort #6 am: Mai 17, 2012, 04:28:39 Nachmittag »
You received a message on freenode.
Failure is a good thing! I'll fix it

ElmerFudd

  • Newbie
  • *
  • Beiträge: 7
    • Profil anzeigen
Re:Driver fails on recent debian unstable armhf
« Antwort #7 am: Mai 17, 2012, 10:36:22 Nachmittag »
I found the smoking gun - debian and ubuntu have added a check for vfp vs. softfp.

Apparently there is an arm attribute that tells whether an elf binary is compiled with vfp support. Some wise person thought it would be a good idea to exit the process instead of failing with wrong arch upon loading an elf.
I'll try to figure out how to make a bug report to both debian and ubuntu about this, since it is unacceptable that dlopen can exit an application. I mean, if you try to load a ppc elf on an x86 you will get an error, and not have your entire application closed.

Anywho, this means you will have to have 2 armsysv in the driver - one for softfp and one compiled with vfp support - even though you don't use fp in function calls.

I'll be happy to test out binaries on my trimslice, but I'm not willing to give access to it over ssh - sorry.
As for a qemu image, I've failed to find a kernel that will work with armhf and ubuntu/debian.

Sundtek

  • Administrator
  • Hero Member
  • *****
  • Beiträge: 8512
    • Profil anzeigen
Re:Driver fails on recent debian unstable armhf
« Antwort #8 am: Mai 18, 2012, 03:49:56 Vormittag »
No problem, this is a bug in Debian.

If HWfloat instructions are issued, an exception should be raised in the kernel and the corresponding command should fall back to softfloat automatically.
Failure is a good thing! I'll fix it

ElmerFudd

  • Newbie
  • *
  • Beiträge: 7
    • Profil anzeigen
Re:Driver fails on recent debian unstable armhf
« Antwort #9 am: Mai 18, 2012, 11:45:20 Nachmittag »
Just to recap on what we discussed on irc.

It is a bug in debian that it exits the application, but the check is valid. It is not floating point operations that is the problem.
The reason they have the check is that the ABI for transferring floats and doubles are different between softfp and hardfloat.
In hardfloat mode vfp registers are used for transferring floats and doubles, and they don't exist in softfp CPUs.
It was only a problem if there is used floats and doubles in function calls to and from the so, but now they have introduced a check during dl load.
This means the driver will need to support both the old softfloat (with integer registers used for doubles) and the new arm hardfloat ABI.

ElmerFudd

  • Newbie
  • *
  • Beiträge: 7
    • Profil anzeigen
Re:Driver fails on recent debian unstable armhf
« Antwort #10 am: Juni 04, 2012, 07:54:05 Nachmittag »
This issue is now resolved with the latest drivers from last night. Big thanks to mrec for adding armhf support to the driver.
I'm now recording 1 show and watching 2 shows to test my new installation, and it works like a charm.

On a side note my bug report for the exit process issue in eglibc is now closed in debian, meaning it should refuse to load elf files with the wrong ABI instead of exiting the process from the next version.