libobjc2-clang

classic Classic list List threaded Threaded
15 messages Options
Reply | Threaded
Open this post in threaded view
|

libobjc2-clang

Riccardo Mottola-5
Hi,

I am testing a bit libobjc2 + clang on NetBSD/amd64
As we know, the full libobjc2 testsuite passes and most apps appear to work.

There are some oddities though, like this:

Unknown protocol version[1]   Abort trap (core dumped) Vespucci

as soon as I launch it, it crashes.

Starting program: /Local/Tools/Vespucci
Unknown protocol version
Program received signal SIGABRT, Aborted.
0x00007a34713678aa in _lwp_kill () from /usr/lib/libc.so.12
(gdb) bt
#0  0x00007a34713678aa in _lwp_kill () from /usr/lib/libc.so.12
#1  0x00007a347136715a in abort () from /usr/lib/libc.so.12
#2  0x00007a3471c18d56 in init_protocols () from
/System/Library/Libraries/libobjc.so.4.6
#3  0x00007a3471c18b0d in objc_init_protocols ()
    from /System/Library/Libraries/libobjc.so.4.6
#4  0x00007a3471c12647 in objc_load_class () from
/System/Library/Libraries/libobjc.so.4.6
#5  0x00007a3471c18235 in __objc_load () from
/System/Library/Libraries/libobjc.so.4.6
#6  0x00007a34725730a4 in ?? () from
/System/Library/Libraries/libgnustep-base.so.1.27.0
#7  0x00007a3472ed9c00 in ?? ()
#8  0x00007a347256ff69 in _init () from
/System/Library/Libraries/libgnustep-base.so.1.27.0
#9  0x0000000000000000 in ?? ()


Ideas?

Riccardo

Reply | Threaded
Open this post in threaded view
|

Re: libobjc2-clang

David Chisnall-7
Hi,

 From the back trace, this looks as if it's the v2 ABI.  The assert
that's firing is here:

https://github.com/gnustep/libobjc2/blob/ed8eec6c6aa82b049fc8292d0c247b8cd6c2fddc/protocol.c#L224

So it's finding an isa pointer for a protocol that is neither the one of
the known protocol classes and not a known version number.  So the
question is... what is it?  Can you walk up the stack to that point and see?

David

On 10/06/2020 18:15, Riccardo Mottola wrote:

> Hi,
>
> I am testing a bit libobjc2 + clang on NetBSD/amd64
> As we know, the full libobjc2 testsuite passes and most apps appear to
> work.
>
> There are some oddities though, like this:
>
> Unknown protocol version[1]   Abort trap (core dumped) Vespucci
>
> as soon as I launch it, it crashes.
>
> Starting program: /Local/Tools/Vespucci
> Unknown protocol version
> Program received signal SIGABRT, Aborted.
> 0x00007a34713678aa in _lwp_kill () from /usr/lib/libc.so.12
> (gdb) bt
> #0  0x00007a34713678aa in _lwp_kill () from /usr/lib/libc.so.12
> #1  0x00007a347136715a in abort () from /usr/lib/libc.so.12
> #2  0x00007a3471c18d56 in init_protocols () from
> /System/Library/Libraries/libobjc.so.4.6
> #3  0x00007a3471c18b0d in objc_init_protocols ()
>     from /System/Library/Libraries/libobjc.so.4.6
> #4  0x00007a3471c12647 in objc_load_class () from
> /System/Library/Libraries/libobjc.so.4.6
> #5  0x00007a3471c18235 in __objc_load () from
> /System/Library/Libraries/libobjc.so.4.6
> #6  0x00007a34725730a4 in ?? () from
> /System/Library/Libraries/libgnustep-base.so.1.27.0
> #7  0x00007a3472ed9c00 in ?? ()
> #8  0x00007a347256ff69 in _init () from
> /System/Library/Libraries/libgnustep-base.so.1.27.0
> #9  0x0000000000000000 in ?? ()
>
>
> Ideas?
>
> Riccardo
>

Reply | Threaded
Open this post in threaded view
|

Re: libobjc2-clang

Riccardo Mottola-5
Hi David,

David Chisnall wrote:

>
> From the back trace, this looks as if it's the v2 ABI.  The assert
> that's firing is here:
>
> https://github.com/gnustep/libobjc2/blob/ed8eec6c6aa82b049fc8292d0c247b8cd6c2fddc/protocol.c#L224 
>
>
> So it's finding an isa pointer for a protocol that is neither the one
> of the known protocol classes and not a known version number. So the
> question is... what is it?  Can you walk up the stack to that point
> and see?

I tried recompiling base with debug, but no information, the stacktrace
has nothing.

I tried SWK vespucci on NetBSD (other system) with GCC and it works
fine. And on FreeBSD with clang/libobjc2 and it works.
So the issue is "here"... Wonder why.

Unknown protocol version
Program received signal SIGABRT, Aborted.
0x00007afc75d678aa in _lwp_kill () from /usr/lib/libc.so.12
(gdb) bt
#0  0x00007afc75d678aa in _lwp_kill () from /usr/lib/libc.so.12
#1  0x00007afc75d6715a in abort () from /usr/lib/libc.so.12
#2  0x00007afc76618d56 in init_protocols ()
    from /System/Library/Libraries/libobjc.so.4.6
#3  0x00007afc76618b0d in objc_init_protocols ()
    from /System/Library/Libraries/libobjc.so.4.6
#4  0x00007afc76612647 in objc_load_class ()
    from /System/Library/Libraries/libobjc.so.4.6
#5  0x00007afc76618235 in __objc_load ()
    from /System/Library/Libraries/libobjc.so.4.6
#6  0x00007afc76dacbcd in objcv2_load_function ()
    from /System/Library/Libraries/libgnustep-base.so.1.27.0
#7  0x00007afc76dab994 in ?? ()
    from /System/Library/Libraries/libgnustep-base.so.1.27.0
#8  0x00007afc777e9c00 in ?? ()
#9  0x00007afc76da8629 in _init ()
    from /System/Library/Libraries/libgnustep-base.so.1.27.0
#10 0x0000000000000000 in ?? ()

I suppose I should be able to inspect #6 and see what it is loading?

I tried compiling with no optimization, with debug... I can't get a
better stacktrace.

Perhaps libobjc? how can I compile libobic2 in debug? I think I need to
activate  CMAKE_ASM_FLAGS_DEBUG    somehow

Riccardo

Reply | Threaded
Open this post in threaded view
|

Re: libobjc2-clang

Wolfgang Lux


> Am 12.06.2020 um 18:34 schrieb Riccardo Mottola <[hidden email]>:
>
>
> Perhaps libobjc? how can I compile libobic2 in debug? I think I need to activate  CMAKE_ASM_FLAGS_DEBUG somehow

No. But you want to reconfigure libobjc2 with CMAKE_BUILD_TYPE=RelWithDebInfo (or, if you absolutely insist on turning off optimization, CMAKE_BUILD_TYPE=Debug).

Wolfgang


Reply | Threaded
Open this post in threaded view
|

Re: libobjc2-clang

Riccardo Mottola-5
Hi,

Wolfgang Lux wrote:
>> Perhaps libobjc? how can I compile libobic2 in debug? I think I need to activate  CMAKE_ASM_FLAGS_DEBUG somehow
> No. But you want to reconfigure libobjc2 with CMAKE_BUILD_TYPE=RelWithDebInfo (or, if you absolutely insist on turning off optimization, CMAKE_BUILD_TYPE=Debug).

thank you. I used "Debug" to be most extreme (and also to exclude issues
with optimizations, which never hurts).
It still crashes and so:

0x0000753c801678aa in _lwp_kill () from /usr/lib/libc.so.12
(gdb) bt
#0  0x0000753c801678aa in _lwp_kill () from /usr/lib/libc.so.12
#1  0x0000753c8016715a in abort () from /usr/lib/libc.so.12
#2  0x0000753c80a18d56 in init_protocols (protocols=0x753c816c91b0
<objc_protocol_list>)
     at /home/multix/code/gnustep-vcs/libobjc2/protocol.c:225
#3  0x0000753c80a18b0d in objc_init_protocols (protocols=0x753c816c91b0
<objc_protocol_list>)
     at /home/multix/code/gnustep-vcs/libobjc2/protocol.c:258
#4  0x0000753c80a12647 in objc_load_class (class=0x753c816c91d0
<._OBJC_CLASS_NSAffineTransform>)
     at /home/multix/code/gnustep-vcs/libobjc2/class_table.c:465
#5  0x0000753c80a18235 in __objc_load (init=0x753c816ba9a0 <objc_init>)
     at /home/multix/code/gnustep-vcs/libobjc2/loader.c:268
#6  0x0000753c812baced in objcv2_load_function () from
/System/Library/Libraries/libgnustep-base.so.1.27.0
#7  0x0000753c812b9ab4 in ?? () from
/System/Library/Libraries/libgnustep-base.so.1.27.0
#8  0x0000753c81cf7c00 in ?? ()
#9  0x0000753c812b6749 in _init () from
/System/Library/Libraries/libgnustep-base.so.1.27.0
#10 0x0000000000000000 in ?? ()

how can I print out this?
(gdb) p protocols
$4 = (struct objc_protocol_list *) 0x753c816c91b0 <objc_protocol_list>

just to know which "list" and what should be loaded, I suppose there is
an error here.

Riccardo

Reply | Threaded
Open this post in threaded view
|

Re: libobjc2-clang

Wolfgang Lux

> Am 17.06.2020 um 13:28 schrieb Riccardo Mottola <[hidden email]>:
>
> Hi,
>
> Wolfgang Lux wrote:
>>> Perhaps libobjc? how can I compile libobic2 in debug? I think I need to activate  CMAKE_ASM_FLAGS_DEBUG somehow
>> No. But you want to reconfigure libobjc2 with CMAKE_BUILD_TYPE=RelWithDebInfo (or, if you absolutely insist on turning off optimization, CMAKE_BUILD_TYPE=Debug).
>
> thank you. I used "Debug" to be most extreme (and also to exclude issues with optimizations, which never hurts).
> It still crashes and so:
>
> 0x0000753c801678aa in _lwp_kill () from /usr/lib/libc.so.12
> (gdb) bt
> #0  0x0000753c801678aa in _lwp_kill () from /usr/lib/libc.so.12
> #1  0x0000753c8016715a in abort () from /usr/lib/libc.so.12
> #2  0x0000753c80a18d56 in init_protocols (protocols=0x753c816c91b0 <objc_protocol_list>)
>     at /home/multix/code/gnustep-vcs/libobjc2/protocol.c:225
> #3  0x0000753c80a18b0d in objc_init_protocols (protocols=0x753c816c91b0 <objc_protocol_list>)
>     at /home/multix/code/gnustep-vcs/libobjc2/protocol.c:258
> #4  0x0000753c80a12647 in objc_load_class (class=0x753c816c91d0 <._OBJC_CLASS_NSAffineTransform>)
>     at /home/multix/code/gnustep-vcs/libobjc2/class_table.c:465
> #5  0x0000753c80a18235 in __objc_load (init=0x753c816ba9a0 <objc_init>)
>     at /home/multix/code/gnustep-vcs/libobjc2/loader.c:268
> #6  0x0000753c812baced in objcv2_load_function () from /System/Library/Libraries/libgnustep-base.so.1.27.0
> #7  0x0000753c812b9ab4 in ?? () from /System/Library/Libraries/libgnustep-base.so.1.27.0
> #8  0x0000753c81cf7c00 in ?? ()
> #9  0x0000753c812b6749 in _init () from /System/Library/Libraries/libgnustep-base.so.1.27.0
> #10 0x0000000000000000 in ?? ()
>
> how can I print out this?
> (gdb) p protocols
> $4 = (struct objc_protocol_list *) 0x753c816c91b0 <objc_protocol_list>

Since protocols here is a pointer to a struct, you‘d do it in the obvious way: p *protocols. However, the really interesting value would be that of *aProto (aProto itself is set to protocols->list[i] inside the loop where the code crashes).

> just to know which "list" and what should be loaded, I suppose there is an error here.
>
> Riccardo

Wolfgang
Reply | Threaded
Open this post in threaded view
|

Re: libobjc2-clang

David Chisnall-7
In reply to this post by Riccardo Mottola-5
Wow, I ignored email for a week and there are 50 unread emails in my GNUstep folder!  Great to see some renewed interest and activity in the project!

> On 17 Jun 2020, at 12:28, Riccardo Mottola <[hidden email]> wrote:
>
> Hi,
>
> Wolfgang Lux wrote:
>>> Perhaps libobjc? how can I compile libobic2 in debug? I think I need to activate  CMAKE_ASM_FLAGS_DEBUG somehow
>> No. But you want to reconfigure libobjc2 with CMAKE_BUILD_TYPE=RelWithDebInfo (or, if you absolutely insist on turning off optimization, CMAKE_BUILD_TYPE=Debug).
>
> thank you. I used "Debug" to be most extreme (and also to exclude issues with optimizations, which never hurts).

Debug doesn’t make actually much of a difference to overall performance.  The most performance critical bits are hand-coded assembly.

> It still crashes and so:
>
> 0x0000753c801678aa in _lwp_kill () from /usr/lib/libc.so.12
> (gdb) bt
> #0  0x0000753c801678aa in _lwp_kill () from /usr/lib/libc.so.12
> #1  0x0000753c8016715a in abort () from /usr/lib/libc.so.12
> #2  0x0000753c80a18d56 in init_protocols (protocols=0x753c816c91b0 <objc_protocol_list>)
>     at /home/multix/code/gnustep-vcs/libobjc2/protocol.c:225
> #3  0x0000753c80a18b0d in objc_init_protocols (protocols=0x753c816c91b0 <objc_protocol_list>)
>     at /home/multix/code/gnustep-vcs/libobjc2/protocol.c:258
> #4  0x0000753c80a12647 in objc_load_class (class=0x753c816c91d0 <._OBJC_CLASS_NSAffineTransform>)
>     at /home/multix/code/gnustep-vcs/libobjc2/class_table.c:465
> #5  0x0000753c80a18235 in __objc_load (init=0x753c816ba9a0 <objc_init>)
>     at /home/multix/code/gnustep-vcs/libobjc2/loader.c:268
> #6  0x0000753c812baced in objcv2_load_function () from /System/Library/Libraries/libgnustep-base.so.1.27.0
> #7  0x0000753c812b9ab4 in ?? () from /System/Library/Libraries/libgnustep-base.so.1.27.0
> #8  0x0000753c81cf7c00 in ?? ()
> #9  0x0000753c812b6749 in _init () from /System/Library/Libraries/libgnustep-base.so.1.27.0
> #10 0x0000000000000000 in ?? ()
>
> how can I print out this?
> (gdb) p protocols
> $4 = (struct objc_protocol_list *) 0x753c816c91b0 <objc_protocol_list>
>
> just to know which "list" and what should be loaded, I suppose there is an error here.

This is the protocol list that is currently being initialised.  It would be interesting to see what `i` is (I guess 0), because they tells us if we’re failing on the first protocol in a class or a subsequent one.  Then look at `version`, which should be a protocol version number.  If it isn’t a small value (<16), look at `aProto->isa`.

Can you also confirm whether you’ve built with OLDABI_COMPAT enabled?  If you didn’t, then you’ll see this message if you try to load any code that isn’t compiled with the v2 ABI.

David


Reply | Threaded
Open this post in threaded view
|

Re: libobjc2-clang

Riccardo Mottola-5
Hi David, hi Wolfgang,


David Chisnall wrote:
> Wow, I ignored email for a week and there are 50 unread emails in my GNUstep folder!  Great to see some renewed interest and activity in the project!

well, perhaps one of the effect of virus lockdown was to have more time
to hack again! At least for me it is the case... update a lot systems
that I did not have time for... wrote code and discovered new pitfalls.
I hope you all are fine.

>
>> how can I print out this?
>> (gdb) p protocols
>> $4 = (struct objc_protocol_list *) 0x753c816c91b0 <objc_protocol_list>
>>
>> just to know which "list" and what should be loaded, I suppose there is an error here.
> This is the protocol list that is currently being initialised.  It would be interesting to see what `i` is (I guess 0), because they tells us if we’re failing on the first protocol in a class or a subsequent one.  Then look at `version`, which should be a protocol version number.  If it isn’t a small value (<16), look at `aProto->isa`.

you guessed correctly and the protocol is also a very common one, NSCopying

(gdb) p *protocols
$4 = {next = 0x0, count = 2, list = 0x7b8aa32ea1c0 <objc_protocol_list+16>}

(gdb) p i
$1 = 0
(gdb) p aProto
$2 = (struct objc_protocol *) 0x7b8aa39138b8 <._OBJC_PROTOCOL_NSCopying>


version is small:
(gdb) p version
$3 = 0



>
> Can you also confirm whether you’ve built with OLDABI_COMPAT enabled?  If you didn’t, then you’ll see this message if you try to load any code that isn’t compiled with the v2 ABI.

I confirm OLDABI_COMPAT is on. I did not touch it. LEGACY_COMPAT is off.

I only changed the linker option and then the build type.

Further question: this is a from-scratch install, all compiled with the
same version of clang, can I assume that all libraryes have the same ABI
or not? maybe there is a makefile issue somewhere?

Thanks,

Riccardo


Reply | Threaded
Open this post in threaded view
|

Re: libobjc2-clang

David Chisnall-7
Thanks!

> On 18 Jun 2020, at 12:52, Riccardo Mottola <[hidden email]> wrote:
>
> you guessed correctly and the protocol is also a very common one, NSCopying

So, part of the question is whether this is the first time we’re seeing this or not.  Can you stick a watchpoint on the isa pointer and restart it, see if it’s modified before here?

> (gdb) p *protocols
> $4 = {next = 0x0, count = 2, list = 0x7b8aa32ea1c0 <objc_protocol_list+16>}
>
> (gdb) p i
> $1 = 0
> (gdb) p aProto
> $2 = (struct objc_protocol *) 0x7b8aa39138b8 <._OBJC_PROTOCOL_NSCopying>
>
>
> version is small:
> (gdb) p version
> $3 = 0

That’s very odd.  Here’s the definition of the enum:

https://github.com/gnustep/libobjc2/blob/369c84db35a6a1e94f8a4689a695fabdac056166/protocol.h#L26

The isa pointer for each protocol is initially set to one of those enum values (2, 3, or 4) by the compiler and is then set to a proper Objective-C class.  It should never end up 0.  It’s possible that something has corrupted memory or that we’ve just read the low 32 bits this has been set to a 64-bit address that happens to have nothing in the low 32 bits, but it seems quite unlikely.


>> Can you also confirm whether you’ve built with OLDABI_COMPAT enabled?  If you didn’t, then you’ll see this message if you try to load any code that isn’t compiled with the v2 ABI.
>
> I confirm OLDABI_COMPAT is on. I did not touch it. LEGACY_COMPAT is off.
>
> I only changed the linker option and then the build type.
>
> Further question: this is a from-scratch install, all compiled with the same version of clang, can I assume that all libraryes have the same ABI or not? maybe there is a makefile issue somewhere?

Should be.  You can see if __objc_exec_class is called - that’s the entry point used by old ABI code.

David


Reply | Threaded
Open this post in threaded view
|

Re: libobjc2-clang

Riccardo Mottola-5
Hi David,

David Chisnall wrote:
>
>> On 18 Jun 2020, at 12:52, Riccardo Mottola <[hidden email]> wrote:
>>
>> you guessed correctly and the protocol is also a very common one, NSCopying
> So, part of the question is whether this is the first time we’re seeing this or not.  Can you stick a watchpoint on the isa pointer and restart it, see if it’s modified before here?

as simple as this?
(gdb) watch aProto->isa
Hardware watchpoint 4: aProto->isa

I rerun the program but it fails through to the crash - so it appears
that it does not het changed, or I did not set it correct.


> That’s very odd.  Here’s the definition of the enum:
>
> https://github.com/gnustep/libobjc2/blob/369c84db35a6a1e94f8a4689a695fabdac056166/protocol.h#L26
>
> The isa pointer for each protocol is initially set to one of those enum values (2, 3, or 4) by the compiler and is then set to a proper Objective-C class.  It should never end up 0.  It’s possible that something has corrupted memory or that we’ve just read the low 32 bits this has been set to a 64-bit address that happens to have nothing in the low 32 bits, but it seems quite unlikely.
>

I see, this is very strange. What is "really" strange is that that this
is amd64 bit architecture and a known compiler, the same code works on
FreeBSD (and Linux IIRC).
NetBSD would make the difference? or that this is "genuine AMD" and not
intel? would be very strange.

>
>> I confirm OLDABI_COMPAT is on. I did not touch it. LEGACY_COMPAT is off.
>>
>> I only changed the linker option and then the build type.
>>
>> Further question: this is a from-scratch install, all compiled with the same version of clang, can I assume that all libraryes have the same ABI or not? maybe there is a makefile issue somewhere?
> Should be.  You can see if __objc_exec_class is called - that’s the entry point used by old ABI code.

(gdb) b __obj_exec_class
Function "__obj_exec_class" not defined.

Apparently it is not even defined, but this is strange, since I compiled
with OLDABI_COMPAT:

I tried being more explicit:
(gdb) b loader.c:328
Breakpoint 3 at 0x72b92fe1859f: file
/home/multix/code/gnustep-vcs/libobjc2/loader.c, line 330.

and re-run the program, it does not get into that function, so we can
assume it is new code.

Riccardo

Reply | Threaded
Open this post in threaded view
|

Re: libobjc2-clang

Wolfgang Lux


> Am 18.06.2020 um 17:21 schrieb Riccardo Mottola <[hidden email]>:
>
> Hi David,
>
> David Chisnall wrote:
>>
>>> On 18 Jun 2020, at 12:52, Riccardo Mottola <[hidden email]> wrote:
>>>
>>> you guessed correctly and the protocol is also a very common one, NSCopying
>> So, part of the question is whether this is the first time we’re seeing this or not.  Can you stick a watchpoint on the isa pointer and restart it, see if it’s modified before here?
>
> as simple as this?
> (gdb) watch aProto->isa
> Hardware watchpoint 4: aProto->isa
>
> I rerun the program but it fails through to the crash - so it appears that it does not het changed, or I did not set it correct.
>
>
>> That’s very odd.  Here’s the definition of the enum:
>>
>> https://github.com/gnustep/libobjc2/blob/369c84db35a6a1e94f8a4689a695fabdac056166/protocol.h#L26
>>
>> The isa pointer for each protocol is initially set to one of those enum values (2, 3, or 4) by the compiler and is then set to a proper Objective-C class.  It should never end up 0.  It’s possible that something has corrupted memory or that we’ve just read the low 32 bits this has been set to a 64-bit address that happens to have nothing in the low 32 bits, but it seems quite unlikely.
>>
>
> I see, this is very strange. What is "really" strange is that that this is amd64 bit architecture and a known compiler, the same code works on FreeBSD (and Linux IIRC).
> NetBSD would make the difference? or that this is "genuine AMD" and not intel? would be very strange.
>
>>
>>> I confirm OLDABI_COMPAT is on. I did not touch it. LEGACY_COMPAT is off.
>>>
>>> I only changed the linker option and then the build type.
>>>
>>> Further question: this is a from-scratch install, all compiled with the same version of clang, can I assume that all libraryes have the same ABI or not? maybe there is a makefile issue somewhere?
>> Should be.  You can see if __objc_exec_class is called - that’s the entry point used by old ABI code.
>
> (gdb) b __obj_exec_class
> Function "__obj_exec_class" not defined.

The correct name would have been __objc_exec_class.

> Apparently it is not even defined, but this is strange, since I compiled with OLDABI_COMPAT:
>
> I tried being more explicit:
> (gdb) b loader.c:328
> Breakpoint 3 at 0x72b92fe1859f: file /home/multix/code/gnustep-vcs/libobjc2/loader.c, line 330.
>
> and re-run the program, it does not get into that function, so we can assume it is new code.

Anyway, I was able to reproduce this issue on NetBSD 9.
First of all getting the configuration right seems to be a royal PITA. So what I did was:

1. Configure libobjc2 with cmake -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -DCXX_RUNTIME_LIB=/usr/lib/libstdc++.so -DCMAKE_BUILD_TYPE=RelWithDebInfo
The CXX_RUNTIME_LIB option is necessary because cmake by default wants to use /usr/lib/libsupc++.a (as Riccardo already noticed earlier).
The CMAKE_BUILD_TYPE is only necessary to debug the runtime itself (but that was of course necessary in this case).

2. Configure gnustep-make with ./configure CC=clang CXX=clang --with-layout=gnustep --with-library-combo=ng
The --with-library-combo option (yes, ng is sufficient, it's an alias for ng-gnu-gnu) is necessary to detect the new runtime system. ISTR, that this would be detected automatically when configuring with clang, but apparently it didn't work for me on NetBSD.

3. Install GNU binutils via pkgin, because the NetBSD loader seems to generate invalid objects files for subprojects, too.

4. Build gnustep-base and gnustep-back with the command
  env AUXILIARY_LDFLAGS=-fuse-ld=/usr/pkg/gnu/bin/ld.gold make
to make clang use gold instead of the standard linker. You can use
  AUXILIARY_LDFLAGS=-fuse-ld=/usr/pkg/gnu/bin/ld.gold make
instead if you're not using a csh as I do.

This was at least enough to get Ink compiled and running.

I'm still facing problems with my favorite toy project StepTalk because apparently it doesn't get the compilation flags right and misses out the dependent libraries on at least the StepTalk framework itself. In effect that means that the framework can be initialized before gnustep-base and hence all protocols have their isa pointer set to 0, which obviously causes problems later.

Wolfgang




Reply | Threaded
Open this post in threaded view
|

Re: libobjc2-clang

Wolfgang Lux


> Am 28.06.2020 um 21:21 schrieb Wolfgang Lux <[hidden email]>:
>
>
>
>> Am 18.06.2020 um 17:21 schrieb Riccardo Mottola <[hidden email]>:
>>
>> Hi David,
>>
>> David Chisnall wrote:
>>>
>>>> On 18 Jun 2020, at 12:52, Riccardo Mottola <[hidden email]> wrote:
>>>>
>>>> you guessed correctly and the protocol is also a very common one, NSCopying
>>> So, part of the question is whether this is the first time we’re seeing this or not.  Can you stick a watchpoint on the isa pointer and restart it, see if it’s modified before here?
>>
>> as simple as this?
>> (gdb) watch aProto->isa
>> Hardware watchpoint 4: aProto->isa
>>
>> I rerun the program but it fails through to the crash - so it appears that it does not het changed, or I did not set it correct.
>>
>>
>>> That’s very odd.  Here’s the definition of the enum:
>>>
>>> https://github.com/gnustep/libobjc2/blob/369c84db35a6a1e94f8a4689a695fabdac056166/protocol.h#L26
>>>
>>> The isa pointer for each protocol is initially set to one of those enum values (2, 3, or 4) by the compiler and is then set to a proper Objective-C class.  It should never end up 0.  It’s possible that something has corrupted memory or that we’ve just read the low 32 bits this has been set to a 64-bit address that happens to have nothing in the low 32 bits, but it seems quite unlikely.
>>>
>>
>> I see, this is very strange. What is "really" strange is that that this is amd64 bit architecture and a known compiler, the same code works on FreeBSD (and Linux IIRC).
>> NetBSD would make the difference? or that this is "genuine AMD" and not intel? would be very strange.
>>
>>>
>>>> I confirm OLDABI_COMPAT is on. I did not touch it. LEGACY_COMPAT is off.
>>>>
>>>> I only changed the linker option and then the build type.
>>>>
>>>> Further question: this is a from-scratch install, all compiled with the same version of clang, can I assume that all libraryes have the same ABI or not? maybe there is a makefile issue somewhere?
>>> Should be.  You can see if __objc_exec_class is called - that’s the entry point used by old ABI code.
>>
>> (gdb) b __obj_exec_class
>> Function "__obj_exec_class" not defined.
>
> The correct name would have been __objc_exec_class.
>
>> Apparently it is not even defined, but this is strange, since I compiled with OLDABI_COMPAT:
>>
>> I tried being more explicit:
>> (gdb) b loader.c:328
>> Breakpoint 3 at 0x72b92fe1859f: file /home/multix/code/gnustep-vcs/libobjc2/loader.c, line 330.
>>
>> and re-run the program, it does not get into that function, so we can assume it is new code.
>
> Anyway, I was able to reproduce this issue on NetBSD 9.
> First of all getting the configuration right seems to be a royal PITA. So what I did was:
>
> 1. Configure libobjc2 with cmake -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -DCXX_RUNTIME_LIB=/usr/lib/libstdc++.so -DCMAKE_BUILD_TYPE=RelWithDebInfo
> The CXX_RUNTIME_LIB option is necessary because cmake by default wants to use /usr/lib/libsupc++.a (as Riccardo already noticed earlier).
> The CMAKE_BUILD_TYPE is only necessary to debug the runtime itself (but that was of course necessary in this case).
>
> 2. Configure gnustep-make with ./configure CC=clang CXX=clang --with-layout=gnustep --with-library-combo=ng
> The --with-library-combo option (yes, ng is sufficient, it's an alias for ng-gnu-gnu) is necessary to detect the new runtime system. ISTR, that this would be detected automatically when configuring with clang, but apparently it didn't work for me on NetBSD.
>
> 3. Install GNU binutils via pkgin, because the NetBSD loader seems to generate invalid objects files for subprojects, too.
>
> 4. Build gnustep-base and gnustep-back with the command
>  env AUXILIARY_LDFLAGS=-fuse-ld=/usr/pkg/gnu/bin/ld.gold make
> to make clang use gold instead of the standard linker. You can use
>  AUXILIARY_LDFLAGS=-fuse-ld=/usr/pkg/gnu/bin/ld.gold make
> instead if you're not using a csh as I do.
>
> This was at least enough to get Ink compiled and running.
>
> I'm still facing problems with my favorite toy project StepTalk because apparently it doesn't get the compilation flags right and misses out the dependent libraries on at least the StepTalk framework itself. In effect that means that the framework can be initialized before gnustep-base and hence all protocols have their isa pointer set to 0, which obviously causes problems later.

And another problem I'm facing is that exceptions aren't caught but rather crash the program. For instance, given the simple test program
  #import <Foundation/Foundation.h>

  int
  main()
  {
    @autoreleasepool {
      @try {
        [NSException raise: @"Test" format: @"Test"];
      }
      @catch (NSException *e) {
        NSLog(@"Caught exception %@", e);
      }
    }
    return 0;
  }
I would expect a message 'Caught exception Test' in the shell, but on NetBSD the program simply runs into the uncaught exception handler and then aborts:
(gdb) bt
#0  0x000074bd057678aa in _lwp_kill () from /usr/lib/libc.so.12
#1  0x000074bd0576715a in abort () from /usr/lib/libc.so.12
#2  0x000074bd06a28411 in _terminate () at NSException.m:1370
#3  0x000074bd06a283cd in _NSFoundationUncaughtExceptionHandler (
    exception=0x74bd06708ea8) at NSException.m:1395
#4  0x000074bd06a26ec6 in callUncaughtHandler (value=0x74bd06708ea8)
    at NSException.m:1415
#5  0x000074bd064191f1 in objc_exception_throw (object=0x74bd06708ea8)
    at /home/wlux/src/GNUstep/libobjc2/eh_personality.c:243
#6  0x000074bd06a27aff in -[NSException raise] (self=0x74bd06708ea8,
    _cmd=0x74bd06dbe078 <objc_selector_raise_v160:8>) at NSException.m:1586
#7  0x000074bd06a2713f in +[NSException raise:format:arguments:] (
    self=0x74bd06d52b18 <._OBJC_CLASS_NSException>,
    _cmd=0x74bd06dc08d8 <.objc_selector_raise:format:arguments:_v400:81624[1{__va_list_tag=II^v^v}]32>, name=0xa9979f4000000024, format=0xa9979f4000000024,
    argList=0x7f7fff19e870) at NSException.m:1465
#8  0x000074bd06a270b7 in +[NSException raise:format:] (
    self=0x74bd06d52b18 <._OBJC_CLASS_NSException>,
    _cmd=0x6014a0 <objc_selector_raise:format:_v320:81624>,
    name=0xa9979f4000000024, format=0xa9979f4000000024) at NSException.m:1450
#9  0x0000000000400eb1 in gnustep_base_user_main () at testCatch.m:8
#10 0x000074bd06abd6da in main (argc=1, argv=0x7f7fff19e948,
    env=0x7f7fff19e958) at NSProcessInfo.m:1008
#11 0x0000000000400dad in ___start ()
#12 0x00007f7f5460e918 in ?? () from /libexec/ld.elf_so
#13 0x0000000000000001 in ?? ()
#14 0x00007f7fff19efb8 in ?? ()
#15 0x0000000000000000 in ?? ()


Reply | Threaded
Open this post in threaded view
|

Re: libobjc2-clang

Riccardo Mottola-5
In reply to this post by Wolfgang Lux
Hi Wolfgang.


good news that you are able to reproduce this. It means it is not just a
red herring of my setup!


Wolfgang Lux wrote:

>
>> Apparently it is not even defined, but this is strange, since I compiled with OLDABI_COMPAT:
>>
>> I tried being more explicit:
>> (gdb) b loader.c:328
>> Breakpoint 3 at 0x72b92fe1859f: file /home/multix/code/gnustep-vcs/libobjc2/loader.c, line 330.
>>
>> and re-run the program, it does not get into that function, so we can assume it is new code.
> Anyway, I was able to reproduce this issue on NetBSD 9.
> First of all getting the configuration right seems to be a royal PITA. So what I did was:
>
> 1. Configure libobjc2 with cmake -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -DCXX_RUNTIME_LIB=/usr/lib/libstdc++.so -DCMAKE_BUILD_TYPE=RelWithDebInfo
> The CXX_RUNTIME_LIB option is necessary because cmake by default wants to use /usr/lib/libsupc++.a (as Riccardo already noticed earlier).
> The CMAKE_BUILD_TYPE is only necessary to debug the runtime itself (but that was of course necessary in this case).
>
> 2. Configure gnustep-make with ./configure CC=clang CXX=clang --with-layout=gnustep --with-library-combo=ng
> The --with-library-combo option (yes, ng is sufficient, it's an alias for ng-gnu-gnu) is necessary to detect the new runtime system. ISTR, that this would be detected automatically when configuring with clang, but apparently it didn't work for me on NetBSD.

Also on FreeBSD.. and generally I found that it is best to set clang+ng.
Otherwise you get the compatibility gnu runtine.
I always hoped David would fix it... I'd like to see if I can use
libobjc2 with gcc, but that is completely another story.


> 3. Install GNU binutils via pkgin, because the NetBSD loader seems to generate invalid objects files for subprojects, too.

Yes that is mandatory to get the gold linker too!

>
> 4. Build gnustep-base and gnustep-back with the command
>    env AUXILIARY_LDFLAGS=-fuse-ld=/usr/pkg/gnu/bin/ld.gold make
> to make clang use gold instead of the standard linker. You can use
>    AUXILIARY_LDFLAGS=-fuse-ld=/usr/pkg/gnu/bin/ld.gold make
> instead if you're not using a csh as I do.

I think I only set this for lobobjc2.. or anyway had to set the linker
extra, I set it.

>
> This was at least enough to get Ink compiled and running.
>
> I'm still facing problems with my favorite toy project StepTalk because apparently it doesn't get the compilation flags right and misses out the dependent libraries on at least the StepTalk framework itself. In effect that means that the framework can be initialized before gnustep-base and hence all protocols have their isa pointer set to 0, which obviously causes problems later.
>


I have most applications running.. only SWK+Vespucci give this strange
errors.
This linker differences are perhaps also the issues Patrick is seeing? I
do wonder.

Anyway, do you have further ideas on this strange failure? does it
happen for any other app? Ink is working, but the rest?

Also, another bad news- since David asked me. No Valgrind on
NetBSD/amd64. There was a port for i386 but never made it in the
official tree. To "linux-like" apparently.

Riccardo

Reply | Threaded
Open this post in threaded view
|

Re: libobjc2-clang

Wolfgang Lux


> Am 29.06.2020 um 23:23 schrieb Riccardo Mottola <[hidden email]>:
>
> I have most applications running.. only SWK+Vespucci give this strange errors.
> This linker differences are perhaps also the issues Patrick is seeing? I do wonder.
>
> Anyway, do you have further ideas on this strange failure? does it happen for any other app? Ink is working, but the rest?

I haven't tried this for other apps, but for SWK+Vespucci the problem is one of initialization order (the same I've seen -- and fixed -- for StepTalk): The __objc_load function is called for libSimpleWebKit *before* libgnustep-base, which means that all protocol references in libSimpleWebKit are set to nil because the actual protocol classes are not yet known. And the problem here is the SimpleWebKit framework fails to list its depend libraries. This sloppy behavior used to work well on *ix (but not on Windows and not without extra flags on macOS either, although in the latter case gnustep-make adds them for you), but no more with version 2.x of the gnustep runtime. So it's important to add
  «target»_LIBRARIES_DEPEND_UPON += -lgnustep-base
to all library and framework projects to make them compatible with clang+ng. In this particular case, adding
  SimpleWebKit_LIBRARIES_DEPEND_UPON += -lgnustep-base
to Sources/GNUmakefile and recompiling SimpleWebKit from scratch should get you straight.

Wolfgang

PS Incidentally, I've noticed that you don't actually need to use gold: /usr/pkg/gnu/bin/ld.elf would work as well. Apparently the current NetBSD pkgsrc comes with a recent enough version of binutils where the issue with ld -r has been fixed.


Reply | Threaded
Open this post in threaded view
|

Re: libobjc2-clang

Riccardo Mottola-5
Wolfgang Lux wrote:
> I haven't tried this for other apps, but for SWK+Vespucci the problem is one of initialization order (the same I've seen -- and fixed -- for StepTalk): The __objc_load function is called for libSimpleWebKit*before*  libgnustep-base, which means that all protocol references in libSimpleWebKit are set to nil because the actual protocol classes are not yet known. And the problem here is the SimpleWebKit framework fails to list its depend libraries. This sloppy behavior used to work well on *ix (but not on Windows and not without extra flags on macOS either, although in the latter case gnustep-make adds them for you), but no more with version 2.x of the gnustep runtime. So it's important to add
>    «target»_LIBRARIES_DEPEND_UPON += -lgnustep-base
> to all library and framework projects to make them compatible with clang+ng. In this particular case, adding
>    SimpleWebKit_LIBRARIES_DEPEND_UPON += -lgnustep-base
> to Sources/GNUmakefile and recompiling SimpleWebKit from scratch should get you straight.

that's a little inconvenient, but makes sense. I wonder how much stuff
needs to be fixed

I added (inside the subproject, not the top-level of the Framework):

LIBRARIES_DEPEND_UPON = $(FND_LIBS) $(OBJC_LIBS)

>
> Wolfgang
>
> PS Incidentally, I've noticed that you don't actually need to use gold: /usr/pkg/gnu/bin/ld.elf would work as well. Apparently the current NetBSD pkgsrc comes with a recent enough version of binutils where the issue with ld -r has been fixed.

I need it for libobjc2, not other stuff