Discussion:
libc blocking functions/with_libc inside thread
Boris Mulder
2017-06-23 12:24:01 UTC
Permalink
Hi all,

We are trying to get openVPN to work under 17.05 again. One issue we
encountered was that lwip sockets seem to function incorrectly; so we
tried switching to lxip sockets.

However, sockets in lxip require the use of the with_libc() function.
When I put this around the openvpn_main() call it gives me the error:

Error: void Libc::Kernel::run(Libc::Application_code&) called from
non-kernel context

The catch here is that I'm calling all this from inside the code of a
Genode::Thread which executes the main function of openvpn. It seems
that I cannot use with_libc inside another thread than the entrypoint
thread.

calling with_libc inside Component::construct() does not work for the
new thread. It will fail when calling socket().

Is there any way of using sockets correctly inside another thread?
--
Met vriendelijke groet / kind regards,

Boris Mulder

Cyber Security Labs B.V. | Gooimeer 6-31 | 1411 DD Naarden | The Netherlands
+31 35 631 3253 (office)
Christian Helmuth
2017-06-23 12:40:12 UTC
Permalink
Hello Boris,
Post by Boris Mulder
However, sockets in lxip require the use of the with_libc() function.
Error: void Libc::Kernel::run(Libc::Application_code&) called from
non-kernel context
The catch here is that I'm calling all this from inside the code of a
Genode::Thread which executes the main function of openvpn. It seems
that I cannot use with_libc inside another thread than the entrypoint
thread.
We also identified that exposing the with_libc() in the Libc API was
the wrong direction. Therefore, we'll work on moving this aspect back
into the libc internals in the future. You're right with regard that
with_libc is not permitted/needed for other threads/pthreads in libc
applications. Blocking situations are handled differently for threads
beside the main entrypoint. But, the main entrypoint thread needs to
finish I/O operations by handling the I/O signals of Genode sessions
used by the component.
Post by Boris Mulder
calling with_libc inside Component::construct() does not work for the
new thread. It will fail when calling socket().
Is there any way of using sockets correctly inside another thread?
How does socket() fail if you do not wrap the call with with_libc()?
I'd expect the thread to open a socket_fs file and maybe block for the
I/O operation to complete. Also, is there any reason to use a
Genode::Thread which uses POSIX interfaces only (beside the admittedly
more concise syntax compared to pthread_create())?

Greets
--
Christian Helmuth
Genode Labs

https://www.genode-labs.com/ · https://genode.org/
https://twitter.com/GenodeLabs · /ˈdʒiː.nəʊd/

Genode Labs GmbH · Amtsgericht Dresden · HRB 28424 · Sitz Dresden
Geschäftsführer: Dr.-Ing. Norman Feske, Christian Helmuth
Boris Mulder
2017-06-23 13:21:28 UTC
Permalink
Hello,
Post by Christian Helmuth
How does socket() fail if you do not wrap the call with with_libc()?
I'd expect the thread to open a socket_fs file and maybe block for the
I/O operation to complete. Also, is there any reason to use a
Genode::Thread which uses POSIX interfaces only (beside the admittedly
more concise syntax compared to pthread_create())?
Basically if I do not use with_libc, the call to socket() will hang
forever inside the first read() call to the socket file. The reason I
used a Genode::Thread was because openvpn already did that. Do you think
using a pthread might be better in this case?
--
Met vriendelijke groet / kind regards,

Boris Mulder

Cyber Security Labs B.V. | Gooimeer 6-31 | 1411 DD Naarden | The Netherlands
+31 35 631 3253 (office)
Christian Helmuth
2017-06-23 13:28:18 UTC
Permalink
Hey,
Post by Boris Mulder
Basically if I do not use with_libc, the call to socket() will hang
forever inside the first read() call to the socket file.
So, which code does your initial entrypoint execute? As I wrote
before, the initial entrypoint is responsible for completing the I/O
operations in Libc components. In other words, if the initial
entrypoint does not block and then handle libc I/O signals, other
threads blocked in the libc will never resume.
Post by Boris Mulder
The reason I used a Genode::Thread was because openvpn already did
that. Do you think using a pthread might be better in this case?
No, I was just curios ;-)

Grets
--
Christian Helmuth
Genode Labs

https://www.genode-labs.com/ · https://genode.org/
https://twitter.com/GenodeLabs · /ˈdʒiː.nəʊd/

Genode Labs GmbH · Amtsgericht Dresden · HRB 28424 · Sitz Dresden
Geschäftsführer: Dr.-Ing. Norman Feske, Christian Helmuth
Boris Mulder
2017-06-23 13:59:53 UTC
Permalink
The entrypoint creates the root component, spawns the thread and
returns. It will then handle RPC requests, as entrypoints do IIRC.

The program acts as a server (serving Nic sessions asynchronously) and
as a client to lxip vfs with libc. the code can be found in [1].

How can I have the entrypoint handle I/O signals in libc while also
being able to serve clients in Genode?


[1]
https://github.com/genodelabs/genode/blob/master/repos/ports/src/app/openvpn/main.cc
Post by Christian Helmuth
Hey,
Post by Boris Mulder
Basically if I do not use with_libc, the call to socket() will hang
forever inside the first read() call to the socket file.
before, the initial entrypoint is responsible for completing the I/O
operations in Libc components. In other words, if the initial
entrypoint does not block and then handle libc I/O signals, other
threads blocked in the libc will never resume.
Post by Boris Mulder
The reason I used a Genode::Thread was because openvpn already did
that. Do you think using a pthread might be better in this case?
No, I was just curios ;-)
Grets
--
Met vriendelijke groet / kind regards,

Boris Mulder

Cyber Security Labs B.V. | Gooimeer 6-31 | 1411 DD Naarden | The Netherlands
+31 35 631 3253 (office)
Christian Helmuth
2017-06-26 08:57:53 UTC
Permalink
Hello Boris,
Post by Boris Mulder
The entrypoint creates the root component, spawns the thread and
returns. It will then handle RPC requests, as entrypoints do IIRC.
The program acts as a server (serving Nic sessions asynchronously) and
as a client to lxip vfs with libc. the code can be found in [1].
How can I have the entrypoint handle I/O signals in libc while also
being able to serve clients in Genode?
This should happen automatically under the hood as libc processes
signals in ordinary I/O signal handlers in the entrypoint.

Are you able to run the scenario under linux and inspect the
processing of both threads via GDB? I fear that I cannot help with
specifics of OpenVPN, but may guide with more details about the
blocking situation. It may be interesting to know if any network
packets reach the OpenVPN code.

Greets
--
Christian Helmuth
Genode Labs

https://www.genode-labs.com/ · https://genode.org/
https://twitter.com/GenodeLabs · /ˈdʒiː.nəʊd/

Genode Labs GmbH · Amtsgericht Dresden · HRB 28424 · Sitz Dresden
Geschäftsführer: Dr.-Ing. Norman Feske, Christian Helmuth
Boris Mulder
2017-06-26 12:46:08 UTC
Permalink
Hello Christian,

Actually the OpenVPN code hangs once it calls the libc socket()
function. Internally, this function calls a blocking write(), and this
write() is handled by Libc::Kernel.

So openVPN does not send or receive any packet yet as it is blocked at
socket().

Earlier, we have used lwip as a socket library. When we did that,
socket() (and connect() in TCP mode) did work, but it failed to send any
initial data to the server, likewise blocking on some function.

We are reaching the limit of our knowledge of genode libc and the
side-effects of the asynchronous entrypoint. At this point our debugging
went down into the libc kernel and there is a limit how deep we can go.
Help on this topic would be appreciated.

We uploaded the new 17.05 ready code of openVPN (including a run script
which can be run through make run/openvpn) onto
https://github.com/nlcsl/genode/tree/openvpn_17.05 .

If you have the time, could you try to run it and see if it is possible
to let it produce a single UDP packet? For this, it is not necessary to
setup a server. From there, we could pick it up again.

We appreciate it,

Boris
Post by Christian Helmuth
Hello Boris,
Post by Boris Mulder
The entrypoint creates the root component, spawns the thread and
returns. It will then handle RPC requests, as entrypoints do IIRC.
The program acts as a server (serving Nic sessions asynchronously) and
as a client to lxip vfs with libc. the code can be found in [1].
How can I have the entrypoint handle I/O signals in libc while also
being able to serve clients in Genode?
This should happen automatically under the hood as libc processes
signals in ordinary I/O signal handlers in the entrypoint.
Are you able to run the scenario under linux and inspect the
processing of both threads via GDB? I fear that I cannot help with
specifics of OpenVPN, but may guide with more details about the
blocking situation. It may be interesting to know if any network
packets reach the OpenVPN code.
Greets
--
Met vriendelijke groet / kind regards,

Boris Mulder

Cyber Security Labs B.V. | Gooimeer 6-31 | 1411 DD Naarden | The Netherlands
+31 35 631 3253 (office)
Christian Helmuth
2017-06-26 14:17:31 UTC
Permalink
Hello Boris,
Post by Boris Mulder
Actually the OpenVPN code hangs once it calls the libc socket()
function. Internally, this function calls a blocking write(), and this
write() is handled by Libc::Kernel.
thanks to your provided test case and the hint with "blocking write" I
was able to validate my suspicion about the blocker in your scenario.
A rough sketch of my solution can be found here

https://github.com/chelmuth/genode/commits/openvpn_17.05.

The issue is the unfortunate interplay of I/O-signal handling in the
initial entrypoint and the current implementation of the VFS plugin,
which interfaces with our file-system session. In the case of
"blocking write" the VFS plugin calls
wait_and_dispatch_one_io_signal() directly on the initial entrypoint.
In your scenario this results in the initial-entrypoint thread and the
OpenVPN thread racing on the handling of first I/O signal. As the
entrypoint always wins, the OpenVPN thread is blocked until another
I/O signal occurs (which may never happen in the startup phase).

The sketched solution just reverses the roles of the first and second
application thread. Now, the initial entrypoint implements OpenVPN
(handling its own I/O signals) and the additional entrypoint
implements the NIC server (with root and session component).

I hope this helps.

Greets
--
Christian Helmuth
Genode Labs

https://www.genode-labs.com/ · https://genode.org/
https://twitter.com/GenodeLabs · /ˈdʒiː.nəʊd/

Genode Labs GmbH · Amtsgericht Dresden · HRB 28424 · Sitz Dresden
Geschäftsführer: Dr.-Ing. Norman Feske, Christian Helmuth
Post by Boris Mulder
So openVPN does not send or receive any packet yet as it is blocked at
socket().
Earlier, we have used lwip as a socket library. When we did that,
socket() (and connect() in TCP mode) did work, but it failed to send any
initial data to the server, likewise blocking on some function.
We are reaching the limit of our knowledge of genode libc and the
side-effects of the asynchronous entrypoint. At this point our debugging
went down into the libc kernel and there is a limit how deep we can go.
Help on this topic would be appreciated.
We uploaded the new 17.05 ready code of openVPN (including a run script
which can be run through make run/openvpn) onto
https://github.com/nlcsl/genode/tree/openvpn_17.05 .
If you have the time, could you try to run it and see if it is possible
to let it produce a single UDP packet? For this, it is not necessary to
setup a server. From there, we could pick it up again.
We appreciate it,
Boris
Post by Christian Helmuth
Hello Boris,
Post by Boris Mulder
The entrypoint creates the root component, spawns the thread and
returns. It will then handle RPC requests, as entrypoints do IIRC.
The program acts as a server (serving Nic sessions asynchronously) and
as a client to lxip vfs with libc. the code can be found in [1].
How can I have the entrypoint handle I/O signals in libc while also
being able to serve clients in Genode?
This should happen automatically under the hood as libc processes
signals in ordinary I/O signal handlers in the entrypoint.
Are you able to run the scenario under linux and inspect the
processing of both threads via GDB? I fear that I cannot help with
specifics of OpenVPN, but may guide with more details about the
blocking situation. It may be interesting to know if any network
packets reach the OpenVPN code.
Greets
--
Met vriendelijke groet / kind regards,
Boris Mulder
Cyber Security Labs B.V. | Gooimeer 6-31 | 1411 DD Naarden | The Netherlands
+31 35 631 3253 (office)
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
genode-main mailing list
https://lists.sourceforge.net/lists/listinfo/genode-main
Martijn Verschoor
2017-06-29 14:43:44 UTC
Permalink
Hi,

Christian, thanks a lot for your speedy refactoring of the OpenVPN port to run the OpenVPN code in the main thread. The OpenVPN code no longer blocks on opening a socket and now tries to setup a VPN connection with the configured server. Unfortunately we are now stumbling upon two new problems.

With OpenVPN configured to use UDP, the OpenVPN component starts the TLS handshake but fails. After some debugging we noticed a pattern of retransmissions by the OpenVPN client. It appears to us that the OpenVPN client cannot read incoming packets from the socket until after (again) writing to the socket (which happens due to retransmission after timeout). If you are interested, take a look at the attached pcap in Wireshark and notice the duplication of messages. For reference I also added a pcap of the OpenVPN port on 16.05.

Also we notice that OpenVPN reads on the socket are non-blocking, proven by the massive amount of READ (len -1) debug messages. This was previously not the case.

With OpenVPN configured to use TCP, the TLS handshake and key-exchange passes successfully, yielding an OpenVPN connection between both client and server. We would now expect the corresponding Nic session to become available for the Genode client that issued the Nic session request, but this is not the case. Instead the client blocks on the creation of the Nic::Connection indefinitely. In the OpenVPN server Root::_create_session returns and the Root calls _ep.manage(..) etc.. What could keep the constructor of Nic::Connection blocking? Is this somehow related to the new asynchronous session creation process?

Met vriendelijke groet / kind regards,

Martijn Verschoor

Cyber Security Labs B.V. | Gooimeer 6-31 | 1411 DD Naarden | The Netherlands
+31 35 631 3253 (office) | +31 616 014 087 (mobile)

Boris Mulder
2017-06-23 13:05:51 UTC
Permalink
Hello,
Post by Christian Helmuth
How does socket() fail if you do not wrap the call with with_libc()?
I'd expect the thread to open a socket_fs file and maybe block for the
I/O operation to complete. Also, is there any reason to use a
Genode::Thread which uses POSIX interfaces only (beside the admittedly
more concise syntax compared to pthread_create())?
Basically if I do not use with_libc, the call to socket() will hang
forever inside the first read() call to the socket file. The reason I
used a Genode::Thread was because openvpn already did that. Do you think
using a pthread might be better in this case?
--
Met vriendelijke groet / kind regards,

Boris Mulder

Cyber Security Labs B.V. | Gooimeer 6-31 | 1411 DD Naarden | The Netherlands
+31 35 631 3253 (office)
Loading...