Discussion:
Intercepting ram session of child
Denis Huber
2016-08-14 16:03:02 UTC
Permalink
Hello Genode community,

I am working on a checkpointing component which runs in Genode's
userland, on Genode 16.05 on the foc kernel (pbxa9 build). I want to
provide custom Ram, Pd, and Cpu sessions for a child component (target)
to intercept the remote procedure calls to keep book of the state of the
child. But I am having trouble to create such sessions.

I looked into the gdb_monitor and noux component and tried to adopt the
concepts used. In the gdb_monitor, I found the pd_session_component
without a pd_root implementation a very promising approach.

It creates a custom Gdb_monitor::Pd_session_component, which wraps a
Genode::Pd_connection object. In the constructor the manage method of an
entrypoint for this Pd_session_component is called, thus the remote
procedure calls to this object can be handled.

I tried to adopt this concept for a custom Ram session. I created a very
short test program [1] (Run script in [2]) which creates a custom
Ram_session_component and tries to transfer quota from its environment
to the custom Ram session component. The quota transfer fails with error
code -1, which means the recipient of the transfer is not a valid
session component [3].

I think the problem is, that I have not created a valid capability of
the custom Ram_session_component. What important fact do I miss in my
implementation? I am grateful for every hint you give me :)

[1]
https://github.com/702nADOS/genode-CheckpointRestore-SharedMemory/blob/d2c6705bb9951f58d3b01f8e35fa68197cc73cd7/src/random/main.cc

[2]
https://github.com/702nADOS/genode-CheckpointRestore-SharedMemory/blob/d2c6705bb9951f58d3b01f8e35fa68197cc73cd7/run/random.run

[3]
https://github.com/genodelabs/genode/blob/16.05/repos/base/src/core/ram_session_component.cc#L75


Kind regards,
Denis
Norman Feske
2016-08-17 12:06:22 UTC
Permalink
Hi Denis,

first, thank you for providing a test case in such a nicely condensed
form. It is great for pin-pointing the problem.
Post by Denis Huber
I tried to adopt this concept for a custom Ram session. I created a very
short test program [1] (Run script in [2]) which creates a custom
Ram_session_component and tries to transfer quota from its environment
to the custom Ram session component. The quota transfer fails with error
code -1, which means the recipient of the transfer is not a valid
session component [3].
I think the problem is, that I have not created a valid capability of
the custom Ram_session_component. What important fact do I miss in my
implementation? I am grateful for every hint you give me :)
Core's RAM service does not know the meaning behind the 'ram_special'
capability because this capability does refer to an RPC object living
inside your component, not inside core. Core can transfer quota only
between RAM sessions known to core. It tries to look up a RAM session
for the supplied capability argument but cannot find one. In contrast,
if you change the transfer-quota line to

log("t: ", env.ram().transfer_quota(ram_impl._parent_ram, 4096));

the operation succeeds because 'ram_impl._parent_ram' refers to a RAM
session provided by core.

If you want the child subsystem to deal with multiple RAM sessions (such
as 'env.ram()' and a manually created RAM connection), the "non-root"
approach can no longer be applied. In order to allow the RAM sessions
refer to each other, they need to be provided by the same RAM service.
I.e., when creating the child, you will have to equip it with a RAM
session of your RAM service as done, for example, by noux. The
'_resources.ram' of the child [1] is a locally-provided RAM-session object.

Btw, you will encounter a very similar problem with the PD-session
argument for the 'Cpu_session::create_thread' operation. When forwarding
this RPC call to core, you will need to replace the capability argument
supplied by the client (which refers to your virtualized PD session) by
the "real" PD capability as known by core.

Could this explanation answer your question?

[1]
https://github.com/genodelabs/genode/blob/master/repos/ports/src/noux/child.h#L164

Best regards
Norman
--
Dr.-Ing. Norman Feske
Genode Labs

http://www.genode-labs.com · http://genode.org

Genode Labs GmbH · Amtsgericht Dresden · HRB 28424 · Sitz Dresden
Geschäftsführer: Dr.-Ing. Norman Feske, Christian Helmuth

------------------------------------------------------------------------------
Denis Huber
2016-08-18 08:28:34 UTC
Permalink
Hello Norman,

thank you for your well-written answer :)
Post by Norman Feske
Hi Denis,
first, thank you for providing a test case in such a nicely condensed
form. It is great for pin-pointing the problem.
Post by Denis Huber
I tried to adopt this concept for a custom Ram session. I created a very
short test program [1] (Run script in [2]) which creates a custom
Ram_session_component and tries to transfer quota from its environment
to the custom Ram session component. The quota transfer fails with error
code -1, which means the recipient of the transfer is not a valid
session component [3].
I think the problem is, that I have not created a valid capability of
the custom Ram_session_component. What important fact do I miss in my
implementation? I am grateful for every hint you give me :)
Core's RAM service does not know the meaning behind the 'ram_special'
capability because this capability does refer to an RPC object living
inside your component, not inside core. Core can transfer quota only
between RAM sessions known to core. It tries to look up a RAM session
for the supplied capability argument but cannot find one. In contrast,
if you change the transfer-quota line to
log("t: ", env.ram().transfer_quota(ram_impl._parent_ram, 4096));
the operation succeeds because 'ram_impl._parent_ram' refers to a RAM
session provided by core.
This means, the Rpc_objects can communicate, if they are managed by the
same Entrypoint, but not if they are in different Entrypoints. Just out
of curiosity: Can I delegate a capability from one Entrypoint to another?
Post by Norman Feske
If you want the child subsystem to deal with multiple RAM sessions (such
as 'env.ram()' and a manually created RAM connection), the "non-root"
approach can no longer be applied. In order to allow the RAM sessions
refer to each other, they need to be provided by the same RAM service.
I.e., when creating the child, you will have to equip it with a RAM
session of your RAM service as done, for example, by noux. The
'_resources.ram' of the child [1] is a locally-provided RAM-session object.
Btw, you will encounter a very similar problem with the PD-session
argument for the 'Cpu_session::create_thread' operation. When forwarding
this RPC call to core, you will need to replace the capability argument
supplied by the client (which refers to your virtualized PD session) by
the "real" PD capability as known by core.
Could this explanation answer your question?
Thanks, this explanation helped me to solve the initial problem and I
could also implement the PD and CPU Rpc_objects (without root objects),
which intercept the child's session methods.

Now my problem is the following: If I apply your solution and start a
child with a custom RAM Rpc_object, then the program hangs up on the
child creation. I have created a simple program [1] (child component
[2], run script [3]) to demonstrate the error. The output is

[init -> random] All is fine until now!
Error: Test execution timed out
Makefile:261: recipe for target 'run/random' failed
make: *** [run/random] Error 254

This demonstrates that all resource-object creations before the child
object creation passed without errors. But the program hangs up on the
creation of the child. What could be the error?


Best regards
Denis

[1]
https://github.com/702nADOS/genode-CheckpointRestore-SharedMemory/blob/5676b3c5732e01f2dc02fb81082f5c38bd23f86b/src/random/main.cc

[2]
https://github.com/702nADOS/genode-CheckpointRestore-SharedMemory/blob/5676b3c5732e01f2dc02fb81082f5c38bd23f86b/src/sheep_counter/main.cc

[3]
https://github.com/702nADOS/genode-CheckpointRestore-SharedMemory/blob/5676b3c5732e01f2dc02fb81082f5c38bd23f86b/run/random.run


------------------------------------------------------------------------------
Denis Huber
2016-08-18 10:19:59 UTC
Permalink
I have found the error and corrected the source code [1].

[1]
https://github.com/702nADOS/genode-CheckpointRestore-SharedMemory/blob/4c02e6c865518cbae0c1648a346cabd0385fe175/src/random/main.cc

The error was, that I used the same Entrypoint for managing my custom
Rpc_object<Ram_session> and for passing it to the child constructor.

I think, the child constructor created an Rpc_object<Parent>, which is
managed by the child's Entrypoint. Then the Rpc_object<Parent> called an
RPC method from my Rpc_object<Ram_session>, which also is managed by
this Entrypoint. Thus, the Entrypoint blocked on the remote procedure
call to Rpc_objcet<Ram_session> waiting for itself to finish the call
from Rpc_object<Ram_session>. The Rpc_object<Ram_session> needs to use
this Entrypoint, but it was blocked. Thus, it went waiting for the
Entrypoint.

Now Rpc_object<Parent> is waiting for Rpc_object<Ram_session> and vice
vera => deadlock.

Please correct me, if I am wrong :)


Best regards
Denis
Post by Denis Huber
Hello Norman,
thank you for your well-written answer :)
Post by Norman Feske
Hi Denis,
first, thank you for providing a test case in such a nicely condensed
form. It is great for pin-pointing the problem.
Post by Denis Huber
I tried to adopt this concept for a custom Ram session. I created a very
short test program [1] (Run script in [2]) which creates a custom
Ram_session_component and tries to transfer quota from its environment
to the custom Ram session component. The quota transfer fails with error
code -1, which means the recipient of the transfer is not a valid
session component [3].
I think the problem is, that I have not created a valid capability of
the custom Ram_session_component. What important fact do I miss in my
implementation? I am grateful for every hint you give me :)
Core's RAM service does not know the meaning behind the 'ram_special'
capability because this capability does refer to an RPC object living
inside your component, not inside core. Core can transfer quota only
between RAM sessions known to core. It tries to look up a RAM session
for the supplied capability argument but cannot find one. In contrast,
if you change the transfer-quota line to
log("t: ", env.ram().transfer_quota(ram_impl._parent_ram, 4096));
the operation succeeds because 'ram_impl._parent_ram' refers to a RAM
session provided by core.
This means, the Rpc_objects can communicate, if they are managed by the
same Entrypoint, but not if they are in different Entrypoints. Just out
of curiosity: Can I delegate a capability from one Entrypoint to another?
Post by Norman Feske
If you want the child subsystem to deal with multiple RAM sessions (such
as 'env.ram()' and a manually created RAM connection), the "non-root"
approach can no longer be applied. In order to allow the RAM sessions
refer to each other, they need to be provided by the same RAM service.
I.e., when creating the child, you will have to equip it with a RAM
session of your RAM service as done, for example, by noux. The
'_resources.ram' of the child [1] is a locally-provided RAM-session object.
Btw, you will encounter a very similar problem with the PD-session
argument for the 'Cpu_session::create_thread' operation. When forwarding
this RPC call to core, you will need to replace the capability argument
supplied by the client (which refers to your virtualized PD session) by
the "real" PD capability as known by core.
Could this explanation answer your question?
Thanks, this explanation helped me to solve the initial problem and I
could also implement the PD and CPU Rpc_objects (without root objects),
which intercept the child's session methods.
Now my problem is the following: If I apply your solution and start a
child with a custom RAM Rpc_object, then the program hangs up on the
child creation. I have created a simple program [1] (child component
[2], run script [3]) to demonstrate the error. The output is
[init -> random] All is fine until now!
Error: Test execution timed out
Makefile:261: recipe for target 'run/random' failed
make: *** [run/random] Error 254
This demonstrates that all resource-object creations before the child
object creation passed without errors. But the program hangs up on the
creation of the child. What could be the error?
Best regards
Denis
[1]
https://github.com/702nADOS/genode-CheckpointRestore-SharedMemory/blob/5676b3c5732e01f2dc02fb81082f5c38bd23f86b/src/random/main.cc
[2]
https://github.com/702nADOS/genode-CheckpointRestore-SharedMemory/blob/5676b3c5732e01f2dc02fb81082f5c38bd23f86b/src/sheep_counter/main.cc
[3]
https://github.com/702nADOS/genode-CheckpointRestore-SharedMemory/blob/5676b3c5732e01f2dc02fb81082f5c38bd23f86b/run/random.run
------------------------------------------------------------------------------
_______________________________________________
genode-main mailing list
https://lists.sourceforge.net/lists/listinfo/genode-main
------------------------------------------------------------------------------
Norman Feske
2016-08-19 14:49:03 UTC
Permalink
Hi Denis,
Post by Denis Huber
This means, the Rpc_objects can communicate, if they are managed by the
same Entrypoint, but not if they are in different Entrypoints. Just out
of curiosity: Can I delegate a capability from one Entrypoint to another?
I am not sure what you mean by "communicate" exactly here. The important
point is that core's RAM session tried to look up another RAM session by
using a the capability it received (as argument to the transfer-quota
call) as key. The operation to look up an RPC object by a given
capability is provided by the entrypoint (via the 'apply' method).
However, the entrypoint knows only those RPC objects that it manages. So
when presented with a capability that refers to an RPC object managed by
some other entrypoint (in your case, the entrypoint living in your
"random" program), it finds no matching RPC object among its managed
objects.
Post by Denis Huber
I think, the child constructor created an Rpc_object<Parent>, which
is managed by the child's Entrypoint. Then the Rpc_object<Parent>
called an RPC method from my Rpc_object<Ram_session>, which also is
managed by this Entrypoint. Thus, the Entrypoint blocked on the
remote procedure call to Rpc_objcet<Ram_session> waiting for itself
to finish the call from Rpc_object<Ram_session>. The
Rpc_object<Ram_session> needs to use this Entrypoint, but it was
blocked. Thus, it went waiting for the Entrypoint.
Exactly!
Post by Denis Huber
Now Rpc_object<Parent> is waiting for Rpc_object<Ram_session> and
vice vera => deadlock.
Please correct me, if I am wrong :)
Your analysis is flawless. :-)

For this exact reason, the Genode::Child constructor takes the RAM
session (to be used locally to initialize the child) and RAM-session
capability as two distinct arguments instead of internally creating a
Ram_session_client object with the capability.

Cheers
Norman
--
Dr.-Ing. Norman Feske
Genode Labs

http://www.genode-labs.com · http://genode.org

Genode Labs GmbH · Amtsgericht Dresden · HRB 28424 · Sitz Dresden
Geschäftsführer: Dr.-Ing. Norman Feske, Christian Helmuth

------------------------------------------------------------------------------
Loading...