Genode in distributed scenarios and ROM proxy implementation

Discussion:

Johannes Schlatow

2016-03-20 00:30:38 UTC

Hi,

as I'm beginning to use Genode in a distributed scenario, I had some thoughts about the use of proxy components in order to transmit particular session interfaces over an arbitrary communication medium. The general concept here is that a client component can connect to a server component running on a different Genode system by using a proxy component on each side.

For starters, I implemented this for the ROM session. I call this implementation "remote ROM": The remote ROM server is instantiated on the server side and connects to a local ROM service. It relays the content of a particular ROM module the remote ROM client, which is instantiated on another Genode system. The remote ROM client receives the updated ROM content from the remote ROM server and provides a local ROM service.

In order to generalise the implementation, I separated the network-specific part in a backend library so that the backend can be tailored for the particular communication medium, protocol and policy. I think this renders the implementation highly flexible. I added a backend implementation (nic_ip) as a proof of concept, which uses a Nic service to transmit network packets with a static IPv4 header.

In my opinion, the remote ROM implementation is already the enabler for quite a lot of distributed scenarios as it allows full publisher-subscriber communication of arbitrary data without adding too much interdependencies between the systems.

The implementation can be found here:
https://github.com/ValiValpas/genode-world/tree/remoterom

*Please feel free to adapt/contribute/improve this concept and its implementation.*

Menno Valkema

2016-03-21 11:52:22 UTC

Permalink

Hi Johannes,

This is great. Currently we're trying to accomplish a similar thing for
RPC calls and Signals. The approach with statically configured proxies
you take, also made most sense to us, which is what we did for RPC
calls, and we're now trying to use it for Signals. Combined with your
work it seems together we made the first steps towards distributed use
of the fundamental IPC mechanisms provided by Genode (shared memory,
rpc, and signals).

For RPC calls we took a similar approach using a client_proxy and a
server_proxy. Say we have a session called abc_session this is how the
components collaborate:
1) Abc Rpc client connects to a client_proxy which behaves like a
abc_session/session_component
2) The client_proxy marshalls the commands into command structures with
opcodes and arguments, and forwards these over a nic connection.
3) server_proxy receives the commands over a nic connection, un-marshals
the commands, and forwards the requests to the server while behaving as
a abc_session client.
4) the server handles the request, and returns.
5) return value is forwarded to the client in a similar way the commands
were transmitted.

Where we both have some form of serialized commands (in your case it's
handled by the ip backend with an enum identifying the action is being
sent over), we choose, for now, not to specify any sort of
transportation mechanism except a bare nic session, and send raw data
over nic sessions. In case we need an ip layer, we plan to move this in
a separate component which knows how to handle ip, ipsec or some other
protocol. However, I like the idea of your implementation to use various
back-ends which handle different transportation types.

Some challenges we're still looking into are:
- Manual marshalling. Right now we marshall the rpc calls manually.
Genode already has a great marshalling mechanism in place, however I
didn't manage to re-use this, so for now I do it by hand. This seems
like a bit of a waste, so in a later stage I hope to look into this again.
- Cross CPU architecture usage of arguments. How to handle integer
arguments or structs with integer arguments between big/little endian
architectures, or systems with a different notion of word-lengths and
struct alignment?
- We're still working on Signals.
- Routing. Presently we statically configure how the client-proxy and
the server-proxy are connected. It would be better if we had more
flexibility here, like what's now being provided by init.
- Capabilities. It would be great to have some sort of distributed
capability system to have stricter control over who talks to who.

Regards,
Menno

PS. I went through you source code to see how you implemented things,
and it's quite clear what's going on. However when I tried to run the
server by typing 'make run/test-remoterom_backend_nic_ip_server' some
methods being called seemed to be missing (calculate_checksum for
example), and the compiler exited with an error. It seems some of the
work is not in the github repository yet?
--
Cyber Security Labs B.V. | Gooimeer 6-31 | 1411 DD Naarden | The
Netherlands | https://nlcsl.com/ | +31 35 631 3253 (office)

Post by Johannes Schlatow
Hi,
as I'm beginning to use Genode in a distributed scenario, I had some thoughts about the use of proxy components in order to transmit particular session interfaces over an arbitrary communication medium. The general concept here is that a client component can connect to a server component running on a different Genode system by using a proxy component on each side.
For starters, I implemented this for the ROM session. I call this implementation "remote ROM": The remote ROM server is instantiated on the server side and connects to a local ROM service. It relays the content of a particular ROM module the remote ROM client, which is instantiated on another Genode system. The remote ROM client receives the updated ROM content from the remote ROM server and provides a local ROM service.
In order to generalise the implementation, I separated the network-specific part in a backend library so that the backend can be tailored for the particular communication medium, protocol and policy. I think this renders the implementation highly flexible. I added a backend implementation (nic_ip) as a proof of concept, which uses a Nic service to transmit network packets with a static IPv4 header.
In my opinion, the remote ROM implementation is already the enabler for quite a lot of distributed scenarios as it allows full publisher-subscriber communication of arbitrary data without adding too much interdependencies between the systems.
https://github.com/ValiValpas/genode-world/tree/remoterom
*Please feel free to adapt/contribute/improve this concept and its implementation.*
------------------------------------------------------------------------------
Transform Data into Opportunity.
Accelerate data analysis in your applications with
Intel Data Analytics Acceleration Library.
Click to learn more.
http://pubads.g.doubleclick.net/gampad/clk?id=278785231&iu=/4140
_______________________________________________
genode-main mailing list
https://lists.sourceforge.net/lists/listinfo/genode-main

Johannes Schlatow

2016-03-21 16:01:22 UTC

Permalink

Hi Menno,

On Mon, 21 Mar 2016 12:52:22 +0100

Post by Menno Valkema
Hi Johannes,
This is great. Currently we're trying to accomplish a similar thing for
RPC calls and Signals. The approach with statically configured proxies
you take, also made most sense to us, which is what we did for RPC
calls, and we're now trying to use it for Signals. Combined with your
work it seems together we made the first steps towards distributed use
of the fundamental IPC mechanisms provided by Genode (shared memory,
rpc, and signals).
For RPC calls we took a similar approach using a client_proxy and a
server_proxy. Say we have a session called abc_session this is how the
1) Abc Rpc client connects to a client_proxy which behaves like a
abc_session/session_component
2) The client_proxy marshalls the commands into command structures with
opcodes and arguments, and forwards these over a nic connection.
3) server_proxy receives the commands over a nic connection, un-marshals
the commands, and forwards the requests to the server while behaving as
a abc_session client.
4) the server handles the request, and returns.
5) return value is forwarded to the client in a similar way the commands
were transmitted.
Where we both have some form of serialized commands (in your case it's
handled by the ip backend with an enum identifying the action is being
sent over), we choose, for now, not to specify any sort of
transportation mechanism except a bare nic session, and send raw data
over nic sessions. In case we need an ip layer, we plan to move this in
a separate component which knows how to handle ip, ipsec or some other
protocol. However, I like the idea of your implementation to use various
back-ends which handle different transportation types.
- Manual marshalling. Right now we marshall the rpc calls manually.
Genode already has a great marshalling mechanism in place, however I
didn't manage to re-use this, so for now I do it by hand. This seems
like a bit of a waste, so in a later stage I hope to look into this again.

I also started with a generic proxy implementation for RPC some time ago and did make some progress on the marshalling. However, I dismissed the idea as RPC over a network didn't appeal to me as a very good idea, especially when distributed capabilities are required.
Nevertheless, I might have some code buried in my repositories. Let me know if you're interested in that.

Post by Menno Valkema
- Cross CPU architecture usage of arguments. How to handle integer
arguments or structs with integer arguments between big/little endian
architectures, or systems with a different notion of word-lengths and
struct alignment?
- We're still working on Signals.
- Routing. Presently we statically configure how the client-proxy and
the server-proxy are connected. It would be better if we had more
flexibility here, like what's now being provided by init.
- Capabilities. It would be great to have some sort of distributed
capability system to have stricter control over who talks to who.
Regards,
Menno
PS. I went through you source code to see how you implemented things,
and it's quite clear what's going on. However when I tried to run the
server by typing 'make run/test-remoterom_backend_nic_ip_server' some
methods being called seemed to be missing (calculate_checksum for
example), and the compiler exited with an error. It seems some of the
work is not in the github repository yet?

Oh, sorry. I forgot to mention that this requires some modifications of the Ipv4_packet class, see:
https://github.com/genodelabs/genode/pull/1915

Menno Valkema

2016-03-23 08:58:21 UTC

Permalink

Hi Johannes,

If you have the sources for marshalling conveniently laying around,
please share, I'd like to have a look at them.

Thanks, Menno

Post by Johannes Schlatow
Hi Menno,
On Mon, 21 Mar 2016 12:52:22 +0100

https://github.com/genodelabs/genode/pull/1915
------------------------------------------------------------------------------
Transform Data into Opportunity.
Accelerate data analysis in your applications with
Intel Data Analytics Acceleration Library.
Click to learn more.
http://pubads.g.doubleclick.net/gampad/clk?id=278785351&iu=/4140
_______________________________________________
genode-main mailing list
https://lists.sourceforge.net/lists/listinfo/genode-main

Norman Feske

2016-03-23 08:42:33 UTC

Permalink

Hi Menno,

thank you for chiming in. Great that we can get a joint discussion going!

Post by Menno Valkema
This is great. Currently we're trying to accomplish a similar thing for
RPC calls and Signals. The approach with statically configured proxies
you take, also made most sense to us, which is what we did for RPC
calls, and we're now trying to use it for Signals. Combined with your
work it seems together we made the first steps towards distributed use
of the fundamental IPC mechanisms provided by Genode (shared memory,
rpc, and signals).

Intuitively, I see the appeal of proxying those low-level mechanisms.
Once these three mechanisms are covered, any session interface could be
reused in a distributed fashion. However, practically, I doubt that this
approach is the best way to go because it discards a substantial benefit
that Genode gives us: The knowledge about the semantics of the
information to transmit (1). At the same time, it will ultimately become
complex (2). Let me illustrate both problems separately.

1) Exploiting our knowledge about the transferred information

For the sake of argumentation, let us take a look at the framebuffer
session interface. It basically consists of an RPC call to request a
shared-memory buffer (as dataspace) and an RPC call for reporting
rectangular areas within the framebuffer to be updated. The actual
pixels are transferred via the shared-memory buffer. Signals are used to
let the client synchronize its output to the refresh rate of the
framebuffer. With the approach of merely wrapping the three low-level
communication mechanisms into network packets, the interface would rely
on RPC via TCP, a distributed shared memory technique, and a signaling -
possibly also via a TCP connection.

This is not how a network protocol for remote desktops is usually
designed. Instead, such protocols leverage the knowledge about the
specific domain, in this case, transferring pixels. Exploiting these
known semantics, they can compress the pixels via pixel-oriented
compression algorithms, drop intermediate frames if detecting a low
network bandwidth, and possibly decide to not proxy the sync signals
over the network because the network latency would render them useless
anyway. Instead, we would produce artificial periodic sync events at the
local proxy.

Similar observations can be made for the other session interfaces as
well. E.g., it is perfectly fine to drop network packets of a NIC
session. But it would be wrong to do that for block requests. When
looking at the file-system session, the issue becomes even more
apparent. The file-system session uses shared memory to carry payload
between client and server. Using the low-level proxy approach, we would
again employ a distributed shared memory mechanism instead of
straight-forward packet-based communication. This is just wrong.

In short: Genode's session interfaces are painfully bad as network
protocols because they are not designed as network protocols.

Hence, instead of proxying the low-level mechanisms, I warmly recommend
to take a step back and solve the proxying problem for different session
interfaces individually. Granted, this approach does not "scale" with a
growing number of session interfaces. On the other hand, we actually
don't have a scalability problem. The number of session interfaces
remained almost constant over the past years. So it is unlikely that we
will suddenly see an influx of new session interfaces. In practice, we
are talking about approx. 10 session interfaces (Terminal, ROM, Report,
Framebuffer, Input, Nic, Block, File-system, LOG) to cover.
Consequently, the presumed generality of the low-level proxy approach
does not solve a scalability problem.

As another benefit for solving the proxying in a session-specific way,
we can reuse existing protocols in a straight-forward way. E.g., just
using VNC for proxying the framebuffer session. This would, as a side
effect, greatly improve the interoperability of the distributed Genode
systems with existing infrastructures.

2) Complexity

As you mentioned, Genode relies on three low-level communication
mechanisms (synchronous RPC, signals, and shared memory). Genode's
session interfaces rely on certain characteristics of those mechanisms:

* Low latency of RPC. Usually, the synchronous nature of RPC calls is
leveraged by the underlying kernel to take scheduling decisions. E.g.,
NOVA and base-hw transfer the scheduling context between the client
and the server to attain low latency. On the network, this assumption
does no longer hold true.

* The order of memory accesses in shared memory must be preserved. E.g.,
In packet-stream-based session interfaces, packet descriptors are
enqueued into the request queue _after_ the payload got written into
the bulk buffer. If we don't preserve this order (as most attempts
at distributed shared memory do), the server would possibly observe a
request before the associated bulk-buffer data is current.

* Signals must not be dropped.

When attempting the proxying of the low-level mechanisms, those
characteristics must be preserved. This is extremely complicated,
especially for shared memory. The resulting sophistication would
ultimately defeat Genode's biggest advantage, which is its low complexity.

Post by Menno Valkema
Where we both have some form of serialized commands (in your case it's
handled by the ip backend with an enum identifying the action is being
sent over), we choose, for now, not to specify any sort of
transportation mechanism except a bare nic session, and send raw data
over nic sessions. In case we need an ip layer, we plan to move this in
a separate component which knows how to handle ip, ipsec or some other
protocol.

That sounds very cool!

Post by Menno Valkema
- Manual marshalling. Right now we marshall the rpc calls manually.
Genode already has a great marshalling mechanism in place, however I
didn't manage to re-use this, so for now I do it by hand. This seems
like a bit of a waste, so in a later stage I hope to look into this again.

It is tempting. But as I discussed above, I think that the attempt to
leverage Genode's RPC at the transport level for the distributed
scenario is futile.

The endian issues, signals, and capabilities are further indicators
hinting at the possibly wrong approach.

Your remark about the routing and dynamics is of course a limitation of
the static proxies (which are actually modeled after device drivers,
which also operate on static resources - the devices). I propose to not
take on this problem before we have developed a good understanding of
our requirements. E.g., I don't foresee that we will distribute a Genode
system arbitrarily. Instead, we will intuitively select our "cut points"
in ways that minimize remote communication. So maybe, the flexibility of
init's routing won't be needed. Or we will discover that we will need
some other kind of flexibility.

Best regards
Norman

--
Dr.-Ing. Norman Feske
Genode Labs

http://www.genode-labs.com · http://genode.org

Genode Labs GmbH · Amtsgericht Dresden · HRB 28424 · Sitz Dresden
Geschäftsführer: Dr.-Ing. Norman Feske, Christian Helmuth