Hydro_intro



Introduction to Hydro

Introduction to Hydro

Hydro is an implementation of the ICE protocol for Objective Caml. Hydro consists of a runtime that manages connections, that implements the protocol details, and that defines the basic object architecture. Furthermore, Hydro includes a compiler hydrogen that reads the ICE interface definition language "Slice" and outputs O'Caml code for the language mapping.

ICE has been invented by the company ZeroC. This company offers bindings for C++, C#, Java, Python, Ruby, and PHP. There are also product extensions like IceGrid. While Hydro is compatible with ICE on the Slice and binary protocol level, its implementors have nothing to do with ZeroC, and have also gone their own ways in designing the runtime and the language mapping. Actually, Hydro is a clean-room implementation on the basis of the ICE User Manual.

We cannot explain the ICE architecture here. You can find an excellent description in the ICE User Manual.

The hydrogen compiler

hydrogen generates the language mapping code for a Slice file. The usage is quite simple:

 hydrogen file.ice 

This outputs an O'Caml module interface as file.mli and an implementation as file.ml. It is allowed to use the C preprocessor in the Slice file, so you can refer to other files. Hydro does not support, however, separate compilation of Slice modules, e.g. if file.ice contains Slice modules like

 module M1 {
     ...
   };

   module M2 {
     ...
   };

the generated mapping code is not structured by O'Caml modules M1 and M2, and you cannot O'Caml-compile M1 and M2 separately and link them later. hydrogen always produces a single O'Caml output module that is named after the single Slice input file. The Slice modules M1 and M2 only appear as part of the generated identifiers. (The reason for this is that Slice modules have C++ namespace semantics, that means it is possible to open them at any time and to add further definitions to them. It is also allowed to use forward declarations across Slice modules. Both is not supported by O'Caml modules.)

The Hydro runtime

Programs can use the runtime by including -package hydro in the ocamlfind invocation.

The following modules are important for using Hydro:

The Hydro runtime is generally written in an asynchronous way. This means that the runtime can handle input and output in parallel, and this for any number of connections. It is nevertheless single-threaded, so for many uses it is not necessary to deal with the complexity of multi-threading. For example, you can invoke methods of several remote objects in parallel by first submitting all method calls, and by then waiting until all responses have arrived. Technically, this is achieved by using the Equeue infrastructure of Ocamlnet: The user provides a Unixqueue.event_system to Hydro, and this event system coordinates all activities. Of course, the user can also decide to do only synchronous method calls (one call after the other). In order to get a Unixqueue.event_system, simply call

 let esys = Unixqueue.create_unix_event_system() 

once in your program and use this single esys for all your Hydro invocations. (Ocamlnet users know this.)

The runtime needs an in-memory representation of the Slice definition. This representation is called system, and it can also enriched by custom object constructors. Of course, hydrogen makes it simple to create a system, just do

 let sys = Hydro_lm.create_system() in
   M.fill_system sys;
   (* ...further sys modifications if necessary... *)
 

where M is the name of the hydro-generated O'Caml module.

Getting started on the client side

Let's assume we have a simple Slice interface M.ice:

 
   module Sample {
     interface F {
       int add(int x, int y);
     };
   };

Furthermore, we assume that there is a remote object called Adder on host "worker", port 4711, and Adder implements this interface. How do we call add synchronously?

Technically, we need a local proxy for Adder that supports the invocation of add. Such a proxy can only live in a proxy environment, so we first have to create one:

  let proxy_env =
    let esys = Unixqueue.create_unix_event_system () in
    let sys = Hydro_lm.create_system() in
    M.fill_system sys;
    let cp = Hydro_params.client_params() in
    let pool = Hydro_proxy.pool() in
    let resolver = Hydro_proxy.proxy_resolver cp in
    let conf = Hydro_proxy.proxy_conf() in
    ( object
	method event_system = esys
	method system = sys
	method proxy_resolver = resolver
	method client_pool = pool
	method default_proxy_conf = conf
      end : Hydro_proxy.proxy_env_t
    )

As you see, the environment is just a container for all relevant session variables. The pool manages the active connections to ICE servers. The resolver is a facility that finds a connection for a given reference to a remote object. (Note that you can also use Hydro_locator.proxy_resolver instead with enhanced capabilities.)

Then, we have to describe the sever socket where the remote object lives:

  let endpoints =
    `Endpoints [| `TCP ( object
                           method host = "worker"
                           method port = 4711
                           method timeout = 60l
                           method compress = false
                         end : Hydro_types.tcp_endpoint
                       )
               |]
 

(Note: Currently, compress=false is required because Hydro does not support compression yet.)

Finally, we create the full address of the remote object:

  let addr =
    ( object
        method id = ( object method name = "Adder"
                             method category = ""
                      end )
        method facet = None
        method secure = false
        method mode = `Twoway
        method parameters = endpoints
      end : Hydro_types.proxy_addr
    )

(Note: Currently, secure=false and mode=`Twoway are the only supported options for addressing the object.)

Now the language mapping comes into play. After doing

 hydrogen m.ice 

you get an O'Caml interface m.mli with 70 lines, and the corresponding implementation m.ml has even 216 lines. Of course, there is a lot of stuff that is generated just in case it's needed. For instance, code for exception handling is emitted although no exceptions are defined in m.ice.

The relevant part of m.mli is:

  type pr_Sample_F = [ `Ice_Object | `Sample_F  ] Hydro_lm.proxy_reference

  class type r_Sample_F_add =
    object
      method hydro_response : Hydro_lm.client_response
      method result : int32
    end

  class type poi_Sample_F =
    object
      inherit poi_Ice_Object
      method add : int32 ->
		     int32 -> r_Sample_F_add Hydro_lm.call_suspension_t
    end

  class type po_Sample_F =
    object
      inherit Hydro_proxy.proxy_t
      inherit poi_Sample_F
      method hydro_proxy_reference : pr_Sample_F
    end

  val pc_Sample_F : Hydro_proxy.proxy_env_t -> pr_Sample_F -> po_Sample_F

  val unchecked_pr_Sample_F : 't Hydro_lm.proxy_reference -> pr_Sample_F

What does this mean? The address addr is an untyped name of the proxy. The generated module defines pr_Sample_F which is basically also an address, but it has an attached type parameter [`Ice_Object|`Sample_F] (note that this is a polymorphic variant used as phantom parameter). Effectively, the values `Ice_Object and `SampleF are the names of the supported interfaces (by definition, every interface is a descendent of the predefined ::Ice::Object, the root of the inheritance hierarchy). The type [`Ice_Object|`SampleF] enumerates all interfaces the proxy supports (which are several due to interface inheritance). This becomes clearer when we define a second interface G (inside module Sample) as

  interface G extends F {
    int sub(int x, int y);
  };

For G we would get in m.mli:

  type pr_Sample_G =
    [ `Ice_Object | `Sample_F | `Sample_G ] Hydro_lm.proxy_reference

Here, the names of three interfaces appear since every proxy for G is implicitly a proxy for F, and also a proxy for ::Ice::Object. This corresponds to the fact that pr_Sample_G is a subtype of pr_Sample_F, which is a subtype of pr_Ice_Object (see its definition in the runtime module Hydro_lm_IceObject).

Now, how can we get a value of pr_Sample_F? We do first

  let pr = Hydro_lm.pr_of_address addr

but this creates only an [ `Ice_Object ] proxy_reference , i.e. the parameter is wrong, or better too unspecific. We have to downcast this typed proxy reference. In general, ICE defines two ways for downcasting proxies: Either as unchecked cast or as checked cast. In the latter way the server is asked whether the remote object is really an instance of the desired interface. Currently, Hydro does not implement checked casts, so we can only do

  let pr' = M.unchecked_pr_Sample_F pr

and have finally a pr_Sample_F reference.

As mentioned, such references are only typed incarnations of addresses. We still have no way of calling a method. In order to do this, we have to create the proxy object:

  let po = M.pc_Sample_F proxy_env pr'

This object of type po_Sample_F is now a live proxy, i.e. it can connect to the server, manage connections, and of course also invoke methods. Note that it is still unknown whether the remote object exists - this is first checked on the first method call.

The type po_Sample_F again reflects the inheritance relation. As F is a descendent of ::Ice::Object, the type po_Sample_F is a subtype of po_Ice_Object. For the second interface G we would have that po_Sample_G is a subtype of po_Sample_F.

We can call a number of methods on po. Some are predefined in Hydro_proxy.proxy_t, some in Hydro_lm_IceObject, and the other are added in the generated code. The predefined methods include

The generated methods:

The add method has the strange type

   method add : int32 -> int32 -> r_Sample_F_add Hydro_lm.call_suspension_t

The parameters are clearly x and y, but what do we get as result? The point is that add only prepares the invocation, but does not do it immediately. Because of this, you get a call suspension object, and this allows you (1) to modify call parameters like timeouts, and (2) to select between a synchronous and an asynchronous call.

We do a synchronous call:

 
  let r = (po # add 42l 16l) # scall

This statement first returns when the response (or error code) has arrived. r has type r_Sample_F_add, and you get the main result by doing

  let z = r#result

which is hopefully 58l. Note that result can also throw exceptions. In case the method has output parameters, these additional output values are accessible as out_name, e.g. r#out_remark if there was an output parameter remark.

Basic language mapping

The mapping for most types is straight-forward:

Exceptions

O'Caml does not support subtyping for exceptions, i.e. we cannot give a nice exception hierarchy here. Predefined exceptions are:

Furthermore, the exception hierarchy in the Slice definition is mapped to a single generated User_exception. An example shows how this works. Given a Slice definition

  exception X {
    string text;
  };

  exception Y {
    string detail;
  };

this is mapped to

  type exception_name = [ `X | `Y ]

  and exception_ops =
    < exn_name : exception_name;
      exn_id : string;
      is_X : bool;
      as_X : t_X;
      is_Y : bool;
      as_Y : t_Y;
    >

  and t_X =
    < hydro_ops : exception_ops;
      text : string
    >

  and t_Y =
    < hydro_ops : exception_ops;
      text : string;
      detail : string;
    >

  and user_exception =
    < hydro_ops : exception_ops >

  exception User_exception of user_exception

Note that the <...> notation means O'Caml object types. They are seldom used, but quite useful in this context.

Now when you catch a User_exception, how can you distinguish between the several Slice exceptions, and how can you get the arguments? Do it this way:

  try
     ...
  with
  | User_exception ue when ue#hydro_ops#is_Y ->
      let y = ue#hydro_ops#as_Y in
      printf "Exception Y: text = %s detail = %s" x#text x#detail
  | User_exception ue when ue#hydro_ops#is_X ->
      let x = ue#hydro_ops#as_X in
      printf "Exception X: text = %s" x#text

Of course, we test first for Y because all Y exception are also X exceptions because of the exception hierarchy (i.e. is_X is true for all Y exceptions).

Alternatively:

  try
     ...
  with
  | User_exception ue  ->
    ( match ue#hydro_ops#exn_name with
      | `X ->
           let x = ue#hydro_ops#as_X in
           printf "Exception X: text = %s" x#text
      | `Y ->
           let y = ue#hydro_ops#as_Y in
           printf "Exception Y: text = %s detail = %s" x#text x#detail
    )

The latter is advantageous when you want to ensure that all possible exceptions are caught. exn_name always returns the name of the exception that was really thrown, so the order of X and Y does not matter here.

Objects: Client view

As explained, proxies are handles for remote objects. They are the most frequent way to access objects. There is, however, another way: One can also send the object over the connection, and access it directly.

An object can have instance variables and operations (methods). If the object is sent, the instance variables are transferred, and another version of the object is created on the other side. This copy is initialized with the instance variables. Of course, any method invocation happens on the copy then.

Sending objects can be interesting as a means to transfer structured values as a whole (e.g. trees or graphs). It is not interesting to make the methods locally available (there is no way to send code). Because of this, we ignore the object methods for now, and focus on the instance variables.

For example, the Slice definition is:

  module Sample {
    class C {
      int x;
      string y;
    };

    class D extends C {
      bool z;
    };

    interface F {
      void foo(C c);
    };
  };

We have two classes C and D, and D is a subtype of C. We also have an interface F, and we assume we can access a remote object that exposes F via a proxy for F.

The method foo takes an argument - this may either be an instance of class C or of its descendant D. ICE demands that if you call foo with a descendant of the declared class type, all of the descendant must be transferred to the remote side, and it must also be recoverable if the remote side knows about D. Hydro supports this.

Note that this means that Hydro needs to support downcasts: When a peer gets a D object when it expects only C, it must be possible to test for the presence of D, and if successful, to uncover D.

This makes the language mapping of objects a bit complicated. O'Caml does not support downcasts for its own classes, so we have to generate emulation code.

So, to what is C mapped? Let's have a look on the core definitions:

  type or_Sample_C

  class type od_Sample_C =
    object
      inherit od_Ice_Object
      method x : int32 ref
      method y : string ref
    end

  class type o_Sample_C =
    object
      inherit od_Sample_C
      inherit Hydro_lm.object_base
    end

  val wrap_Sample_C : o_Sample_C -> or_Sample_C
  val unwrap_Sample_C : or_Sample_C -> o_Sample_C
  val as_Sample_C : #Hydro_lm.object_base -> o_Sample_C

  class mk_od_Sample_C : int32 * string * unit -> od_Sample_C
  class mk_Sample_C : #od_Sample_C -> o_Sample_C
  class restore_Sample_C : Hydro_types.sliced_value -> o_Sample_C

There are five types that play a role:

If you only want to create a pure data object, you can simply combine wrap_Sample_C, mk_Sample_C, and mk_od_Sample_C:

  let od = new mk_od_Sample_C (34l, "Hello", ()) in
  let o = new mk_Sample_C od in
  let or = wrap_Sample_C

If the object had operations, mk_Sample_C would define dummy implementations for these operations that always fail for any invocation. But you can override these dummies by inheriting from mk_Sample_C:

  class my_C od =
  object
    inherit mk_Sample_C od
    method bar = ...
  end

In order to look at the instance variables of an object or you receive, simply unwrap it:

  let o = unwrap_Sample_C or in
  let x = !(o#x)

Now, how to override methods in objects you receive? After unwrapping you always get the generated version of o_Sample_C. It is possible to change that by defining a custom constructor for restoring received objects:

  class my_restore_C sv =
  object
    inherit restore_Sample_C sv
    method bar = ...
  end

  Hashtbl.replace sys#ctors "::Sample::C"
    (fun sv -> (new my_restore_C sv :> Hydro_lm.object_base))

The restore_Sample_C class is used for restoring received objects. It is entered by default in the sys#ctors hash table. By replacing it there, you can select your own variant.

Finally, how do the announced downcasts work? For example, we can create a D object and upcast it to C by:

  let od_d = new mk_od_Sample_D (true, (34l, "Hello", ())) in
  let o_d = new mk_Sample_D od_d in
  let o_c = (o_d :> o_Sample_C)

This is the usual O'Caml coercion. We get the original typing back by doing

  let o_d' = as_Sample_D o_c
 

It can only decided at runtime whether the downcast is possible. If it fails, we get the exception Invalid_coercion. In this example, it always succeeds because we know o_c is in reality a D object. Now, o_d and o_d' are really the same object:

  o_d # x := 35l;
  print_int32 (! (o_d' # x))

will print 35. Even o_d = o_d' will return true - the object remains the same, only the typing changes.

Note that the as_... functions can also be used for upcasting instead of the :> operator. Of course, the latter is a no-op whereas the as_... function can be expensive.