(* $Id: seqdb_archive.mli 16167 2008-01-17 23:16:02Z gerd $ *) (** Archives of element files ("gar" format) *) (** An archive is a sequence of elements. Every element has a name and a few meta data (timestamp, comment), and of course payload data. The payload can optionally stored with GZIP compression. If an archive is opened for reading, the elements are read from the end to the beginning. If a name occurs several times in the archive, the entry is found that is stored last. Archives can only be modified by appending new elements at the end. Archives can be stored in three ways: - As normal file. In this case, the module optionally locks the file if it is opened for writing in order to avoid that two writers modify it at the same time. There is no locking for readers. - As file in a {!Seqdb_fsys_ht} user-space file system. There is no additional locking in this case as the file system module already performs adequate locking. - As string. Such archives can only be opened for reading. The representation as byte stream is identical in all three storage forms. Conventionally, the file suffix for these archives is "gar". If stored in a file system, the file type 'a' is used, but not name suffix. *) type t (** Type of opened archives *) type mode = [ `Rdonly | `Rdwr | `Rdwr_exclusive ] (** How to open: - [`Rdonly]: only for reading - [`Rdwr]: for reading and writing (appending) - [`Rdwr_exclusive]: as [`Rdwr], but the file is write-locked before it is accessed. If another writer already locks the file, it is waited until the lock can be acquired from the other writer. This mode is only available if the archive is stored as regular file. *) (** About write locking: The lock is released when the * file is closed the next time. (As POSIX file locks are used it is * sufficient that _any_ file descriptor for that file is closed to * release the lock. Note that this behaviour may introduce subtle * bugs into your program if you open archive files several times.) * * It is generally not required to lock files for reading. *) type entry (** Type of a member element *) type properties = { entry_name : string; (** unique name of the entry *) entry_timestamp : int64; (** timestamp when the entry was added *) entry_gzip : bool; (** whether the entry is stored compressed *) entry_comment : string; (** an arbitrary comment string *) } val openfile : mode -> string -> t (** Open the file. For [mode=`Rdwr] the file is created if not yet existing *) val openfile_filesys : 'a Seqdb_fsys_types.file_system -> mode -> string -> t (** Open the file. For [mode=`Rdwr] the file is created if not yet existing. [mode=`Rdwr_exclusive] is handled like [`Rdwr]. *) val openstring : string -> t (** Open the archive given as string (always [`Rdonly]) *) val close : t -> unit (** Closes the file (and if locks are held, they are released) *) val contents : t -> entry list (** Return the contents *) val info : entry -> properties (** Return toc information about the entry *) val read : entry -> Netchannels.in_obj_channel (** Read the contents of the entry. If gzip'ed the contents are returned uncompressed *) val append : t -> string -> int64 -> bool -> bool -> string -> Netchannels.out_obj_channel (** [let ch = append arch entry_name entry_timestamp entry_gzip data_gzip comment]: * Appends data to * the archive. The data must be written to the returned [ch], and [ch] * must be closed (using [close_out]) afterwards. * [entry_gzip] says whether the entry is to be stored compressed. * [data_gzip] says whether the data written to the [out_obj_channel] * are already compressed. * * Limitation: If [data_gzip && not entry_gzip], the decompression is fully * done in memory, not chunk by chunk as for the other combinations. * * It is only possible to append one entry or to read one entry at a time, * i.e. the next entry cannot be added or read before all data for the * previous are written or read and [ch] is closed. *) val keep_page_cache_clean : t -> unit (** Remove all cached pages for [t] from the page cache. Applies only if t is a file *) (* CHECK: keep_page_cache_clean makes only sense for synced files *)