External Data Representation and Marshalling

Chamod Malintha
5 min readJun 21, 2020

Information stored in running programs is displayed as data structures. The data is independent of the form of communication used and data Structures must be flattened before transmission rebuilt on arrival. The individual primitive data elements that are transmitted in messages can be Data values ​​of many different types, and not all computers store primitive values ​​like Integers in the same order.

The representation of floating-point numbers also differs Between architectures. There are two ways of ordering the whole numbers: the so-called Big-endian order in which the most significant byte comes first; and little-endian order, in which it comes last. Another problem is the set of codes used to represent characters: for example, most applications on systems like UNIX use ASCII characters Coding with one byte per character, while the Unicode standard allows this Text representation in many different languages ​​and takes two bytes per character.

Marshalling is the process of getting and assembling a collection of data items Convert them into a format suitable for sending in a message. Unmarshalling is a process of Decompose them on arrival and create an equivalent collection of data items Ahead. Therefore, marshalling is the transformation of structured data items, Converts a primitive value into an external data representation. Similarly, unmarshalling is to Generate primitive values ​​from external data representation and Restructure the data structure.

There are three approaches to external data representation and marshalling.

· CORBA’s regular data representation

Concerned about an outer portrayal for the organized and crude sorts that can be passed as the contentions and aftereffects of remote technique summons in CORBA. It very well may be utilized by an assortment of programming languages.

· Object serialization in Java.

Concerned about the compression and external data representation of a single object or tree of objects that may need to be Sent to a report or stored on disk. Only for use in Java.

· XML (Extensible Markup Language).

This defines the text format Representation of structured data. Originally, Textual self-describing structured data — for example, documents accessible on the Web-but now used to represent the data sent by clients and servers in web services.

In the first two cases, the activities are intended to be marshalling and unmarshalling activities made by a middleware layer with no connection to the application programmer. Even in the case of XML, which is therefore textual and more accessible with manual encoding, software for marshalling and unmarshalling is available to all commonly used programming platforms and environments. Because of marshalling It requires the consideration of all the best details of the primitive representation components of composite items, the process is likely to be prone to errors if done manually. Proportionality is another issue that can be addressed in design automatically generated marshalling procedures.

In the first two approaches, the primary data types are marshaled binary Forms. In the third entry (XML), primary data types are represented by text. The text representation of a data value is usually longer than the equivalent value Binary representation. The HTTP protocol is another example of text entry.

CORBA’s Common Data Representation (CDR)

https://www.google.com/url?sa=i&url=https%3A%2F%2Fslideplayer.com%2Fslide%2F222276%2F&psig=AOvVaw1A8soBnX3Sy-8MGUXp1yfO&ust=1592805805178000&source=images&cd=vfe&ved=0CAIQjRxqFwoTCJjg8ZaekuoCFQAAAAAdAAAAABAU

CORBA CDR is an external data representation defined with CORBA 2.0. CDR can represent all data types that can be used as arguments and return values in CORBA’s remote appeals. These consist of 15 primary types and composite types. Every argument or result in a remote appeal is represented by a sequence of bytes of the appeal or result message.

The marshalling operation can be generated automatically from the type of data sent in the message. Types of data structure and basic data types are described in CORBA IDL, which provides an overview of the RMI method’s objections and types of results. In other words, you can use CORBA IDL to describe the data structure of a message.

Java object serialization

https://www.geeksforgeeks.org/serialization-in-java/

In Java, the term serialization refers to the activity of flattening an object or a connected set of objects into a serial form that is suitable for storing on disk or transmitting in a message.

Java object serialization uses reflection to find the class name of the object to be serialized and the name, type, and value of its instance variables. That’s all you need for a serialized form. Deserialization creates a class with the serialized form of the class name. Then use it to create a new constructor with an argument type that corresponds to the one specified in the serialized form. Finally, use the new constructor to create a new object with instance variables whose values are read from the serialized form. This recursive procedure continues until the class information and the type and name of all required class instance variables are written out. Each class is given a handle and no class is written to the byte stream more than once.

The process of serialization and deserialization of remote call arguments and results is usually done automatically by the middleware and does not involve the application programmer. If desired, programmers with special requirements can write their own versions of the methods to read and write objects.

Extensible Markup Language (XML)

https://dimestorerocket.com/read-a-xml-file-fast-with-csharp/

XML is a markup language defined by the World Wide Web Federation (W3C) for its general use on the Web. In general, a markup language is a text encoding that represents both the text and the details of its structure or appearance. Both XML and HTML are derived from standardized generalized markup language. HTML is designed to define the look of web pages. XML is designed to create structural documents for the web.

XML is extensible in that it allows users to define their own tags, unlike HTML which uses a fixed set of tags. However, if you want to use XML documents in more than one application, the tag names must match each other. For example, clients typically use SOAP messages to communicate with web services. SOAP is a tagged XML format for use by web services and clients.

Some external data representations, such as CORBA cdrs, do not need to be self-explanatory, as it assumes that the client and server exchanging messages have prior knowledge of the order and type of messages. However, XML is designed for different purposes in different applications. This is made possible by providing tags along with using namespaces to define the meaning of the tags. Also, tags allow you to select only the parts of the document that your application needs to process. Adding information related to other applications is not affected.

--

--

Chamod Malintha

Software Engineer | BSc. (Hons.) in Software Engineering | University of Kelaniya, Sri Lanka