SmartFrog Reference Manual
- Part 1: An Introduction to SmartFrog
-
Part 2: The SmartFrog Notation and Core Data Model
- Introduction
- The Primary SmartFrog Notation
- Resolution – Semantics For The SmartFrog Notation
- Template Parameterization Pattern
- Functions and operators
- Predicates, Assertions and Schemas
- Mapping to the Core Data Model
- Primary Language Processing
- Programming with the Parser
- The Common Data Model
- The SmartFrog Grammar Rules
- The SmartFrog Lexical Rules
- Predefined SmartFrog Functions
- Schemas
- Comparing the SmartFrog Notation with XML
- Part 3: The SmartFrog Component Model
- Part 4: The SmartFrog Runtime
- Part 5: A SmartFrog Example
- Appendix A.
The SmartFrog Reference Manual
A guide to programming with the SmartFrog Framework
For SmartFrog Version 3.06
Localized for UK English / A4 Paper
Table Of Contents
Part 1: An Introduction to SmartFrog
Introduction
This manual is aimed at those wanting to use and understand the workings of SmartFrog. It is not a basic tutorial, though hopefully it is not too obscure, either. The notation is described fully, as is the component model. The framework, however, is only outlined. For a detailed reference description of the framework APIs, users should refer to the accompanying Javadoc files.
The manual is divided into several sections:
-
The aims of the SmartFrog system: defining the basic goals of the system, thus ensuring that there is an awareness of these aims to aid in understanding the technical details.
-
The SmartFrog notation, describing the details and semantics of the first configuration description notation to be supported by the SmartFrog framework; other notations are in preparation but are not included in this manual.
-
The SmartFrog component model and framework, defining how to write components and run them within the SmartFrog system.
-
The SmartFrog security infrastructure, describing how SmartFrog ensures that systems are appropriately protected.
A separate document covers the details of installing and running the SmartFrog system. A number of examples are also provided and documented as part of the framework.
This document contains sections that assume differing levels of knowledge and familiarity with the SmartFrog system. It is suggested that a first-time user read only those parts that are essential before experimenting, then progressing to more advanced topics as familiarity develops. To aid in this, sections or sub-sections are tagged with one of the following labels: basic, advanced and expert indicating progressively more advanced topics. If a section is tagged as a particular level of complexity, and a sub-section is considered to be of higher level, the sub-section will be tagged with this higher level.
Aims Of The SmartFrog Framework - Basic
Configuration
For many years HP Labs has been involved in the development of large-scale distributed systems, and in particular management and measurement systems. From this experience, it became clear that configuration is often the major hurdle in the development, adoption and use of such large systems. This experience is supported by evidence from other domains, such as telecom service platforms, large scale e-service hosting environments, and so on. The weight of evidence clearly indicates that many of the problematic aspects of developing, delivering and maintaining such systems are resolved by the introduction of a well-designed, intuitive configuration system. These observations led to the development of the SmartFrog configuration framework described in this manual.
There are several significant reasons for investing in a powerful and flexible configuration environment, which in combination illustrate why this area is in many cases essential for the success of a large system. These are discussed below as a clear understanding of these reasons help in determining the requirements for a supporting environment.
Increased operational reliability
Configuration errors are the major cause of system failure. It is no coincidence that at least one system development inside of HP has termed the development of a tailored configuration system as its ‘high-availability programme’. It is pointless spending money on expensive replicated databases and computation if they contain wrong data, or are carrying out the wrong calculations. From hard experience, they know that the human element is by far the weakest point in any system of even moderate complexity.
Many systems are required to be resilient to a (small) number of failures, providing support for dynamic system reconfiguration in the case of such failures. This should be provided via failure detection mechanisms triggering re-configuration actions within the system components themselves (such as instigating fail-over) and through the configuration system to ensure a consistent view of the current configuration and to provide appropriate re-configuration policy (for example, where to create the replacement components in the case of a processor failure).
Improved quality
After examining the architecture and design of several large-scale systems it became clear that the developers of the various component sub-systems had each created their own configuration infrastructure, often not realizing that this area is of great importance to the overall system. Each makes separate decisions as to format of the data, how it is stored, and so on. In addition, since some aspects of configuration such as configuration description or failure detection and recovery can be extremely complex, the separate development groups frequently do not utilize best practice.
Reduced cost
Costs can arise for several reasons and in several areas such as development, installation and maintenance. For each of these, providing well-defined best-practice procedures and well-implemented support environments for configuration can save significant time and hence money. From experience with several systems, the majority of support calls for these systems (and hence source of recurring cost to the platform provider) come from configuration issues.
Assured correctness and consistency
Validation rules need to be provided to ensure that a configuration is correct before it is deployed into the running system. These rules should include dependencies between various system components (e.g. version dependencies) as well as rules governing repetition (e.g. each web server should run the xxx process and …), replication (e.g. two cooperating instances of this component should exist for reliability...), location (e.g. this component should be close to the database...), and so on. Tools for modelling and reasoning about the configurations are required.
Given a configuration that has been defined and validated, the configuration must then be correctly and verifiably instantiated, preferably automatically, with appropriate error handling in the case of failure. Discovery services must be present to enable binding of services to each other as defined in the configuration, and status monitoring capabilities are required to provide management tools with the ability to monitor the overall state of the system and to ensure it is correct with respect to the desired configuration.
Complex systems may in fact be impossible to configure manually if the requirements change faster than individuals ability to track these changes and carry out the complex reconfiguration tasks. In these cases, automated, adaptive configuration, driven from general rules and auto discovery, is the only solution.
Increased security
System configurations are vital to the integrity of the system. Consequently, in many environments where physical and network isolation cannot be guaranteed, a high level of basic system security must be provided. This involves not only protecting the configuration data itself from unauthorized access, but also the run-time environment must be secure. This includes discovery protocols, component instantiation services, management services and so on. It is typically hard to provide a secure environment when many independent and diverse techniques are used to provide the configuration, so again a single solution implementing best practice is an essential step to ensure system integrity.
Improved Customer Experience
A major issue to be considered in designing systems is that different classes of user have different requirements. All too frequently, the configuration information is designed for the convenience of the system developer not the system operator. Data is required in a form that often does not reflect the skills of the administrator, or maybe is replicated in several files, or distributed over many processors, each of which can lead to a slow and error-prone configuration process. Configuration should be done in ways useful to the operator and adapted to the system and not by expecting the operator to adapt. This can be expensive and hard to implement unless there is extensive support for the systems developers.
The SmartFrog Framework
SmartFrog is a framework for the development of configuration-driven systems. It was originally designed as a framework for building and managing large monitoring systems where flexible configurations are essential. SmartFrog is currently in use within several products, though it is not a product in its own right.
The name reflects its basic design concept – the Smart Framework for Object Groups. It defines systems and sub-systems as collections of software components with certain properties. The framework provides mechanisms for describing these component collections, deploying and instantiating them and then managing them during their entire lifecycle.
The framework consists of three major aspects:
-
The SmartFrog configuration description environment, consisting of a description notation and tools to enable the storage, validation and manipulation of these descriptions.
-
The SmartFrog component model, defining the interfaces that a software component (or a management adapter for a component) should implement. These interfaces are to support the various lifecycle operations such as creation, versioning and termination, as well as management actions such as accessing status information.
-
The SmartFrog configuration management system, which uses these descriptions and management adapters to instantiate the software components and to monitor them throughout their lifecycle in a secure way, including an integrated run-time environment providing capabilities such as discovery and naming.
Notation
The SmartFrog ‘notation’ is in fact defined as a set of open data structures. In principle, this definition can support a number of parsers that provide different textual versions of the notation (for example using XML as a surface syntax). Additionally, it’s possible to develop GUI tools that allow the users to “drag-and-drop” their configurations using the data structures as the common form. At this stage, no generic GUI tools are available for SmartFrog, though experimental versions have been built; usually such tools are normally best tailored to a specific class of system.
The notation is object-oriented, supporting inheritance and extension of configuration descriptions. These descriptions consist of component definitions, associations and relationships between the components, and workflows associated with the lifecycle of the components and the system as a whole. The descriptions may be parameterized enabling multiple instantiations with different configuration data, and validations may be provided which verify that these instances are correct before an attempt is made to deploy the configuration.
The current version of SmartFrog, though in principle able to support multiple textual languages, just provides its own specialized notation “out-of-the-box”. Others are in preparation for future releases.
The notation is not used to define behaviour, merely the structure of collections of components and their relationships with other collections. It is not a programming language. The behavioural part of a component is assumed to be defined in an existing programming language (such as C or Java) and the component will be started as needed by the SmartFrog configuration management system. Currently only Java is tightly integrated. Java adaptors must be used to wrap code written in other languages, and these are relatively simple to implement.
Components
The component model supported by SmartFrog is a simple, extensible set of interfaces providing access to key management actions – such as instance creation, configuration, termination, and so on. A component may be fully integrated (i.e. it may implement the defined management interfaces directly, and hence be written in Java) or it may be independent, in which case a management adapter must be provided. Several standard management adapters or base integrated components have been written to provide common behaviours and these may be extended or modified as appropriate.
Each component (or adapter) must implement a standard lifecycle, implemented as a set of action routines that the environment invokes in the appropriate order and at the right time to carry out the configuration or other management task required. The lifecycle process is governed and controlled by the definition of workflows within the SmartFrog system to provide a very flexible and adaptable environment for carrying out the various configuration tasks.
A complete set of APIs is available to the components that allow them to access the configuration information, locate other components as defined in the configuration and to alter the running configuration if so desired.
Environment
The SmartFrog configuration and management infrastructure is supported by a collection of services, such as:
-
deployment – the distribution of code, configuration data and the instantiation of components in the right place with certain ‘transactional’ guarantees
-
discovery and naming – providing a number of binding services to allow components to locate each other and communicate
-
management – every component is manageable via tools provided with the framework, via the web, or other consoles (if so configured) with no developer effort.
These services are incorporated so as to provide a seamless and coherent programming and configuration model. The benefits of this approach are in providing configuration abstractions to component developers that allow multiple configurations of different scale to be produced without altering the components in any way. The environment is broken into several well-defined functional units, each of which has some specific role to play. Furthermore each of these operates through well-defined and open interfaces, so it is easy to replace the existing functional units, or even to make the selection of which functional unit to use part of the configuration description.
For example, suppose a component, say an SS7 stack, requires the use of another, a real-time database for storing connection information to help the recovery process in the case of system failure. This may be done in many ways. For example, the database could name itself under some well-known name in some well-known naming service, and the stack could find it there. Alternatively, the system may use SLP discovery to locate the database, or perhaps look in a file for this location information. Each approach has advantages in different system contexts, but the programmer typically has to decide up front which to support.
Not so with the SmartFrog integrated environment. The SmartFrog system supports the notion of a binding and provides multiple ways – determined by the environment and driven by the configuration descriptions – for these bindings to be resolved. This includes all the above approaches and others may be added as required. So a programmer need only obtain its binding from the environment and the precise mechanism is handled by the SmartFrog environment as defined by the configuration.
SmartFrog is a framework, and is designed to make it easy to provide additional binding mechanisms as they are required – for example changing the naming service or adding a specialized binding service which uses some other technologies such as databases or directories.
This is equally true of the other services. Consider deployment; it is possible to provide different mechanisms for ensuring that a component is created in the right place. For instance, it might be by hostname, or perhaps by some computer’s role within the system, or perhaps it needs to be close to another existing component. Each of these location mechanisms may be integrated into the run-time environment and then referenced freely within the configuration descriptions.
Final Comments
The design goals for SmartFrog were to produce a very lightweight and flexible configuration and management infrastructure capable of scaling from small systems to very large. This has been achieved through the use of the framework concept and providing users with the ability to alter the low-level semantics by replacing functional units, yet providing standard capabilities by offering default implementations of these units. The system also provides a flexible configuration description notation, with potential for multiple textual or GUI syntaxes to be used targeted at specific system architectures.
Applications of SmartFrog have clearly demonstrated that systems are more quickly implemented using the technology, and that the structure imposed upon the implementations by the use of SmartFrog is beneficial to long-term reliability, usability and manageability.
The Anatomy of SmartFrog
This section attempts to lay out the main aspects of the SmartFrog service deployment framework, describe their relationships, and map them into the structure of the reference document.
As described in the introduction, it consists of three main aspects:
-
The SmartFrog notation, a language in which to describe the configurations, also known as service descriptions.
-
The SmartFrog component model, the way in which programmers create components that are created and managed by SmartFrog as part of a service and which can interact with the system. These are deployed according to the service description.
-
The SmartFrog runtime, the collection of services that exist as part of the SmartFrog system. This is also know as the deployment engine, but is strictly a misnomer since it is in reality a collection of predefined components.
These various components can be seen from the following outline diagram of a SmartFrog system.
These three aspects will now be examined in a little greater detail.
Building Systems with SmartFrog
Part 2: The SmartFrog Notation and Core Data Model
Introduction
The statement that there is a SmartFrog notation is a simplification of reality. SmartFrog may support many notations, though it provides a 'standard' primary notation out of the box. To enable this, SmartFrog provides a well-defined interface between the language processing parts of SmartFrog and the run-time as a well-defined data model: the set of Java classes that must be used by the language processing to represent the data delivered to the runtime system.
Roughly speaking, the model of SmartFrog language handling is shown in the following diagram:
As is illustrated in this diagram, there may be many notations, each with their own language processing, which at the back-end of that processing produces an instance of the data model that can be understood by the remainder of the SmartFrog system. Alternatively, programmes such as a drag-and-drop gui can produce the data in the correct form directly.
To support the development and use of additional languages, the SmartFrog framework provides a rudimentary structure for integrating language processors. A language processor is assumed to consist of three major steps: parsing, executing some processing phases, and then conversion to the standard data format. The set of processing phases are assumed to be language specific, including having the empty set of phases.
This is illustrated in the diagram above, also showing the associated Java calls used within the framework. These are not important at this stage and are explained in detail later in the reference manual.
Note that the core data mode and the primary notation are closely coupled. This means that in effect the core model can in some ways be seen as a true subset of the primary notation – it could be unparsed into the primary notation and parsed back directly into the core form without requiring any language processing.
Indeed, the two are sufficiently close that the Java classes that are used to directly represent parse-trees of the notation are derived from those of the core model, and much of the same terminology is used in both. So for example, an attribute-set in both is called a Component Description, the only difference being that in the primary notation this may have a super-type from which it inherits, whereas in the core model it may not.
Each notation is assumed to have an associated name, and this name is used in the construction of a parser (selected via a standard language-name to parser-classname mapping). Furthermore, if text files or URLs are handled by the SmartFrog system, the extension associated with that file is assumed to indicate the name of the notation in use. Thus for the primary notation, files should end with “.sf”.
Once converted to the core format, the data represented may be used in several ways:
-
It can be data that is passed to components in the same way as any other data. Indeed many of the components provided as part of the Smart frog distribution exchange such data through their APIs.
-
It can represent the set of components that should be deployed by the SmartFrog run-time.
Now the second case is in fact just a special case of the first, where the data is passed to one of the standard SmartFrog 'Compound' components, such as the ProcessCompound, that understands how to interpret these descriptions as that of a distributed set of components. This duality is described in the following diagram:
The reference manual the primary notation and the core data model.
-
The primary notation is covered in this part of the reference manual. This is the only notation covered in the reference manual.
-
The programming model for interacting with the language framework is given in section REF. This provides details of how to invoke a parser for a specific notation, how to drive the phase-resolution steps of the language, and finally how to covert to the standardized form for handling within the rest of the SmartFrog framework.
-
The core data model is described in section REF. This is only a partial description and the primary source of this information should be the Javadoc for the classes involved.
The Primary SmartFrog Notation
Background
The primary SmartFrog notation has been designed to provide users of the SmartFrog framework a simple, yet powerful, attribute description language. As such, the language has similar aims to that of XML - though it predates XML by a couple of years. There are a number of significant differences between XML and the SmartFrog notation that are worth explaining and this is done in a section 16.
The primary notation is designed to be very close to the common core data model, but provides a number of additional important useability features, including inheritance, linking, attribute placement, functions and various types of well-formedness predicate. Indeed all of the classes used by the parser to represent the abstract syntax tree are derived from those of the core data model.
Attributes
A SmartFrog description consists of an ordered collection of attributes. The attributes are ordered because several of the operations in the SmartFrog framework require an order, for example the order in which the configuration should be instantiated.
Each attribute has a name and a value, this value being either a simple value (integer, string, etc.), or an ordered collection of attributes known as a component description. This recursion provides a tree of attributes, the leaves of which are the basic values. A value may also be provided by reference to another attribute. This is described by the following BNF, where Stream indicates the entry point to the SmartFrog parser.
Stream ::= AttributeList
AttributeList::= (Attribute
| #include String
| ; // allow arbitrary extra ";"
)*
Attribute::= Name Value
Name::= -- | (WORD [ : Name ])
Value::= Component
| SimpleValue ;
| ; // instance of SFNull
From this it is clear that the input to the parser is a collection of attributes, each named and having an optional value. If the value is not present, the value is defined to be an instance of the class SFNull (note that the other way of defining a value of class SFNull is to use the basic value NULL). The reason for providing this feature is to enable the use of attributes where the presence of the attribute is what is important, not its value.
The syntax for a name will be covered later, but for now it can be considered to be either a simple sequence of letters and digits, starting with a letter, or the double-hyphen “--“). The double hyphen is for use at times when the attribute name is not important and so a new unique name is generated and used. This is particularly useful with the function syntax described in Section 5, and most specifically the nary operators.
Include files are covered in more detail in section 3.1, but in general they consist of parseable SmartFrog text which are parsed as attribute lists and unpacked into place within the container attribute list.
Values can be divided up into two main categories: nested attribute sets (components) and the rest (simple values) which include numbers, strings, vectors of these, and so on. In addition it is possible not to provide a value for the attribute, or more precisely to give a null value to it (an instance of the SFNull class). This is captured by the third clause of the BNF for values above.
Simple Values
Values are expressible in several syntactic forms.
SimpleValue::= Basic
| Reference
| Operator
| IfThenElse
| Vector
Basic Values
The primary way is to provide a basic value, a literal syntactic form for the basic core values in the SmartFrog language. The syntax for the basic values is best given by example.
Integer: 345
Long: 65325L or 65325l
Float: 34.76F or 34.76f or 34.76E-10F or 34.76e+10f or 34.76E10f
Double: 1534.45 or 1534.45D or 1534.45d or 1534.45E10 or 1534.45E-10D
String: "this is a string"
Multi-line String: ## This is a string
Over many lines #
Boolean: true
SFNull: NULL // alternatively, leave the value empty
Byte Array: #HEX#AB348eAb#
Consequently, an example of a piece of SmartFrog text is as follows
portNum 4074;
hostname "ahost.smartfrog.org";
isHighPriority false
defining three attributes with the appropriate values.
In addition to these basic values, it is also possible to give vectors of basic values (as opposed to the more extensive vector syntax given below). These vectors are limited to containing basic values, and other vectors of basic values.
userList [| "fred", "harry" |];
empty [| |];
listOfLists [| [| 1,2,3 |], [| 4,5,6 |] |];
The full syntax for the basic values is
Basic::= String
| Number
| Boolean
| ByteArray
| [| [Basic (, Basic)*] |]
| NULL
Number::= DOUBLE
| FLOAT
| INTEGER
| LONG
String::= STRING // "....."
| MULTILINESTRING // ##....#
Boolean::= true | false
ByteArray::= #HEX#....#
| #DEC#....#
| #OCT#....#
| #BIN#....#
| #B64#....#
Note that byte arrays will be definable as hexadecimal (HEX), decimal (DEC), octal (OCT), binary (BIN) and base64 (B64), however only hexadecimal is currently implemented. Depending on the definitional form, the characters that may be used and the number that must be present are different. White space characters are ignored so that neat tabbed layouts may be used. They are treated in the syntax as single tokens.
References
The second form of simple value provided by the language is the reference. A reference is a link between the value of one attribute and that of another. This allows for the definition of data in one place and reused in many, easing maintenance issues for descriptions. References are dealt with in more detail in section 2.2.4, but as a first example the use of a name in the value position provides the link to the attribute of the same name. So for example:
x 42;
y x;
defines y to be the same value as x, namely 42;
Operators
The remaining three forms of value definition are syntactic sugar for the use of functions. The semantics of functions are outlined in section 5 and described in detail in section . However, their syntax is
Operator::=
(
(UnaryOp SimpleValue )
| (SimpleValue [ BinaryOp SimpleValue ])
| (SimpleValue [ (NaryOp SimpleValue)* ])
)
UnaryOp::= !
BinaryOp::= - | / | == | != | >= | > | <= | <
NaryOp::= + | * | ++ | <> | && | ||
This states that the use of an operator is always defined within brackets (...) and that there are three types of operator: unary, binary and nary. Although with the nary operators, more than one instance of the operator symbol is present, it must always be the same operator; they cannot be mixed. However, other operators may be nested within another set of ( ). The following examples may help to make the syntax clear:
aTruthValue true;
anotherValue (! aTruthValue ) // the only unary operator: not
aNumber 45;
aMinus (100 – aNumber) // a binary operator
aSum (aNumber + aMinus + 100) // an nary operator
These operators are all converted at time of parsing into the template representation of a function, and hence at no time willl operators appear in an description generated from the parsed form.
Note that attribute names can contain rather a large number of special symbols, such as “+” and “-”. This means that there is a danger that an operator may lexically stick to a name if not separated from it by white space. As a consequence, it is good practise to always use white space around operator symbols.
If-Then-Else
Similarly to operators, if-then-else expressions are shorthand for the template form. This is described in detail in section . The syntax for this expression form is
IfThenElse::= IF SimpleValue
THEN SimpleValue
ELSE SimpleValue
FI
The line breaks being, of course, optional. The “if” value is a boolean and depending on the result the expression takes the value of the “then” or “else” values. The FI is merely a closing keyword. An example of its use is:-
val1 42;
val2 43;
diff IF (val1 > val2) THEN (val1 – val2) ELSE (val2 – val1) FI;
Vectors
The final form of simple value is the vector. Vectors are lists of values and are constructed using the vector function described in section . However, to simplify its use, the following syntactic form has been provided.
Vector::= [ [SimpleValue ( , SimpleValue) *] ]
Thus a vector is a sequence of values separated by “,” and delimited by “[ ]”. If no value is provided within the vector, an empty vector is returned. Vectors may be nested to produce vectors of vectors. Example uses of vectors are:
v1 [1,2,3];
v2 [9,8,7];
v3 [v1, v2]; // same as [[1,2,3],[9,8,7]]
Note that there are two syntaxes for vectors – the one given here which provides the ability to embed references and which therefore requires a degree of processing (known as resolution). It is parsed into the use of the vector function rather than directly into a vector. The other form, using the “[| |]” delimiters, parses directly into a vector and hence may not have references within the definition. The reason for having simpler form in addition to the more general form is is that there are times when the fully processed (resolved) data structures need to be unparsed, and then re-parsed at some future time without further transformation. A example of this is during the signing of a description for security purposes.
Values of Other Classes
The set of values that can be described by the use of the language is limited to a few basic classes and collections of these. It would be useful to be able to include values from other classes in Java. These in principle can be generated in functions, or some user-defined phase, and added to the attribute sets. However, there are problems with this for SmartFrog, and in particular with some aspects of the security where descriptions transformed to core form need to be signed and this is restricted to the known classes.
Consequently the conversion to the core form ensures that the values represented in the attribute sets, the component descriptions, are limited to these core classes. If other values need to be held within the tree, it is recommended that they are held in serialized form within a ByteArray value. This will need to be deserialized by the component at the time of deployment.
Component Descriptions
Attributes may have values that are collections of other attributes, known as component descriptions. They obtain their name from the fact that they may be interpreted by the framework as the description of a component, though they may equally be used to describe structured data.
A component description consists of two parts, a reference to another component description to act as a source of attributes (the type), and a collection of attributes that are then added to, or override, the attributes of the referenced collection (the body). The syntax is:
Component::= extends [LAZY] Type Body
Type::= [ NULL | BaseReference ]
Body ::= ( ; | { AttributeList } )
The LAZY keyword may be largely ignored; it is merely a tag and only has a semantic effect during the deployment of a SmartFrog application. Extension of an existing LAZY component description does not inherit the tag.
Both the reference and the attribute list are effectively optional. If neither is present, the resultant attribute list is defined to be empty. The syntax is most easily explained through an example:
SFService extends { // an implicit extension of NULL
portNum 4047;
hostname "ahost.smartfrog.org";
administrators ["patrick"];
}
UseableService extends SFService {
// an extension of the previous compnent
portNum 4048; // override the definition of portNum
users ["fred", "harry"]; // add a new attribute
}
The text consists of two attributes, both of which have values that are collections of attributes. The second of these, UseableService, is defined as an extension of the first SFService, with two attributes added to or overwriting those inherited. The text is semantically identical to the following:
SFService extends {
portNum 4047;
hostname "ahost.smartfrog.org";
administrators ["patrick"];
}
UseableService extends {
portNum 4048;
hostname "ahost.smartfrog.org";
administrators ["patrick"];
users ["fred", "harry"];
}
Note that the attributes in a component description are ordered and that when an attribute is overwritten it maintains its position, but when it is a new attribute it is added to the end. The process of expansion of the inheritance in this way is known as Type Resolution and is explained further below.
Note also that the parsed stream is considered to be in an implicit, anonymous (i.e. not named in an outer component description), component description known as ROOT.
The example is also shown in the diagram. It clearly shows that there are two kinds of relationship between component descriptions. One is the containment relationship, where a component description contains an attribute that is itself a component description. The second is the inheritance or extension relationship. This second class of relationship is one that can be transformed, by type resolution, to an equivalent one containing no extension (also indicated by the NULL extension).
Whilst the extension relationship is merely a convenient way of defining attributes, the containment hierarchy is a more fundamental construct. It should be noticed that that containment hierarchy effectively provides a naming scheme by which attributes may be referenced. In this it is similar to other such named hierarchies, such as directory hierarchies common in files systems.
Types vs. Prototypes
SmartFrog does not define types for attributes and components. Rather it defines the notion of a prototype (c.f. the programming language Self). Each attribute whose value is a component description may be considered as a prototype for another: it may be taken and modified as appropriate to provide the value for the new attribute. The mechanism for this is the extends construct.
Any attribute whose value is a component description may be, at a later juncture, selected and modified to provide a new component description to be bound to a name. This new attribute may be further modified by subsequent attributes. In this way, it is possible to provide partial definitions, with default values for attributes, to be completed or specialized when used. This provides a simple template mechanism for components.
Consequently, there are no separate spaces of types and instances; every component is logically an instance, but may also be a prototype for another. However, it is clear that in providing descriptions, some components will be defined with the intention that they be used as prototypes for other components, whilst others will be defined without that expectation. Whilst this may appear strange in the first instance, it turns out to be one of the main strengths of the SmartFrog notation.
References
References may occur in three places in the syntax: as the name of an attribute – known as a placement, as a reference to the extended component (the prototype) of a component description, and as an attribute value referring to another attribute whose value is to be copied – known as a link.
The primary purpose of a reference is to indicate a path through the containment hierarchy defined by the components. In this, it is similar to the notion of path common in file systems in operating systems such Linux. A path defines a traversal of the directory hierarchy, a structure similar to the component hierarchy.
The syntax for references is as follows:
Reference::= [LAZY] BaseReference
BaseReference::= ReferencePart ( : ReferencePart)*
ReferencePart::= ROOT
| PARENT
| (ATTRIB WORD)
| (HERE WORD)
| THIS
| WORD
| (PROPERTY WORD)
| (IPROPERTY WORD)
| (HOST (WORD | STRING))
| PROCESS
Thus, a reference is a colon-separated list of parts each of which indicates a step in the path through the containment tree. Examples of references are:
PARENT:PARENT:foo:bar
ATTRIB a:b
ROOT
x
HOST 15.144.56.65:foo:bar
Normally a reference indicates a path through the containment tree to an attribute whose value should be copied, or a component description in which an attribute should be placed. These references are “resolved” during the language processing to eliminate them and to carry out the appropriate copying or placement.
However, occasionally the reference itself is the desired value, or the reference cannot be resolved during language processing as the data referenced is not available until a later stage. Under these circumstances, the keyword “LAZY” is prefixed to the reference to indicate that the reference resolution should be delayed.
The general rule for the interpretation of a reference is that the reference is evaluated in a context (a component description somewhere in the description containment tree), and that each step moves the context to a possibly different component for the remainder of the reference to be evaluated. This is equivalent to path evaluation in a Linux file system, the path is evaluated in a current directory, and each part of the path moves the context to another directory.
The semantics of each of the reference parts is as follows: starting at component in which the reference is defined…
-
PARENT - move context to the parent (container) component if it exists, fail otherwise (c.f. Linux “..”)
-
HERE WORD - look for the attribute named “word” in the current context, fail otherwise
-
ATTRIB WORD - look for the attribute named “word” in the current context or anywhere in the containment hierarchy (the closest is chosen), move to the context defined by this attribute, fail if no attribute is found in the containment hierarchy
-
ROOT - switch context to the outer-most component (normally the implicit root component (c.f. Linux “/ “)
-
THIS – keep the context the same, don't switch (c.f. Linux “.”)
-
WORD – the interpretation of the WORD depends on the location. If it is the only part in the reference, or the first part, it is interpreted as ATTRIB. If it is the second or later part of a reference it is interpreted as HERE.
Some examples of references (in this case link references) are as follows:
The arrows in the left-hand text show the path followed as the references are resolved to obtain the referenced attribute values, noting that the resolution of ref3 will follow the resolution of ref2. The contexts traversed as the resolutions progress are shown boxed and the right-hand text shows the result of resolving the three links.
In addition to these four structural reference parts, there are four others that are not appropriate for all circumstances and are not related to the containment hierarchy. These are
-
PROPERTY WORD – return the value that is the Java system property named WORD. It may only occur at the end of a reference, and only in a link. Syntactically it may occur anywhere, however the remainder of the link is ignored. It is usually used in conjunction with LAZY. Without LAZY, the value of the property at the time of parsing will be used; with LAZY the application run-time value of the property will be used when the link is resolved – see section 3. The property is always a string.
-
IPROPERTY WORD – as for PROPERTY, but the property is interpreted as indicating an integer which is parsed and returned as such.
-
HOST (WORD | STRING) – switch to the context of the process compound on the host name WORD (or STRING – which must be used if supplying an IP address, but may also be used with a host name). This reference part is also used in conjunction with the LAZY keyword and only in links. It is used to provide a naming service for applications within the SmartFrog system. Again, without LAZY the parser will look-up the value in the remote process compound, with LAZY this will be done at run-time when the link is resolved – see section 3.
-
PROCESS – switch to the context of the process compound of the current process. This is also used in conjunction with the LAZY keyword and only in links. It is used to provide a naming service for applications within the SmartFrog system. Again, without LAZY the parser will look-up the value in the remote process compound, with LAZY this will be done at run-time when the link is resolved – see section 3.
The above rules determine the general interpretation of references. However, each of the syntactic contexts has its own slight semantic variation; these variations appear in the detailed definition of the semantics for references.
Reference Elimination – Resolution
The key to the semantics of the SmartFrog notation is the process by which references are eliminated. This is necessary for each of the three syntactic locations where references may occur – prototype references, placement references and link references. The process by which references are eliminated is known as reference resolution. However, each type of reference has a different notion of resolution and so each has a specific resolution action – known respectively as type resolution, placement resolution and link resolution. This last name is historically also known as deployment resolution; this old name appears in parts of the API and is kept for backward compatibility. The resolution steps are described in more detail in the next few sub-sections, and then revisited as a whole to examine their interaction with each other.
Prototype References
References to prototypes, as defined in the following syntactic context,
Component ::= extends [LAZY] BaseComponent
BaseComponent ::= [Reference] ( ; | { AttributeList } )
are resolved as described above except in one respect: if the reference to the prototype consists of a single WORD part, it is interpreted as ATTRIB WORD in the usual way.
Thus, the following are equivalent
Foo extends Bar { …}
Foo extends ATTRIB Bar {…}
This is to provide a greater degree of convenience when referring to a prototype as these are most often defined in the outermost implicit root context, and frequently defined in an included file. Using this re-interpretation using ATTRIB, rather than adding an implicit ROOT reference part to the front, ensures that global definitions of prototypes at the top level may be locally overridden if required.
The following example demonstrates most of the situations:
Foo extends { a 1; }
Bar extends {
foo extends Foo;
}
Baz extends {
foo extends {
b 2;
}
foo1 extends Foo; // recall - this is equivalent to ATTRIB Foo
foo2 extends ROOT:Foo;
foo3 extends PARENT:Foo;
foo4 extends PARENT:PARENT:Foo;
}
After type resolution, which includes the merging and overwrite of attributes as described in section 2.2.2, the example is equivalent to:
Foo extends { a 1; }
Bar extends {
foo extends { a 1; } // ATTRIB Foo finds the outermost
}
Baz extends {
foo extends { b 2; }
foo1 extends { b 2; } // ATTRIB Foo finds the closest enclosing
foo2 extends { a 1; } // ROOT:Foo finds the one in the root
foo3 extends { b 2; } // PARENT:Foo finds that in the parent
foo4 extends { a 1; } // PARENT:PARENT:Foo finds that in
// the root (in this case)
}
Placement References
An attribute’s name may be a reference, as described in the syntactic clauses
Attribute ::= Name Value
Name ::= BaseReference
This is not completely accurate, as the syntax in fact limits references to being a reference containing WORD values, the other reference parts are considered erroneous.
The resolution of the reference is again largely as described above, with the following modification.
The last reference part of the reference must be a WORD and is treated differently. This word part is not strictly part of the reference, but is used to identify the name of an attribute that is to be created (as opposed to referenced) in the context of the prefix part of the name reference. Thus in the attribute definition
foo:baz:bar 42;
the foo:baz is a reference to a location, bar is the name of the attribute to be created in that context.
In most cases, the name consists only of that final WORD leaving the prefix reference empty, indicating the current context. Thus, the attribute is defined in that current context. Where a non-empty reference prefixes the final word, the reference is used to determine the appropriate context and the attribute with the given name is placed into that context.
Consider the example
Service extends {
portNum 4089;
}
Service:portNum 4074;
Service:hostname "ahost.smartfrog.org";
The prefix reference Service: is de-referenced to indicate the Service attribute. The two prefixed attributes are therefore placed within that reference context, overriding or placed at the end of the context as appropriate. Thus, the example is roughly equivalent to the following (there are some differences in their behaviour as prototypes):
Service extends {
portNum 4074;
hostname "ahost.smartfrog.org";
}
The act of placing the attributes into a location is known as placement resolution, and it occurs simultaneously with the removal of the reference-prefixed attribute from its defining context.
Placement of attributes can lead to a great deal of confusion if not used properly. It reacts in interesting ways with type resolution; this interaction explained in the section on resolution.
Link And LAZY Link References
Frequently, attributes need to take on the same values as other attributes. This can be for many reasons:
-
to avoid repetition of values at many points in a description making it easier to maintain that description
-
to hide the structure of the description to a program; explained further in section 3.
-
to provide a means of simple parameterization; explained further in the section .
This association between the value of one attribute and that of another is defined by providing a reference in the place of a value of the attribute. This reference is resolved relative to the context at the point of definition.
Consider the following example, in which a server and a client both need to know the TCP/IP port on which the server will listen.
System extends {
server extends {
portNum 4089;
}
client extends {
portNum ATTRIB server:portNum;
}
}
The system contains a server and a client. The server and client both have an attribute portNum, with that of the client being defined as a link to that of the server.
There is a resolution step, known as link resolution (and occasionally deployment resolution), which replaces references by the values that they reference. During the resolution phase, chains of links are resolved appropriately.
In the above example, the definition of System is equivalent to the following:
System extends {
server extends {
portNum 4089;
}
client extends {
portNum 4089;
}
}
Consequently, both the server and client share the same value and maintenance is eased in that should the port number need be changed, this need happen in only one place in the description.
It is frequently the case that the link itself is required as a value; i.e. the link should not be resolved to the value that it might refer to within the description. This reference may then be used within a SmartFrog application after deployment, for resolution at run-time rather than at the time of parsing the description. The primary use for this is described in section 3.
In order to provide a reference value, rather than have it resolved to the value of another attribute during link resolution, the keyword LAZY may be prefixed to the link to indicate that the link resolution should not resolve the link. An example of this is:
System extends {
server extends {
foo 42;
}
client extends {
myServer LAZY ATTRIB server;
}
}
In this case, the client’s attribute myServer is a reference to the server, not a copy of the server component. As is, resolution will have no effect, as the link will be left to be the attribute value. If the keyword LAZY had not been present, the following would have been the result of resolution:
System extends {
server extends {
foo 42;
}
client extends {
myServer extends {
foo 42;
};
}
}
The word LAZY is an indication that it will be resolved at run-time – so far as the notation is concerned, this means that the link is the value.
Comments
The SmartFrog notation follows most modern languages in providing both end-of-line comments and multi-line bounded comments. The syntax for these is identical to that of Java, namely
-
// this is a comment to the end of the line
-
/* this is a comment which is terminated by */
Include Files
A stream of text may reference include files at certain points in that text. Unlike a C include file, though, the include file is not merely textually embedded into the original stream. Rather the include file is itself parsed (and must be syntactically correct) as a stream in its own right. Every stream must parse as a collection of attribute definitions, and this is equally true of the include files.
Include files may only be used within attribute lists (i.e. at the top level or within a component definition). The collection of attributes from the include file are simply added to the attribute list being parsed in the container stream.
Consider the following example:
-
file foo.sf contains:
foo extends {
a 42;
}
-
the primary stream is:
#include "foo.sf"
system extends {
myFoo extends foo;
#include "foo.sf"
}
After the parsing is complete (but before type resolution), the following is obtained:
foo extends {
a 42;
}
system extends {
myFoo extends foo;
foo extends {
a 42;
}
}
It should be noted that because includes may occur within other component descriptions, this may be used as a naming mechanism to prevent clashes of attribute name within multiple include files. Consider
-
file foo1.sf contains
foo extends { a 42; }
-
file foo2.sf contains
foo extends { b 42; }
-
the primary stream contains
foo1 extends { #include "foo1.sf" }
foo2 extends { #include "foo2.sf" }
sfConfig extends {
bar extends ATTRIB foo1:foo;
baz extends ATTRIB foo2:foo;
}
If the includes had not been buried within separately named components, but both had been included into the top level, only the second of the two mentioned foo attributes would have been available for extension. The second would override the first.
sfConfig
A stream contains a whole collection of attributes at the top level. Most are merely there to act as building blocks – prototypes for building others. Typically, there is only a single attribute that is the essence of the description – that which describes the desired configuration and is not merely a building block on the way. By convention in SmartFrog, the reserved attribute name sfConfig defines this special attribute and all the tools provided respect this convention.
Thus, when a stream is parsed to an attribute set, the top-level attribute sfConfig defines the system; the rest are ignored, apart from providing definitions for extensions and other resolutions. This is equivalent to the Java language use of the “special” method main(…) to indicate the entry point to a program. The entry point to a configuration description is sfConfig.
Thus in the following example, the attributes def1, def2 and def3 are only present for the purposes of defining sfConfig, and it is only this last attribute that represents the actual configuration description.
def1 extends Prim {…}
def2 extends Compound {
foo extends Prim {…}
bar extends Prim {…}
}
def3 extends Prim {…}
sfConfig extends Compound {
d1 extends def1;
d2 extends def2;
d3 extends def3;
}
Resolution – Semantics For The SmartFrog Notation
Resolution is the process by which the raw SmartFrog definitions, with their extensions, placements and links, are turned into the set of attributes that they semantically represent.
In addition to these three steps, there are other steps (phases) in the complete semantic manipulation of the SmartFrog notation, such as function resolution, predicate checking and any user-defined phases. These are described in separate sections as they are somehow less core to understanding the language.
There are two ways of representing the semantics, both roughly equivalent.
-
By defining how the value of an attribute identified by a reference is obtained from a description; defining the semantics by providing a function from reference to value for all possible references.
-
By defining a set of transformation rules that eliminate the complexity of the typing (by expansion), placement (by relocation) and linking (by value copy), resulting in a normalized form of a description containing merely a hierarchical set of attribute lists.
Either of these two forms of semantic definition would do, however the definition of the semantics through transformation has a distinct advantage: these transformations are required in practice and hence are implemented within the SmartFrog system. Thus, an understanding of these transformations is essential to the use of SmartFrog.
The three transformation steps are known in SmartFrog as resolution steps. These are respectively type resolution, placement resolution and link resolution. They are carried out in that order: first the types are expanded, then attributes placed into the correct context from the context in which they were defined, and finally links are resolved.
It should be noted that the entire description is type and place resolved, but only the top-level sfConfig attribute is normally link resolved. In general if the other top-level attributes are link resolved, errors will occur; they are only present to be available as prototypes. Further, unnecessary work will have been done.
The algorithms defined here for the transformations are the result of much empirical experimentation – other transformation algorithms produce more regular semantics, others are more efficient. However, those presented here are a balance between performance and semantic simplicity. They provide a great deal of control over the semantics of the resolution process.
Type Resolution
Type resolution is the expansion of the prototype reference optionally provided in the extends part of a component description. The syntactic form for a component description is roughly
name extends Reference { AttributeList }
The reference refers to a prototype that is to be extended by the attributes in the provided attribute list. This process of type resolution is a depth-first pass over the root component description, in the order of definition of the attributes.
-
Copying the prototype indicated by the reference, creating a new component description
-
Replacing the attribute values of the new component description also mentioned in the attribute list (i.e. the value, but not the order, changes)
-
Adding the remaining attributes at the end of the new prototype.
-
Type-resolving each of the component description’s attributes if they are component descriptions.
If the prototype reference indicates a component description that is not yet resolved, it resolves it first before copying: i.e. each type resolution is carried out with respect to the location where the prototype is defined. The other point to note is that if the reference is only a word, it is interpreted as ATTRIB word for the purposes of locating the prototype for the component description.
If, at the end of the process, one or more component descriptions have failed to resolve, in that their prototypes cannot be found, the whole resolution process ceases and an exception is thrown indicating the missing prototypes and the locations at which they are referenced.
Note that any references that may be copied as part of the extension process are not modified. Hence, copied placements are now relative to the new location and copied links similarly. Prototype references are never copied since a prototype is always resolved before copy.
Placement Resolution
Placement resolution is the process by which the attributes are placed into the correct location. Attributes are named, and this name may contain a reference to a component description as well as the name by which it is to be known in that component description. If the reference is not present, the attribute is assumed to be in the correct component description as defined.
Thus in the example attribute declaration:
foo:bar:baz 42;
The foo:bar: defines the target component description, and baz defines the name for the attribute in that component description.
Placement resolution is the transformation process that results in the attribute definitions being removed from their point of definition and placed in the target component descriptions. The process is a multi-pass process, for each pass:
-
traverse the component description hierarchy
-
depth first
-
visiting the attributes in the order of definition (as determined by type resolution)
-
-
each attribute visited is examined, if it should be placed elsewhere – try to do so, if it fails – leave as is.
The pass is repeated until one of the following occurs:
-
there are no placements left to transform
-
no placements have been successfully carried out, and at least one placement has failed
In the first instance, the placement resolution has successfully completed, the second it has not and an error is generated.
To see why multiple passes are necessary, consider the following:
foo extends {
a 21;
}
foo:bar:a 42;
foo:bar extends { b 34; }
In the first pass, the attribute foo:bar:a is first to be placed, but it fails since foo does not yet contain foo:bar as a component description. Also in the first pass, but later since it is defined later, foo:bar is placed, giving
foo extends {
a 21;
bar extends { b 34; }
}
foo:bar:a 42;
This leaves a placement incomplete so a second pass is required. This time it succeeds, resulting in
foo extends {
a 21;
bar extends {
b 34;
a 42;
}
}
This order dependency does not have much of an effect, except for when two identically named attributes are placed into the same component description. At this point understanding the order of resolution becomes important.
Since placement resolution is carried out after type resolution, the following consequences should be noted:
-
As type resolution is carried out before placement, attributes placed into a prototype will not be inherited by those extending the prototype.
-
Again, as type resolution is carried out before placement, do not place an attribute that is to be used as a super-type; it will not be found.
-
Wherever possible, placement should be restricted to referencing downwards into a structure from the point of attribute definition. Descriptions can be very hard to understand if PARENT, ROOT or ATTRIB are used in a placement reference; this particularly so within a component description to be used as a type. As a consequence, this release of SmartFrog does not permit these reference parts to be used in a placement.
The reason why type resolution is done before placement resolution is that the normal use for placement is to “fill-in” empty “attribute slots” in a prototype. As each instance of the prototype will in general need differently filled slots, placement must be done after the type has been resolved for each instance.
Note that placement of attributes whose values are links do not modify the links to correct for the new location. Thus, links are resolved with respect to where they are placed, not where they are defined.
Link Resolution
Link resolution is the most straightforward of the three forms of resolution; all links are resolved in their location after type and place resolution, and the referenced value replaces the link as the value of the attribute. There are a number of points to note:
-
Only links that are not LAZY are resolved; those that are LAZY are left unresolved with the link itself being the value.
-
If the value of the attribute is a link, this is first resolved and the result of that resolution is used.
-
Links are always resolved in the contexts in which they are located after the type and placement resolution phases are over, not necessarily those in which they were defined.
-
Links referring to an attribute whose value is a LAZY link will leave the LAZY link unchanged, this being the attribute’s value.
-
In resolving a link, the value of the attribute referenced is not copied, but shared, at the original point of definition if this is relevant (e.g. For component descriptions and their parent). Thus any operation that affects the value of this data has an impact on all parts of the tree that share this data. The only operations that affect attribute values in this way are functions (or possibly a user phase).
Sharing has almost no effect on the language semantics unless the data shared is a component description. In this case the parent of the data remains that of the location of definition. This has an impact on how links within that component description are resolved, using the original parent, and not relative to the context in which the link was defined.
An explanation of the consequences of sharing is given in section REF.
The Difference Between Types and Links
On the surface, there are many similarities between the definitions of x and y in:
Foo extends {
a 10;
}
x extends Foo;
y Foo;
They both appear to end up by having the definition of a component description containing a.
One obvious difference is that since they occur each side of place resolution, a placement into Foo will affect y but not x. However there are more subtle differences to do with the sharing of data with links, rather than the copying of data with extends. Consider the following example:
data 1;
Foo extends {
a data;
}
example extends {
data 100;
x extends Foo;
y Foo;
}
In this definition, example:x:a has the value 100, whereas example:y:a has the value 1. The reason for this discrepancy is that the extends copies the definition of Foo and the following link resolution for data is done relative to the copy's location. The link, on the other hand, simply links to the definition of Foo in its existing position, and there the value of data on resolution is 1.
The difference can also be highlighted using one of the functions, such as next that return a different value at each use. Consider the following description:
#include "org/smartfrog/functions.sf"
example extends {
x extends next;
y extends x;
z x;
}
Assuming that this is the first use of next, example:x will have the value 1, example:y will have the value 2, but example:z will have the value 1. This is because it shares the result of the function bound to example:x.
Note that at the very end of the language processing as part of the conversion to the core data model, the sharing is eliminated and each attribute will have its own copy of the value. This is explained in detail in section REF.
Template Parameterization Pattern
When extending a prototype, it is normal to override the values of certain attributes to customize the prototype to its actual use. The simplest way is to extend with the replacement attribute – however this only works for a top-level attribute. Modification of attributes deep in the structure requires the placement of the overriding attribute into the correct context, as in the example:
Service extends {
hostname "localhost";
portNum 4567;
}
ServicePair extends {
service1 extends Service ;
service2 extends Service ;
}
sfConfig extends ServicePair {
// user needs to know structure of ServicePair
service1:hostname "riker.smartfrog.org";
service2:hostname "ackbar.smartfrog.org";
}
This works adequately, but it has the disadvantage that the use of the ServicePair prototype requires knowledge of its structure, though it does have the advantage that any attribute in the structure may be changed if necessary. However, under normal circumstances, there are attributes whose values are expected to change, and others that are not. Under these circumstances, it would be good if the description could be parameterized on these attributes. However, the normal form of parameterization as provided in programming language functions is not a good fit to the SmartFrog notation semantics – so the language provides a way of finding a way of hiding the structure of a description and making it easier to override “deep” attributes.
This technique, more of a pattern for the use of links, is shown in the following example:
Service extends {
hostname "localhost"; // default value
portNum 4567;
}
ServicePair extends {
s1Host "localhost"; // provide default value
s2Host "localhost";
service1 extends Service { hostname s1host; } // lift attribute
service2 extends Service { hostname s2host; } // ditto
}
sfConfig extends ServicePair {
// user needn’t know structure of ServicePair
s1host "riker.smartfrog.org";
s2host "ackbar.smartfrog.org";
}
It is clear that the use of ServicePair requires only the extension with top-level attributes to set the attributes deeply defined in the Service prototype. This pattern, of the use of links lifting an attribute value to one provided in the outermost context, is called the parameterization pattern and is very frequently used.
Note that if a default value for a lifted attribute is not given within the description (in this case ServicePair provides defaults for both the lifted attributes s1Host and s2Host), a deploy resolution error will occur if the parameter is not provided at time of use, since the value to resolve the link will not be found.
Functions and operators
SmartFrog provides users with a small number of predefined functions to improve the expressiveness of the descriptions. In addition, it provides mechanisms by which users may add their own functions, effectively providing an escape mechanism into Java. These functions, whilst not part of the SmartFrog language, are provided for convenience. The mechanism, a special case of a more general phase mechanism, is described in detail in section .
Functions appear, to the language, as predefined component descriptions that may be extended; the parameters are given as named attributes within the body of that description. For example, a use of the string concatenate function is
#include "/org/smartfrog/functions.sf" // the standard functions
val 42;
myString extends concat {
-- "the meaning of life is ";
-- val
}
that results in the value of the myString attribute being "the meaning of life is 42". The names of the attributes have no effect in this case, the strings being concatenated in the order of definition, but may be important for some other functions.
Functions are evaluated inner-first, providing for the nesting of function application, and are evaluated after all the other resolutions steps have be completed. The definitions are themselves affected by these resolutions. Thus a function may be extended with the resultant extension also be a function. The current set of predefined functions is given in section 14.
In order to make the use of functions more natural, some syntactic forms are provided that appear to be infix or prefix operators. However, these are simply translated into the relevant template form during parsing. A more compete description of this process is given in section 14.
Predicates, Assertions and Schemas
It is frequently useful to be able to define a set of well-formedness conditions on the use of a template in order to guarantee that its use is correct. However, this should be done in a way in which all the benefits of template extension are not lost. To this end, an additional phase, similar to that defined for functions, is included which will check predicates defined and attached to a template.
There are three predicate types provided as part of the SmartFrog framework. These are the assertion predicates, schema predicates and the TBD (to be defined) predicates.
The most flexible predefined predicate supplied by the SmartFrog framework is the schema, a description that describes the set of attributes a template should contain. Users may add their own predicate types through a similar escape mechanism to Java provided for functions. Schemas are described in detail in section 15.
Schemas are best described through the use of an example, in this case of a template for a web server component. The example defines a schema for a web server template, and defines the template linked to the schema.
// the definition of schemas
#include "/org/smartfrog/predicates.sf"
WebServerSchema extends Schema {
port extends Integer;
directory extends OptionalString;
}
WebServerTemplate extends Prim {
schema extends WebServerSchema;
port 80; // default value
}
Note that the name for the attribute linking the template to its schema need not be, as in this case, schema. Indeed, a template may have more than one schema attached as attributes, in which case the uses of the templates are checked against all schemas attached. Schemas must extend the base schema template Schema.
Schemas may be extended in the same way as other templates, and their uses may easily be extended through placement as illustrated in the following examples.
// the definition of schemas
#include "/org/smartfrog/predicates.sf"
ThreadedWebServerSchema extends WebServerSchema {
minimumThreads extends Integer;
}
ThreadedWebServerTemplate extends WebServerTemplate {
// overwrite existing schema with extended schema
Schema extends ThreadedWebServerSchema;
minimumThreads 7;
}
AlternativeThreadedWebServerTemplate extends WebServerTemplate {
// add to existing schema
schema:minimumThreads extends Integer;
minimumThreads 7;
}
Note that schemas are entirely optional and need be used only if desired. The value of a schema is that it provides a strict definition and the potentially type of the attributes, both required and optional, of a component. This should make it easier to work with, and so benefit users of the component.
Similarly to schemas, assertions are descriptions that are interpreted as a predicate. An assertion consists of a description that contains attributes that should all evaluate to true - any attribute that evaluates to false, or indeed any other value, is considered to be an assertion failure. The names of these boolean attributes are not significant other than as documentation. There is an implicit conjunction (and) between the various assertion attributes given.
An assertion description must extend Assertion, and must be included in the description to which it applies in the same way as a schema must.
An example of an assertion is
// the definition of assertion
#include "/org/smartfrog/predicates.sf"
WebServerAssertion extends Assertions {
portValid ((port == 80) || (port == 8080) || (port == 8088));
}
WebServerTemplate extends Prim {
schema extends WebServerSchema;
assert extends WebServerAssertion;
port 80; // default value
}
In the same way that attributes may be added to an existing schema, attributes may also be placed into an “Assertions” description, or more than one “Assertions” may be provided.
The TBD predicate is used to indicate that a specific attribute still requires to be assigned a value. If it has not been assigned, and an attempt is made to use it, an appropriate error message is given.
An example of the predicate is as follows:
#include "/org/smartfrog/predicates.sf"
aTemplate extends Prim {
sfClass “org.smartfrog....”;
anAttribute TBD;
}
sfConfig extends Compound {
anInstance extends aTemplate;
anotherInstance extends aTemplate {
anAttribute 45;
}
}
Here, the attribute anAttribute of aTemplate is defined as TBD, so any use of the template that does not set this value will generate an error. In the definition of sfConfig, the first use, to define anInstance, is erroneous whereas the second to define anotherInstance is valid.
The TBD attribute ("To be determined") is a simple substitute for the more rigorous schema declaration. Note that the type of the attribute is not defined, which can be a useful feature.
Mapping to the Core Data Model
The attribute sets produced by the above phases are now simple enough to be mapped into the core data structures supported by the SmartFrog runtime. These data structures do not support extension, placements, functions or predicates – so all these have to be resolved away. Links are supported, but they are considered as values and have no further special meaning – they are all assumed to be LAZY links.
The translation into these core data structures is therefore straight-forward apart from one additional point: the structures produced by the phases can share data, but this is eliminated by copying. If this copying involves Component Descriptions, these are also parented into the part of the tree into which they are being copied.
The reason for this sharing elimination is to do with the semantics of the distributed system. Whilst all the data is local it could make sense to share data as it is more efficient, although care has to be taken when data is changed behind the scenes with side-effects on other parts of the tree. However, when parts of the tree get mapped to different processes during deployment, the data has to be copied and the sharing broken in any case. To ensure a common semantics between local and remote deployments, separate copies are taken at all times.
This sharing elimination is illustrated by the following diagram. Note that the parent link from back from the foo attribute's data only exists if the attribute is itself an attribute set (a component description).
Primary Language Processing
Phases are a way transforming the SmartFrog parse tree into the final form ready for deployment (or other purpose). Each phase is a pass over the component description hierarchy carrying out an action controlled, in the case of user-defined phases, by attributes defined within the descriptions.
Under normal circumstances users will not need to know about phases or how to modify on adapt them, the default collection of phases is already correct for most purposes.
The predefined phases for the default language are as follows:
-
type – carry out type resolution on the component description hierarchy; this is predefined and does not rely on attributes in the tree to trigger it.
-
place – carry out place resolution on the component description hierarchy; this is predefined and does not rely on attributes in the tree to trigger it.
-
link – carry out link resolution on the component description hierarchy; this is predefined and does not rely on attributes in the tree to trigger it.
-
sfConfig – not really a phase, rather it controls where the phases are applied. Its effect is that for the remaining phases in the current phase list, they are only applied to the sfConfig attribute.
-
print – again, not really a phase, but it triggers the printing of the tree to the standard output. This provides a debugging mechanism as it can be placed between any other phases to view the intermediate state of the tree.
-
function – in reality a user-defined phase, but one which is provided by default. It causes all the functions to be evaluated. It is triggered in the same way as the other user-defined phases, by the occurrence of attributes with the name phase.function.
-
predicate – also a user-defined phase which is provided by default. It causes all predicates to be checked and errors reported. The schema mechanism is an instance of the use of the predicate phase, though others may be added by users. The phase is triggered in the same way as other user-defined phases, by the occurrence of the attributes with the name phase.predicate.
Phases are triggered in a specific order, as determined by the top-level attribute phaseList. If the attribute is not present, it is as though the attribute were defined as follows:
phaseList [“type”, “place”, “sfConfig”, “link”, “function”, “predicate”];
This default definition provides the semantics described in the section 1.
In addition to the pre-defined phases, a user may introduce their own. User phases are defined as follows:
-
A class must be created which implements the interface PhaseAction in package org.smartfrog.sfcore.parser. The interface is fully defined in the Javadoc, but in summary, it provides two methods:
-
forComponent – which initializes the instance of the action with the component description on which it is to operate
-
doit – which triggers the action of the phase,
-
In whichever component description the action must take place, an attribute whose name starts with the string phase.nnn must be provided, set to the string containing the class name, where nnn is the desired name of the phase.
-
The phaseList attribute must be set at the top level of the description, containing the phase name nnn at the appropriate point relative to the other phases. It is recommended that this is placed after all the standard resolution phases, though occasionally it may be necessary to place the phase earlier.
There are a few points to notice. Firstly, the descriptions are traversed depth-first so the inner descriptions are visited before the outer. This makes sense for functions, for example, that are evaluated from the inside. The second point is that the action is independent of the phase, in that the attribute name determines the phase; the action is determined by the attribute value. Thus, it is possible for the same action to be used in two different phases, and for different actions to be invoked in the same phase – as is the case with all functions. It is also possible to have more than one action for each phase in a component description since the attribute name merely needs to start with the phase.nnn string so several may be provided.
Note that both the phaseList attribute and the phase.nnn attributes are removed from the description after the action is invoked.
Consider the following example. A class is provided that adds the sfProcessHost attribute (used to determine on which host a component should be deployed) to a component description, based on the value of an attribute sfLogicalHost. It maps the logical host to the physical host in some way not defined here – say by using the method mapHost.
The class might be defined as follows:
package org.smartfrog,example;
class MapHost implements PhaseAction {
ComponentDescription cmp = null;
public void forComponent (ComponentDescription c) {
cmp = c;
}
public void doit() {
String logicalHost = c.sfResolve(
Reference.fromString("sfLogicalHost"));
c.addAttribute("sfProcessHost", mapHost(logicalHost));
}
private String mapHost(String logical) { … }
}
This class may then be used in a description, to be acted on in the phase mapHosts, as follows
phaseList ["type", "place", "sfConfig", "link",
"function", "predicate", "mapHosts"];
MappedCompound extends Compound {
phase.mapHosts "org.smartfrog.example.MapHost";
}
sfConfig extends MappedCompound {
sfLogicalHost "databaseHost";
component1 extends Prim { …}
component2 extends Prim { … }
}
The phase list adds the mapping phase to the end, providing for the host mapping. The MappedCompound, when used, carries its phase attribute with it. Consequently, it is now contained within sfConfig. Thus during that last phase, sfConfig will be mapped to the correct physical host.
Functions
Functions are evaluated during a predefined phase, named function, with the effect that an attribute obtains the value of the evaluated function. To make functions easier to write, a predefined abstract PhaseAction, called BaseFunction from package
org.smartfrog.sfcore.languages.sf.functions
is provided that makes writing new functions easier.
New functions should extend the class BaseFunction and provide the method doFunction(), returning the result of the function as an Object. Any attribute may be accessed during the evaluation process.
BaseFunction is documented in the Javadoc and predefined functions are documented in section 14.
Predicates
Predicates are evaluated during a predefined phase, named predicate, with the effect that the associated predicate class is evaluated and any errors notified to the user by generating an appropriate exception. Most predicates will be instances of Schema, however users may define their own. To make user-defined predicates easier to define, a class BasePredicate from package
org.smartfrog.sfcore.languages.sf.predicates
is provided that makes writing new predicates easier.
New predicates should extend the class BasePredicate and provide the method doPredicate(), throwing the exception
SmartFrogCompileResolutionException
if there is an error. Any attribute may be accessed during the predicate evaluation.
BasePredicate is documented in the Javadoc and the predefined predicate Schema is documented in section 15.
Programming with the Parser
Background
The SmartFrog framework is designed to support a range of possible languages to define configurations for the deployment engine to instantiate. The languages are all required to follow a common model for their processing, and to eventually produce data structures that are suitable for the deployment system. The default language is the base SmartFrog language defined above, and which uses the file extension “.sf”.
The first stage of language processing is the parser – a tool for turning text into data structures for further processing. The parser interface allows programmers to select the parser based either on the language type of the file (as defined by file extension), by direct selection, or simply using the default (sf) parser.
After parsing, the data structures produced must implement an interface for driving the remaining resolution phases. This interface is
org.smartfrog.parser.Phases
Following the invocation of the various phases, the data is converted into a hierarchy of data supporting the ComponentDescription interface, which may then be passed to the deployment system.
Using this model, it is reasonably easy to define a new language and integrate it into the system. The default SF language is the first such, but others such as XML based languages, or the more advanced SF2 language currently under development are also possible.
The remainder of this section describes how to invoke the parser, how to step the language data structures through the various processing phases, and finally the nature of the resultant ComponentDescription data structures.
Summary of Language Processing
All of the tools provided with the SmartFrog system handle a SmartFrog text in an identical way to produce a fully resolved deployable description. The process is basically:
-
parse the text stream to produce hierarchical data structures
-
carry out all the phases, which for the default primary language are
-
type resolve the root
-
place resolve the root
-
extract attribute sfConfig from the root
-
link resolve sfConfig
-
evaluate any functions in sfConfig
-
check predicates and schemas in sfConfig
-
-
convert to standard data model, creating simple normalised attribute tree
The Parser
The SmartFrog parser is implemented as a Java class with a method to parse an InputStream producing an instance of the class ComponentDescription, the Java class representing the parsed text allowing programmatic manipulation of the information. Any InputStream may be used, thus the parser may be invoked on a String, a File, a URL, or any indeed any object that provides a stream model.
During parsing, a number of include files or URLs

