Home
ObjexxFCL 3.0
 

Developers Guide

This guide contains some supplementary information of interest to project developers who wish to understand the design and inner workings of the ObjexxFCL, and developers with ObjexxFCL licensing allowing modification of the ObjexxFCL code.

ObjexxFCL has been modernized and extended relative to earlier releases. In particular the FArray class template hierarchy is notably more complex and subtle. For this reason care should be exercised before changing the ObjexxFCL code.

The Users guide is a prerequisite for this guide.


ObjexxFCL Organization

The ObjexxFCL is organized into the following source modules:

Module Description
ObjexxFCL ObjexxFCL declarations
ObjexxFCL.Project ObjexxFCL Project-specific declarations
byte Single-byte integer
ubyte Single-byte unsigned integer
CArray C-style array wrapper
CArrayP C-style array wrapper/proxy
ChunkVector Chunk-contiguous 1D vector class template
Chunk ChunkVector 1D Chunk vector class template
ChunkExponent ChunkVector exponent wrapper class
FArray FArray abstract base class template
FArray.all All-dimension FArray master wrapper
FArray.io All-dimension FArray stream output master wrapper
FArrayN ND FArray abstract base class template
FArrayND ND Real FArray class template
FArrayNP ND Proxy FArray class template
FArrayNA ND Argument FArray class template
FArrayN.all ND FArray master wrapper
KeyFArrayND ND Key-indexed FArray class template
FArrayN.io ND FArray stream output
FArrayInitializer FArray initializer class template
FArraySection FArray section class template
FArrayTraits FArray type traits class template and specializations
IndexRange Index range class
StaticIndexRange Static index range class
DynamicIndexRange Dynamic index range class
InitializerSentinel Array initializer sentinel class
ProxySentinel Proxy array sentinel class
Star Assumed-size upper index sentinel class
Dimension Dimensional parameter class
DimensionExpression DimensionExpression hierarchy base class
DimensionExpressionCon Constant-valued DimensionExpression class
DimensionExpressionSum Sum DimensionExpression class
DimensionExpressionSub Subtraction DimensionExpression class
DimensionExpressionMul Multiplication DimensionExpression class
DimensionExpressionDiv Division DimensionExpression class
DimensionExpressionMin Minimum binary function DimensionExpression class
DimensionExpressionMax Maximum binary function DimensionExpression class
DimensionExpressionPow Power binary function DimensionExpression class
DimensionExpressionSquare Square unary function DimensionExpression class
DimensionExpressionCube Cube unary function DimensionExpression class
DimensionExpressions DimensionExpression operators and master wrapper
Observer Combined Subject--Observer base class
ObserverSingle Single Observer class
ObserverMulti Multiple Observer class
ObserverGraph Observer dependency graph class
ObserverMediator Observer mediator namespace/functions
Fstring Fstring classes/functions
Cstring C-style string (char*) memory managed wrapper class
string.functions Useful std::string functions
char.functions Useful char functions
format C++ stream formatted input/output
Fmath Math intrinsics/other functions
rvalue_cast rvalue cast to reference function template
array.iterator C array begin and end iterator functions
Time_Date Time and date functions
TypeTraits Type traits class template and specializations

Source modules may have header and implementation files or just header files. Only the header files for the modules in green would normally be included directly in project code, but the other headers can be used if desired. Classes intended for use in project code are forward declared in headers of the form Class.fwd.hh along with typedef names that are provided for coding convenience.


ObjexxFCL Applications

The ObjexxFCL is compatible with common 32 and 64-bit platforms. Very large (64-bit size type) FArray and ChunkVector arrays are supported on 64-bit platforms but indexing into each dimension of an FArray is still done by int types so each dimension's index range is limited to the range of a (32-bit) int.

The ObjexxFCL is currently intended for use with single-threaded applications and is not thread safe.

The ObjexxFCL can be built into a shared library or dynamic link library (DLL) but such use should be carefully validated on each platform/compiler combination to assure proper functioning. Using a shared library built with one compiler with executables built with another version of that compiler or a different compiler may not work due to the use of different C++ ABIs.


ObjexxFCL Design

Everything in the ObjexxFCL lives in the ObjexxFCL namespace. Normally projects would bring everything into visibility with a "using namespace ObjexxFCL;" directive as in the ObjexxFCL.Project.hh header provided. Even with such a directive the ObjexxFCL:: prefix can be used for explicit disambiguation.

The design of the ObjexxFCL is focused on providing near-seamless Fortran migration support and near-Fortran performance. It is not intended to provide a complete linear algebra library or high-level matrix operations. Programming errors in the use of the ObjexxFCL are caught by assertion failures to avoid slowing down release builds, but this requires testing to be done with debug builds that enable assertion checks. C++ exceptions are not used to avoid the performance cost and the burden placed on project code to handle the exceptions.


Dimension

Dimension objects are size parameters for use in the FArray index ranges. Dimension expressions can be formed by combining Dimensions and constants with common mathematical operators { + - * / } and these expressions can be used in index range specifications. Dimensions automatically notify dependent index ranges and FArrays when set/changed via an Observer pattern framework (see below) and those FArrays will resize and reinitialize themselves as needed when notified of a change to a Dimension upon which they depend. Also, note that:

  • The default action when a size-modifying function is called on a Dimension is to notify its Observers even if the Dimension's internal or external value hasn't changed. The reason is that real FArrays can have initializers that get run whenever the array is resized, even when the size didn't change and no allocation occurs. Without this behavior client code depending on that initialization would have to check if the Dimension was actually changed and if not trigger the initialization process through another type of notify operation. Dimension assignment functions are provided that only notify-if-changed ("nic") for use when re-initialization is not required when a Dimension's value hasn't changed.


IndexRange

The IndexRange classes encapsulate the arbitrary index range that Fortran arrays can use for each dimension (unlike the zero-based array indexing of native C/C++ arrays).

StaticIndexRange holds a non-dynamic range that can be explicitly changed but is not automatically updated via the Dimension system's Observer pattern. Dimension expressions can be assigned to StaticIndexRanges but those ranges take the expressions' current values and are not notified of Dimension changes. StaticIndexRange is used for the argument FArray index ranges since those arrays are not designed for automatic redimensioning.

DynamicIndexRange holds a dynamic range that is automatically updated by changes to any Dimensions that it depends on. DynamicIndexRange is used for the real and proxy FArray index ranges. When a DynamicIndexRange changes its Dimensions it prevents automatic notification updates by them and handles the notification itself.

Zero-sized index ranges are indicated by index ranges of the form [lower,lower-1]. (Zero-sized FArrays are supported.)

"Unbounded" index ranges, having unknown upper bounds, have an index range of the form [lower,lower-2] with size given by a constant named npos that is defined as the unsigned type size_t cast of -1 (4294967295 for 32-bit size_t). Unbounded proxy or argument FArrays are created when a bare array element is passed (the "faster" method) to a proxy or argument array pass-by-value function argument.


FArray Hierarchy

Design

The FArray hierarchy is designed to achieve a number of goals:

  • Fortran-compatibility:
    • Column-major, contiguous storage
    • Arbitrary index ranges for each dimension
    • Array passing "tricks"
  • Fast, near-Fortran run-time performance

The data is stored in a dynamically allocated array that is owned by the corresponding real array and pointed to, but not owned, by any proxy or argument arrays that might refer to the real array. The column-major ordering is obtained by the formulas giving the index into the linear array from the set of array dimensional indexes.

On the assumption that subscripting calls are the most common and performance-critical, the FArray design uses some cached values to speed up the subscript operations. The size of all but the last dimension's index ranges are cached and an offset pointer into the data array is cached. The const subscript operator returns its value by reference, which may have a small performance cost for built-in numeric types on some platforms but is necessary to support the passing of array elements to array arguments. Linear (one-dimensional) indexing is provided for very fast access to a sequence of array elements whose linear index is easily calculated.

The proxy FArrays and argument FArrays are proxy objects that provide a view to the data of another array but act as if it is their own data. Proxy FArrays differ from argument FArrays in that they can reattach themselves automatically to the new data of a reallocated source array and will adjust their dimensions if Dimension objects used in their index ranges are changed. Argument FArrays are statically dimensioned and cannot reattach to arrays that are resized: they quickly constructed for use in function argument lists.

When a real or proxy FArray changes its IndexRange(s) it prevents automatic notification updates and does a manual update after all the changes for efficiency.

Real, proxy, and argument FArrays can be passed by reference when the function array type will always match that of the caller; a reference to the common FArray base class can be passed any FArrays of the same rank and can perform all of the common array operations.

The proxy/argument arrays "work" by grabbing a pointer to the passed array, array section, or element and, when possible, extracting the size of the actual data section. They can then reinterpret the pointed to data as an array of their declared rank. Since function argument declarations cannot contain constructor arguments, when the passed array is not of the same rank and dimensions as the argument array, or when array sections or elements will be passed, it is necessary to set the argument array dimensions with a call to the dimension member function, as in A.dimension(3,4), before the array is used in the function.

When array elements are passed to argument arrays the argument array can only extract the address of that element for its data pointer (another reason the const subscript operator return by reference) and it has indeterminate size. The dimension call can set a size but this cannot be checked against the actual underlying data array. The loss of size information will propagate through subsequent passing of that argument array, eliminating the possibility of bounds checking for those arrays. For this reason the argument, a, member function is provided to pass "safer", as in A.a(2,3): an FArraySection object is constructed and passed that contains the data pointer and size information. There is a performance cost for the construction of FArraySections that remains in release builds so there is a definite tradeoff.

Array assignment operators have value semantics and will not resize a real array or attach an argument array to a new array. This means that arrays used in assignments must have compatible dimensions. Real arrays can be resized by the dimension member function: this will invalidate any of their argument arrays.

In order to achieve maximal run-time performance no array bounds checks are performed for any FArray classes in release builds (when NDEBUG is defined). Bounds are checked by asserts in debug builds. All new code using FArrays should be tested with debug builds.

The FArray implementation is heavily tuned for performance and thus has some unusual features:

  • Protected data and some manually inlined functions are used to improve the performance of non-inlining debug builds.
  • Overrides of non-virtual functions are used to allow more efficient calls to be made from the concrete FArray classes.

Extensions

The FArray hierarchy could be extended in many ways for specific applications:

  • Alternate bounds-checked subscript functions (like std::vector::at): this requires the use of C++ exceptions with some performance impact.
  • Additional whole-array operations: sums, multiplication, inner products. Avoiding temporary arrays where possible is important for performance (expression templates may be worthwhile).
  • An optional data-preserving resize during automatic redimensioning policy.
  • Linear algebra operations: Gaussian and iterative solvers, inversion, etc.

Objexx has developed some of these and can develop custom extensions for clients.

In many cases it may make more sense to interface with existing matrix and numerical libraries. FArrays provide access to their column-major data arrays so they can work directly with libraries that accept arrays with this ordering. Copy in/out semantics will be required to interface with other array representations such as nested std::vectors, TNT, and Blitz++: this should be done as non-member functions declared and defined in separate files to avoid unnecessary dependencies.


Observer-Based Dynamic Sizing

Dimensions automatically notify dependent index ranges and FArrays when set/changed via an Observer pattern framework and those FArrays will resize and reinitialize themselves as needed when notified of a change to a Dimension upon which they depend. Notification must take place in topologically sorted order (handled by the ObserverGraph class) since Dimensions and Dimension expressions can be interdependent and an FArray may depend on the same Dimension in multiple ways. Also, note that:

  • Observer is a combined Subject-Observer base class for Dimension, DynamicIndexRange, and the real and proxy FArrays.
  • ObserverSingle is a lightweight Observer base class for classes that can have at most one observer per object (such as DynamicIndexRange).
  • ObserverMulti is an Observer base class for classes that can have multiple observers.
  • SetWrapper is an insulating wrapper for std::set (used by ObserverMulti) that can be forward declared unlike std::set to reduce the significant compilation time cost of including <set> in many sources.
  • The default action when a size-modifying function is called on a Subject (Dimension) is to notify its Observers even if the Subject's internal or extenal value hasn't changed. The reason is that real FArrays can have initializers that get run whenever the array is resized, even when the size didn't change and no allocation occurs. Without this behavior client code depending on that initialization would have to check if the Subject was actually changed and if not trigger the initialization process through another type of notify operation. Size-modifying functions can be added, as provided for Dimension, that only notify-if-changed for use when re-initialization is not required when a Subject's value hasn't changed.

ChunkVector

ChunkVector is designed to support very large 1D arrays. It uses a std::vector of Chunk objects of user-controllable size to avoid trying to allocate massive contiguous blocks of memory in a possibly fragmented memory environment. By using power-of-two Chunk sizes the 2-level indexing can be done with bit shift operations and provides speed competitive with that of std::vector. As of v.2.4.0 ChunkVector was rewritten to hold Chunk objects, which handle their own memory management, so that some problems with using nested std::vectors for the Chunks could be avoided, including:

  • No control over the possible excess capacity in each Chunk (without expensive shrink operations)
  • No way to avoid initialization of elements of built-in value types
  • No bounds checking in debug builds
  • No non-preserving resize operations

ChunkVector::resize was written to take advantage of the ability to swap the old Chunks into the new outer std::vector instead of expensively copying each Chunk as std::vector::resize would do.


Fstring

The Fstring system includes the Fstring and Fsubstring classes and some nonmember Fortran intrinsic and additional functions.

The Fstring class has the semantics of a Fortran string: fixed length, indexed from 1, and trailing spaces ignored in comparisons.

Fstring uses the operator[] for index-based access to the characters as does C++'s std::string. For Fortran-like efficiency bounds checking is performed by asserts in debug builds only: code using Fstring should be tested with debug builds. Fstring could be extended to provide an always-bounds-checked indexing member function (like std::string::at): this would require using C++ exceptions with some performance impact.

Fstring concatenation uses operator+ as does std::string.

Fstring interoperates with std::string and C style strings (char *) with constructor, assignment, and generator functions.

Fstring transparently generates Fsubstrings when the single or double index operator() is used, as in name(2,5) or name(7). The single index version generates the tail substring starting at the index (analogous to name(7:) in Fortran). Fsubstrings are not designed for explicit use in project code but they can be.