The Data Map
The data map is a string parameter that describes how columns in the source are mapped onto identifiers in AIMMS. You can add it to DataLink and give it a name. Then this data map can be used for reading and writing by using its name.
What is a data map
There are two kinds of data maps, an old one referred to as the “classic data map”, and a new one referred to as the “data map”. The classic data map was created by reusing some parts of the flatfilereader
and augmenting it by adding support for other providers. The use of the classic data map still has elements that are there for historic reasons. Parts that are no longer used by DataLink still exist for backward compatibility of existing models. The new data map was introduced because the structure of the classic data map, particular when it comes to column numbers, was to rigid for some new features.
The “Classic” Data Map
A classic data map is a string parameter that can be defined as:
StringParameter SP_ClassicDataMap {
IndexDomain: (
dl::dt, ! Table name
dl::idn, ! Identifier name
dl::cn, ! Column number
dl::dn ! Domain number
);
}
The dl::dt
is the index of set dl::DataTables
. We have to make sure that the table name exists in that set before we can use it to specify the data map. We can do this by
dl::DataTables += {'TableNameInSource'} ;
witch will add table name TableNameInSource
to set dl::DataTables
.
Suppose we have a set identifier S_TheSet
in AIMMS, and a parameter P_TheParameter
indexed over S_TheSet
. Then we can specify the data map as:
SP_ClassicDataMap(dl::dt,dl::idn,dl::cn,dl::dn) := data {
! table , identifier , C, D : column name
( TableNameInSource, S_TheSet , 1, 1 ) : "ColumnNameInSource",
( TableNameInSource, P_TheParameter , 2, 0 ) : "OtherColumnNameInSource",
};
The numbers 1 and 2 are column numbers. For each table they start at 1 and are counted up, starting with all the domains (sets) first, followed by all the parameters. The numbers 1 and 0 are the domain numbers. If the domain number equals the column number it is a set that can serve as a domain. If the domain number is zero then it is a parameter that has all the sets in the table as domain.
The string values “ColumnNameInSource” and “OtherColumnNameInSource” are the names in the data source.
Add the Map
To use a data map we first have to add it to DataLink and assign a name to it that we can use to tell DataLink which data map we want to use. We can add a data map using the following function:
- dl::AddDataSourceMapping(MapName, DataMap, ColDepend, TableAttribute, ColAttribute)
- Arguments
MapName – String representing the name of the data map
DataMap – The data map we add to DataLink
ColDepend – not used
TableAttribute – not used
ColAttribute – Column attributes (optional)
If we want to use the data map SP_ClassicDataMap
that we specified before and give it the name “TheMapping”, we can do
dl::AddDataSourceMapping(
"TheMapping", ! The name of this data map
SP_ClassicDataMap, ! The data map
dl::DependEmpty, ! not used
dl::TableAttributesEmpty, ! not used
dl::ColAttributeEmpty ! Column Attribute
);
The input arguments dl::depends
and dl tableabtributes
are not used and for this empty placeholder values dl::DependEmpty
and dl TableAttributesEmpty
are used.
For the column attributes also an empty placeholder dl::ColAttributeEmpty
is used but here we can specify some attributes that the provider can use. If we define a string parameter SP_ColAttr
as:
StringParameter SP_ColAttr {
IndexDomain: (
dl::dt, ! Table name
dl::cn, ! Column number
dl::colattr ! Attribute type
);
}
We could set some attribute for the P_TheParameter
column
SP_ColAttr := data{
! table name , C , attribute : value
( 'TableNameInSource' , 2 , 'Width' ) : "8",
( 'TableNameInSource' , 2 , 'Precision' ) : "2"
};
when we replace the placeholder dl::ColAttributeEmpty
by SP_ColAttr
as argument for AddDataSourceMapping
, then these attribute values will be passed on to the provider.
Internally DataLink keeps track of a list of data map and their names. It is possible that the name A data map can be removed
- dl::RemoveDataSourceMapping(MapName)
- Arguments
MapName – String representing the name of the data map
Note
If a procedure in AIMMS contains a call to dl::AddDataSourceMapping
it can happen quiet easily that we call it with the same data map name if we rerun the procedure. To prevent DataLink from throwing errors it is possible to always call dl::RemoveDataSourceMapping
before calling dl::AddDataSourceMapping
.
The New Data Map
The new data map is recently introduced to add functionality that is very hard to implement using the classic data map. Also some other future enhancements were taking into account in coming up with this data map.
A data map is a string parameter that can be defined as:
StringParameter SP_DataMap {
IndexDomain: (
dl::dt, ! Table name
dl::idn, ! Identifier name
dl::xd, ! eXtra data map number
dl::dma ! Data map attribute
);
}
The third and fourth element have changed. The third element is an integer value that is used for some new functionality described later. For now we can leave it as 0, to have the same setup as with the classic data map.
The fourth element is the data map attribute. This is a string expressing what information about the table and identifier we are expressing. The most important and mandatory attribute is colname
which tells that we specify the column name in the source. Also it allows us to specify column attributes in the data map instead of separate string parameter, like in the classic data map configuration.
The first two element have not changed, so we also have to make sure that the table name is added to the set dl::DataTables
:
dl::DataTables+={'TableName'}; ! define a tablename
We use the same example as in the description of the classic data map. We assume that we have a set identifier S_TheSet
in AIMMS, and a parameter P_TheParameter
indexed over S_TheSet
. Then we can specify the data map as:
dl::DataMap := data{
! table name , identifier , X , attribute : value
( 'TableName' , 'S_TheSet' , 0 , 'colname' ): "ColumnNameInSource",
( 'TableName' , 'P_TheParameter' , 0 , 'colname' ): "OtherColumnNameInSource"
};
Here, instead of first declaring the string parameter, we use the build in dl::DataMap
. This is possible because after we added the map to DataLink we no longer need it and can reused the same parameter to add more data maps.
If we also want to specify some column attributes we can do:
dl::DataMap := data{
! table name , identifier , X , attribute : value
( 'TableName' , 'S_TheSet' , 0 , 'colname' ) : "ColumnNameInSource",
( 'TableName' , 'P_TheParameter' , 0 , 'colname' ) : "OtherColumnNameInSource",
( 'TableName' , 'P_TheParameter' , 0 , 'width' ) : "8" ,
( 'TableName' , 'P_TheParameter' , 0 , 'precision') : "2"
};
We can see here how the attributes works. Basically it allows us for the combination TableName
, P_TheParameter
and 0
to specify three different properties. (Also we can already hint on the role of the extra number, because this will allow us to specify the same attribute for a table/identifier pair more than once.)
For the new style data map a new function is created to add it to DataLink:
- dl::AddDataMap(MapName, DataMap)
- Arguments
MapName – String representing the name of the data map
DataMap – The data map (new style) we add to DataLink
So we can do,
dl::AddDataMap("TheMapping",dl::DataMap);
Again we have to make sure that the data map name already does not exist in DataLink, so we can make a call
dl::RemoveDataSourceMapping("TheMapping");
Since this function only has a string as argument we can use the same function to remove a new style data map as the classic data map.
Tip
Keep in mind that the data map is just a string parameter with 4 indices. The order in which we specify everything does not matter. For large complicated data maps you may want to stick to a fixed strategy of ordering. You could group it based on the data map attribute, or you can decide to keep all specifications of an identifier close to each other.
Parameters with Column Indices
One of the limitations of the classic data map is that we need to assign one column number to each parameter. This means that it can only associate this parameter with one column in the source, which is a serious limitation.
The Problem
Let’s look at a simple example to understand this limitation. Consider we have a model with
Set S_FoodTypes {
Index: f;
}
Set S_Nutrients {
Index: n;
}
Parameter P_NutrientValue{
IndexDomain: (f,n);
}
To read the P_NutrientValue
from a source the table must at least have the same columns as in the one-column table below.
FoodType |
Nutrients |
NutrientValue |
---|---|---|
Big Mac |
Protein |
25 |
Big Mac |
Fat |
22 |
Big Mac |
Carbohydrates |
44 |
Quarter Pounder |
Protein |
32.4 |
Quarter Pounder |
Fat |
25 |
etc. |
etc. |
etc. |
This table is rather awkward to work with and more importantly, when the source is supplied by an other program it is unlikely that it has this form. A more natural way to present this data is shown in the multi-column table.
FoodType |
Protein |
Fat |
Carbohydrates |
---|---|---|---|
Big Mac |
25 |
22 |
44 |
Quarter Pounder |
32.4 |
25 |
40.4 |
French Fries |
5 |
21 |
54 |
etc. |
etc. |
etc. |
etc. |
There are two main difference between these two tables.
We see that the “data” values from the “Nutrients” in the one-column table are now column names in the multi-column table.
We see that the multi-column table no longer has a column “NutrientValue”, while three columns have data for/from
P_NutrientValue
.
The Solution
We want to read all the data in the multi-column table into one parameter P_NutrientValue
, but now they are split among different columns. For this we introduced the extra dl::xd
in the new data map. This makes it possible to specify more than one column name for a single parameter. Consider the following data map.
dl::DataMap := data{
! table name , identifier , X , attribute : value
( 'TableName' , 'S_FoodTypes' , 0 , 'colname' ) : "FoodType",
( 'TableName' , 'P_NutrientValue' , 1 , 'colname' ) : "Protein",
( 'TableName' , 'P_NutrientValue' , 2 , 'colname' ) : "Fat" ,
( 'TableName' , 'P_NutrientValue' , 3 , 'colname' ) : "Carbohydrates"
};
Here the values 1, 2 and 3 for dl::xd
have no meaning, any non zero positive integer is allowed. The purpose of these numbers is that it allows us to specify multiple columns per table/identifier pair. So they only have to be distinct for each column. When DataLink encounters those non zero numbers then:
DataLink will see the non zero positive value for
dl::xd
and concludes thatP_NutrientValue
is NOT a normal single column parameter.DataLink looks at the index domain of parameter
P_NutrientValue
and sees that it hasf
andn
as index domain.DataLink looks at all sets defined for table “TableName” and finds only set
S_FoodTypes
with indexf
.DataLink understands that the columns correspond to elements from the set with
n
as index.
This procedure will only work if the parameter have an index in the index domain that is not mapped to a column in the data map. The location of this index can be anywhere. So if C
is the column index we could have a parameter P(i,j,C)
, or P(i,C,j)
, or P(C,i,j)
.
Tip
The recommended location of the column index is the last element from the index domain. So P(i,j,C)
would be preferred over the other possibilities. For reading this is not important. For writing the table structure has to be constructed from the data from AIMMS and when the column index is last, the order DataLink receives the data resembles the row structure of the table most.
The IdxMap Attribute
If we use a parameter with column index then the ‘colname’ attribute specifies the column name in the source. This will also be used as the value for the corresponding index. Suppose we had the set S_Nutrients
defined as:
Set S_Nutrients {
Index: n;
Definition: {
{'p','f','c'}
}
}
If we want to read from the multi-column table we need to somehow tell which column corresponds to which elements in S_Nutrients
. For this we introduced the data map attribute idxmap
to map the index value to a column.
dl::DataMap := data{
! table name , identifier , X , attribute : value
( 'TableName' , 'S_FoodTypes' , 0 , 'colname' ) : "FoodType",
( 'TableName' , 'P_NutrientValue' , 1 , 'colname' ) : "Protein", ! xd = 1: column name "Protein" in the source
( 'TableName' , 'P_NutrientValue' , 1 , 'idxmap' ) : "p", ! xd = 1: index n has value 'p'
( 'TableName' , 'P_NutrientValue' , 2 , 'colname' ) : "Fat" ,
( 'TableName' , 'P_NutrientValue' , 2 , 'idxmap' ) : "f",
( 'TableName' , 'P_NutrientValue' , 3 , 'colname' ) : "Carbohydrates"
( 'TableName' , 'P_NutrientValue' , 3 , 'idxmap' ) : "c",
};
In this table for ‘TableName’, ‘P_NutrientValue’ and ‘dl::xd = 1’ we see both a colname
and an idxmap
specified.
Here we say that column “Protein” in the source corresponds to P_NutrientValue(f,'p')
in the AIMMS model.
Downloadable example
Valid tables and their domains
DataLink reads and writes tables row by row. This requires that for each parameter all elements of its index domain must be in that row. This makes that parameters become dependent on the presence of columns corresponding to their index domain. So there is a restriction on what kind of tables can be mapped by the data map.
We will say that all data in columns representing a set in AIMMS will be “Domain Columns” , because this data will also be passed on as values for the index domains of the parameters.
Indices from the same set
In the classic data map we explicitly spell out which column has what domain number. This allows us to use the same set as domain multiple times.
Suppose we have a parameter P(i,j)
and a set:
Set S {
Index: i,j;
}
Then in the data map below it is clear that data from domain column “Si” is mapped onto the first domain index of P(i,j)
and data from domain column Sj
is mapped onto onto the second domain index:
ClassicDataMap := data{
( 'TableOne' , 'S' , 1 , 1 ) : "Si", ! Set S
( 'TableOne' , 'S' , 2 , 2 ) : "Sj", ! Also set S
( 'TableOne' , 'P' , 1 , 0 ) : "P"
};
We cannot do the same in the new data map because we do not specify the domain number. Instead of using the names of the set S
, we can use the names of the indices i
and j
to map them onto domain columns Si
and Sj
:
dl::DataMap := data{
( 'TableOne' , 'i' , 0 , 'colname' ) : "Si", ! Index i ( of set S )
( 'TableOne' , 'j' , 0 , 'colname' ) : "Sj", ! Index j ( also of set S )
( 'TableOne' , 'P' , 0 , 'colname' ) : "P"
};
Parameters in different tables
Because each table is read row by row, we cannot reuse a domain columns from a different table. Suppose we want to read parameters P(i)
and Q(i)
from two different table. Then we run into problems when we try create a classic data map:
ClassicDataMap := data{
! table one
( 'TableOne' , 'S' , 1 , 1 ) : "S",
( 'TableOne' , 'P' , 2 , 0 ) : "P",
! table two
( 'TableTwo' , 'Q' , 1 , 0 ) : "Q" ! Obvious: Cannot read this because it is missing the S column
};
Here it becomes immediately clear that we miss a domain column for the second table. If Q
has a domain we need to have a column with domain number 1 (and thus column number 1) and so Q
cannot have column number 1.
In the new data map this is less obvious. DataLink will deduce which domain columns correspond to the index domain so we no longer have to express this explicitly. Even if we cannot see this in the data map, we still cannot have a parameter without the required domain column in the table:
dl::DataMap := data{
! table one
( 'TableOne' , 'S' , 0 , 'colname' ) : "S",
( 'TableOne' , 'P' , 0 , 'colname' ) : "P",
! table two
( 'TableTwo' , 'Q' , 0 , 'colname' ) : "Q" ! Not so obvious: Cannot read this because it is missing the S column
};