The data validation rules discussed in this section are subject to change as the system continues to be developed.
The data in GROUPMEMBERS is the result of an automated analysis called smearing which spreads the group composition survey results over larger time intervals. GROUPMEMBERS contains one row for each individual which analysis determines is a member of the surveyed group for each analyzed group composition time interval.
At present there is extremely limited automated group composition analysis and the GROUPMEMBERS table contains one row per RAW_GROUPMEMBERS row.
An individual can be recorded present in a group at most once -- the combination of GID and the related CHIMPIDS.AnimID must be unique.
A warning is generated when an individual is in a
group comp, when that individual is is less than
5 years old, and their current mother is not in the
group comp. The current mother is the biological mother (the
mother indicated when the young individual has a
BIOGRAPHY_DATA.MomID
which is not NULL) if the young individual is never adopted or
adopted after the date of the group comp. The current mother is
the adoptive mother if the date of the group comp is after the
date of permanent adoption (after the, non-NULL,
ADOPTIONS.ADate).
In the case of multiple permanent adoptions the current mother
is the most recent adoptive mother as of the date of the group
comp. When the biological mother is unknown and there are no
known adoptive parents then there are no requirements regarding
presence of a parent. These checks are performed by the warning
system and so do not occur until the warning system is
run.
The reverse is also true; a warning is generated when the most current “mother” is present in the group composition and any offspring, biological or permanently adopted on or before the date of the follow, in the study as of the date of the follow, less than 5 years old are not present. As above, in the case of adoption, only the current mother/offspring relationship is considered. As above, these checks are performed by the warning system and so do not occur until the warning system is run.
Temporary adoptions do not generate warnings, with the exception that a warning is generated if the focals of a follow are involved in a temporary adoption and either the mother or the infant appears in the group comp without the other.
When an adoption is temporary the infant need not be with the temporary adoptive parent. Significantly, the infant need not be with any parent. The presumption is that temporary adoption may be due to infant abandonment, and the establishment of a temporary adoption is what the system could (but does not) use to flag this situation. But unless a temporary adoption changes at a later date into a permanent adoption, temporary adoptions have no end date.
Rather than have temporary adoptions create situations where the temporarily adopted individual thenceforth is not required to be with any parent, the system uses the existence of a mother/infant follow involving a temporary parent as the flag which triggers suspension of young-infants-must-be-with-parents warnings.
At the various time interval endpoints, the date of permanent adoption, the date of entry into the study, and the date of departure from the study, the enforced rules regarding chimps less than 5 years and their real or permanently adoptive mothers are interpreted in such a way as to loosen the requirements and exclude the date of the interval endpoint.[27] This is because individuals may be born, die, be adopted, etc., at any point during the day.
Because the system sometimes but not always expects offspring less than 5 years to be part of their “mother's” group composition care must be taken during analysis to ensure that results are not skewed based on whether or not the offspring's presence is forced.
Presence in the study population is determined by whether the date of the follow is after an individual's BIOGRAPHY_DATA.Entrydate and before the individual's BIOGRAPHY_DATA.Departdate. Because the individual need not be in the study population for the entire day on the date of entry into the study population or the date of departure from the study population the above rules regarding offspring less than 5 years old do not require the mother to be with the offspring on these two days. Unless care is taken during analysis this censoring could lead to biased results.
The perineal swelling
(Swelling) value must be NULL
unless the individual is female. The perineal swelling may not be
more than 0 unless the individual is at
least 7 years old.
A unique integer identifying the row recording an
individual's presence in the group composition survey.
This column is automatically maintained by the
database, cannot be changed, and must not be
NULL.
A unique integer identifying the group composition survey
during which the individual was found. This column may not be NULL.
The identifier assigned to the group member. The ChimpID code of the individual.
This column may not be NULL.
A Boolean flag indicating whether the identity of the
individual is tentative. TRUE when the individual's identity
is not certain, FALSE when certain.
This column may not be NULL.
The data in GROUPS is the result of an automated analysis called smearing which spreads the group composition survey results over larger time intervals. GROUPMEMBERS contains one row for every time interval over which is automatically analyzed.
At present there is extremely limited automated group composition analysis and the GROUPS table contains one row per RAW_GROUPS row.
The group composition membership (a GROUPS row) is related to behavioral intervals (INTERVALS rows) based on the group composition time (the RAW_GROUPS.Time value). -- All intervals with times starting with (inclusive of) the -- group comp time through the interval (exclusive of) the -- next group comp time are connected to the group comp.
Each group composition survey must record the presence of at least one individual, the individual who has the role of mother in the follow -- there must be related row on the GROUPMEMBERS table with the mother's AnimID related to the the ChimpID value. This check is performed upon transaction commit when rows are added to the GROUPS table or when rows are deleted from the GROUPMEMBERS table or when rows are inserted or updated on the FOLLOWPARTS table or when rows are updated on the INTERVALS table, and performed immediately when a row is updated on the GROUPMEMBERS table.[28]
Each group composition survey must be related to at least one follow interval -- there must be at least one related row on the INTERVALS table. This check is performed upon transaction commit.
Note that there are some periods of time when group composition was not collected. There are no rules regarding when this occurred; the system permits group composition to be missing no matter the date.[29]
Given the data integrity rules and their manner of
enforcement group composition survey data must be added to the
database within a single transaction is as follows: Create a
GROUPS row and then those related GROUPMEMBERS
rows required to satisfy the data integrity requirements, update
the related INTERVALS rows with the new
GID value, then relate the group
composition survey data to a follow either by updating (a)
previously created INTERVALS row(s) inserted
with a NULL
INTERVALS.GID
value or create new INTERVALS row(s) with the
GID value of the new GROUPS row.
RAW_GROUPMEMBERS contains one row for each individual found to be a member of the surveyed group for each group composition survey.
The data in RAW_GROUPMEMBERS is interpreted and “smeared” into the GROUPMEMBERS table, which provides a more direct link between the follow intervals and the group composition data. However, although the RAW_GROUPMEMBERS data is copied into the GROUPMEMBERS table, any sort of intelligent smearing is not yet implemented.
Almost none of the data validation checks described in this section are implemented. They are all subject to change.
An individual can be recorded present in a group at most once -- the combination of RGID and the related CHIMPIDS.AnimID must be unique.
See the GROUPMEMBERS for information regarding when mothers are expected to be present in a group composition with their young offspring.
Because offspring less than 5 years are sometimes but not always forced to be part of their “mother's” group composition care must be taken during analysis to ensure that results are not skewed based on whether or not the offspring's presence is forced.
Presence in the study population is determined by whether the date of the follow is after an individual's BIOGRAPHY_DATA.Entrydate and before the individual's BIOGRAPHY_DATA.Departdate. Because the individual need not be in the study population for the entire day on the date of entry into the study population or the date of departure from the study population the above rules regarding offspring less than 5 years old do not require the mother to be with the offspring on these two days. Unless care is taken during analysis this censoring could lead to biased results.
The perineal swelling
(Swelling) value must be NULL
unless the individual is female. The perineal swelling may not be
more than 0 unless the individual is at
least 7 years old.
A unique integer identifying the row recording an
individual's presence in the group composition survey.
This column is automatically maintained by the
database, cannot be changed, and must not be
NULL.
A unique integer identifying the group composition survey
during which the individual was found. This column may not be NULL.
The identifier assigned to the group member. The ChimpID code of the individual.
This column may not be NULL.
Code indicating the source that places the individual in the group. The legal values for this column are defined by the GM_ORIGINS support table.
This column may not be NULL.
A Boolean flag indicating whether the identity of the
individual is tentative. TRUE when the individual's identity
is not certain, FALSE when certain.
This column may not be NULL.
RAW_GROUPS contains one row for every survey of group composition.
The data in RAW_GROUPS is interpreted and “smeared” into the GROUPS table, which provides a more direct link between the follow intervals and the group composition data. However, this is not yet implemented.
Almost none of the data validation checks described in this section are implemented. They are all subject to change.
Each group composition survey must record the presence of at least one individual, the individual who has the role of mother in the follow -- there must be related row on the RAW_GROUPMEMBERS table with the mother's AnimID related to the the ChimpID value. This check is performed upon transaction commit when rows are added to the RAW_GROUPS table or when rows are deleted from the RAW_GROUPMEMBERS table or when rows are inserted or updated on the FOLLOWPARTS table or when rows are updated on the INTERVALS table, and performed immediately when a row is updated on the RAW_GROUPMEMBERS table.[30]
Each group composition survey must be related to at least one follow sheet -- there must be at least one related row on the Sheets table. This check is performed upon transaction commit.
The time of the group composition must be unique per follow; the combination of Time and the related INTERVALS.FollowID must be unique.
Note that there are some periods of time when group composition was not collected. There are no rules regarding when this occurred; the system permits group composition to be missing no matter the date.[31]
Given the data integrity rules and their manner of
enforcement group composition survey data must be added to the
database within a single transaction is as follows: Create a
RAW_GROUPS row and then those related RAW_GROUPMEMBERS
rows required to satisfy the data integrity requirements, update
the related INTERVALS rows with the new
GID value, then relate the group
composition survey data to a follow either by updating (a)
previously created INTERVALS row(s) inserted
with a NULL
INTERVALS.GID
value or create new INTERVALS row(s) with the
GID value of the new RAW_GROUPS row.
A unique integer identifying the group composition survey.
This column is automatically maintained by the
database, cannot be changed, and must not be
NULL.
A Boolean flag indicating whether one or more baboons are
present. This column may be NULL when it is not known whether
a baboon is present.
In the data converted from MS Access this column is set
to FALSE whenever there is no data. This may or may not
indicate that baboons are absent; analysis may show that this
simply means that the data was not collected. At the time of
this writing the database content has not been updated to
reflect when group composition surveys flagged the presence of
baboons and when they did not.
The integer identifying the template sheet on which the group composition was recorded.
This column may not be NULL.
Integer identifying the process (“smeared”) group survey analysis row into which the raw group survey data was incorporated.
When group survey analysis is automated this column will
not allowed to be NULL. In the meantime this column may be
NULL when there is no automated analysis.
[27] At any rate this is the intended design. These sorts of corner cases are always tricky and there may be bugs.
[28] The infant and the sibling need not be in the group.
[29] The system takes advantage of this; because group composition is not required the group composition can be uploaded independently, after upload of the follow related data.
[30] The infant and the sibling need not be in the group.
[31] The system takes advantage of this; because group composition is not required the group composition can be uploaded independently, after upload of the follow related data.