Keywords: Family Tree Software | Cycle Detection | GEDCOM Format | Event-Driven Model | Data Validation
Abstract: This article examines cycle detection errors in family tree software development. By analyzing the limitations of the GEDCOM format, it proposes an unrestricted data model solution based on real-world events. The paper details how event-driven modeling can replace strict assertion validation to handle complex scenarios like consanguineous relationships, with specific implementation methods for visualizing duplicate nodes.
Problem Background and Challenges
In family tree software development, developers frequently encounter program errors caused by cycle detection. Typical scenarios include users inputting family data with consanguineous relationships, such as parent-child unions producing offspring. In such cases, traditional family tree assertions detect contradictions like "X cannot be both father and grandfather of Y," rendering the software unusable.
Limitations of the GEDCOM Format
The GEDCOM (Genealogical Data Communication) format, as a standard for family tree data, imposes strict presumptions and limitations on family relationships. This format has several significant design flaws: first, it cannot accommodate same-sex relationships; second, it lacks support for historically common consanguineous marriages; most importantly, its tree-based assumptions are frequently violated in the real world.
From a historical perspective, consanguineous unions were more prevalent in the 18th and 19th centuries than commonly assumed. These limitations of the GEDCOM format present major challenges for software developers handling real-world data.
Event-Driven Model for the Real World
To address these issues, we recommend adopting an event-based modeling approach for family trees. The core of this method shifts focus from static family relationships to dynamic life events. Key events include:
- Birth events: Recording individual birth information
- Marriage events: Documenting various forms of unions
- Adoption events: Handling non-biological family relationships
- Death events: Recording life conclusions
At the code implementation level, we can define a base event class:
class FamilyEvent {
public:
enum EventType { BIRTH, MARRIAGE, ADOPTION, DEATH };
FamilyEvent(EventType type, const QDateTime& timestamp,
const QList<Person*>& involvedPersons);
virtual ~FamilyEvent() = default;
EventType getType() const { return m_type; }
const QDateTime& getTimestamp() const { return m_timestamp; }
const QList<Person*>& getInvolvedPersons() const { return m_involvedPersons; }
private:
EventType m_type;
QDateTime m_timestamp;
QList<Person*> m_involvedPersons;
};
Strategies for Relaxing Assertion Validation
Traditional strict assertion validation often proves counterproductive in family tree software. We suggest converting hard errors into soft warnings, providing users with an "add anyway" option. This strategy balances the necessity of data validation with the complexities of the real world.
In implementation, a validation service class can be created:
class RelationshipValidator {
public:
enum ValidationResult { VALID, WARNING, ERROR };
ValidationResult validateRelationship(Person* person1, Person* person2,
RelationshipType type);
QString getWarningMessage() const { return m_warningMessage; }
QString getErrorMessage() const { return m_errorMessage; }
private:
QString m_warningMessage;
QString m_errorMessage;
bool checkLogicalImpossibilities(Person* p1, Person* p2);
bool checkUnusualButPossible(Person* p1, Person* p2);
};
Visualization Solutions
For display issues caused by cyclic relationships, we propose duplicate node visualization techniques. When cyclic relationships are detected, the system can draw multiple instances of the same individual at different positions, using highlighting effects to indicate that these instances represent the same person.
The key implementation lies in node management:
class FamilyTreeNode {
public:
FamilyTreeNode(Person* person, int instanceId = 0);
void setHighlighted(bool highlighted);
void addConnection(FamilyTreeNode* other, RelationshipType type);
Person* getPerson() const { return m_person; }
int getInstanceId() const { return m_instanceId; }
bool isHighlighted() const { return m_highlighted; }
private:
Person* m_person;
int m_instanceId;
bool m_highlighted;
QList<QPair<FamilyTreeNode*, RelationshipType>> m_connections;
};
Rethinking Data Structures
Family trees are inherently not tree structures but directed acyclic graphs (DAGs). Recognizing this is crucial for designing appropriate data structures. Common scenarios like cousin marriages disrupt strict tree-based assumptions.
Proper data structure selection should be based on:
- Support for multiple parents
- Allowing cycle detection rather than prevention
- Providing flexible relationship type definitions
- Supporting both timeline views and traditional tree views
Implementation Recommendations and Best Practices
In practical development, we recommend the following strategies:
- Use event sourcing patterns to record all family changes
- Implement configurable validation rule systems
- Provide multiple visualization layout algorithms
- Support format conversion during data import/export
- Establish user feedback mechanisms to collect edge cases
By adopting this real-world event-based modeling approach, family tree software can better handle various complex scenarios while maintaining code robustness and maintainability. The key lies in balancing data integrity with real-world complexity, providing users with tools that are both accurate and flexible.