+ 1
AI Based Family Tree and Social Network
I am trying to build a platform where people can signup create their family trees and add the existing family members on the platform. As more people connect system will find more connections. Which algo of AI should be used?
1 Answer
+ 2
It sounds like you're trying to add a feature that helps a user equate family members they add with people that were added by other users.
When finding a match for a given person, I would design a heuristic function to estimate likelihood of 2 persons from the database representing the same person in reality. The heuristic result could be between 0 and 1. I might also decide on cases where 2 cases are too different to be considered anymore. For example, mismatched sex seems like reasonable proof they're not the same person. Simple rules to discard potential matches could improve efficiency of your search by preventing larger calculations in your heuristic function. The graphical user interface would then show results in a sorted and paginated list of search results. The user could pull up this search for any person selected from his family tree and request the people be merged.
sex, names, date of births, general locations of birth should combine to make a very unique identity for every person. Some aspects that complicate the problem would be:
- names being spelled differently or incorrectly.
- some data not being filled in consistently. Maybe one user adds the name and nothing more. Another user adds a date of birth and just initials instead of a name.
- Some people can change their names so you may store multiple names for each person.
For names being spelled differently or incorrectly, you could use the length of longest contiguous subsequence( https://en.wikipedia.org/wiki/Longest_common_subsequence_problem ) to measure similarity.
I would record information about all these people that get merged by the end user. I wouldn't use any sophisticated AI techniques on it immediately but if I had snapshots of 1000's of pairs that were merged manually, I could use that to improve the heuristic function. Some supervised learning techniques or statistics could make the heuristic more accurately match how probable it is for specific attributes to decide a match.