Words similarity/relatedness using WuPalmer Algorithm
Wu & Palmer – Words Similarity
The Wu & Palmer calculates relatedness by considering the depths of the two synsets in the WordNet taxonomies, along with the depth of the LCS (Least Common Subsumer).
The formula is score = 2 * depth (lcs) / (depth (s1) + depth (s2)).
This means that 0 < score <= 1. The score can never be zero because the depth of the LCS is never zero (the depth of the root of a taxonomy is one). The score is one if the two input concepts are the same.
Required library: ws4j java library
Input parameters: two words with their part of speech.
Returns: The return value is the relatedness score. If no path exists between the two word senses, then a negative number is returned. If an error occurs, then the error level is set to non-zero and an error string is created.
Example:
Word 1: cancer [POS – Noun]
Word 2: disease [POS – Noun]
Initialization of WordNet database
ILexicalDatabase db = new NictWordNet();
RelatednessCalculator rc = new WuPalmer(db);
Word Similarity Method
public double wordSimilarity(String word1, POS posWord1, String word2, POS posWord2) {
double maxScore = 0 D;
try {
WS4JConfiguration.getInstance().setMFS(true);
List < Concept > synsets1 = (List < Concept > ) db.getAllConcepts(word1, posWord1.name());
List < Concept > synsets2 = (List < Concept > ) db.getAllConcepts(word2, posWord2.name());
for (Concept synset1: synsets1) {
for (Concept synset2: synsets2) {
Relatedness relatedness = rc.calcRelatednessOfSynset(synset1, synset2);
double score = relatedness.getScore();
if (score > maxScore) {
maxScore = score;
}
}
}
System.out.println("Similarity score of " + word1 + " & " + word2 + " : " + maxScore);
} catch (Exception e) {
logger.error("Exception : ", e);
}
Similarity score for cancer and disease is 0.88
Description:
- Initialize WordNet Database and WuPalmer object.
- Set MFS to true. It Uses Most Frequent Sense. MFS increases calculation speed up.
- Get the synsets for input words as per their pos.
- Iterate over each synsets to calculate relatedness score of synsets.
- Return max score for synsets.
Below is the description form http://ws4jdemo.appspot.com/
WuPalmer (cancer#n#1 , disease#n#1 ) = 0.88
T1 = HyperTrees(cancer#n#1) =
[1] * ROOT * #n #1 < entity# n #1 < abstraction# n #6 < attribute# n #2
< state# n #2 < condition# n #1 < physical_condition# n #1
< pathological_state# n #1 < ill_health# n #1 < illness# n #1
< growth# n #6 < tumor# n #1 < malignant_tumor# n #1 < cancer# n #1
[2] *ROOT*#n#1 < entity#n#1 < abstraction#n#6 < attribute#n#2
< state#n#2 < condition#n#1 < physical_condition#n#1
< pathological_state#n#1 < ill_health#n#1 < illness#n#1
< disease#n#1 < malignancy#n#1 < malignant_tumor#n#1
< cancer#n#1
T2 = HyperTrees(disease#n#1) =
[1] *ROOT*#n#1 < entity#n#1 < abstraction#n#6 < attribute#n#2
< state#n#2 < condition#n#1 < physical_condition#n#1
< pathological_state#n#1 < ill_health#n#1 < illness#n#1
< disease#n#1
Lowest Common Subsumer(s) = argmax(depth(subsumer(T1,T2))) = { subsumer(T1[2], T2[1]) } = disease#n#1 }
DepthLCS = depth (disease#n#1 ) = 11
Depth1 = min(depth( {tree in T1 | tree contains LCS } )) = 14
Depth2 = min(depth( {tree in T2 | tree contains LCS } )) = 11
Score = 2 * DepthLCS / ( Depth1 + Depth2 ) = 2 * 11 / (14 + 11) = 0.88
Write a comment
- Jothi G April 7, 2016, 5:43 amSir, I am new in Java. This is main our class, public static void main(String[] args) { wordSimilarity("cancer", POS.n, "disease", POS.n); } But it is not working. The error Exception in thread "main" java.lang.RuntimeException: Uncompilable source code - non-static method wordSimilarity(java.lang.String,edu.cmu.lti.jawjaw.pobj.POS,java.lang.String,edu.cmu.lti.jawjaw.pobj.POS) cannot be referenced from a static context at similarity1.main(similarity1.java:28) Java Result: 1 What is correct way calling this function. Thank you.reply
- Sagar Gole October 29, 2015, 10:11 amYou can create your own class and add the above code in it. First you have to initialize the <strong>ILexicalDatabase</strong> and <strong>RelatednessCalculator</strong> objects and then call the wordSimilarity method in your main method. Like <strong>wordSimilarity("cancer", POS.n, "disease", POS.n);</strong> Thanks, SJGolereply
- jothi October 29, 2015, 7:04 amSir, Where is the main class in your code. Give your full code . Thank you.reply